Get content between tags html or xml from url https

If none of the more specific forums is the right place to ask

Get content between tags html or xml from url https

Postby MESSIAH » 2019-11-07 02:18

I need read file xml or html from website and get content between tags like url or div etc.

Url is starting from https.

I was try using some xml parsers but without success - they are out date and don't support https/ssl

Code: Select all
xmlstarlet sel --net -t -c "*" https://xmlstar.sourceforge.net/xmlstarlet-xsa.xml


failed to load external entity "https://xmlstar.sourceforge.net/xmlstarlet-xsa.xml"
Last edited by MESSIAH on 2019-11-07 14:55, edited 1 time in total.
LIFE'S A BITCH AND I'M CUSTOMER WITHOUT WALLET
User avatar
MESSIAH
 
Posts: 9
Joined: 2018-08-26 20:05

Re: Get content between tags html or xml from url https

Postby Dai_trying » 2019-11-07 08:21

I would use Python with BeautifulSoup and requests, a short tutorial can be found in this youtube video and if you search web scraping you can find many other tutorials.
Dai_trying
 
Posts: 800
Joined: 2016-01-07 12:25


Return to General Questions

Who is online

Users browsing this forum: No registered users and 4 guests

fashionable