Page 1 of 1

How to click on all links in a page with "cURL" or "wget"?

Posted: 2021-02-07 08:33
by hack3rcon
Hello,
I want to click on all links in a page by "cURL" or "wget" tool. I found a curl command, but it show me below error:

Code: Select all

$ curl -r -l 2 https://www.TARGET.com/
Warning: Invalid character is found in given range. A specified range MUST 
Warning: have only digits in 'start'-'stop'. The server's response to this 
Warning: request is uncertain.
curl: (7) Couldn't connect to server
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request - Invalid Header</h2>
<hr><p>HTTP Error 400. The request has an invalid header name.</p>
</BODY></HTML>
How can I fix it?

Thank you.

Re: How to click on all links in a page with "cURL" or "wget

Posted: 2021-02-07 09:38
by Head_on_a_Stick
Read the man page. And what does "click on all links" mean, exactly? That doesn't seem to make any sense at all in the context of curl or wget. What are you actually trying to do?

Re: How to click on all links in a page with "cURL" or "wget

Posted: 2021-03-22 16:34
by hack3rcon
Head_on_a_Stick wrote:Read the man page. And what does "click on all links" mean, exactly? That doesn't seem to make any sense at all in the context of curl or wget. What are you actually trying to do?
Thanks.
Consider a web page that some links are in it. I like to use the wget or cURL for sending a request to that page, it's like clicking on all of the links.
I saw https://askubuntu.com/questions/639069/ ... e-webpages, but:

Code: Select all

$ curl -r -l 2 https://www.URL.com/
Warning: Invalid character is found in given range. A specified range MUST 
Warning: have only digits in 'start'-'stop'. The server's response to this 
Warning: request is uncertain.
curl: (7) Couldn't connect to server
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request - Invalid Header</h2>
<hr><p>HTTP Error 400. The request has an invalid header name.</p>
</BODY></HTML>

Re: How to click on all links in a page with "cURL" or "wget

Posted: 2021-03-22 17:05
by reinob
That's what happens when you blindly type something somebody wrote in some forum.
Read the manual.
-r in wget is not the same as -r in curl.

Re: How to click on all links in a page with "cURL" or "wget

Posted: 2021-03-23 14:42
by hack3rcon
reinob wrote:That's what happens when you blindly type something somebody wrote in some forum.
Read the manual.
-r in wget is not the same as -r in curl.
In cURL:
-r, --range <range> Retrieve only the bytes within RANGE
--raw Do HTTP "raw"; no transfer decoding
I know a wget command like below cna do it, but it download whole the website:

Code: Select all

$ wget -r -p -k http://website
I just want wget send a request like click on all links.

Re: How to click on all links in a page with "cURL" or "wget

Posted: 2021-03-23 15:02
by reinob
hack3rcon wrote:
reinob wrote:That's what happens when you blindly type something somebody wrote in some forum.
Read the manual.
-r in wget is not the same as -r in curl.
In cURL:
-r, --range <range> Retrieve only the bytes within RANGE
--raw Do HTTP "raw"; no transfer decoding
I know a wget command like below cna do it, but it download whole the website:

Code: Select all

$ wget -r -p -k http://website
I just want wget send a request like click on all links.
You're going to have to define "click".
If you mean that every link on a webpage should be requested (GET /.../ HTTP/1.0, etc.) then recursive wget is what you want.

If you have a problem with wget actually storing the downloaded page, then you have another problem to deal with, which is easy enough (you can just wipe the folder when you're done). Or use "-O /dev/null".

If your "click" can also be a HEAD request (instead of a GET request), then you can use "wget --recursive --spider", which will "click" (HEAD) every link without downloading anything.

Re: How to click on all links in a page with "cURL" or "wget

Posted: 2021-03-24 10:22
by hack3rcon
reinob wrote:
hack3rcon wrote:
reinob wrote:That's what happens when you blindly type something somebody wrote in some forum.
Read the manual.
-r in wget is not the same as -r in curl.
In cURL:
-r, --range <range> Retrieve only the bytes within RANGE
--raw Do HTTP "raw"; no transfer decoding
I know a wget command like below cna do it, but it download whole the website:

Code: Select all

$ wget -r -p -k http://website
I just want wget send a request like click on all links.
You're going to have to define "click".
If you mean that every link on a webpage should be requested (GET /.../ HTTP/1.0, etc.) then recursive wget is what you want.

If you have a problem with wget actually storing the downloaded page, then you have another problem to deal with, which is easy enough (you can just wipe the folder when you're done). Or use "-O /dev/null".

If your "click" can also be a HEAD request (instead of a GET request), then you can use "wget --recursive --spider", which will "click" (HEAD) every link without downloading anything.
Thank you.
Consider https://www.amazon.com/s?k=debian&i=str ... nb_sb_noss URL, you can see a list of books on that page, I want to use cURL or wget tool, to click on all books on that page. Is it clear?