Help with multiple CURL commands

eKo · May 2017

Hello,
Can someone help me out with a command like this:

curl --compressed -m 5 --retry 2 --retry-delay 2 --silent -H 'Accept-Encoding: ' --connect-timeout 5 -w 'www.google.com\t:\t%{time_total}\n' -o /dev/null https://www.google.com > /home/url-list.txt

I would like to do a bash script, to run multiple urls at the same time from a txt file (like google.com, google.co.uk, google.co.in, etc) and save the results to a txt file.
Im asking if someone is good with bash and willing to help (I need to run that script with crontab).

p.s. I know 0 about bash...

Thanks!

risharde · May 2017

I'm on mobile so I can't test said command... is it that you are trying to scrape urls from google? If so, google usually detects these searches as automated after you do a few page loads... have never tried to bypass the protection ...you might need to factor in proxies to achieve the desired result.. this was when I tested it a few years back... not sure if google has changed since then

eKo · May 2017

Hello,
No, actually Im trying to get pageload of given urls from a txt file (in this case i put google as example domains) I need to monitor my own domains pageloads and save them every 1h into an other txt file (like results-date-hour.txt ?).

a sample urls.txt can be:
mydomain.com
mydomain.net
mydomain.org

and a sample results.txt can be:

#

mydomain.com : 0.140
mydomain.net : 0.333
mydomain.org: 1.100

Tested on Date Time

#

Thanks!

risharde · May 2017

Gotcha, as soon as I get to a terminal, I will work on it for you as long as someone else doesn't beat me to it

IonSwitch_Stan · May 2017

root@test-vps1 ~]# cat domains

www.google.com

www.reddit.com

www.ionswitch.com

[root@test-vps1 ~]# cat domains | xargs -i curl --compressed -m 5 --retry 2 --retry-delay 2 --silent -H 'Accept-Encoding: ' --connect-timeout 5 -w '{}\t:\t%{time_total}\n' -o /dev/null https://{} >> log

[root@test-vps1 ~]# cat log

www.google.com : 0.491

www.reddit.com : 0.183

www.ionswitch.com : 0.180

eKo · May 2017

Hello Stan,
I see you have done it pretty quickly, but can you please do instead of cat domains import the domains from a domains.txt ? and I presume if I change >> log with >> results.txt is the same?

Can be done to add an timestamp on the footer of the results like: Date-Time ?
I really appreciate your help guys!

ricardo · May 2017

In his example, "domains" is the name of the input text file.

eKo · May 2017

ok,
In domains.txt i use:

https://www.google.co.in  
https://www.google.co.uk

the command is:

rm -rf /home/results.txt && cat /home/domains.txt | xargs -i curl --compressed -m 5 --retry 2 --retry-delay 2 --silent -H 'Accept-Encoding: ' --connect-timeout 5 -w '{}\t:\t%{time_total}\n' -o /dev/null {} >> /home/results.txt

the results.txt is:

https://www.google.co.in
    :   0.000
https://www.google.co.uk    :   0.048

o.O How is this possible?
I need first to delete the old results.txt then create the new one, thats why the rm -rf ...

flatland_spider · May 2017

-o /dev/null sends the output into space.

-w '{}\t:\t%{time_total}\n specifies what is being sent to stdout.

>> /home/results.txt saves stdout to the results.txt file, which is why you're only seeing how long it took to download the page.

As some general advice, read the man page before running any command posted on the Internet. One, it makes sure the command isn't malicious, and two, it makes sure you understand what is going on.

man curl or https://curl.haxx.se/docs/manpage.html.

Also, rm /home/results.txt will suffice since it's your file.

Howdy, Stranger!

Categories

In this Discussion

Help with multiple CURL commands

Comments

#

#

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Help with multiple CURL commands

Comments

#

#