Monday, 7 April 2014

Wget How-To

Well, most Windows users know and use apps like download managers. There are some of these apps for linux as well. But, most of the time (I being at the terminal) use a small utility known as wget.

It comes with most linux systems pre-installed. Yeah, it's a command line download manager.

NOTE: For the website (websitenamehere), I have typed everything including the "http://" and it works.


Mirror an entire website

To do that type:
wget -m websitenamehere
 
and that will basically mirror or make a copy of the website on your system with the same name.
You could also do:
wget -H -r --level=1 -k -p websitenamehere

What this baby does is:
  • -r tells to download pages recursively
  • --level=1 tells wget to download other websites that link to it but just upto one level. Simply, let's say website A links to your website and website B links to website A. Then, this command says download website A but not B
  • -H is for spanning hosts
  • -k for converting the links so that both websites point to each other on the hard drive and not to the real thing
  • -p tells to download all materials to properly display the page (like images, animations, you get the idea)



Resume downloads

Well, if you are on flaky connection, and the download keeps breaking off, then
wget -c fullfilenamehere
fullfilenamehere means the entire address to that file




Specify output document

At times the link don't precisely tell the document name or it could be lengthy, then you could use
wget --output-document=filename
NOTE: filename has to be the full filename with extensions




Download music automatically

First create a file with the list of your mp3 music sources (URLs) but mention them one per line, then
wget -r -l1 -H -t1 -nd -N -np -A.mp3 -erobots=off -i filename.txt


Bulk downloads

Make a file with the list of URLs of the stuff that you want to download, then
wget -i filename.txt


Generate a log for checking broken links

For this, simply type
wget --spider -o wget.log -e robots=off --wait 1 -r -p websitenamehere

There are many more uses for this versatile tool. If I have not mentioned your trick or there is some mistake in mine, then please comment.

As always thanks for reading and please comment.

No comments:

Post a Comment

Stuff that most try to do