wget: Very Advanced Usage
7.3 Very Advanced Usage
=======================
• If you wish Wget to keep a mirror of a page (or FTP
subdirectories), use ‘--mirror’ (‘-m’), which is the shorthand for
‘-r -l inf -N’. You can put Wget in the crontab file asking it to
recheck a site each Sunday:
crontab
0 0 * * 0 wget --mirror https://www.gnu.org/ -o /home/me/weeklog
• In addition to the above, you want the links to be converted for
local viewing. But, after having read this manual, you know that
link conversion doesn’t play well with timestamping, so you also
want Wget to back up the original HTML files before the conversion.
Wget invocation would look like this:
wget --mirror --convert-links --backup-converted \
https://www.gnu.org/ -o /home/me/weeklog
• But you’ve also noticed that local viewing doesn’t work all that
well when HTML files are saved under extensions other than ‘.html’,
perhaps because they were served as ‘index.cgi’. So you’d like
Wget to rename all the files served with content-type ‘text/html’
or ‘application/xhtml+xml’ to ‘NAME.html’.
wget --mirror --convert-links --backup-converted \
--html-extension -o /home/me/weeklog \
https://www.gnu.org/
Or, with less typing:
wget -m -k -K -E https://www.gnu.org/ -o /home/me/weeklog