wget: Proxies
8.1 Proxies
===========
“Proxies” are special-purpose HTTP servers designed to transfer data
from remote servers to local clients. One typical use of proxies is
lightening network load for users behind a slow connection. This is
achieved by channeling all HTTP and FTP requests through the proxy which
caches the transferred data. When a cached resource is requested again,
proxy will return the data from cache. Another use for proxies is for
companies that separate (for security reasons) their internal networks
from the rest of Internet. In order to obtain information from the Web,
their users connect and retrieve remote data using an authorized proxy.
Wget supports proxies for both HTTP and FTP retrievals. The standard
way to specify proxy location, which Wget recognizes, is using the
following environment variables:
‘http_proxy’
‘https_proxy’
If set, the ‘http_proxy’ and ‘https_proxy’ variables should contain
the URLs of the proxies for HTTP and HTTPS connections
respectively.
‘ftp_proxy’
This variable should contain the URL of the proxy for FTP
connections. It is quite common that ‘http_proxy’ and ‘ftp_proxy’
are set to the same URL.
‘no_proxy’
This variable should contain a comma-separated list of domain
extensions proxy should _not_ be used for. For instance, if the
value of ‘no_proxy’ is ‘.mit.edu’, proxy will not be used to
retrieve documents from MIT.
In addition to the environment variables, proxy location and settings
may be specified from within Wget itself.
‘--no-proxy’
‘proxy = on/off’
This option and the corresponding command may be used to suppress
the use of proxy, even if the appropriate environment variables are
set.
‘http_proxy = URL’
‘https_proxy = URL’
‘ftp_proxy = URL’
‘no_proxy = STRING’
These startup file variables allow you to override the proxy
settings specified by the environment.
Some proxy servers require authorization to enable you to use them.
The authorization consists of “username” and “password”, which must be
sent by Wget. As with HTTP authorization, several authentication
schemes exist. For proxy authorization only the ‘Basic’ authentication
scheme is currently implemented.
You may specify your username and password either through the proxy
URL or through the command-line options. Assuming that the company’s
proxy is located at ‘proxy.company.com’ at port 8001, a proxy URL
location containing authorization data might look like this:
http://hniksic:mypassword@proxy.company.com:8001/
Alternatively, you may use the ‘proxy-user’ and ‘proxy-password’
options, and the equivalent ‘.wgetrc’ settings ‘proxy_user’ and
‘proxy_password’ to set the proxy username and password.