GNU Info

Info Node: (wget.info)Recursive Accept/Reject Options

(wget.info)Recursive Accept/Reject Options


Prev: Recursive Retrieval Options Up: Invoking
Enter node , (file) or (file)node

Recursive Accept/Reject Options
===============================

`-A ACCLIST --accept ACCLIST'
`-R REJLIST --reject REJLIST'
     Specify comma-separated lists of file name suffixes or patterns to
     accept or reject (Note: Types of Files for more details).

`-D DOMAIN-LIST'
`--domains=DOMAIN-LIST'
     Set domains to be followed.  DOMAIN-LIST is a comma-separated list
     of domains.  Note that it does _not_ turn on `-H'.

`--exclude-domains DOMAIN-LIST'
     Specify the domains that are _not_ to be followed.  (Note:
     Spanning Hosts).

`--follow-ftp'
     Follow FTP links from HTML documents.  Without this option, Wget
     will ignore all the FTP links.

`--follow-tags=LIST'
     Wget has an internal table of HTML tag / attribute pairs that it
     considers when looking for linked documents during a recursive
     retrieval.  If a user wants only a subset of those tags to be
     considered, however, he or she should be specify such tags in a
     comma-separated LIST with this option.

`-G LIST'
`--ignore-tags=LIST'
     This is the opposite of the `--follow-tags' option.  To skip
     certain HTML tags when recursively looking for documents to
     download, specify them in a comma-separated LIST.

     In the past, the `-G' option was the best bet for downloading a
     single page and its requisites, using a commandline like:

          wget -Ga,area -H -k -K -r http://SITE/DOCUMENT

     However, the author of this option came across a page with tags
     like `<LINK REL="home" HREF="/">' and came to the realization that
     `-G' was not enough.  One can't just tell Wget to ignore `<LINK>',
     because then stylesheets will not be downloaded.  Now the best bet
     for downloading a single page and its requisites is the dedicated
     `--page-requisites' option.

`-H'
`--span-hosts'
     Enable spanning across hosts when doing recursive retrieving
     (Note: Spanning Hosts).

`-L'
`--relative'
     Follow relative links only.  Useful for retrieving a specific home
     page without any distractions, not even those from the same hosts
     (Note: Relative Links).

`-I LIST'
`--include-directories=LIST'
     Specify a comma-separated list of directories you wish to follow
     when downloading (Note: Directory-Based Limits for more
     details.)  Elements of LIST may contain wildcards.

`-X LIST'
`--exclude-directories=LIST'
     Specify a comma-separated list of directories you wish to exclude
     from download (Note: Directory-Based Limits for more details.)
     Elements of LIST may contain wildcards.

`-np'

`--no-parent'
     Do not ever ascend to the parent directory when retrieving
     recursively.  This is a useful option, since it guarantees that
     only the files _below_ a certain hierarchy will be downloaded.
     Note: Directory-Based Limits, for more details.


automatically generated by info2www version 1.2.2.9