Info Node: (wget.info)Recursive Accept/Reject Options

www.fifi.org
    Documentation
        Manpages
        GNU Info
        Debian document tree
        Whole document tree
    Trigance web page
    Public services
    User info
    Mailing lists
    Secure server
    Multilingual usage

Validate HTML
Validate CSS

(wget.info)Recursive Accept/Reject Options

Recursive Accept/Reject Options =============================== `-A ACCLIST --accept ACCLIST' `-R REJLIST --reject REJLIST' Specify comma-separated lists of file name suffixes or patterns to accept or reject (Note: Types of Files for more details). `-D DOMAIN-LIST' `--domains=DOMAIN-LIST' Set domains to be followed. DOMAIN-LIST is a comma-separated list of domains. Note that it does _not_ turn on `-H'. `--exclude-domains DOMAIN-LIST' Specify the domains that are _not_ to be followed. (Note: Spanning Hosts). `--follow-ftp' Follow FTP links from HTML documents. Without this option, Wget will ignore all the FTP links. `--follow-tags=LIST' Wget has an internal table of HTML tag / attribute pairs that it considers when looking for linked documents during a recursive retrieval. If a user wants only a subset of those tags to be considered, however, he or she should be specify such tags in a comma-separated LIST with this option. `-G LIST' `--ignore-tags=LIST' This is the opposite of the `--follow-tags' option. To skip certain HTML tags when recursively looking for documents to download, specify them in a comma-separated LIST. In the past, the `-G' option was the best bet for downloading a single page and its requisites, using a commandline like: wget -Ga,area -H -k -K -r http://SITE/DOCUMENT However, the author of this option came across a page with tags like `<LINK REL="home" HREF="/">' and came to the realization that `-G' was not enough. One can't just tell Wget to ignore `<LINK>', because then stylesheets will not be downloaded. Now the best bet for downloading a single page and its requisites is the dedicated `--page-requisites' option. `-H' `--span-hosts' Enable spanning across hosts when doing recursive retrieving (Note: Spanning Hosts). `-L' `--relative' Follow relative links only. Useful for retrieving a specific home page without any distractions, not even those from the same hosts (Note: Relative Links). `-I LIST' `--include-directories=LIST' Specify a comma-separated list of directories you wish to follow when downloading (Note: Directory-Based Limits for more details.) Elements of LIST may contain wildcards. `-X LIST' `--exclude-directories=LIST' Specify a comma-separated list of directories you wish to exclude from download (Note: Directory-Based Limits for more details.) Elements of LIST may contain wildcards. `-np' `--no-parent' Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files _below_ a certain hierarchy will be downloaded. Note: Directory-Based Limits, for more details.

automatically generated by

info2www

version 1.2.2.9