Manpage of \f3WWW\f1

www.fifi.org
    Documentation
        Manpages
        GNU Info
        Debian document tree
        Whole document tree
    Trigance web page
    Public services
    User info
    Mailing lists
    Secure server
    Multilingual usage

Validate HTML
Validate CSS

SUBROUTINES

extract_description( FILE )

Extracts a description from an HTML or plain text file given by the FILE name; FILE should be an absolute path. The first $description::chars (default: 2048) characters are read. If the file ends in one of the extensions htm, html, or shtml, it is presumed to be an HTML file; if the file ends in txt, it is presumed to be a plain text file. Other extensions are not recognized and no description is returned for them.

For HTML files, first, if a <META NAME="description" CONTENT="..."> or a <META NAME="DC.description" CONTENT="..."> (Dublin Core) element is found, then the words specified as the value of the CONTENT attribute is returned as the description.

Otherwise, all HTML comments, text between <SCRIPT>, <STYLE>, and <TITLE> tags, and all other HTML tags are stripped. If <AREA ... ALT="..."> or <IMG ... ALT="..."> elements are found, then the words specified as the value of the ALT attributes are extracted.

Finally, for either HTML or plain text files, at most $description::words (default: 50) are returned.

extract_meta( FILE, NAME )

Extracts the value of the CONTENT attribute from a META element having the given NAME attribute from an HTML file given by the FILE name; FILE should be an absolute path. The file must end in one of the extensions htm, html, or shtml to be considered an HTML file. The first $description::chars (default: 2048) characters are read. The characters are cached between consecutive calls using the same filename.

hyperlink( LIST )

Adds hyperlinks to strings: that is strings that contain substrings that are valid URLs (according to RFC 1630) have the appropriate HTML tags ``wrapped'' around them so that they will be selectable when displayed in a browser. The ftp, gopher, http, https, mailto, news, telnet, and wais URLs are recognized. Example:



     Read all about it at
     http://www.usatoday.com/

becomes:

     Read all about it at
     <A HREF="http://www.usatoday.com/">http://www.usatoday.com/</A>

\f3WWW\f1

NAME

SYNOPSIS

DESCRIPTION

SUBROUTINES

SEE ALSO

AUTHOR

Index