(python2.1-lib.info)tokenize


Tokenizer for Python source
===========================

Lexical scanner for Python source code.  This module was written by Ka
Ping Yee <>.
This manual section was written by Fred L. Drake, Jr. <fdrake@acm.org>.
The `tokenize' module provides a lexical scanner for Python source
code, implemented in Python.  The scanner in this module returns
comments as tokens as well, making it useful for implementing
"pretty-printers," including colorizers for on-screen displays.

The scanner is exposed by a single function:

`tokenize(readline[, tokeneater])'
     The `tokenize()' function accepts two parameters: one representing
     the input stream, and one providing an output mechanism for
     `tokenize()'.

     The first parameter, READLINE, must be a callable object which
     provides the same interface as the `readline()' method of built-in
     file objects (see section~Note: File Objectsfile).  Each call
     to the function should return one line of input as a string.

     The second parameter, TOKENEATER, must also be a callable object.
     It is called with five parameters: the token type, the token
     string, a tuple `(SROW, SCOL)' specifying the row and column where
     the token begins in the source, a tuple `(EROW, ECOL)' giving the
     ending position of the token, and the line on which the token was
     found.  The line passed is the _logical_ line; continuation lines
     are included.

All constants from the `token' module are also exported from
`tokenize', as are two additional token type values that might be
passed to the TOKENEATER function by `tokenize()':

`COMMENT'
     Token value used to indicate a comment.

`NL'
     Token value used to indicate a non-terminating newline.  The
     NEWLINE token indicates the end of a logical line of Python code;
     NL tokens are generated when a logical line of code is continued
     over multiple physical lines.

automatically generated by info2www version 1.2.2.9