Tokenizer for Python source
===========================
Lexical scanner for Python source code. This module was written by Ka
Ping Yee <>.
This manual section was written by Fred L. Drake, Jr. <fdrake@acm.org>.
The `tokenize' module provides a lexical scanner for Python source
code, implemented in Python. The scanner in this module returns
comments as tokens as well, making it useful for implementing
"pretty-printers," including colorizers for on-screen displays.
The scanner is exposed by a single function:
`tokenize(readline[, tokeneater])'
The `tokenize()' function accepts two parameters: one representing
the input stream, and one providing an output mechanism for
`tokenize()'.
The first parameter, READLINE, must be a callable object which
provides the same interface as the `readline()' method of built-in
file objects (see section~Note:File Objectsfile). Each call
to the function should return one line of input as a string.
The second parameter, TOKENEATER, must also be a callable object.
It is called with five parameters: the token type, the token
string, a tuple `(SROW, SCOL)' specifying the row and column where
the token begins in the source, a tuple `(EROW, ECOL)' giving the
ending position of the token, and the line on which the token was
found. The line passed is the _logical_ line; continuation lines
are included.
All constants from the `token' module are also exported from
`tokenize', as are two additional token type values that might be
passed to the TOKENEATER function by `tokenize()':
`COMMENT'
Token value used to indicate a comment.
`NL'
Token value used to indicate a non-terminating newline. The
NEWLINE token indicates the end of a logical line of Python code;
NL tokens are generated when a logical line of code is continued
over multiple physical lines.