Copyright (C) 2000-2012 |
GNU Info (python2.1-lib.info)tokenizeTokenizer for Python source =========================== Lexical scanner for Python source code. This module was written by Ka Ping Yee <>. This manual section was written by Fred L. Drake, Jr. <fdrake@acm.org>. The `tokenize' module provides a lexical scanner for Python source code, implemented in Python. The scanner in this module returns comments as tokens as well, making it useful for implementing "pretty-printers," including colorizers for on-screen displays. The scanner is exposed by a single function: `tokenize(readline[, tokeneater])' The `tokenize()' function accepts two parameters: one representing the input stream, and one providing an output mechanism for `tokenize()'. The first parameter, READLINE, must be a callable object which provides the same interface as the `readline()' method of built-in file objects (see section~Note: File Objectsfile). Each call to the function should return one line of input as a string. The second parameter, TOKENEATER, must also be a callable object. It is called with five parameters: the token type, the token string, a tuple `(SROW, SCOL)' specifying the row and column where the token begins in the source, a tuple `(EROW, ECOL)' giving the ending position of the token, and the line on which the token was found. The line passed is the _logical_ line; continuation lines are included. All constants from the `token' module are also exported from `tokenize', as are two additional token type values that might be passed to the TOKENEATER function by `tokenize()': `COMMENT' Token value used to indicate a comment. `NL' Token value used to indicate a non-terminating newline. The NEWLINE token indicates the end of a logical line of Python code; NL tokens are generated when a logical line of code is continued over multiple physical lines. automatically generated by info2www version 1.2.2.9 |