GNU Info

Info Node: (python2.1-lib.info)parser

(python2.1-lib.info)parser


Next: symbol Prev: Python Language Services Up: Python Language Services
Enter node , (file) or (file)node

Access Python parse trees
=========================

Access parse trees for Python source code.  This module was written by
Fred L. Drake, Jr. <fdrake@acm.org>.
This manual section was written by Fred L. Drake, Jr. <fdrake@acm.org>.
The `parser' module provides an interface to Python's internal parser
and byte-code compiler.  The primary purpose for this interface is to
allow Python code to edit the parse tree of a Python expression and
create executable code from this.  This is better than trying to parse
and modify an arbitrary Python code fragment as a string because
parsing is performed in a manner identical to the code forming the
application.  It is also faster.

There are a few things to note about this module which are important to
making use of the data structures created.  This is not a tutorial on
editing the parse trees for Python code, but some examples of using the
`parser' module are presented.

Most importantly, a good understanding of the Python grammar processed
by the internal parser is required.  For full information on the
language syntax, refer to the .  The parser itself is created from a
grammar specification defined in the file `Grammar/Grammar' in the
standard Python distribution.  The parse trees stored in the AST
objects created by this module are the actual output from the internal
parser when created by the `expr()' or `suite()' functions, described
below.  The AST objects created by `sequence2ast()' faithfully simulate
those structures.  Be aware that the values of the sequences which are
considered "correct" will vary from one version of Python to another as
the formal grammar for the language is revised.  However, transporting
code from one Python version to another as source text will always
allow correct parse trees to be created in the target version, with the
only restriction being that migrating to an older version of the
interpreter will not support more recent language constructs.  The
parse trees are not typically compatible from one version to another,
whereas source code has always been forward-compatible.

Each element of the sequences returned by `ast2list()' or `ast2tuple()'
has a simple form.  Sequences representing non-terminal elements in the
grammar always have a length greater than one.  The first element is an
integer which identifies a production in the grammar.  These integers
are given symbolic names in the C header file `Include/graminit.h' and
the Python module `symbol'.  Each additional element of the sequence
represents a component of the production as recognized in the input
string: these are always sequences which have the same form as the
parent.  An important aspect of this structure which should be noted is
that keywords used to identify the parent node type, such as the keyword
`if' in an `if_stmt', are included in the node tree without any special
treatment.  For example, the `if' keyword is represented by the tuple
`(1, 'if')', where `1' is the numeric value associated with all `NAME'
tokens, including variable and function names defined by the user.  In
an alternate form returned when line number information is requested,
the same token might be represented as `(1, 'if', 12)', where the `12'
represents the line number at which the terminal symbol was found.

Terminal elements are represented in much the same way, but without any
child elements and the addition of the source text which was
identified.  The example of the `if' keyword above is representative.
The various types of terminal symbols are defined in the C header file
`Include/token.h' and the Python module `token'.

The AST objects are not required to support the functionality of this
module, but are provided for three purposes: to allow an application to
amortize the cost of processing complex parse trees, to provide a parse
tree representation which conserves memory space when compared to the
Python list or tuple representation, and to ease the creation of
additional modules in C which manipulate parse trees.  A simple
"wrapper" class may be created in Python to hide the use of AST objects.

The `parser' module defines functions for a few distinct purposes.  The
most important purposes are to create AST objects and to convert AST
objects to other representations such as parse trees and compiled code
objects, but there are also functions which serve to query the type of
parse tree represented by an AST object.

See also:
     Note: symbol Useful constants representing internal nodes of the
     parse tree.  Note: token Useful constants representing leaf
     nodes of the parse tree and functions for testing node values.

Creating AST Objects
Converting AST Objects
Queries on AST Objects
Exceptions and Error Handling
AST Objects
Examples 4

automatically generated by info2www version 1.2.2.9