GNU Info

Info Node: (flex.info)C++

(flex.info)C++


Next: Incompatibilities Prev: Performance Up: Top
Enter node , (file) or (file)node

Generating C++ scanners
***********************

   `flex' provides two different ways to generate scanners for use with
C++.  The first way is to simply compile a scanner generated by `flex'
using a C++ compiler instead of a C compiler.  You should not encounter
any compilations errors (please report any you find to the email address
given in the Author section below).  You can then use C++ code in your
rule actions instead of C code.  Note that the default input source for
your scanner remains `yyin', and default echoing is still done to
`yyout'.  Both of these remain `FILE *' variables and not C++ `streams'.

   You can also use `flex' to generate a C++ scanner class, using the
`-+' option, (or, equivalently, `%option c++'), which is automatically
specified if the name of the flex executable ends in a `+', such as
`flex++'.  When using this option, flex defaults to generating the
scanner to the file `lex.yy.cc' instead of `lex.yy.c'.  The generated
scanner includes the header file `FlexLexer.h', which defines the
interface to two C++ classes.

   The first class, `FlexLexer', provides an abstract base class
defining the general scanner class interface.  It provides the
following member functions:

`const char* YYText()'
     returns the text of the most recently matched token, the
     equivalent of `yytext'.

`int YYLeng()'
     returns the length of the most recently matched token, the
     equivalent of `yyleng'.

`int lineno() const'
     returns the current input line number (see `%option yylineno'), or
     1 if `%option yylineno' was not used.

`void set_debug( int flag )'
     sets the debugging flag for the scanner, equivalent to assigning to
     `yy_flex_debug' (see the Options section above).  Note that you
     must build the scanner using `%option debug' to include debugging
     information in it.

`int debug() const'
     returns the current setting of the debugging flag.

   Also provided are member functions equivalent to
`yy_switch_to_buffer(), yy_create_buffer()' (though the first argument
is an `istream*' object pointer and not a `FILE*', `yy_flush_buffer()',
`yy_delete_buffer()', and `yyrestart()' (again, the first argument is a
`istream*' object pointer).

   The second class defined in `FlexLexer.h' is `yyFlexLexer', which is
derived from `FlexLexer'.  It defines the following additional member
functions:

`yyFlexLexer( istream* arg_yyin = 0, ostream* arg_yyout = 0 )'
     constructs a `yyFlexLexer' object using the given streams for
     input and output.  If not specified, the streams default to `cin'
     and `cout', respectively.

`virtual int yylex()'
     performs the same role is `yylex()' does for ordinary flex
     scanners: it scans the input stream, consuming tokens, until a
     rule's action returns a value.  If you derive a subclass S from
     `yyFlexLexer' and want to access the member functions and
     variables of S inside `yylex()', then you need to use `%option
     yyclass="S"' to inform `flex' that you will be using that subclass
     instead of `yyFlexLexer'.  In this case, rather than generating
     `yyFlexLexer::yylex()', `flex' generates `S::yylex()' (and also
     generates a dummy `yyFlexLexer::yylex()' that calls
     `yyFlexLexer::LexerError()' if called).

`virtual void switch_streams(istream* new_in = 0, ostream* new_out = 0)'
     reassigns `yyin' to `new_in' (if non-nil) and `yyout' to `new_out'
     (ditto), deleting the previous input buffer if `yyin' is
     reassigned.

`int yylex( istream* new_in = 0, ostream* new_out = 0 )'
     first switches the input streams via `switch_streams( new_in,
     new_out )' and then returns the value of `yylex()'.

   In addition, `yyFlexLexer' defines the following protected virtual
functions which you can redefine in derived classes to tailor the
scanner:

`virtual int LexerInput( char* buf, int max_size )'
     reads up to `max_size' characters into BUF and returns the number
     of characters read.  To indicate end-of-input, return 0
     characters.  Note that "interactive" scanners (see the `-B' and
     `-I' flags) define the macro `YY_INTERACTIVE'.  If you redefine
     `LexerInput()' and need to take different actions depending on
     whether or not the scanner might be scanning an interactive input
     source, you can test for the presence of this name via `#ifdef'.

`virtual void LexerOutput( const char* buf, int size )'
     writes out SIZE characters from the buffer BUF, which, while
     NUL-terminated, may also contain "internal" NUL's if the scanner's
     rules can match text with NUL's in them.

`virtual void LexerError( const char* msg )'
     reports a fatal error message.  The default version of this
     function writes the message to the stream `cerr' and exits.

   Note that a `yyFlexLexer' object contains its _entire_ scanning
state.  Thus you can use such objects to create reentrant scanners.
You can instantiate multiple instances of the same `yyFlexLexer' class,
and you can also combine multiple C++ scanner classes together in the
same program using the `-P' option discussed above.  Finally, note that
the `%array' feature is not available to C++ scanner classes; you must
use `%pointer' (the default).

   Here is an example of a simple C++ scanner:

         // An example of using the flex C++ scanner class.
     
     %{
     int mylineno = 0;
     %}
     
     string  \"[^\n"]+\"
     
     ws      [ \t]+
     
     alpha   [A-Za-z]
     dig     [0-9]
     name    ({alpha}|{dig}|\$)({alpha}|{dig}|[_.\-/$])*
     num1    [-+]?{dig}+\.?([eE][-+]?{dig}+)?
     num2    [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
     number  {num1}|{num2}
     
     %%
     
     {ws}    /* skip blanks and tabs */
     
     "/*"    {
             int c;
     
             while((c = yyinput()) != 0)
                 {
                 if(c == '\n')
                     ++mylineno;
     
                 else if(c == '*')
                     {
                     if((c = yyinput()) == '/')
                         break;
                     else
                         unput(c);
                     }
                 }
             }
     
     {number}  cout << "number " << YYText() << '\n';
     
     \n        mylineno++;
     
     {name}    cout << "name " << YYText() << '\n';
     
     {string}  cout << "string " << YYText() << '\n';
     
     %%
     
     Version 2.5               December 1994                        44
     
     int main( int /* argc */, char** /* argv */ )
         {
         FlexLexer* lexer = new yyFlexLexer;
         while(lexer->yylex() != 0)
             ;
         return 0;
         }

   If you want to create multiple (different) lexer classes, you use
the `-P' flag (or the `prefix=' option) to rename each `yyFlexLexer' to
some other `xxFlexLexer'.  You then can include `<FlexLexer.h>' in your
other sources once per lexer class, first renaming `yyFlexLexer' as
follows:

     #undef yyFlexLexer
     #define yyFlexLexer xxFlexLexer
     #include <FlexLexer.h>
     
     #undef yyFlexLexer
     #define yyFlexLexer zzFlexLexer
     #include <FlexLexer.h>

   if, for example, you used `%option prefix="xx"' for one of your
scanners and `%option prefix="zz"' for the other.

   IMPORTANT: the present form of the scanning class is _experimental_
and may change considerably between major releases.


automatically generated by info2www version 1.2.2.9