flex provides two different ways to generate scanners for
use with C++. The first way is to simply compile a
scanner generated by flex using a C++ compiler instead of a C
compiler. You should not encounter any compilations
errors (please report any you find to the email address
given in the Author section below). You can then use C++
code in your rule actions instead of C code. Note that
the default input source for your scanner remains yyin,
and default echoing is still done to yyout. Both of these
remain `FILE *' variables and not C++ streams.
You can also use flex to generate a C++ scanner class, using
the `-+' option, (or, equivalently, `%option c++'), which
is automatically specified if the name of the flex executable ends
in a `+', such as flex++. When using this option, flex
defaults to generating the scanner to the file `lex.yy.cc' instead
of `lex.yy.c'. The generated scanner includes the header file
`FlexLexer.h', which defines the interface to two C++ classes.
The first class, FlexLexer, provides an abstract base
class defining the general scanner class interface. It
provides the following member functions:
`const char* YYText()'
returns the text of the most recently matched
token, the equivalent of yytext.
`int YYLeng()'
returns the length of the most recently matched
token, the equivalent of yyleng.
`int lineno() const'
returns the current input line number (see `%option yylineno'),
or 1 if `%option yylineno' was not used.
`void set_debug( int flag )'
sets the debugging flag for the scanner, equivalent to assigning to
yy_flex_debug (see the Options section above). Note that you
must build the scanner using `%option debug' to include debugging
information in it.
`int debug() const'
returns the current setting of the debugging flag.
Also provided are member functions equivalent to
`yy_switch_to_buffer(), yy_create_buffer()' (though the
first argument is an `istream*' object pointer and not a
`FILE*', `yy_flush_buffer()', `yy_delete_buffer()',
and `yyrestart()' (again, the first argument is a `istream*'
object pointer).
The second class defined in `FlexLexer.h' is yyFlexLexer,
which is derived from FlexLexer. It defines the following
additional member functions:
constructs a yyFlexLexer object using the given
streams for input and output. If not specified,
the streams default to cin and cout, respectively.
`virtual int yylex()'
performs the same role is `yylex()' does for ordinary
flex scanners: it scans the input stream, consuming
tokens, until a rule's action returns a value. If you derive a subclass
S
from yyFlexLexer
and want to access the member functions and variables of
S
inside `yylex()',
then you need to use `%option yyclass="S"'
to inform flex
that you will be using that subclass instead of yyFlexLexer.
In this case, rather than generating `yyFlexLexer::yylex()',
flex generates `S::yylex()'
(and also generates a dummy `yyFlexLexer::yylex()'
that calls `yyFlexLexer::LexerError()'
if called).
first switches the input streams via `switch_streams( new_in, new_out )'
and then returns the value of `yylex()'.
In addition, yyFlexLexer defines the following protected
virtual functions which you can redefine in derived
classes to tailor the scanner:
`virtual int LexerInput( char* buf, int max_size )'
reads up to `max_size' characters into buf and
returns the number of characters read. To indicate
end-of-input, return 0 characters. Note that
"interactive" scanners (see the `-B' and `-I' flags)
define the macro YY_INTERACTIVE. If you redefine
LexerInput() and need to take different actions
depending on whether or not the scanner might be
scanning an interactive input source, you can test
for the presence of this name via `#ifdef'.
`virtual void LexerOutput( const char* buf, int size )'
writes out size characters from the buffer buf,
which, while NUL-terminated, may also contain
"internal" NUL's if the scanner's rules can match
text with NUL's in them.
`virtual void LexerError( const char* msg )'
reports a fatal error message. The default version
of this function writes the message to the stream
cerr and exits.
Note that a yyFlexLexer object contains its entire
scanning state. Thus you can use such objects to create
reentrant scanners. You can instantiate multiple instances of
the same yyFlexLexer class, and you can also combine
multiple C++ scanner classes together in the same program
using the `-P' option discussed above.
Finally, note that the `%array' feature is not available to
C++ scanner classes; you must use `%pointer' (the default).
Here is an example of a simple C++ scanner:
// An example of using the flex C++ scanner class.
%{
int mylineno = 0;
%}
string \"[^\n"]+\"
ws [ \t]+
alpha [A-Za-z]
dig [0-9]
name ({alpha}|{dig}|\$)({alpha}|{dig}|[_.\-/$])*
num1 [-+]?{dig}+\.?([eE][-+]?{dig}+)?
num2 [-+]?{dig}*\.{dig}+([eE][-+]?{dig}+)?
number {num1}|{num2}
%%
{ws} /* skip blanks and tabs */
"/*" {
int c;
while((c = yyinput()) != 0)
{
if(c == '\n')
++mylineno;
else if(c == '*')
{
if((c = yyinput()) == '/')
break;
else
unput(c);
}
}
}
{number} cout << "number " << YYText() << '\n';
\n mylineno++;
{name} cout << "name " << YYText() << '\n';
{string} cout << "string " << YYText() << '\n';
%%
Version 2.5 December 1994 44
int main( int /* argc */, char** /* argv */ )
{
FlexLexer* lexer = new yyFlexLexer;
while(lexer->yylex() != 0)
;
return 0;
}
If you want to create multiple (different) lexer classes,
you use the `-P' flag (or the `prefix=' option) to rename each
yyFlexLexer to some other xxFlexLexer. You then can
include `<FlexLexer.h>' in your other sources once per lexer
class, first renaming yyFlexLexer as follows: