Some scanners (such as those which support "include"
files) require reading from several input streams. As
flex scanners do a large amount of buffering, one cannot
control where the next input will be read from by simply
writing a YY_INPUT which is sensitive to the scanning
context. YY_INPUT is only called when the scanner reaches
the end of its buffer, which may be a long time after
scanning a statement such as an "include" which requires
switching the input source.
To negotiate these sorts of problems, flex provides a
mechanism for creating and switching between multiple
input buffers. An input buffer is created by using:
YY_BUFFER_STATE yy_create_buffer( FILE *file, int size )
which takes a FILE pointer and a size and creates a buffer
associated with the given file and large enough to hold
size characters (when in doubt, use YY_BUF_SIZE for the
size). It returns a YY_BUFFER_STATE handle, which may
then be passed to other routines (see below). The
YY_BUFFER_STATE type is a pointer to an opaque structyy_buffer_state structure, so you may safely initialize
YY_BUFFER_STATE variables to `((YY_BUFFER_STATE) 0)' if you
wish, and also refer to the opaque structure in order to
correctly declare input buffers in source files other than
that of your scanner. Note that the FILE pointer in the
call to yy_create_buffer is only used as the value of yyin
seen by YY_INPUT; if you redefine YY_INPUT so it no longer
uses yyin, then you can safely pass a nil FILE pointer to
yy_create_buffer. You select a particular buffer to scan
from using:
switches the scanner's input buffer so subsequent tokens
will come from new_buffer. Note that
`yy_switch_to_buffer()' may be used by `yywrap()' to set
things up for continued scanning, instead of opening a new
file and pointing yyin at it. Note also that switching
input sources via either `yy_switch_to_buffer()' or `yywrap()'
does not change the start condition.
void yy_delete_buffer( YY_BUFFER_STATE buffer )
is used to reclaim the storage associated with a buffer.
You can also clear the current contents of a buffer using:
void yy_flush_buffer( YY_BUFFER_STATE buffer )
This function discards the buffer's contents, so the next time the
scanner attempts to match a token from the buffer, it will first fill
the buffer anew using YY_INPUT.
`yy_new_buffer()' is an alias for `yy_create_buffer()',
provided for compatibility with the C++ use of new and delete
for creating and destroying dynamic objects.
Finally, the YY_CURRENT_BUFFER macro returns a
YY_BUFFER_STATE handle to the current buffer.
Here is an example of using these features for writing a
scanner which expands include files (the `<<EOF>>' feature
is discussed below):
/* the "incl" state is used for picking up the name
* of an include file
*/
%x incl
%{
#define MAX_INCLUDE_DEPTH 10
YY_BUFFER_STATE include_stack[MAX_INCLUDE_DEPTH];
int include_stack_ptr = 0;
%}
%%
include BEGIN(incl);
[a-z]+ ECHO;
[^a-z\n]*\n? ECHO;
<incl>[ \t]* /* eat the whitespace */
<incl>[^ \t\n]+ { /* got the include file name */
if ( include_stack_ptr >= MAX_INCLUDE_DEPTH )
{
fprintf( stderr, "Includes nested too deeply" );
exit( 1 );
}
include_stack[include_stack_ptr++] =
YY_CURRENT_BUFFER;
yyin = fopen( yytext, "r" );
if ( ! yyin )
error( ... );
yy_switch_to_buffer(
yy_create_buffer( yyin, YY_BUF_SIZE ) );
BEGIN(INITIAL);
}
<<EOF>> {
if ( --include_stack_ptr < 0 )
{
yyterminate();
}
else
{
yy_delete_buffer( YY_CURRENT_BUFFER );
yy_switch_to_buffer(
include_stack[include_stack_ptr] );
}
}
Three routines are available for setting up input buffers
for scanning in-memory strings instead of files. All of
them create a new input buffer for scanning the string,
and return a corresponding YY_BUFFER_STATE handle (which
you should delete with `yy_delete_buffer()' when done with
it). They also switch to the new buffer using
`yy_switch_to_buffer()', so the next call to `yylex()' will
start scanning the string.
`yy_scan_string(const char *str)'
scans a NUL-terminated string.
`yy_scan_bytes(const char *bytes, int len)'
scans len bytes (including possibly NUL's) starting
at location bytes.
Note that both of these functions create and scan a copy
of the string or bytes. (This may be desirable, since
`yylex()' modifies the contents of the buffer it is
scanning.) You can avoid the copy by using:
`yy_scan_buffer(char *base, yy_size_t size)'
which scans in place the buffer starting at base,
consisting of size bytes, the last two bytes of
which must be YY_END_OF_BUFFER_CHAR (ASCII NUL).
These last two bytes are not scanned; thus,
scanning consists of `base[0]' through `base[size-2]',
inclusive.
If you fail to set up base in this manner (i.e.,
forget the final two YY_END_OF_BUFFER_CHAR bytes),
then `yy_scan_buffer()' returns a nil pointer instead
of creating a new input buffer.
The type yy_size_t is an integral type to which you
can cast an integer expression reflecting the size
of the buffer.