GNU Info

Info Node: (gawk.info)Future Extensions

(gawk.info)Future Extensions


Prev: Dynamic Extensions Up: Notes
Enter node , (file) or (file)node

Probable Future Extensions
==========================

     AWK is a language similar to PERL, only considerably more elegant.
     Arnold Robbins

     Hey!
     Larry Wall

   This minor node briefly lists extensions and possible improvements
that indicate the directions we are currently considering for `gawk'.
The file `FUTURES' in the `gawk' distribution lists these extensions as
well.

   Following is a list of probable future changes visible at the `awk'
language level:

Loadable Module Interface
     It is not clear that the `awk'-level interface to the modules
     facility is as good as it should be.  The interface needs to be
     redesigned, particularly taking namespace issues into account, as
     well as possibly including issues such as library search path order
     and versioning.

`RECLEN' variable for fixed length records
     Along with `FIELDWIDTHS', this would speed up the processing of
     fixed-length records.  `PROCINFO["RS"]' would be `"RS"' or
     `"RECLEN"', depending upon which kind of record processing is in
     effect.

Additional `printf' specifiers
     The 1999 ISO C standard added a number of additional `printf'
     format specifiers.  These should be evaluated for possible
     inclusion in `gawk'.

Databases
     It may be possible to map a GDBM/NDBM/SDBM file into an `awk'
     array.

Large Character Sets
     It would be nice if `gawk' could handle UTF-8 and other character
     sets that are larger than eight bits.

More `lint' warnings
     There are more things that could be checked for portability.

   Following is a list of probable improvements that will make `gawk''s
source code easier to work with:

Loadable Module Mechanics
     The current extension mechanism works (Note: Adding New Built-in
     Functions to `gawk'.), but is rather
     primitive. It requires a fair amount of manual work to create and
     integrate a loadable module.  Nor is the current mechanism as
     portable as might be desired.  The GNU `libtool' package provides
     a number of features that would make using loadable modules much
     easier.  `gawk' should be changed to use `libtool'.

Loadable Module Internals
     The API to its internals that `gawk' "exports" should be revised.
     Too many things are needlessly exposed.  A new API should be
     designed and implemented to make module writing easier.

Better Array Subscript Management
     `gawk''s management of array subscript storage could use revamping,
     so that using the same value to index multiple arrays only stores
     one copy of the index value.

Integrating the DBUG Library
     Integrating Fred Fish's DBUG library would be helpful during
     development, but it's a lot of work to do.

   Following is a list of probable improvements that will make `gawk'
perform better:

An Improved Version of `dfa'
     The `dfa' pattern matcher from GNU `grep' has some problems.
     Either a new version or a fixed one will deal with some important
     regexp matching issues.

Compilation of `awk' programs
     `gawk' uses a Bison (YACC-like) parser to convert the script given
     it into a syntax tree; the syntax tree is then executed by a
     simple recursive evaluator.  This method incurs a lot of overhead,
     since the recursive evaluator performs many procedure calls to do
     even the simplest things.

     It should be possible for `gawk' to convert the script's parse tree
     into a C program which the user would then compile, using the
     normal C compiler and a special `gawk' library to provide all the
     needed functions (regexps, fields, associative arrays, type
     coercion, and so on).

     An easier possibility might be for an intermediate phase of `gawk'
     to convert the parse tree into a linear byte code form like the
     one used in GNU Emacs Lisp.  The recursive evaluator would then be
     replaced by a straight line byte code interpreter that would be
     intermediate in speed between running a compiled program and doing
     what `gawk' does now.

   Finally, the programs in the test suite could use documenting in
this Info file.

   Note: Making Additions to `gawk', if you are interested
in tackling any of these projects.


automatically generated by info2www version 1.2.2.9