Always check out the latest perl5-porters discussions on these subjects before embarking on an implementation tour. Bugs remove recursion in regular expression engine fix memory leaks during compile failures make signal handling safe Tie Modules VecArray Implement array using vec() SubstrArray Implement array using substr() VirtualArray Implement array using a file ShiftSplice Defines shift et al in terms of splice method Would be nice to have pack "(stuff)*", "(stuff)?", "(stuff)+", "(stuff)4", ... contiguous bitfields in pack/unpack lexperl bundled perl preprocessor/macro facility this would solve many of the syntactic nice-to-haves use posix calls internally where possible gettimeofday (possibly best left for a module?) format BOTTOM -i rename file only when successfully changed all ARGV input should act like <> report HANDLE [formats]. support in perlmain to rerun debugger regression tests using __DIE__ hook lexically scoped functions: my sub foo { ... } the basic concept is easy and sound, the difficulties begin with self-referential and mutually referential lexical subs: how to declare the subs? lexically scoped typeglobs? (lexical I/O handles work now) wantlvalue? more generalized want()/caller()? named prototypes: sub foo ($foo, @bar) { ... } ? regression/sanity tests for suidperl iterators/lazy evaluation/continuations/first/ first_defined/short-circuiting grep/?? This is a very thorny and hotly debated subject, tread carefully and do your homework first generalise Errno way of extracting cpp symbols and use that in Errno, Fcntl, POSIX (ExtUtils::CppSymbol?) the _r-problem: for all the {set,get,end}*() system database calls (and a couple more: readdir, *rand*, crypt, *time, tmpnam) there are in many systems the _r versions to be used in re-entrant (=multithreaded) code Icky things: the _r API is not standardized and the _r-forms require per-thread data to store their state cross-compilation support host vs target: compile in the host, get the executable to the target, get the possible input files to the target, execute in the target (and do not assume a UNIXish shell in the target! e.g. no command redirection can be assumed), get possible output files back to to host. this needs to work both during Configure and during the build. You cannot assume shared filesystems between the host and the target (you may need e.g. ftp), executing the target executable may involve e.g. rsh a way to make << and >> to shift bitvectors instead of numbers Possible pragmas debugger optimize (use less qw[memory cpu]) Optimizations constant function cache switch structures foreach(reverse...) cache eval tree (unless lexical outer scope used (mark in &compiling?)) rcatmaybe shrink opcode tables via multiple implementations selected in peep cache hash value? (Not a win, according to Guido) optimize away @_ where possible tail recursion removal "one pass" global destruction rewrite regexp parser for better integrated optimization LRU cache of regexp: foreach $pat (@pats) { foo() if /$pat/ } Vague possibilities ref function in list context? make tr/// return histogram in list context? loop control on do{} et al explicit switch statements built-in globbing compile to real threaded code structured types autocroak? modifiable $1 et al