Copyright (C) 2000-2012 |
Manpages Text::ParseWordsSection: Perl Programmers Reference Guide (3perl)Updated: 2001-02-22 Index Return to Main Contents NAMEText::ParseWords - parse text into an array of tokens or array of arraysSYNOPSISuse Text::ParseWords; @lists = &nested_quotewords($delim, $keep, @lines); @words = "ewords($delim, $keep, @lines); @words = &shellwords(@lines); @words = &parse_line($delim, $keep, $line); @words = &old_shellwords(@lines); # DEPRECATED! DESCRIPTIONThe &nested_quotewords() and "ewords() functions accept a delimiter (which can be a regular expression) and a list of lines and then breaks those lines up into a list of words ignoring delimiters that appear inside quotes. "ewords() returns all of the tokens in a single long list, while &nested_quotewords() returns a list of token lists corresponding to the elements of @lines. &parse_line() does tokenizing on a single string. The &*quotewords() functions simply call &parse_lines(), so if you're only splitting one line you can call &parse_lines() directly and save a function call.The $keep argument is a boolean flag. If true, then the tokens are split on the specified delimiter, but all other characters (quotes, backslashes, etc.) are kept in the tokens. If $keep is false then the &*quotewords() functions remove all quotes and backslashes that are not themselves backslash-escaped or inside of single quotes (i.e., "ewords() tries to interpret these characters just like the Bourne shell). NB: these semantics are significantly different from the original version of this module shipped with Perl 5.000 through 5.004. As an additional feature, $keep may be the keyword ``delimiters'' which causes the functions to preserve the delimiters in each string as tokens in the token lists, in addition to preserving quote and backslash characters. &shellwords() is written as a special case of "ewords(), and it does token parsing with whitespace as a delimiter- similar to most Unix shells. EXAMPLESThe sample program:
use Text::ParseWords; @words = "ewords('\s+', 0, q{this is "a test" of\ quotewords \"for you}); $i = 0; foreach (@words) { print "$i: <$_>\n"; $i++; }produces:
0: <this> 1: <is> 2: <a test> 3: <of quotewords> 4: <"for> 5: <you>demonstrating:
Replacing ""ewords('\s+', 0, q{this is...})" with "&shellwords(q{this is...})" is a simpler way to accomplish the same thing. AUTHORSMaintainer is Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Original author unknown). Much of the code for &parse_line() (including the primary regexp) from Joerk Behrends <jbehrends@multimediaproduzenten.de>.Examples section another documentation provided by John Heidemann <johnh@ISI.EDU> Bug reports, patches, and nagging provided by lots of folks- thanks everybody! Special thanks to Michael Schwern <schwern@envirolink.org> for assuring me that a &nested_quotewords() would be useful, and to Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about error-checking (sort of- you had to be there).
IndexThis document was created by man2html, using the manual pages. Time: 04:16:35 GMT, April 25, 2024 |