/[pcre]/code/trunk/doc/pcretest.txt
ViewVC logotype

Diff of /code/trunk/doc/pcretest.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 155 by ph10, Tue Apr 24 13:36:11 2007 UTC revision 289 by ph10, Sun Dec 23 12:17:20 2007 UTC
# Line 85  DESCRIPTION Line 85  DESCRIPTION
85         "re>" to prompt for regular expressions, and "data>" to prompt for data         "re>" to prompt for regular expressions, and "data>" to prompt for data
86         lines.         lines.
87    
88           When  pcretest  is  built,  a  configuration option can specify that it
89           should be linked with the libreadline library. When this  is  done,  if
90           the input is from a terminal, it is read using the readline() function.
91           This provides line-editing and history facilities. The output from  the
92           -help option states whether or not readline() will be used.
93    
94         The program handles any number of sets of input on a single input file.         The program handles any number of sets of input on a single input file.
95         Each set starts with a regular expression, and continues with any  num-         Each set starts with a regular expression, and continues with any  num-
96         ber of data lines to be matched against the pattern.         ber of data lines to be matched against the pattern.
# Line 146  PATTERN MODIFIERS Line 152  PATTERN MODIFIERS
152         The following table shows additional modifiers for setting PCRE options         The following table shows additional modifiers for setting PCRE options
153         that do not correspond to anything in Perl:         that do not correspond to anything in Perl:
154    
155           /A          PCRE_ANCHORED           /A              PCRE_ANCHORED
156           /C          PCRE_AUTO_CALLOUT           /C              PCRE_AUTO_CALLOUT
157           /E          PCRE_DOLLAR_ENDONLY           /E              PCRE_DOLLAR_ENDONLY
158           /f          PCRE_FIRSTLINE           /f              PCRE_FIRSTLINE
159           /J          PCRE_DUPNAMES           /J              PCRE_DUPNAMES
160           /N          PCRE_NO_AUTO_CAPTURE           /N              PCRE_NO_AUTO_CAPTURE
161           /U          PCRE_UNGREEDY           /U              PCRE_UNGREEDY
162           /X          PCRE_EXTRA           /X              PCRE_EXTRA
163           /<cr>       PCRE_NEWLINE_CR           /<cr>           PCRE_NEWLINE_CR
164           /<lf>       PCRE_NEWLINE_LF           /<lf>           PCRE_NEWLINE_LF
165           /<crlf>     PCRE_NEWLINE_CRLF           /<crlf>         PCRE_NEWLINE_CRLF
166           /<anycrlf>  PCRE_NEWLINE_ANYCRLF           /<anycrlf>      PCRE_NEWLINE_ANYCRLF
167           /<any>      PCRE_NEWLINE_ANY           /<any>          PCRE_NEWLINE_ANY
168             /<bsr_anycrlf>  PCRE_BSR_ANYCRLF
169         Those  specifying  line ending sequencess are literal strings as shown.           /<bsr_unicode>  PCRE_BSR_UNICODE
170         This example sets multiline matching  with  CRLF  as  the  line  ending  
171         sequence:         Those  specifying  line  ending sequences are literal strings as shown,
172           but the letters can be in either  case.  This  example  sets  multiline
173           matching with CRLF as the line ending sequence:
174    
175           /^abc/m<crlf>           /^abc/m<crlf>
176    
# Line 369  DATA LINES Line 377  DATA LINES
377         The use of \x{hh...} to represent UTF-8 characters is not dependent  on         The use of \x{hh...} to represent UTF-8 characters is not dependent  on
378         the  use  of  the  /8 modifier on the pattern. It is recognized always.         the  use  of  the  /8 modifier on the pattern. It is recognized always.
379         There may be any number of hexadecimal digits inside  the  braces.  The         There may be any number of hexadecimal digits inside  the  braces.  The
380         result  is from one to six bytes, encoded according to the UTF-8 rules.         result  is  from  one  to  six bytes, encoded according to the original
381           UTF-8 rules of RFC 2279. This allows for  values  in  the  range  0  to
382           0x7FFFFFFF.  Note  that not all of those are valid Unicode code points,
383           or indeed valid UTF-8 characters according to the later  rules  in  RFC
384           3629.
385    
386    
387  THE ALTERNATIVE MATCHING FUNCTION  THE ALTERNATIVE MATCHING FUNCTION
388    
389         By  default,  pcretest  uses  the  standard  PCRE  matching   function,         By   default,  pcretest  uses  the  standard  PCRE  matching  function,
390         pcre_exec() to match each data line. From release 6.0, PCRE supports an         pcre_exec() to match each data line. From release 6.0, PCRE supports an
391         alternative matching function, pcre_dfa_test(),  which  operates  in  a         alternative  matching  function,  pcre_dfa_test(),  which operates in a
392         different  way,  and has some restrictions. The differences between the         different way, and has some restrictions. The differences  between  the
393         two functions are described in the pcrematching documentation.         two functions are described in the pcrematching documentation.
394    
395         If a data line contains the \D escape sequence, or if the command  line         If  a data line contains the \D escape sequence, or if the command line
396         contains  the -dfa option, the alternative matching function is called.         contains the -dfa option, the alternative matching function is  called.
397         This function finds all possible matches at a given point. If, however,         This function finds all possible matches at a given point. If, however,
398         the  \F escape sequence is present in the data line, it stops after the         the \F escape sequence is present in the data line, it stops after  the
399         first match is found. This is always the shortest possible match.         first match is found. This is always the shortest possible match.
400    
401    
402  DEFAULT OUTPUT FROM PCRETEST  DEFAULT OUTPUT FROM PCRETEST
403    
404         This section describes the output when the  normal  matching  function,         This  section  describes  the output when the normal matching function,
405         pcre_exec(), is being used.         pcre_exec(), is being used.
406    
407         When a match succeeds, pcretest outputs the list of captured substrings         When a match succeeds, pcretest outputs the list of captured substrings
408         that pcre_exec() returns, starting with number 0 for  the  string  that         that  pcre_exec()  returns,  starting with number 0 for the string that
409         matched the whole pattern. Otherwise, it outputs "No match" or "Partial         matched the whole pattern. Otherwise, it outputs "No match" or "Partial
410         match" when pcre_exec() returns PCRE_ERROR_NOMATCH  or  PCRE_ERROR_PAR-         match"  when  pcre_exec() returns PCRE_ERROR_NOMATCH or PCRE_ERROR_PAR-
411         TIAL,  respectively, and otherwise the PCRE negative error number. Here         TIAL, respectively, and otherwise the PCRE negative error number.  Here
412         is an example of an interactive pcretest run.         is an example of an interactive pcretest run.
413    
414           $ pcretest           $ pcretest
# Line 409  DEFAULT OUTPUT FROM PCRETEST Line 421  DEFAULT OUTPUT FROM PCRETEST
421           data> xyz           data> xyz
422           No match           No match
423    
424           Note  that unset capturing substrings that are not followed by one that
425           is set are not returned by pcre_exec(), and are not shown by  pcretest.
426           In  the following example, there are two capturing substrings, but when
427           the first data line is matched, the  second,  unset  substring  is  not
428           shown.  An "internal" unset substring is shown as "<unset>", as for the
429           second data line.
430    
431               re> /(a)|(b)/
432             data> a
433              0: a
434              1: a
435             data> b
436              0: b
437              1: <unset>
438              2: b
439    
440         If the strings contain any non-printing characters, they are output  as         If the strings contain any non-printing characters, they are output  as
441         \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on         \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on
442         the pattern. See below for the definition of  non-printing  characters.         the pattern. See below for the definition of  non-printing  characters.
# Line 621  AUTHOR Line 649  AUTHOR
649    
650  REVISION  REVISION
651    
652         Last updated: 24 April 2007         Last updated: 18 December 2007
653         Copyright (c) 1997-2007 University of Cambridge.         Copyright (c) 1997-2007 University of Cambridge.

Legend:
Removed from v.155  
changed lines
  Added in v.289

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12