/[pcre]/code/trunk/doc/pcretest.txt
ViewVC logotype

Diff of /code/trunk/doc/pcretest.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 48 by nigel, Sat Feb 24 21:39:29 2007 UTC revision 49 by nigel, Sat Feb 24 21:39:33 2007 UTC
# Line 43  backslash, because Line 43  backslash, because
43  is interpreted as the first line of a pattern that starts with "abc/", causing  is interpreted as the first line of a pattern that starts with "abc/", causing
44  pcretest to read the next line as a continuation of the regular expression.  pcretest to read the next line as a continuation of the regular expression.
45    
46    
47    PATTERN MODIFIERS
48    -----------------
49    
50  The pattern may be followed by i, m, s, or x to set the PCRE_CASELESS,  The pattern may be followed by i, m, s, or x to set the PCRE_CASELESS,
51  PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED options, respectively. For  PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED options, respectively. For
52  example:  example:
# Line 103  compiled, and the results used when the Line 107  compiled, and the results used when the
107  The /M modifier causes the size of memory block used to hold the compiled  The /M modifier causes the size of memory block used to hold the compiled
108  pattern to be output.  pattern to be output.
109    
110  Finally, the /P modifier causes pcretest to call PCRE via the POSIX wrapper API  The /P modifier causes pcretest to call PCRE via the POSIX wrapper API rather
111  rather than its native API. When this is done, all other modifiers except /i,  than its native API. When this is done, all other modifiers except /i, /m, and
112  /m, and /+ are ignored. REG_ICASE is set if /i is present, and REG_NEWLINE is  /+ are ignored. REG_ICASE is set if /i is present, and REG_NEWLINE is set if /m
113  set if /m is present. The wrapper functions force PCRE_DOLLAR_ENDONLY always,  is present. The wrapper functions force PCRE_DOLLAR_ENDONLY always, and
114  and PCRE_DOTALL unless REG_NEWLINE is set.  PCRE_DOTALL unless REG_NEWLINE is set.
115    
116    The /8 modifier causes pcretest to call PCRE with the PCRE_UTF8 option set.
117    This turns on the (currently incomplete) support for UTF-8 character handling
118    in PCRE, provided that it was compiled with this support enabled. This modifier
119    also causes any non-printing characters in output strings to be printed using
120    the \x{hh...} notation if they are valid UTF-8 sequences.
121    
122    
123    DATA LINES
124    ----------
125    
126  Before each data line is passed to pcre_exec(), leading and trailing whitespace  Before each data line is passed to pcre_exec(), leading and trailing whitespace
127  is removed, and it is then scanned for \ escapes. The following are recognized:  is removed, and it is then scanned for \ escapes. The following are recognized:
128    
129    \a     alarm (= BEL)    \a         alarm (= BEL)
130    \b     backspace    \b         backspace
131    \e     escape    \e         escape
132    \f     formfeed    \f         formfeed
133    \n     newline    \n         newline
134    \r     carriage return    \r         carriage return
135    \t     tab    \t         tab
136    \v     vertical tab    \v         vertical tab
137    \nnn   octal character (up to 3 octal digits)    \nnn       octal character (up to 3 octal digits)
138    \xhh   hexadecimal character (up to 2 hex digits)    \xhh       hexadecimal character (up to 2 hex digits)
139      \x{hh...}  hexadecimal UTF-8 character
140    \A     pass the PCRE_ANCHORED option to pcre_exec()  
141    \B     pass the PCRE_NOTBOL option to pcre_exec()    \A         pass the PCRE_ANCHORED option to pcre_exec()
142    \Cdd   call pcre_copy_substring() for substring dd after a successful match    \B         pass the PCRE_NOTBOL option to pcre_exec()
143             (any decimal number less than 32)    \Cdd       call pcre_copy_substring() for substring dd after a successful
144    \Gdd   call pcre_get_substring() for substring dd after a successful match                 match (any decimal number less than 32)
145             (any decimal number less than 32)    \Gdd       call pcre_get_substring() for substring dd after a successful
146    \L     call pcre_get_substringlist() after a successful match                 match (any decimal number less than 32)
147    \N     pass the PCRE_NOTEMPTY option to pcre_exec()    \L         call pcre_get_substringlist() after a successful match
148    \Odd   set the size of the output vector passed to pcre_exec() to dd    \N         pass the PCRE_NOTEMPTY option to pcre_exec()
149             (any number of decimal digits)    \Odd       set the size of the output vector passed to pcre_exec() to dd
150    \Z     pass the PCRE_NOTEOL option to pcre_exec()                 (any number of decimal digits)
151      \Z         pass the PCRE_NOTEOL option to pcre_exec()
152    
153  A backslash followed by anything else just escapes the anything else. If the  A backslash followed by anything else just escapes the anything else. If the
154  very last character is a backslash, it is ignored. This gives a way of passing  very last character is a backslash, it is ignored. This gives a way of passing
# Line 143  If /P was present on the regex, causing Line 158  If /P was present on the regex, causing
158  \B, and \Z have any effect, causing REG_NOTBOL and REG_NOTEOL to be passed to  \B, and \Z have any effect, causing REG_NOTBOL and REG_NOTEOL to be passed to
159  regexec() respectively.  regexec() respectively.
160    
161    The use of \x{hh...} to represent UTF-8 characters is not dependent on the use
162    of the /8 modifier on the pattern. It is recognized always. There may be any
163    number of hexadecimal digits inside the braces. The result is from one to six
164    bytes, encoded according to the UTF-8 rules.
165    
166    
167    OUTPUT FROM PCRETEST
168    --------------------
169    
170  When a match succeeds, pcretest outputs the list of captured substrings that  When a match succeeds, pcretest outputs the list of captured substrings that
171  pcre_exec() returns, starting with number 0 for the string that matched the  pcre_exec() returns, starting with number 0 for the string that matched the
172  whole pattern. Here is an example of an interactive pcretest run.  whole pattern. Here is an example of an interactive pcretest run.
# Line 158  whole pattern. Here is an example of an Line 182  whole pattern. Here is an example of an
182    No match    No match
183    
184  If the strings contain any non-printing characters, they are output as \0x  If the strings contain any non-printing characters, they are output as \0x
185  escapes. If the pattern has the /+ modifier, then the output for substring 0 is  escapes, or as \x{...} escapes if the /8 modifier was present on the pattern.
186  followed by the the rest of the subject string, identified by "0+" like this:  If the pattern has the /+ modifier, then the output for substring 0 is followed
187    by the the rest of the subject string, identified by "0+" like this:
188    
189      re> /cat/+      re> /cat/+
190    data> cataract    data> cataract
# Line 190  Note that while patterns can be continue Line 215  Note that while patterns can be continue
215  prompt is used for continuations), data lines may not. However newlines can be  prompt is used for continuations), data lines may not. However newlines can be
216  included in data by means of the \n escape.  included in data by means of the \n escape.
217    
218    
219    COMMAND LINE OPTIONS
220    --------------------
221    
222  If the -p option is given to pcretest, it is equivalent to adding /P to each  If the -p option is given to pcretest, it is equivalent to adding /P to each
223  regular expression: the POSIX wrapper API is used to call PCRE. None of the  regular expression: the POSIX wrapper API is used to call PCRE. None of the
224  following flags has any effect in this case.  following flags has any effect in this case.
# Line 208  a synonym for -m. Line 237  a synonym for -m.
237    
238  If the -t option is given, each compile, study, and match is run 20000 times  If the -t option is given, each compile, study, and match is run 20000 times
239  while being timed, and the resulting time per compile or match is output in  while being timed, and the resulting time per compile or match is output in
240  milliseconds. Do not set -t with -s, because you will then get the size output  milliseconds. Do not set -t with -m, because you will then get the size output
241  20000 times and the timing will be distorted. If you want to change the number  20000 times and the timing will be distorted. If you want to change the number
242  of repetitions used for timing, edit the definition of LOOPREPEAT at the top of  of repetitions used for timing, edit the definition of LOOPREPEAT at the top of
243  pcretest.c  pcretest.c
244    
245  Philip Hazel <ph10@cam.ac.uk>  Philip Hazel <ph10@cam.ac.uk>
246  January 2000  August 2000

Legend:
Removed from v.48  
changed lines
  Added in v.49

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12