/[pcre]/code/tags/pcre-7.2/doc/pcretest.txt
ViewVC logotype

Diff of /code/tags/pcre-7.2/doc/pcretest.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 53 by nigel, Sat Feb 24 21:39:42 2007 UTC revision 63 by nigel, Sat Feb 24 21:40:03 2007 UTC
# Line 3  NAME Line 3  NAME
3       expressions.       expressions.
4    
5    
   
6  SYNOPSIS  SYNOPSIS
7       pcretest [-d] [-i] [-m] [-o osize] [-p] [-t] [source]  [des-       pcretest [-d] [-i] [-m] [-o osize] [-p] [-t] [source]  [des-
8       tination]       tination]
9    
10       pcretest was written as a test program for the PCRE  regular       pcretest was written as a test program for the PCRE  regular
11       expression  library  itself,  but  it  can  also be used for       expression  library  itself,  but  it  can  also be used for
12       experimenting  with  regular  expressions.  This  man   page       experimenting  with  regular  expressions.   This   document
13       describes  the  features of the test program; for details of       describes  the  features of the test program; for details of
14       the regular expressions themselves, see the pcre man page.       the regular  expressions  themselves,  see  the  pcrepattern
15         documentation.  For details of PCRE and its options, see the
16         pcreapi documentation.
17    
18    
19  OPTIONS  OPTIONS
20    
21    
22         -C        Output the version number of the PCRE library, and
23                   all   available  information  about  the  optional
24                   features that are included, and then exit.
25    
26       -d        Behave as if each regex had the /D  modifier  (see       -d        Behave as if each regex had the /D  modifier  (see
27                 below); the internal form is output after compila-                 below); the internal form is output after compila-
28                 tion.                 tion.
# Line 42  OPTIONS Line 48  OPTIONS
48                 wrapper  API  is  used  to  call PCRE. None of the                 wrapper  API  is  used  to  call PCRE. None of the
49                 other options has any effect when -p is set.                 other options has any effect when -p is set.
50    
51       -t        Run each compile, study,  and  match  20000  times       -t        Run each compile, study, and match many times with
52                 with  a  timer, and output resulting time per com-                 a  timer, and output resulting time per compile or
53                 pile or match (in milliseconds).  Do  not  set  -t                 match (in milliseconds). Do not set  -t  with  -m,
54                 with -m, because you will then get the size output                 because  you  will  then get the size output 20000
55                 20000 times and the timing will be distorted.                 times and the timing will be distorted.
   
56    
57    
58  DESCRIPTION  DESCRIPTION
59    
60       If pcretest is given two filename arguments, it  reads  from       If pcretest is given two filename arguments, it  reads  from
61       the  first and writes to the second. If it is given only one       the  first and writes to the second. If it is given only one
   
   
   
   
 SunOS 5.8                 Last change:                          1  
   
   
   
62       filename argument, it reads from that  file  and  writes  to       filename argument, it reads from that  file  and  writes  to
63       stdout. Otherwise, it reads from stdin and writes to stdout,       stdout. Otherwise, it reads from stdin and writes to stdout,
64       and prompts for each line of input, using  "re>"  to  prompt       and prompts for each line of input, using  "re>"  to  prompt
# Line 70  SunOS 5.8 Last change: Line 68  SunOS 5.8 Last change:
68       The program handles any number of sets of input on a  single       The program handles any number of sets of input on a  single
69       input  file.  Each set starts with a regular expression, and       input  file.  Each set starts with a regular expression, and
70       continues with any  number  of  data  lines  to  be  matched       continues with any  number  of  data  lines  to  be  matched
71       against  the  pattern.  An empty line signals the end of the       against the pattern.
72       data lines, at which point a new regular expression is read.  
73       The  regular  expressions  are  given  enclosed  in any non-       Each line is matched separately and  independently.  If  you
74       alphameric delimiters other than backslash, for example       want  to  do  multiple-line  matches, you have to use the \n
75         escape sequence in a single line of input to encode the new-
76         line  characters.  The maximum length of data line is 30,000
77         characters.
78    
79         An empty line signals the end of the data  lines,  at  which
80         point  a new regular expression is read. The regular expres-
81         sions are given enclosed in  any  non-alphameric  delimiters
82         other than backslash, for example
83    
84         /(a|bc)x+yz/         /(a|bc)x+yz/
85    
# Line 104  SunOS 5.8 Last change: Line 110  SunOS 5.8 Last change:
110       continuation of the regular expression.       continuation of the regular expression.
111    
112    
   
113  PATTERN MODIFIERS  PATTERN MODIFIERS
114    
115       The pattern may be followed by i, m, s,  or  x  to  set  the       The pattern may be followed by i, m, s,  or  x  to  set  the
116       PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED       PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED
117       options, respectively. For example:       options, respectively. For example:
# Line 165  PATTERN MODIFIERS Line 171  PATTERN MODIFIERS
171       pcre_fullinfo()  after  compiling an expression, and output-       pcre_fullinfo()  after  compiling an expression, and output-
172       ting the information it gets back. If the  pattern  is  stu-       ting the information it gets back. If the  pattern  is  stu-
173       died, the results of that are also output.       died, the results of that are also output.
174    
175       The /D modifier is a  PCRE  debugging  feature,  which  also       The /D modifier is a  PCRE  debugging  feature,  which  also
176       assumes /I.  It causes the internal form of compiled regular       assumes /I.  It causes the internal form of compiled regular
177       expressions to be output after compilation.       expressions to be output after compilation. If  the  pattern
178         was studied, the information returned is also output.
179    
180       The /S modifier causes pcre_study() to be called  after  the       The /S modifier causes pcre_study() to be called  after  the
181       expression  has been compiled, and the results used when the       expression  has been compiled, and the results used when the
# Line 185  PATTERN MODIFIERS Line 193  PATTERN MODIFIERS
193       REG_NEWLINE is set.       REG_NEWLINE is set.
194    
195       The /8 modifier  causes  pcretest  to  call  PCRE  with  the       The /8 modifier  causes  pcretest  to  call  PCRE  with  the
196       PCRE_UTF8  option  set.  This turns on the (currently incom-       PCRE_UTF8  option set. This turns on support for UTF-8 char-
197       plete) support for UTF-8 character handling  in  PCRE,  pro-       acter handling in PCRE, provided that it was  compiled  with
198       vided  that  it was compiled with this support enabled. This       this  support  enabled.  This  modifier also causes any non-
199       modifier also causes any non-printing characters  in  output       printing characters in output strings to  be  printed  using
200       strings  to  be printed using the \x{hh...} notation if they       the \x{hh...} notation if they are valid UTF-8 sequences.
201       are valid UTF-8 sequences.  
202    
203    CALLOUTS
204    
205         If the pattern contains  any  callout  requests,  pcretest's
206         callout function will be called. By default, it displays the
207         callout number, and the start and current positions  in  the
208         text at the callout time. For example, the output
209    
210           --->pqrabcdef
211             0    ^  ^
212    
213         indicates that callout number 0 occurred for a match attempt
214         starting at the fourth character of the subject string, when
215         the pointer was at the seventh character. The callout  func-
216         tion returns zero (carry on matching) by default.
217    
218         Inserting callouts may be helpful  when  using  pcretest  to
219         check  complicated regular expressions. For further informa-
220         tion about callouts, see the pcrecallout documentation.
221    
222         For testing the PCRE library, additional control of  callout
223         behaviour  is available via escape sequences in the data, as
224         described in the following section.  In  particular,  it  is
225         possible to pass in a number as callout data (the default is
226         zero). If the callout function receives a  non-zero  number,
227         it returns that value instead of zero.
228    
229    
230  DATA LINES  DATA LINES
231    
232       Before each data line is passed to pcre_exec(), leading  and       Before each data line is passed to pcre_exec(), leading  and
233       trailing whitespace is removed, and it is then scanned for \       trailing whitespace is removed, and it is then scanned for \
234       escapes. The following are recognized:       escapes.  Some  of  these  are  pretty  esoteric   features,
235         intended  for  checking  out  some  of  the more complicated
236         features of PCRE. If you are just testing "ordinary" regular
237         expressions,  you probably don't need any of these. The fol-
238         lowing escapes are recognized:
239    
240         \a         alarm (= BEL)         \a         alarm (= BEL)
241         \b         backspace         \b         backspace
# Line 209  DATA LINES Line 247  DATA LINES
247         \v         vertical tab         \v         vertical tab
248         \nnn       octal character (up to 3 octal digits)         \nnn       octal character (up to 3 octal digits)
249         \xhh       hexadecimal character (up to 2 hex digits)         \xhh       hexadecimal character (up to 2 hex digits)
250         \x{hh...}  hexadecimal UTF-8 character         \x{hh...}  hexadecimal character, any number of digits
251                        in UTF-8 mode
252         \A         pass the PCRE_ANCHORED option to pcre_exec()         \A         pass the PCRE_ANCHORED option to pcre_exec()
253         \B         pass the PCRE_NOTBOL option to pcre_exec()         \B         pass the PCRE_NOTBOL option to pcre_exec()
254         \Cdd       call pcre_copy_substring() for substring dd         \Cdd       call pcre_copy_substring() for substring dd
255                       after a successful match (any decimal number                      after a successful match (any decimal number
256                       less than 32)                      less than 32)
257           \Cname     call pcre_copy_named_substring() for substring
258                        "name" after a successful match (name termin-
259                        ated by next non alphanumeric character)
260           \C+        show the current captured substrings at callout
261                        time
262    
263           C-        do not supply a callout function
264           \C!n       return 1 instead of 0 when callout number n is
265                        reached
266           \C!n!m     return 1 instead of 0 when callout number n is
267                        reached for the nth time
268           \C*n       pass the number n (may be negative) as callout
269                        data
270         \Gdd       call pcre_get_substring() for substring dd         \Gdd       call pcre_get_substring() for substring dd
271                        after a successful match (any decimal number
272                       after a successful match (any decimal number                      less than 32)
273                       less than 32)         \Gname     call pcre_get_named_substring() for substring
274                        "name" after a successful match (name termin-
275                        ated by next non-alphanumeric character)
276         \L         call pcre_get_substringlist() after a         \L         call pcre_get_substringlist() after a
277                       successful match                      successful match
278           \M         discover the minimum MATCH_LIMIT setting
279         \N         pass the PCRE_NOTEMPTY option to pcre_exec()         \N         pass the PCRE_NOTEMPTY option to pcre_exec()
280         \Odd       set the size of the output vector passed to         \Odd       set the size of the output vector passed to
281                       pcre_exec() to dd (any number of decimal                      pcre_exec() to dd (any number of decimal
282                       digits)                      digits)
283         \Z         pass the PCRE_NOTEOL option to pcre_exec()         \Z         pass the PCRE_NOTEOL option to pcre_exec()
284    
285         If \M is present, pcretest calls pcre_exec() several  times,
286         with  different  values  in  the  match_limit  field  of the
287         pcre_extra data structure, until it finds the minimum number
288         that is needed for pcre_exec() to complete. This number is a
289         measure of the amount of  recursion  and  backtracking  that
290         takes  place,  and  checking  it out can be instructive. For
291         most simple matches, the number is quite small, but for pat-
292         terns  with very large numbers of matching possibilities, it
293         can become large very quickly with increasing length of sub-
294         ject string.
295    
296       When \O is used, it may be higher or lower than the size set       When \O is used, it may be higher or lower than the size set
297       by  the  -O  option (or defaulted to 45); \O applies only to       by  the  -O  option (or defaulted to 45); \O applies only to
298       the call of pcre_exec() for the line in which it appears.       the call of pcre_exec() for the line in which it appears.
# Line 249  DATA LINES Line 314  DATA LINES
314       bytes, encoded according to the UTF-8 rules.       bytes, encoded according to the UTF-8 rules.
315    
316    
   
317  OUTPUT FROM PCRETEST  OUTPUT FROM PCRETEST
318    
319       When a match succeeds, pcretest outputs the list of captured       When a match succeeds, pcretest outputs the list of captured
320       substrings  that pcre_exec() returns, starting with number 0       substrings  that pcre_exec() returns, starting with number 0
321       for the string that matched the whole pattern.  Here  is  an       for the string that matched the whole pattern.  Here  is  an
322       example of an interactive pcretest run.       example of an interactive pcretest run.
323    
324         $ pcretest         $ pcretest
325         PCRE version 2.06 08-Jun-1999         PCRE version 4.00 08-Jan-2003
326    
327           re> /^abc(\d+)/           re> /^abc(\d+)/
328         data> abc123         data> abc123
# Line 307  OUTPUT FROM PCRETEST Line 372  OUTPUT FROM PCRETEST
372       of the \n escape.       of the \n escape.
373    
374    
   
375  AUTHOR  AUTHOR
376    
377       Philip Hazel <ph10@cam.ac.uk>       Philip Hazel <ph10@cam.ac.uk>
378       University Computing Service,       University Computing Service,
      New Museums Site,  
379       Cambridge CB2 3QG, England.       Cambridge CB2 3QG, England.
      Phone: +44 1223 334714  
380    
381       Last updated: 15 August 2001  Last updated: 03 February 2003
382       Copyright (c) 1997-2001 University of Cambridge.  Copyright (c) 1997-2003 University of Cambridge.

Legend:
Removed from v.53  
changed lines
  Added in v.63

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12