/[pcre]/code/trunk/doc/pcretest.1
ViewVC logotype

Diff of /code/trunk/doc/pcretest.1

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 77 by nigel, Sat Feb 24 21:40:45 2007 UTC revision 91 by nigel, Sat Feb 24 21:41:34 2007 UTC
# Line 4  pcretest - a program for testing Perl-co Line 4  pcretest - a program for testing Perl-co
4  .SH SYNOPSIS  .SH SYNOPSIS
5  .rs  .rs
6  .sp  .sp
7  .B pcretest "[-C] [-d] [-dfa] [-i] [-m] [-o osize] [-p] [-t] [source]"  .B pcretest "[options] [source] [destination]"
8  .ti +5n  .sp
 .B "[destination]"  
 .P  
9  \fBpcretest\fP was written as a test program for the PCRE regular expression  \fBpcretest\fP was written as a test program for the PCRE regular expression
10  library itself, but it can also be used for experimenting with regular  library itself, but it can also be used for experimenting with regular
11  expressions. This document describes the features of the test program; for  expressions. This document describes the features of the test program; for
# Line 59  Behave as if each regex has the \fB/P\fP Line 57  Behave as if each regex has the \fB/P\fP
57  used to call PCRE. None of the other options has any effect when \fB-p\fP is  used to call PCRE. None of the other options has any effect when \fB-p\fP is
58  set.  set.
59  .TP 10  .TP 10
60    \fB-q\fP
61    Do not output the version number of \fBpcretest\fP at the start of execution.
62    .TP 10
63    \fB-S\fP \fIsize\fP
64    On Unix-like systems, set the size of the runtime stack to \fIsize\fP
65    megabytes.
66    .TP 10
67  \fB-t\fP  \fB-t\fP
68  Run each compile, study, and match many times with a timer, and output  Run each compile, study, and match many times with a timer, and output
69  resulting time per compile or match (in milliseconds). Do not set \fB-m\fP with  resulting time per compile or match (in milliseconds). Do not set \fB-m\fP with
# Line 80  set starts with a regular expression, an Line 85  set starts with a regular expression, an
85  lines to be matched against the pattern.  lines to be matched against the pattern.
86  .P  .P
87  Each data line is matched separately and independently. If you want to do  Each data line is matched separately and independently. If you want to do
88  multiple-line matches, you have to use the \en escape sequence in a single line  multi-line matches, you have to use the \en escape sequence (or \er or \er\en,
89  of input to encode the newline characters. The maximum length of data line is  depending on the newline setting) in a single line of input to encode the
90  30,000 characters.  newline characters. There is no limit on the length of data lines; the input
91    buffer is automatically extended if it is too small.
92  .P  .P
93  An empty line signals the end of the data lines, at which point a new regular  An empty line signals the end of the data lines, at which point a new regular
94  expression is read. The regular expressions are given enclosed in any  expression is read. The regular expressions are given enclosed in any
95  non-alphanumeric delimiters other than backslash, for example  non-alphanumeric delimiters other than backslash, for example:
96  .sp  .sp
97    /(a|bc)x+yz/    /(a|bc)x+yz/
98  .sp  .sp
# Line 134  effect as they do in Perl. For example: Line 140  effect as they do in Perl. For example:
140  The following table shows additional modifiers for setting PCRE options that do  The following table shows additional modifiers for setting PCRE options that do
141  not correspond to anything in Perl:  not correspond to anything in Perl:
142  .sp  .sp
143    \fB/A\fP    PCRE_ANCHORED    \fB/A\fP       PCRE_ANCHORED
144    \fB/C\fP    PCRE_AUTO_CALLOUT    \fB/C\fP       PCRE_AUTO_CALLOUT
145    \fB/E\fP    PCRE_DOLLAR_ENDONLY    \fB/E\fP       PCRE_DOLLAR_ENDONLY
146    \fB/f\fP    PCRE_FIRSTLINE    \fB/f\fP       PCRE_FIRSTLINE
147    \fB/N\fP    PCRE_NO_AUTO_CAPTURE    \fB/J\fP       PCRE_DUPNAMES
148    \fB/U\fP    PCRE_UNGREEDY    \fB/N\fP       PCRE_NO_AUTO_CAPTURE
149    \fB/X\fP    PCRE_EXTRA    \fB/U\fP       PCRE_UNGREEDY
150      \fB/X\fP       PCRE_EXTRA
151      \fB/<cr>\fP    PCRE_NEWLINE_CR
152      \fB/<lf>\fP    PCRE_NEWLINE_LF
153      \fB/<crlf>\fP  PCRE_NEWLINE_CRLF
154    .sp
155    Those specifying line endings are literal strings as shown. Details of the
156    meanings of these PCRE options are given in the
157    .\" HREF
158    \fBpcreapi\fP
159    .\"
160    documentation.
161    .
162    .
163    .SS "Finding all matches in a string"
164    .rs
165  .sp  .sp
166  Searching for all possible matches within each subject string can be requested  Searching for all possible matches within each subject string can be requested
167  by the \fB/g\fP or \fB/G\fP modifier. After finding a match, PCRE is called  by the \fB/g\fP or \fB/G\fP modifier. After finding a match, PCRE is called
# Line 157  flags set in order to search for another Line 178  flags set in order to search for another
178  If this second match fails, the start offset is advanced by one, and the normal  If this second match fails, the start offset is advanced by one, and the normal
179  match is retried. This imitates the way Perl handles such cases when using the  match is retried. This imitates the way Perl handles such cases when using the
180  \fB/g\fP modifier or the \fBsplit()\fP function.  \fB/g\fP modifier or the \fBsplit()\fP function.
181  .P  .
182    .
183    .SS "Other modifiers"
184    .rs
185    .sp
186  There are yet more modifiers for controlling the way \fBpcretest\fP  There are yet more modifiers for controlling the way \fBpcretest\fP
187  operates.  operates.
188  .P  .P
# Line 234  recognized: Line 259  recognized:
259    \ee         escape    \ee         escape
260    \ef         formfeed    \ef         formfeed
261    \en         newline    \en         newline
262    .\" JOIN
263      \eqdd       set the PCRE_MATCH_LIMIT limit to dd
264                   (any number of digits)
265    \er         carriage return    \er         carriage return
266    \et         tab    \et         tab
267    \ev         vertical tab    \ev         vertical tab
# Line 242  recognized: Line 270  recognized:
270  .\" JOIN  .\" JOIN
271    \ex{hh...}  hexadecimal character, any number of digits    \ex{hh...}  hexadecimal character, any number of digits
272                 in UTF-8 mode                 in UTF-8 mode
273    .\" JOIN
274    \eA         pass the PCRE_ANCHORED option to \fBpcre_exec()\fP    \eA         pass the PCRE_ANCHORED option to \fBpcre_exec()\fP
275                   or \fBpcre_dfa_exec()\fP
276    .\" JOIN
277    \eB         pass the PCRE_NOTBOL option to \fBpcre_exec()\fP    \eB         pass the PCRE_NOTBOL option to \fBpcre_exec()\fP
278                   or \fBpcre_dfa_exec()\fP
279  .\" JOIN  .\" JOIN
280    \eCdd       call pcre_copy_substring() for substring dd    \eCdd       call pcre_copy_substring() for substring dd
281                 after a successful match (number less than 32)                 after a successful match (number less than 32)
# Line 276  recognized: Line 308  recognized:
308  .\" JOIN  .\" JOIN
309    \eL         call pcre_get_substringlist() after a    \eL         call pcre_get_substringlist() after a
310                 successful match                 successful match
311    \eM         discover the minimum MATCH_LIMIT setting  .\" JOIN
312      \eM         discover the minimum MATCH_LIMIT and
313                   MATCH_LIMIT_RECURSION settings
314    .\" JOIN
315    \eN         pass the PCRE_NOTEMPTY option to \fBpcre_exec()\fP    \eN         pass the PCRE_NOTEMPTY option to \fBpcre_exec()\fP
316                   or \fBpcre_dfa_exec()\fP
317  .\" JOIN  .\" JOIN
318    \eOdd       set the size of the output vector passed to    \eOdd       set the size of the output vector passed to
319                 \fBpcre_exec()\fP to dd (any number of digits)                 \fBpcre_exec()\fP to dd (any number of digits)
320  .\" JOIN  .\" JOIN
321    \eP         pass the PCRE_PARTIAL option to \fBpcre_exec()\fP    \eP         pass the PCRE_PARTIAL option to \fBpcre_exec()\fP
322                 or \fBpcre_dfa_exec()\fP                 or \fBpcre_dfa_exec()\fP
323    .\" JOIN
324      \eQdd       set the PCRE_MATCH_LIMIT_RECURSION limit to dd
325                   (any number of digits)
326    \eR         pass the PCRE_DFA_RESTART option to \fBpcre_dfa_exec()\fP    \eR         pass the PCRE_DFA_RESTART option to \fBpcre_dfa_exec()\fP
327    \eS         output details of memory get/free calls during matching    \eS         output details of memory get/free calls during matching
328    .\" JOIN
329    \eZ         pass the PCRE_NOTEOL option to \fBpcre_exec()\fP    \eZ         pass the PCRE_NOTEOL option to \fBpcre_exec()\fP
330                   or \fBpcre_dfa_exec()\fP
331  .\" JOIN  .\" JOIN
332    \e?         pass the PCRE_NO_UTF8_CHECK option to    \e?         pass the PCRE_NO_UTF8_CHECK option to
333                 \fBpcre_exec()\fP                 \fBpcre_exec()\fP or \fBpcre_dfa_exec()\fP
334    \e>dd       start the match at offset dd (any number of digits);    \e>dd       start the match at offset dd (any number of digits);
335    .\" JOIN
336                 this sets the \fIstartoffset\fP argument for \fBpcre_exec()\fP                 this sets the \fIstartoffset\fP argument for \fBpcre_exec()\fP
337                   or \fBpcre_dfa_exec()\fP
338    .\" JOIN
339      \e<cr>      pass the PCRE_NEWLINE_CR option to \fBpcre_exec()\fP
340                   or \fBpcre_dfa_exec()\fP
341    .\" JOIN
342      \e<lf>      pass the PCRE_NEWLINE_LF option to \fBpcre_exec()\fP
343                   or \fBpcre_dfa_exec()\fP
344    .\" JOIN
345      \e<crlf>    pass the PCRE_NEWLINE_CRLF option to \fBpcre_exec()\fP
346                   or \fBpcre_dfa_exec()\fP
347  .sp  .sp
348    The escapes that specify line endings are literal strings, exactly as shown.
349  A backslash followed by anything else just escapes the anything else. If the  A backslash followed by anything else just escapes the anything else. If the
350  very last character is a backslash, it is ignored. This gives a way of passing  very last character is a backslash, it is ignored. This gives a way of passing
351  an empty line as data, since a real empty line terminates the data input.  an empty line as data, since a real empty line terminates the data input.
352  .P  .P
353  If \eM is present, \fBpcretest\fP calls \fBpcre_exec()\fP several times, with  If \eM is present, \fBpcretest\fP calls \fBpcre_exec()\fP several times, with
354  different values in the \fImatch_limit\fP field of the \fBpcre_extra\fP data  different values in the \fImatch_limit\fP and \fImatch_limit_recursion\fP
355  structure, until it finds the minimum number that is needed for  fields of the \fBpcre_extra\fP data structure, until it finds the minimum
356  \fBpcre_exec()\fP to complete. This number is a measure of the amount of  numbers for each parameter that allow \fBpcre_exec()\fP to complete. The
357  recursion and backtracking that takes place, and checking it out can be  \fImatch_limit\fP number is a measure of the amount of backtracking that takes
358  instructive. For most simple matches, the number is quite small, but for  place, and checking it out can be instructive. For most simple matches, the
359  patterns with very large numbers of matching possibilities, it can become large  number is quite small, but for patterns with very large numbers of matching
360  very quickly with increasing length of subject string.  possibilities, it can become large very quickly with increasing length of
361    subject string. The \fImatch_limit_recursion\fP number is a measure of how much
362    stack (or, if PCRE is compiled with NO_RECURSE, how much heap) memory is needed
363    to complete the match attempt.
364  .P  .P
365  When \eO is used, the value specified may be higher or lower than the size set  When \eO is used, the value specified may be higher or lower than the size set
366  by the \fB-O\fP command line option (or defaulted to 45); \eO applies only to  by the \fB-O\fP command line option (or defaulted to 45); \eO applies only to
367  the call of \fBpcre_exec()\fP for the line in which it appears.  the call of \fBpcre_exec()\fP for the line in which it appears.
368  .P  .P
369  If the \fB/P\fP modifier was present on the pattern, causing the POSIX wrapper  If the \fB/P\fP modifier was present on the pattern, causing the POSIX wrapper
370  API to be used, only \eB and \eZ have any effect, causing REG_NOTBOL and  API to be used, the only option-setting sequences that have any effect are \eB
371  REG_NOTEOL to be passed to \fBregexec()\fP respectively.  and \eZ, causing REG_NOTBOL and REG_NOTEOL, respectively, to be passed to
372    \fBregexec()\fP.
373  .P  .P
374  The use of \ex{hh...} to represent UTF-8 characters is not dependent on the use  The use of \ex{hh...} to represent UTF-8 characters is not dependent on the use
375  of the \fB/8\fP modifier on the pattern. It is recognized always. There may be  of the \fB/8\fP modifier on the pattern. It is recognized always. There may be
# Line 397  parentheses after each string for \fB\eC Line 454  parentheses after each string for \fB\eC
454  .P  .P
455  Note that while patterns can be continued over several lines (a plain ">"  Note that while patterns can be continued over several lines (a plain ">"
456  prompt is used for continuations), data lines may not. However newlines can be  prompt is used for continuations), data lines may not. However newlines can be
457  included in data by means of the \en escape.  included in data by means of the \en escape (or \er or \er\en for those newline
458    settings).
459  .  .
460  .  .
461  .SH "OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION"  .SH "OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION"
# Line 568  University Computing Service, Line 626  University Computing Service,
626  Cambridge CB2 3QG, England.  Cambridge CB2 3QG, England.
627  .P  .P
628  .in 0  .in 0
629  Last updated: 28 February 2005  Last updated: 29 June 2006
630  .br  .br
631  Copyright (c) 1997-2005 University of Cambridge.  Copyright (c) 1997-2006 University of Cambridge.

Legend:
Removed from v.77  
changed lines
  Added in v.91

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12