| 97 |
stdout, and prompts for each line of input, using "re>" to prompt for regular |
stdout, and prompts for each line of input, using "re>" to prompt for regular |
| 98 |
expressions, and "data>" to prompt for data lines. |
expressions, and "data>" to prompt for data lines. |
| 99 |
.P |
.P |
| 100 |
|
When \fBpcretest\fP is built, a configuration option can specify that it should |
| 101 |
|
be linked with the \fBlibreadline\fP library. When this is done, if the input |
| 102 |
|
is from a terminal, it is read using the \fBreadline()\fP function. This |
| 103 |
|
provides line-editing and history facilities. The output from the \fB-help\fP |
| 104 |
|
option states whether or not \fBreadline()\fP will be used. |
| 105 |
|
.P |
| 106 |
The program handles any number of sets of input on a single input file. Each |
The program handles any number of sets of input on a single input file. Each |
| 107 |
set starts with a regular expression, and continues with any number of data |
set starts with a regular expression, and continues with any number of data |
| 108 |
lines to be matched against the pattern. |
lines to be matched against the pattern. |
| 163 |
The following table shows additional modifiers for setting PCRE options that do |
The following table shows additional modifiers for setting PCRE options that do |
| 164 |
not correspond to anything in Perl: |
not correspond to anything in Perl: |
| 165 |
.sp |
.sp |
| 166 |
\fB/A\fP PCRE_ANCHORED |
\fB/A\fP PCRE_ANCHORED |
| 167 |
\fB/C\fP PCRE_AUTO_CALLOUT |
\fB/C\fP PCRE_AUTO_CALLOUT |
| 168 |
\fB/E\fP PCRE_DOLLAR_ENDONLY |
\fB/E\fP PCRE_DOLLAR_ENDONLY |
| 169 |
\fB/f\fP PCRE_FIRSTLINE |
\fB/f\fP PCRE_FIRSTLINE |
| 170 |
\fB/J\fP PCRE_DUPNAMES |
\fB/J\fP PCRE_DUPNAMES |
| 171 |
\fB/N\fP PCRE_NO_AUTO_CAPTURE |
\fB/N\fP PCRE_NO_AUTO_CAPTURE |
| 172 |
\fB/U\fP PCRE_UNGREEDY |
\fB/U\fP PCRE_UNGREEDY |
| 173 |
\fB/X\fP PCRE_EXTRA |
\fB/X\fP PCRE_EXTRA |
| 174 |
\fB/<cr>\fP PCRE_NEWLINE_CR |
\fB/<cr>\fP PCRE_NEWLINE_CR |
| 175 |
\fB/<lf>\fP PCRE_NEWLINE_LF |
\fB/<lf>\fP PCRE_NEWLINE_LF |
| 176 |
\fB/<crlf>\fP PCRE_NEWLINE_CRLF |
\fB/<crlf>\fP PCRE_NEWLINE_CRLF |
| 177 |
\fB/<anycrlf>\fP PCRE_NEWLINE_ANYCRLF |
\fB/<anycrlf>\fP PCRE_NEWLINE_ANYCRLF |
| 178 |
\fB/<any>\fP PCRE_NEWLINE_ANY |
\fB/<any>\fP PCRE_NEWLINE_ANY |
| 179 |
.sp |
\fB/<bsr_anycrlf>\fP PCRE_BSR_ANYCRLF |
| 180 |
Those specifying line ending sequencess are literal strings as shown. This |
\fB/<bsr_unicode>\fP PCRE_BSR_UNICODE |
| 181 |
example sets multiline matching with CRLF as the line ending sequence: |
.sp |
| 182 |
|
Those specifying line ending sequences are literal strings as shown, but the |
| 183 |
|
letters can be in either case. This example sets multiline matching with CRLF |
| 184 |
|
as the line ending sequence: |
| 185 |
.sp |
.sp |
| 186 |
/^abc/m<crlf> |
/^abc/m<crlf> |
| 187 |
.sp |
.sp |
| 420 |
The use of \ex{hh...} to represent UTF-8 characters is not dependent on the use |
The use of \ex{hh...} to represent UTF-8 characters is not dependent on the use |
| 421 |
of the \fB/8\fP modifier on the pattern. It is recognized always. There may be |
of the \fB/8\fP modifier on the pattern. It is recognized always. There may be |
| 422 |
any number of hexadecimal digits inside the braces. The result is from one to |
any number of hexadecimal digits inside the braces. The result is from one to |
| 423 |
six bytes, encoded according to the UTF-8 rules. |
six bytes, encoded according to the original UTF-8 rules of RFC 2279. This |
| 424 |
|
allows for values in the range 0 to 0x7FFFFFFF. Note that not all of those are |
| 425 |
|
valid Unicode code points, or indeed valid UTF-8 characters according to the |
| 426 |
|
later rules in RFC 3629. |
| 427 |
. |
. |
| 428 |
. |
. |
| 429 |
.SH "THE ALTERNATIVE MATCHING FUNCTION" |
.SH "THE ALTERNATIVE MATCHING FUNCTION" |
| 469 |
data> xyz |
data> xyz |
| 470 |
No match |
No match |
| 471 |
.sp |
.sp |
| 472 |
|
Note that unset capturing substrings that are not followed by one that is set |
| 473 |
|
are not returned by \fBpcre_exec()\fP, and are not shown by \fBpcretest\fP. In |
| 474 |
|
the following example, there are two capturing substrings, but when the first |
| 475 |
|
data line is matched, the second, unset substring is not shown. An "internal" |
| 476 |
|
unset substring is shown as "<unset>", as for the second data line. |
| 477 |
|
.sp |
| 478 |
|
re> /(a)|(b)/ |
| 479 |
|
data> a |
| 480 |
|
0: a |
| 481 |
|
1: a |
| 482 |
|
data> b |
| 483 |
|
0: b |
| 484 |
|
1: <unset> |
| 485 |
|
2: b |
| 486 |
|
.sp |
| 487 |
If the strings contain any non-printing characters, they are output as \e0x |
If the strings contain any non-printing characters, they are output as \e0x |
| 488 |
escapes, or as \ex{...} escapes if the \fB/8\fP modifier was present on the |
escapes, or as \ex{...} escapes if the \fB/8\fP modifier was present on the |
| 489 |
pattern. See below for the definition of non-printing characters. If the |
pattern. See below for the definition of non-printing characters. If the |
| 717 |
.rs |
.rs |
| 718 |
.sp |
.sp |
| 719 |
.nf |
.nf |
| 720 |
Last updated: 24 April 2007 |
Last updated: 18 December 2007 |
| 721 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
| 722 |
.fi |
.fi |