| 9 |
.\" HREF |
.\" HREF |
| 10 |
\fBpcresyntax\fP |
\fBpcresyntax\fP |
| 11 |
.\" |
.\" |
| 12 |
page. Perl's regular expressions are described in its own documentation, and |
page. PCRE tries to match Perl syntax and semantics as closely as it can. PCRE |
| 13 |
|
also supports some alternative regular expression syntax (which does not |
| 14 |
|
conflict with the Perl syntax) in order to provide some compatibility with |
| 15 |
|
regular expressions in Python, .NET, and Oniguruma. |
| 16 |
|
.P |
| 17 |
|
Perl's regular expressions are described in its own documentation, and |
| 18 |
regular expressions in general are covered in a number of books, some of which |
regular expressions in general are covered in a number of books, some of which |
| 19 |
have copious examples. Jeffrey Friedl's "Mastering Regular Expressions", |
have copious examples. Jeffrey Friedl's "Mastering Regular Expressions", |
| 20 |
published by O'Reilly, covers regular expressions in great detail. This |
published by O'Reilly, covers regular expressions in great detail. This |
| 315 |
.\" |
.\" |
| 316 |
. |
. |
| 317 |
. |
. |
| 318 |
|
.SS "Absolute and relative subroutine calls" |
| 319 |
|
.rs |
| 320 |
|
.sp |
| 321 |
|
For compatibility with Oniguruma, the non-Perl syntax \eg followed by a name or |
| 322 |
|
a number enclosed either in angle brackets or single quotes, is an alternative |
| 323 |
|
syntax for referencing a subpattern as a "subroutine". Details are discussed |
| 324 |
|
.\" HTML <a href="#onigurumasubroutines"> |
| 325 |
|
.\" </a> |
| 326 |
|
later. |
| 327 |
|
.\" |
| 328 |
|
Note that \eg{...} (Perl syntax) and \eg<...> (Oniguruma syntax) are \fInot\fP |
| 329 |
|
synonymous. The former is a back reference; the latter is a subroutine call. |
| 330 |
|
. |
| 331 |
|
. |
| 332 |
.SS "Generic character types" |
.SS "Generic character types" |
| 333 |
.rs |
.rs |
| 334 |
.sp |
.sp |
| 1053 |
branch is abandoned before the option setting. This is because the effects of |
branch is abandoned before the option setting. This is because the effects of |
| 1054 |
option settings happen at compile time. There would be some very weird |
option settings happen at compile time. There would be some very weird |
| 1055 |
behaviour otherwise. |
behaviour otherwise. |
| 1056 |
|
.P |
| 1057 |
|
\fBNote:\fP There are other PCRE-specific options that can be set by the |
| 1058 |
|
application when the compile or match functions are called. In some cases the |
| 1059 |
|
pattern can contain special leading sequences to override what the application |
| 1060 |
|
has set or what has been defaulted. Details are given in the section entitled |
| 1061 |
|
.\" HTML <a href="#newlineseq"> |
| 1062 |
|
.\" </a> |
| 1063 |
|
"Newline sequences" |
| 1064 |
|
.\" |
| 1065 |
|
above. |
| 1066 |
. |
. |
| 1067 |
. |
. |
| 1068 |
.\" HTML <a name="subpattern"></a> |
.\" HTML <a name="subpattern"></a> |
| 1259 |
which may be several bytes long (and they may be of different lengths). |
which may be several bytes long (and they may be of different lengths). |
| 1260 |
.P |
.P |
| 1261 |
The quantifier {0} is permitted, causing the expression to behave as if the |
The quantifier {0} is permitted, causing the expression to behave as if the |
| 1262 |
previous item and the quantifier were not present. |
previous item and the quantifier were not present. This may be useful for |
| 1263 |
|
subpatterns that are referenced as |
| 1264 |
|
.\" HTML <a href="#subpatternsassubroutines"> |
| 1265 |
|
.\" </a> |
| 1266 |
|
subroutines |
| 1267 |
|
.\" |
| 1268 |
|
from elsewhere in the pattern. Items other than subpatterns that have a {0} |
| 1269 |
|
quantifier are omitted from the compiled pattern. |
| 1270 |
.P |
.P |
| 1271 |
For convenience, the three most common quantifiers have single-character |
For convenience, the three most common quantifiers have single-character |
| 1272 |
abbreviations: |
abbreviations: |
| 2049 |
processing option does not affect the called subpattern. |
processing option does not affect the called subpattern. |
| 2050 |
. |
. |
| 2051 |
. |
. |
| 2052 |
|
.\" HTML <a name="onigurumasubroutines"></a> |
| 2053 |
|
.SH "ONIGURUMA SUBROUTINE SYNTAX" |
| 2054 |
|
.rs |
| 2055 |
|
.sp |
| 2056 |
|
For compatibility with Oniguruma, the non-Perl syntax \eg followed by a name or |
| 2057 |
|
a number enclosed either in angle brackets or single quotes, is an alternative |
| 2058 |
|
syntax for referencing a subpattern as a subroutine, possibly recursively. Here |
| 2059 |
|
are two of the examples used above, rewritten using this syntax: |
| 2060 |
|
.sp |
| 2061 |
|
(?<pn> \e( ( (?>[^()]+) | \eg<pn> )* \e) ) |
| 2062 |
|
(sens|respons)e and \eg'1'ibility |
| 2063 |
|
.sp |
| 2064 |
|
PCRE supports an extension to Oniguruma: if a number is preceded by a |
| 2065 |
|
plus or a minus sign it is taken as a relative reference. For example: |
| 2066 |
|
.sp |
| 2067 |
|
(abc)(?i:\eg<-1>) |
| 2068 |
|
.sp |
| 2069 |
|
Note that \eg{...} (Perl syntax) and \eg<...> (Oniguruma syntax) are \fInot\fP |
| 2070 |
|
synonymous. The former is a back reference; the latter is a subroutine call. |
| 2071 |
|
. |
| 2072 |
|
. |
| 2073 |
.SH CALLOUTS |
.SH CALLOUTS |
| 2074 |
.rs |
.rs |
| 2075 |
.sp |
.sp |
| 2115 |
production code should be noted to avoid problems during upgrades." The same |
production code should be noted to avoid problems during upgrades." The same |
| 2116 |
remarks apply to the PCRE features described in this section. |
remarks apply to the PCRE features described in this section. |
| 2117 |
.P |
.P |
| 2118 |
Since these verbs are specifically related to backtracking, they can be used |
Since these verbs are specifically related to backtracking, most of them can be |
| 2119 |
only when the pattern is to be matched using \fBpcre_exec()\fP, which uses a |
used only when the pattern is to be matched using \fBpcre_exec()\fP, which uses |
| 2120 |
backtracking algorithm. They cause an error if encountered by |
a backtracking algorithm. With the exception of (*FAIL), which behaves like a |
| 2121 |
|
failing negative assertion, they cause an error if encountered by |
| 2122 |
\fBpcre_dfa_exec()\fP. |
\fBpcre_dfa_exec()\fP. |
| 2123 |
.P |
.P |
| 2124 |
The new verbs make use of what was previously invalid syntax: an opening |
The new verbs make use of what was previously invalid syntax: an opening |
| 2240 |
.rs |
.rs |
| 2241 |
.sp |
.sp |
| 2242 |
.nf |
.nf |
| 2243 |
Last updated: 14 September 2007 |
Last updated: 19 April 2008 |
| 2244 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2008 University of Cambridge. |
| 2245 |
.fi |
.fi |