| 79 |
changes the convention to CR. That pattern matches "a\enb" because LF is no |
changes the convention to CR. That pattern matches "a\enb" because LF is no |
| 80 |
longer a newline. Note that these special settings, which are not |
longer a newline. Note that these special settings, which are not |
| 81 |
Perl-compatible, are recognized only at the very start of a pattern, and that |
Perl-compatible, are recognized only at the very start of a pattern, and that |
| 82 |
they must be in upper case. |
they must be in upper case. If more than one of them is present, the last one |
| 83 |
|
is used. |
| 84 |
|
.P |
| 85 |
|
The newline convention does not affect what the \eR escape sequence matches. By |
| 86 |
|
default, this is any Unicode newline sequence, for Perl compatibility. However, |
| 87 |
|
this can be changed; see the description of \eR in the section entitled |
| 88 |
|
.\" HTML <a href="#newlineseq"> |
| 89 |
|
.\" </a> |
| 90 |
|
"Newline sequences" |
| 91 |
|
.\" |
| 92 |
|
below. A change of \eR setting can be combined with a change of newline |
| 93 |
|
convention. |
| 94 |
. |
. |
| 95 |
. |
. |
| 96 |
.SH "CHARACTERS AND METACHARACTERS" |
.SH "CHARACTERS AND METACHARACTERS" |
| 399 |
is discouraged. |
is discouraged. |
| 400 |
. |
. |
| 401 |
. |
. |
| 402 |
|
.\" HTML <a name="newlineseq"></a> |
| 403 |
.SS "Newline sequences" |
.SS "Newline sequences" |
| 404 |
.rs |
.rs |
| 405 |
.sp |
.sp |
| 406 |
Outside a character class, the escape sequence \eR matches any Unicode newline |
Outside a character class, by default, the escape sequence \eR matches any |
| 407 |
sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is equivalent to |
Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is |
| 408 |
the following: |
equivalent to the following: |
| 409 |
.sp |
.sp |
| 410 |
(?>\er\en|\en|\ex0b|\ef|\er|\ex85) |
(?>\er\en|\en|\ex0b|\ef|\er|\ex85) |
| 411 |
.sp |
.sp |
| 425 |
Unicode character property support is not needed for these characters to be |
Unicode character property support is not needed for these characters to be |
| 426 |
recognized. |
recognized. |
| 427 |
.P |
.P |
| 428 |
|
It is possible to restrict \eR to match only CR, LF, or CRLF (instead of the |
| 429 |
|
complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF |
| 430 |
|
either at compile time or when the pattern is matched. (BSR is an abbrevation |
| 431 |
|
for "backslash R".) This can be made the default when PCRE is built; if this is |
| 432 |
|
the case, the other behaviour can be requested via the PCRE_BSR_UNICODE option. |
| 433 |
|
It is also possible to specify these settings by starting a pattern string with |
| 434 |
|
one of the following sequences: |
| 435 |
|
.sp |
| 436 |
|
(*BSR_ANYCRLF) CR, LF, or CRLF only |
| 437 |
|
(*BSR_UNICODE) any Unicode newline sequence |
| 438 |
|
.sp |
| 439 |
|
These override the default and the options given to \fBpcre_compile()\fP, but |
| 440 |
|
they can be overridden by options given to \fBpcre_exec()\fP. Note that these |
| 441 |
|
special settings, which are not Perl-compatible, are recognized only at the |
| 442 |
|
very start of a pattern, and that they must be in upper case. If more than one |
| 443 |
|
of them is present, the last one is used. They can be combined with a change of |
| 444 |
|
newline convention, for example, a pattern can start with: |
| 445 |
|
.sp |
| 446 |
|
(*ANY)(*BSR_ANYCRLF) |
| 447 |
|
.sp |
| 448 |
Inside a character class, \eR matches the letter "R". |
Inside a character class, \eR matches the letter "R". |
| 449 |
. |
. |
| 450 |
. |
. |
| 992 |
.rs |
.rs |
| 993 |
.sp |
.sp |
| 994 |
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
| 995 |
PCRE_EXTENDED options can be changed from within the pattern by a sequence of |
PCRE_EXTENDED options (which are Perl-compatible) can be changed from within |
| 996 |
Perl option letters enclosed between "(?" and ")". The option letters are |
the pattern by a sequence of Perl option letters enclosed between "(?" and ")". |
| 997 |
|
The option letters are |
| 998 |
.sp |
.sp |
| 999 |
i for PCRE_CASELESS |
i for PCRE_CASELESS |
| 1000 |
m for PCRE_MULTILINE |
m for PCRE_MULTILINE |
| 1008 |
permitted. If a letter appears both before and after the hyphen, the option is |
permitted. If a letter appears both before and after the hyphen, the option is |
| 1009 |
unset. |
unset. |
| 1010 |
.P |
.P |
| 1011 |
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be |
| 1012 |
|
changed in the same way as the Perl-compatible options by using the characters |
| 1013 |
|
J, U and X respectively. |
| 1014 |
|
.P |
| 1015 |
When an option change occurs at top level (that is, not inside subpattern |
When an option change occurs at top level (that is, not inside subpattern |
| 1016 |
parentheses), the change applies to the remainder of the pattern that follows. |
parentheses), the change applies to the remainder of the pattern that follows. |
| 1017 |
If the change is placed right at the start of a pattern, PCRE extracts it into |
If the change is placed right at the start of a pattern, PCRE extracts it into |
| 1035 |
option settings happen at compile time. There would be some very weird |
option settings happen at compile time. There would be some very weird |
| 1036 |
behaviour otherwise. |
behaviour otherwise. |
| 1037 |
.P |
.P |
| 1038 |
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be |
\fBNote:\fP There are other PCRE-specific options that can be set by the |
| 1039 |
changed in the same way as the Perl-compatible options by using the characters |
application when the compile or match functions are called. In some cases the |
| 1040 |
J, U and X respectively. |
pattern can contain special leading sequences to override what the application |
| 1041 |
|
has set or what has been defaulted. Details are given in the section entitled |
| 1042 |
|
.\" HTML <a href="#newlineseq"> |
| 1043 |
|
.\" </a> |
| 1044 |
|
"Newline sequences" |
| 1045 |
|
.\" |
| 1046 |
|
above. |
| 1047 |
. |
. |
| 1048 |
. |
. |
| 1049 |
.\" HTML <a name="subpattern"></a> |
.\" HTML <a name="subpattern"></a> |
| 2059 |
documentation. |
documentation. |
| 2060 |
. |
. |
| 2061 |
. |
. |
| 2062 |
.SH "BACTRACKING CONTROL" |
.SH "BACKTRACKING CONTROL" |
| 2063 |
.rs |
.rs |
| 2064 |
.sp |
.sp |
| 2065 |
Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which |
Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which |
| 2192 |
.rs |
.rs |
| 2193 |
.sp |
.sp |
| 2194 |
.nf |
.nf |
| 2195 |
Last updated: 21 August 2007 |
Last updated: 17 September 2007 |
| 2196 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
| 2197 |
.fi |
.fi |