| 79 |
changes the convention to CR. That pattern matches "a\enb" because LF is no |
changes the convention to CR. That pattern matches "a\enb" because LF is no |
| 80 |
longer a newline. Note that these special settings, which are not |
longer a newline. Note that these special settings, which are not |
| 81 |
Perl-compatible, are recognized only at the very start of a pattern, and that |
Perl-compatible, are recognized only at the very start of a pattern, and that |
| 82 |
they must be in upper case. |
they must be in upper case. If more than one of them is present, the last one |
| 83 |
|
is used. |
| 84 |
|
.P |
| 85 |
|
The newline convention does not affect what the \eR escape sequence matches. By |
| 86 |
|
default, this is any Unicode newline sequence, for Perl compatibility. However, |
| 87 |
|
this can be changed; see the description of \eR in the section entitled |
| 88 |
|
.\" HTML <a href="#newlineseq"> |
| 89 |
|
.\" </a> |
| 90 |
|
"Newline sequences" |
| 91 |
|
.\" |
| 92 |
|
below. |
| 93 |
. |
. |
| 94 |
. |
. |
| 95 |
.SH "CHARACTERS AND METACHARACTERS" |
.SH "CHARACTERS AND METACHARACTERS" |
| 398 |
is discouraged. |
is discouraged. |
| 399 |
. |
. |
| 400 |
. |
. |
| 401 |
|
.\" HTML <a name="newlineseq"></a> |
| 402 |
.SS "Newline sequences" |
.SS "Newline sequences" |
| 403 |
.rs |
.rs |
| 404 |
.sp |
.sp |
| 405 |
Outside a character class, the escape sequence \eR matches any Unicode newline |
Outside a character class, by default, the escape sequence \eR matches any |
| 406 |
sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is equivalent to |
Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \eR is |
| 407 |
the following: |
equivalent to the following: |
| 408 |
.sp |
.sp |
| 409 |
(?>\er\en|\en|\ex0b|\ef|\er|\ex85) |
(?>\er\en|\en|\ex0b|\ef|\er|\ex85) |
| 410 |
.sp |
.sp |
| 424 |
Unicode character property support is not needed for these characters to be |
Unicode character property support is not needed for these characters to be |
| 425 |
recognized. |
recognized. |
| 426 |
.P |
.P |
| 427 |
|
It is possible to restrict \eR to match only CR, LF, or CRLF (instead of the |
| 428 |
|
complete set of Unicode line endings) by setting the option PCRE_BSR_ANYCRLF |
| 429 |
|
either at compile time or when the pattern is matched. This can be made the |
| 430 |
|
default when PCRE is built; if this is the case, the other behaviour can be |
| 431 |
|
requested via the PCRE_BSR_UNICODE option. It is also possible to specify these |
| 432 |
|
settings by starting a pattern string with one of the following sequences: |
| 433 |
|
.sp |
| 434 |
|
(*BSR_ANYCRLF) CR, LF, or CRLF only |
| 435 |
|
(*BSR_UNICODE) any Unicode newline sequence |
| 436 |
|
.sp |
| 437 |
|
These override the default and the options given to \fBpcre_compile()\fP, but |
| 438 |
|
they can be overridden by options given to \fBpcre_exec()\fP. Note that these |
| 439 |
|
special settings, which are not Perl-compatible, are recognized only at the |
| 440 |
|
very start of a pattern, and that they must be in upper case. If more than one |
| 441 |
|
of them is present, the last one is used. |
| 442 |
|
.P |
| 443 |
Inside a character class, \eR matches the letter "R". |
Inside a character class, \eR matches the letter "R". |
| 444 |
. |
. |
| 445 |
. |
. |
| 987 |
.rs |
.rs |
| 988 |
.sp |
.sp |
| 989 |
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
| 990 |
PCRE_EXTENDED options can be changed from within the pattern by a sequence of |
PCRE_EXTENDED options (which are Perl-compatible) can be changed from within |
| 991 |
Perl option letters enclosed between "(?" and ")". The option letters are |
the pattern by a sequence of Perl option letters enclosed between "(?" and ")". |
| 992 |
|
The option letters are |
| 993 |
.sp |
.sp |
| 994 |
i for PCRE_CASELESS |
i for PCRE_CASELESS |
| 995 |
m for PCRE_MULTILINE |
m for PCRE_MULTILINE |
| 1003 |
permitted. If a letter appears both before and after the hyphen, the option is |
permitted. If a letter appears both before and after the hyphen, the option is |
| 1004 |
unset. |
unset. |
| 1005 |
.P |
.P |
| 1006 |
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be |
| 1007 |
|
changed in the same way as the Perl-compatible options by using the characters |
| 1008 |
|
J, U and X respectively. |
| 1009 |
|
.P |
| 1010 |
When an option change occurs at top level (that is, not inside subpattern |
When an option change occurs at top level (that is, not inside subpattern |
| 1011 |
parentheses), the change applies to the remainder of the pattern that follows. |
parentheses), the change applies to the remainder of the pattern that follows. |
| 1012 |
If the change is placed right at the start of a pattern, PCRE extracts it into |
If the change is placed right at the start of a pattern, PCRE extracts it into |
| 1029 |
branch is abandoned before the option setting. This is because the effects of |
branch is abandoned before the option setting. This is because the effects of |
| 1030 |
option settings happen at compile time. There would be some very weird |
option settings happen at compile time. There would be some very weird |
| 1031 |
behaviour otherwise. |
behaviour otherwise. |
|
.P |
|
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA can be |
|
|
changed in the same way as the Perl-compatible options by using the characters |
|
|
J, U and X respectively. |
|
| 1032 |
. |
. |
| 1033 |
. |
. |
| 1034 |
.\" HTML <a name="subpattern"></a> |
.\" HTML <a name="subpattern"></a> |
| 2177 |
.rs |
.rs |
| 2178 |
.sp |
.sp |
| 2179 |
.nf |
.nf |
| 2180 |
Last updated: 21 August 2007 |
Last updated: 11 September 2007 |
| 2181 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
| 2182 |
.fi |
.fi |