| 95 |
they must be in upper case. If more than one of them is present, the last one |
they must be in upper case. If more than one of them is present, the last one |
| 96 |
is used. |
is used. |
| 97 |
.P |
.P |
| 98 |
The newline convention does not affect what the \eR escape sequence matches. By |
The newline convention affects the interpretation of the dot metacharacter when |
| 99 |
default, this is any Unicode newline sequence, for Perl compatibility. However, |
PCRE_DOTALL is not set, and also the behaviour of \eN. However, it does not |
| 100 |
this can be changed; see the description of \eR in the section entitled |
affect what the \eR escape sequence matches. By default, this is any Unicode |
| 101 |
|
newline sequence, for Perl compatibility. However, this can be changed; see the |
| 102 |
|
description of \eR in the section entitled |
| 103 |
.\" HTML <a href="#newlineseq"> |
.\" HTML <a href="#newlineseq"> |
| 104 |
.\" </a> |
.\" </a> |
| 105 |
"Newline sequences" |
"Newline sequences" |
| 298 |
All the sequences that define a single character value can be used both inside |
All the sequences that define a single character value can be used both inside |
| 299 |
and outside character classes. In addition, inside a character class, the |
and outside character classes. In addition, inside a character class, the |
| 300 |
sequence \eb is interpreted as the backspace character (hex 08). The sequences |
sequence \eb is interpreted as the backspace character (hex 08). The sequences |
| 301 |
\eB, \eR, and \eX are not special inside a character class. Like any other |
\eB, \eN, \eR, and \eX are not special inside a character class. Like any other |
| 302 |
unrecognized escape sequences, they are treated as the literal characters "B", |
unrecognized escape sequences, they are treated as the literal characters "B", |
| 303 |
"R", and "X" by default, but cause an error if the PCRE_EXTRA option is set. |
"N", "R", and "X" by default, but cause an error if the PCRE_EXTRA option is |
| 304 |
Outside a character class, these sequences have different meanings |
set. Outside a character class, these sequences have different meanings. |
|
.\" HTML <a href="#uniextseq"> |
|
|
.\" </a> |
|
|
(see below). |
|
|
.\" |
|
| 305 |
. |
. |
| 306 |
. |
. |
| 307 |
.SS "Absolute and relative back references" |
.SS "Absolute and relative back references" |
| 343 |
.SS "Generic character types" |
.SS "Generic character types" |
| 344 |
.rs |
.rs |
| 345 |
.sp |
.sp |
| 346 |
Another use of backslash is for specifying generic character types. The |
Another use of backslash is for specifying generic character types: |
|
following are always recognized: |
|
| 347 |
.sp |
.sp |
| 348 |
\ed any decimal digit |
\ed any decimal digit |
| 349 |
\eD any character that is not a decimal digit |
\eD any character that is not a decimal digit |
| 356 |
\ew any "word" character |
\ew any "word" character |
| 357 |
\eW any "non-word" character |
\eW any "non-word" character |
| 358 |
.sp |
.sp |
| 359 |
Each pair of escape sequences partitions the complete set of characters into |
There is also the single sequence \eN, which matches a non-newline character. |
| 360 |
two disjoint sets. Any given character matches one, and only one, of each pair. |
This is the same as |
| 361 |
|
.\" HTML <a href="#fullstopdot"> |
| 362 |
|
.\" </a> |
| 363 |
|
the "." metacharacter |
| 364 |
|
.\" |
| 365 |
|
when PCRE_DOTALL is not set. |
| 366 |
|
.P |
| 367 |
|
Each pair of lower and upper case escape sequences partitions the complete set |
| 368 |
|
of characters into two disjoint sets. Any given character matches one, and only |
| 369 |
|
one, of each pair. |
| 370 |
.P |
.P |
| 371 |
These character type sequences can appear both inside and outside character |
These character type sequences can appear both inside and outside character |
| 372 |
classes. They each match one character of the appropriate type. If the current |
classes. They each match one character of the appropriate type. If the current |
| 870 |
\eA it is always anchored, whether or not PCRE_MULTILINE is set. |
\eA it is always anchored, whether or not PCRE_MULTILINE is set. |
| 871 |
. |
. |
| 872 |
. |
. |
| 873 |
.SH "FULL STOP (PERIOD, DOT)" |
.\" HTML <a name="fullstopdot"></a> |
| 874 |
|
.SH "FULL STOP (PERIOD, DOT) AND \eN" |
| 875 |
.rs |
.rs |
| 876 |
.sp |
.sp |
| 877 |
Outside a character class, a dot in the pattern matches any one character in |
Outside a character class, a dot in the pattern matches any one character in |
| 893 |
The handling of dot is entirely independent of the handling of circumflex and |
The handling of dot is entirely independent of the handling of circumflex and |
| 894 |
dollar, the only relationship being that they both involve newlines. Dot has no |
dollar, the only relationship being that they both involve newlines. Dot has no |
| 895 |
special meaning in a character class. |
special meaning in a character class. |
| 896 |
|
.P |
| 897 |
|
The escape sequence \eN always behaves as a dot does when PCRE_DOTALL is not |
| 898 |
|
set. In other words, it matches any one character except one that signifies the |
| 899 |
|
end of a line. |
| 900 |
. |
. |
| 901 |
. |
. |
| 902 |
.SH "MATCHING A SINGLE BYTE" |
.SH "MATCHING A SINGLE BYTE" |