| 241 |
\et tab (hex 09) |
\et tab (hex 09) |
| 242 |
\eddd character with octal code ddd, or back reference |
\eddd character with octal code ddd, or back reference |
| 243 |
\exhh character with hex code hh |
\exhh character with hex code hh |
| 244 |
\ex{hhh..} character with hex code hhh.. |
\ex{hhh..} character with hex code hhh.. (non-JavaScript mode) |
| 245 |
|
\euhhhh character with hex code hhhh (JavaScript mode only) |
| 246 |
.sp |
.sp |
| 247 |
The precise effect of \ecx is as follows: if x is a lower case letter, it |
The precise effect of \ecx is as follows: if x is a lower case letter, it |
| 248 |
is converted to upper case. Then bit 6 of the character (hex 40) is inverted. |
is converted to upper case. Then bit 6 of the character (hex 40) is inverted. |
| 253 |
values are valid. A lower case letter is converted to upper case, and then the |
values are valid. A lower case letter is converted to upper case, and then the |
| 254 |
0xc0 bits are flipped.) |
0xc0 bits are flipped.) |
| 255 |
.P |
.P |
| 256 |
After \ex, from zero to two hexadecimal digits are read (letters can be in |
By default, after \ex, from zero to two hexadecimal digits are read (letters |
| 257 |
upper or lower case). Any number of hexadecimal digits may appear between \ex{ |
can be in upper or lower case). Any number of hexadecimal digits may appear |
| 258 |
and }, but the value of the character code must be less than 256 in non-UTF-8 |
between \ex{ and }, but the value of the character code must be less than 256 |
| 259 |
mode, and less than 2**31 in UTF-8 mode. That is, the maximum value in |
in non-UTF-8 mode, and less than 2**31 in UTF-8 mode. That is, the maximum |
| 260 |
hexadecimal is 7FFFFFFF. Note that this is bigger than the largest Unicode code |
value in hexadecimal is 7FFFFFFF. Note that this is bigger than the largest |
| 261 |
point, which is 10FFFF. |
Unicode code point, which is 10FFFF. |
| 262 |
.P |
.P |
| 263 |
If characters other than hexadecimal digits appear between \ex{ and }, or if |
If characters other than hexadecimal digits appear between \ex{ and }, or if |
| 264 |
there is no terminating }, this form of escape is not recognized. Instead, the |
there is no terminating }, this form of escape is not recognized. Instead, the |
| 265 |
initial \ex will be interpreted as a basic hexadecimal escape, with no |
initial \ex will be interpreted as a basic hexadecimal escape, with no |
| 266 |
following digits, giving a character whose value is zero. |
following digits, giving a character whose value is zero. |
| 267 |
.P |
.P |
| 268 |
|
If the PCRE_JAVASCRIPT_COMPAT option is set, the interpretation of \ex is |
| 269 |
|
as just described only when it is followed by two hexadecimal digits. |
| 270 |
|
Otherwise, it matches a literal "x" character. In JavaScript mode, support for |
| 271 |
|
code points greater than 256 is provided by \eu, which must be followed by |
| 272 |
|
four hexadecimal digits; otherwise it matches a literal "u" character. |
| 273 |
|
.P |
| 274 |
Characters whose value is less than 256 can be defined by either of the two |
Characters whose value is less than 256 can be defined by either of the two |
| 275 |
syntaxes for \ex. There is no difference in the way they are handled. For |
syntaxes for \ex (or by \eu in JavaScript mode). There is no difference in the |
| 276 |
example, \exdc is exactly the same as \ex{dc}. |
way they are handled. For example, \exdc is exactly the same as \ex{dc} (or |
| 277 |
|
\eu00dc in JavaScript mode). |
| 278 |
.P |
.P |
| 279 |
After \e0 up to two further octal digits are read. If there are fewer than two |
After \e0 up to two further octal digits are read. If there are fewer than two |
| 280 |
digits, just those that are present are used. Thus the sequence \e0\ex\e07 |
digits, just those that are present are used. Thus the sequence \e0\ex\e07 |
| 336 |
set. Outside a character class, these sequences have different meanings. |
set. Outside a character class, these sequences have different meanings. |
| 337 |
. |
. |
| 338 |
. |
. |
| 339 |
|
.SS "Unsupported escape sequences" |
| 340 |
|
.rs |
| 341 |
|
.sp |
| 342 |
|
In Perl, the sequences \el, \eL, \eu, and \eU are recognized by its string |
| 343 |
|
handler and used to modify the case of following characters. By default, PCRE |
| 344 |
|
does not support these escape sequences. However, if the PCRE_JAVASCRIPT_COMPAT |
| 345 |
|
option is set, \eU matches a "U" character, and \eu can be used to define a |
| 346 |
|
character by code point, as described in the previous section. |
| 347 |
|
. |
| 348 |
|
. |
| 349 |
.SS "Absolute and relative back references" |
.SS "Absolute and relative back references" |
| 350 |
.rs |
.rs |
| 351 |
.sp |
.sp |
| 405 |
.\" </a> |
.\" </a> |
| 406 |
the "." metacharacter |
the "." metacharacter |
| 407 |
.\" |
.\" |
| 408 |
when PCRE_DOTALL is not set. |
when PCRE_DOTALL is not set. Perl also uses \eN to match characters by name; |
| 409 |
|
PCRE does not support this. |
| 410 |
.P |
.P |
| 411 |
Each pair of lower and upper case escape sequences partitions the complete set |
Each pair of lower and upper case escape sequences partitions the complete set |
| 412 |
of characters into two disjoint sets. Any given character matches one, and only |
of characters into two disjoint sets. Any given character matches one, and only |
| 983 |
.P |
.P |
| 984 |
The escape sequence \eN behaves like a dot, except that it is not affected by |
The escape sequence \eN behaves like a dot, except that it is not affected by |
| 985 |
the PCRE_DOTALL option. In other words, it matches any character except one |
the PCRE_DOTALL option. In other words, it matches any character except one |
| 986 |
that signifies the end of a line. |
that signifies the end of a line. Perl also uses \eN to match characters by |
| 987 |
|
name; PCRE does not support this. |
| 988 |
. |
. |
| 989 |
. |
. |
| 990 |
.SH "MATCHING A SINGLE BYTE" |
.SH "MATCHING A SINGLE BYTE" |
| 2874 |
.rs |
.rs |
| 2875 |
.sp |
.sp |
| 2876 |
.nf |
.nf |
| 2877 |
Last updated: 19 October 2011 |
Last updated: 14 November 2011 |
| 2878 |
Copyright (c) 1997-2011 University of Cambridge. |
Copyright (c) 1997-2011 University of Cambridge. |
| 2879 |
.fi |
.fi |