| 4 |
.SH "PCRE REGULAR EXPRESSION DETAILS" |
.SH "PCRE REGULAR EXPRESSION DETAILS" |
| 5 |
.rs |
.rs |
| 6 |
.sp |
.sp |
| 7 |
The syntax and semantics of the regular expressions supported by PCRE are |
The syntax and semantics of the regular expressions that are supported by PCRE |
| 8 |
described below. Regular expressions are also described in the Perl |
are described in detail below. There is a quick-reference syntax summary in the |
| 9 |
documentation and in a number of books, some of which have copious examples. |
.\" HREF |
| 10 |
Jeffrey Friedl's "Mastering Regular Expressions", published by O'Reilly, covers |
\fBpcresyntax\fP |
| 11 |
regular expressions in great detail. This description of PCRE's regular |
.\" |
| 12 |
expressions is intended as reference material. |
page. Perl's regular expressions are described in its own documentation, and |
| 13 |
|
regular expressions in general are covered in a number of books, some of which |
| 14 |
|
have copious examples. Jeffrey Friedl's "Mastering Regular Expressions", |
| 15 |
|
published by O'Reilly, covers regular expressions in great detail. This |
| 16 |
|
description of PCRE's regular expressions is intended as reference material. |
| 17 |
.P |
.P |
| 18 |
The original operation of PCRE was on strings of one-byte characters. However, |
The original operation of PCRE was on strings of one-byte characters. However, |
| 19 |
there is now also support for UTF-8 character strings. To use this, you must |
there is now also support for UTF-8 character strings. To use this, you must |
| 244 |
.SS "Absolute and relative back references" |
.SS "Absolute and relative back references" |
| 245 |
.rs |
.rs |
| 246 |
.sp |
.sp |
| 247 |
The sequence \eg followed by a positive or negative number, optionally enclosed |
The sequence \eg followed by an unsigned or a negative number, optionally |
| 248 |
in braces, is an absolute or relative back reference. A named back reference |
enclosed in braces, is an absolute or relative back reference. A named back |
| 249 |
can be coded as \eg{name}. Back references are discussed |
reference can be coded as \eg{name}. Back references are discussed |
| 250 |
.\" HTML <a href="#backreferences"> |
.\" HTML <a href="#backreferences"> |
| 251 |
.\" </a> |
.\" </a> |
| 252 |
later, |
later, |
| 1294 |
.sp |
.sp |
| 1295 |
\ed++foo |
\ed++foo |
| 1296 |
.sp |
.sp |
| 1297 |
|
Note that a possessive quantifier can be used with an entire group, for |
| 1298 |
|
example: |
| 1299 |
|
.sp |
| 1300 |
|
(abc|xyz){2,3}+ |
| 1301 |
|
.sp |
| 1302 |
Possessive quantifiers are always greedy; the setting of the PCRE_UNGREEDY |
Possessive quantifiers are always greedy; the setting of the PCRE_UNGREEDY |
| 1303 |
option is ignored. They are a convenient notation for the simpler forms of |
option is ignored. They are a convenient notation for the simpler forms of |
| 1304 |
atomic group. However, there is no difference in the meaning of a possessive |
atomic group. However, there is no difference in the meaning of a possessive |
| 1373 |
.P |
.P |
| 1374 |
Another way of avoiding the ambiguity inherent in the use of digits following a |
Another way of avoiding the ambiguity inherent in the use of digits following a |
| 1375 |
backslash is to use the \eg escape sequence, which is a feature introduced in |
backslash is to use the \eg escape sequence, which is a feature introduced in |
| 1376 |
Perl 5.10. This escape must be followed by a positive or a negative number, |
Perl 5.10. This escape must be followed by an unsigned number or a negative |
| 1377 |
optionally enclosed in braces. These examples are all identical: |
number, optionally enclosed in braces. These examples are all identical: |
| 1378 |
.sp |
.sp |
| 1379 |
(ring), \e1 |
(ring), \e1 |
| 1380 |
(ring), \eg1 |
(ring), \eg1 |
| 1381 |
(ring), \eg{1} |
(ring), \eg{1} |
| 1382 |
.sp |
.sp |
| 1383 |
A positive number specifies an absolute reference without the ambiguity that is |
An unsigned number specifies an absolute reference without the ambiguity that |
| 1384 |
present in the older syntax. It is also useful when literal digits follow the |
is present in the older syntax. It is also useful when literal digits follow |
| 1385 |
reference. A negative number is a relative reference. Consider this example: |
the reference. A negative number is a relative reference. Consider this |
| 1386 |
|
example: |
| 1387 |
.sp |
.sp |
| 1388 |
(abc(def)ghi)\eg{-1} |
(abc(def)ghi)\eg{-1} |
| 1389 |
.sp |
.sp |
| 1986 |
.rs |
.rs |
| 1987 |
.sp |
.sp |
| 1988 |
.nf |
.nf |
| 1989 |
Last updated: 19 June 2007 |
Last updated: 06 August 2007 |
| 1990 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
| 1991 |
.fi |
.fi |