| 66 |
page. |
page. |
| 67 |
. |
. |
| 68 |
. |
. |
| 69 |
|
.\" HTML <a name="newlines"></a> |
| 70 |
.SH "NEWLINE CONVENTIONS" |
.SH "NEWLINE CONVENTIONS" |
| 71 |
.rs |
.rs |
| 72 |
.sp |
.sp |
| 211 |
\eQabc\eE\e$\eQxyz\eE abc$xyz abc$xyz |
\eQabc\eE\e$\eQxyz\eE abc$xyz abc$xyz |
| 212 |
.sp |
.sp |
| 213 |
The \eQ...\eE sequence is recognized both inside and outside character classes. |
The \eQ...\eE sequence is recognized both inside and outside character classes. |
| 214 |
|
An isolated \eE that is not preceded by \eQ is ignored. |
| 215 |
. |
. |
| 216 |
. |
. |
| 217 |
.\" HTML <a name="digitsafterbackslash"></a> |
.\" HTML <a name="digitsafterbackslash"></a> |
| 2110 |
.P |
.P |
| 2111 |
If the PCRE_EXTENDED option is set, an unescaped # character outside a |
If the PCRE_EXTENDED option is set, an unescaped # character outside a |
| 2112 |
character class introduces a comment that continues to immediately after the |
character class introduces a comment that continues to immediately after the |
| 2113 |
next newline in the pattern. |
next newline character or character sequence in the pattern. Which characters |
| 2114 |
|
are interpreted as newlines is controlled by the options passed to |
| 2115 |
|
\fBpcre_compile()\fP or by a special sequence at the start of the pattern, as |
| 2116 |
|
described in the section entitled |
| 2117 |
|
.\" HTML <a href="#recursion"> |
| 2118 |
|
.\" </a> |
| 2119 |
|
"Newline conventions" |
| 2120 |
|
.\" |
| 2121 |
|
above. Note that end of a comment is a literal newline sequence in the pattern; |
| 2122 |
|
escape sequences that happen to represent a newline do not terminate a comment. |
| 2123 |
|
For example, consider this pattern when PCRE_EXTENDED is set, and the default |
| 2124 |
|
newline convention is in force: |
| 2125 |
|
.sp |
| 2126 |
|
abc #comment \en still comment |
| 2127 |
|
.sp |
| 2128 |
|
On encountering the # character, \fBpcre_compile()\fP skips along, looking for |
| 2129 |
|
a newline in the pattern. The sequence \en is still literal at this stage, so |
| 2130 |
|
it does not terminate the comment. Only an actual character with the code value |
| 2131 |
|
0x0a does so. |
| 2132 |
. |
. |
| 2133 |
. |
. |
| 2134 |
.\" HTML <a name="recursion"></a> |
.\" HTML <a name="recursion"></a> |
| 2664 |
overall match fails. If (*THEN) is not directly inside an alternation, it acts |
overall match fails. If (*THEN) is not directly inside an alternation, it acts |
| 2665 |
like (*PRUNE). |
like (*PRUNE). |
| 2666 |
. |
. |
| 2667 |
|
.P |
| 2668 |
|
The above verbs provide four different "strengths" of control when subsequent |
| 2669 |
|
matching fails. (*THEN) is the weakest, carrying on the match at the next |
| 2670 |
|
alternation. (*PRUNE) comes next, failing the match at the current starting |
| 2671 |
|
position, but allowing an advance to the next character (for an unanchored |
| 2672 |
|
pattern). (*SKIP) is similar, except that the advance may be more than one |
| 2673 |
|
character. (*COMMIT) is the strongest, causing the entire match to fail. |
| 2674 |
|
.P |
| 2675 |
|
If more than one is present in a pattern, the "stongest" one wins. For example, |
| 2676 |
|
consider this pattern, where A, B, etc. are complex pattern fragments: |
| 2677 |
|
.sp |
| 2678 |
|
(A(*COMMIT)B(*THEN)C|D) |
| 2679 |
|
.sp |
| 2680 |
|
Once A has matched, PCRE is committed to this match, at the current starting |
| 2681 |
|
position. If subsequently B matches, but C does not, the normal (*THEN) action |
| 2682 |
|
of trying the next alternation (that is, D) does not happen because (*COMMIT) |
| 2683 |
|
overrides. |
| 2684 |
|
. |
| 2685 |
. |
. |
| 2686 |
.SH "SEE ALSO" |
.SH "SEE ALSO" |
| 2687 |
.rs |
.rs |
| 2704 |
.rs |
.rs |
| 2705 |
.sp |
.sp |
| 2706 |
.nf |
.nf |
| 2707 |
Last updated: 10 October 2010 |
Last updated: 26 October 2010 |
| 2708 |
Copyright (c) 1997-2010 University of Cambridge. |
Copyright (c) 1997-2010 University of Cambridge. |
| 2709 |
.fi |
.fi |