| 2341 |
.P |
.P |
| 2342 |
The new verbs make use of what was previously invalid syntax: an opening |
The new verbs make use of what was previously invalid syntax: an opening |
| 2343 |
parenthesis followed by an asterisk. They are generally of the form |
parenthesis followed by an asterisk. They are generally of the form |
| 2344 |
(*VERB) or (*VERB:NAME). Some may take either form, with differing behaviour, |
(*VERB) or (*VERB:NAME). Some may take either form, with differing behaviour, |
| 2345 |
depending on whether or not an argument is present. An name is a sequence of |
depending on whether or not an argument is present. An name is a sequence of |
| 2346 |
letters, digits, and underscores. If the name is empty, that is, if the closing |
letters, digits, and underscores. If the name is empty, that is, if the closing |
| 2347 |
parenthesis immediately follows the colon, the effect is as if the colon were |
parenthesis immediately follows the colon, the effect is as if the colon were |
| 2348 |
not there. Any number of these verbs may occur in a pattern. |
not there. Any number of these verbs may occur in a pattern. |
| 2349 |
.P |
.P |
| 2350 |
PCRE contains some optimizations that are used to speed up matching by running |
PCRE contains some optimizations that are used to speed up matching by running |
| 2351 |
some checks at the start of each match attempt. For example, it may know the |
some checks at the start of each match attempt. For example, it may know the |
| 2352 |
minimum length of matching subject, or that a particular character must be |
minimum length of matching subject, or that a particular character must be |
| 2353 |
present. When one of these optimizations suppresses the running of a match, any |
present. When one of these optimizations suppresses the running of a match, any |
| 2354 |
included backtracking verbs will not, of course, be processed. You can suppress |
included backtracking verbs will not, of course, be processed. You can suppress |
| 2355 |
the start-of-match optimizations by setting the PCRE_NO_START_OPTIMIZE option |
the start-of-match optimizations by setting the PCRE_NO_START_OPTIMIZE option |
| 2356 |
when calling \fBpcre_exec()\fP. |
when calling \fBpcre_exec()\fP. |
| 2357 |
. |
. |
| 2358 |
. |
. |
| 2359 |
.SS "Verbs that act immediately" |
.SS "Verbs that act immediately" |
| 2360 |
.rs |
.rs |
| 2361 |
.sp |
.sp |
| 2362 |
The following verbs act as soon as they are encountered. They may not be |
The following verbs act as soon as they are encountered. They may not be |
| 2363 |
followed by a name. |
followed by a name. |
| 2364 |
.sp |
.sp |
| 2365 |
(*ACCEPT) |
(*ACCEPT) |
| 2391 |
.SS "Recording which path was taken" |
.SS "Recording which path was taken" |
| 2392 |
.rs |
.rs |
| 2393 |
.sp |
.sp |
| 2394 |
There is one verb whose main purpose is to track how a match was arrived at, |
There is one verb whose main purpose is to track how a match was arrived at, |
| 2395 |
though it also has a secondary use in conjunction with advancing the match |
though it also has a secondary use in conjunction with advancing the match |
| 2396 |
starting point (see (*SKIP) below). |
starting point (see (*SKIP) below). |
| 2397 |
.sp |
.sp |
| 2398 |
(*MARK:NAME) or (*:NAME) |
(*MARK:NAME) or (*:NAME) |
| 2406 |
.\" </a> |
.\" </a> |
| 2407 |
section on \fIpcre_extra\fP |
section on \fIpcre_extra\fP |
| 2408 |
.\" |
.\" |
| 2409 |
in the |
in the |
| 2410 |
.\" HREF |
.\" HREF |
| 2411 |
\fBpcreapi\fP |
\fBpcreapi\fP |
| 2412 |
.\" |
.\" |
| 2422 |
0: XZ |
0: XZ |
| 2423 |
MK: B |
MK: B |
| 2424 |
.sp |
.sp |
| 2425 |
The (*MARK) name is tagged with "MK:" in this output, and in this example it |
The (*MARK) name is tagged with "MK:" in this output, and in this example it |
| 2426 |
indicates which of the two alternatives matched. This is a more efficient way |
indicates which of the two alternatives matched. This is a more efficient way |
| 2427 |
of obtaining this information than putting each alternative in its own |
of obtaining this information than putting each alternative in its own |
| 2428 |
capturing parentheses. |
capturing parentheses. |
| 2429 |
.P |
.P |
| 2438 |
No match |
No match |
| 2439 |
.sp |
.sp |
| 2440 |
There are three potential starting points for this match (starting with X, |
There are three potential starting points for this match (starting with X, |
| 2441 |
starting with P, and with an empty string). If the pattern is anchored, the |
starting with P, and with an empty string). If the pattern is anchored, the |
| 2442 |
result is different: |
result is different: |
| 2443 |
.sp |
.sp |
| 2444 |
/^X(*MARK:A)Y|^X(*MARK:B)Z/K |
/^X(*MARK:A)Y|^X(*MARK:B)Z/K |
| 2445 |
XP |
XP |
| 2446 |
No match, mark = B |
No match, mark = B |
| 2447 |
.sp |
.sp |
| 2448 |
PCRE's start-of-match optimizations can also interfere with this. For example, |
PCRE's start-of-match optimizations can also interfere with this. For example, |
| 2449 |
if, as a result of a call to \fBpcre_study()\fP, it knows the minimum |
if, as a result of a call to \fBpcre_study()\fP, it knows the minimum |
| 2450 |
subject length for a match, a shorter subject will not be scanned at all. |
subject length for a match, a shorter subject will not be scanned at all. |
| 2451 |
.P |
.P |
| 2452 |
Note that similar anomalies (though different in detail) exist in Perl, no |
Note that similar anomalies (though different in detail) exist in Perl, no |
| 2453 |
doubt for the same reasons. The use of (*MARK) data after a failed match of an |
doubt for the same reasons. The use of (*MARK) data after a failed match of an |
| 2454 |
unanchored pattern is not recommended, unless (*COMMIT) is involved. |
unanchored pattern is not recommended, unless (*COMMIT) is involved. |
| 2455 |
. |
. |
| 2456 |
. |
. |
| 2463 |
the verb. However, when one of these verbs appears inside an atomic group, its |
the verb. However, when one of these verbs appears inside an atomic group, its |
| 2464 |
effect is confined to that group, because once the group has been matched, |
effect is confined to that group, because once the group has been matched, |
| 2465 |
there is never any backtracking into it. In this situation, backtracking can |
there is never any backtracking into it. In this situation, backtracking can |
| 2466 |
"jump back" to the left of the entire atomic group. (Remember also, as stated |
"jump back" to the left of the entire atomic group. (Remember also, as stated |
| 2467 |
above, that this localization also applies in subroutine calls and assertions.) |
above, that this localization also applies in subroutine calls and assertions.) |
| 2468 |
.P |
.P |
| 2469 |
These verbs differ in exactly what kind of failure occurs when backtracking |
These verbs differ in exactly what kind of failure occurs when backtracking |
| 2480 |
a+(*COMMIT)b |
a+(*COMMIT)b |
| 2481 |
.sp |
.sp |
| 2482 |
This matches "xxaab" but not "aacaab". It can be thought of as a kind of |
This matches "xxaab" but not "aacaab". It can be thought of as a kind of |
| 2483 |
dynamic anchor, or "I've started, so I must finish." The name of the most |
dynamic anchor, or "I've started, so I must finish." The name of the most |
| 2484 |
recently passed (*MARK) in the path is passed back when (*COMMIT) forces a |
recently passed (*MARK) in the path is passed back when (*COMMIT) forces a |
| 2485 |
match failure. |
match failure. |
| 2486 |
.P |
.P |
| 2487 |
Note that (*COMMIT) at the start of a pattern is not the same as an anchor, |
Note that (*COMMIT) at the start of a pattern is not the same as an anchor, |
| 2488 |
unless PCRE's start-of-match optimizations are turned off, as shown in this |
unless PCRE's start-of-match optimizations are turned off, as shown in this |
| 2489 |
\fBpcretest\fP example: |
\fBpcretest\fP example: |
| 2490 |
.sp |
.sp |
| 2491 |
/(*COMMIT)abc/ |
/(*COMMIT)abc/ |
| 2494 |
xyzabc\eY |
xyzabc\eY |
| 2495 |
No match |
No match |
| 2496 |
.sp |
.sp |
| 2497 |
PCRE knows that any match must start with "a", so the optimization skips along |
PCRE knows that any match must start with "a", so the optimization skips along |
| 2498 |
the subject to "a" before running the first match attempt, which succeeds. When |
the subject to "a" before running the first match attempt, which succeeds. When |
| 2499 |
the optimization is disabled by the \eY escape in the second subject, the match |
the optimization is disabled by the \eY escape in the second subject, the match |
| 2500 |
starts at "x" and so the (*COMMIT) causes it to fail without trying any other |
starts at "x" and so the (*COMMIT) causes it to fail without trying any other |
| 2501 |
starting points. |
starting points. |
| 2502 |
.sp |
.sp |
| 2503 |
(*PRUNE) or (*PRUNE:NAME) |
(*PRUNE) or (*PRUNE:NAME) |
| 2504 |
.sp |
.sp |
| 2505 |
This verb causes the match to fail at the current starting position in the |
This verb causes the match to fail at the current starting position in the |
| 2506 |
subject if the rest of the pattern does not match. If the pattern is |
subject if the rest of the pattern does not match. If the pattern is |
| 2507 |
unanchored, the normal "bumpalong" advance to the next starting character then |
unanchored, the normal "bumpalong" advance to the next starting character then |
| 2508 |
happens. Backtracking can occur as usual to the left of (*PRUNE), before it is |
happens. Backtracking can occur as usual to the left of (*PRUNE), before it is |
| 2534 |
.sp |
.sp |
| 2535 |
(*SKIP:NAME) |
(*SKIP:NAME) |
| 2536 |
.sp |
.sp |
| 2537 |
When (*SKIP) has an associated name, its behaviour is modified. If the |
When (*SKIP) has an associated name, its behaviour is modified. If the |
| 2538 |
following pattern fails to match, the previous path through the pattern is |
following pattern fails to match, the previous path through the pattern is |
| 2539 |
searched for the most recent (*MARK) that has the same name. If one is found, |
searched for the most recent (*MARK) that has the same name. If one is found, |
| 2540 |
the "bumpalong" advance is to the subject position that corresponds to that |
the "bumpalong" advance is to the subject position that corresponds to that |
| 2541 |
(*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a |
(*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a |
| 2542 |
matching name is found, normal "bumpalong" of one character happens (the |
matching name is found, normal "bumpalong" of one character happens (the |
| 2543 |
(*SKIP) is ignored). |
(*SKIP) is ignored). |
| 2544 |
.sp |
.sp |
| 2545 |
(*THEN) or (*THEN:NAME) |
(*THEN) or (*THEN:NAME) |