| 958 |
the above patterns match "SUNDAY" as well as "Saturday". |
the above patterns match "SUNDAY" as well as "Saturday". |
| 959 |
. |
. |
| 960 |
. |
. |
| 961 |
|
.SH "DUPLICATE SUBPATTERN NUMBERS" |
| 962 |
|
.rs |
| 963 |
|
.sp |
| 964 |
|
Perl 5.10 introduced a feature whereby each alternative in a subpattern uses |
| 965 |
|
the same numbers for its capturing parentheses. Such a subpattern starts with |
| 966 |
|
(?| and is itself a non-capturing subpattern. For example, consider this |
| 967 |
|
pattern: |
| 968 |
|
.sp |
| 969 |
|
(?|(Sat)ur|(Sun))day |
| 970 |
|
.sp |
| 971 |
|
Because the two alternatives are inside a (?| group, both sets of capturing |
| 972 |
|
parentheses are numbered one. Thus, when the pattern matches, you can look |
| 973 |
|
at captured substring number one, whichever alternative matched. This construct |
| 974 |
|
is useful when you want to capture part, but not all, of one of a number of |
| 975 |
|
alternatives. Inside a (?| group, parentheses are numbered as usual, but the |
| 976 |
|
number is reset at the start of each branch. The numbers of any capturing |
| 977 |
|
buffers that follow the subpattern start after the highest number used in any |
| 978 |
|
branch. The following example is taken from the Perl documentation. |
| 979 |
|
The numbers underneath show in which buffer the captured content will be |
| 980 |
|
stored. |
| 981 |
|
.sp |
| 982 |
|
# before ---------------branch-reset----------- after |
| 983 |
|
/ ( a ) (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x |
| 984 |
|
# 1 2 2 3 2 3 4 |
| 985 |
|
.sp |
| 986 |
|
A backreference or a recursive call to a numbered subpattern always refers to |
| 987 |
|
the first one in the pattern with the given number. |
| 988 |
|
.P |
| 989 |
|
An alternative approach to using this "branch reset" feature is to use |
| 990 |
|
duplicate named subpatterns, as described in the next section. |
| 991 |
|
. |
| 992 |
|
. |
| 993 |
.SH "NAMED SUBPATTERNS" |
.SH "NAMED SUBPATTERNS" |
| 994 |
.rs |
.rs |
| 995 |
.sp |
.sp |
| 1039 |
(?<DN>Sat)(?:urday)? |
(?<DN>Sat)(?:urday)? |
| 1040 |
.sp |
.sp |
| 1041 |
There are five capturing substrings, but only one is ever set after a match. |
There are five capturing substrings, but only one is ever set after a match. |
| 1042 |
|
(An alternative way of solving this problem is to use a "branch reset" |
| 1043 |
|
subpattern, as described in the previous section.) |
| 1044 |
|
.P |
| 1045 |
The convenience function for extracting the data by name returns the substring |
The convenience function for extracting the data by name returns the substring |
| 1046 |
for the first (and in this example, the only) subpattern of that name that |
for the first (and in this example, the only) subpattern of that name that |
| 1047 |
matched. This saves searching to find which numbered subpattern it was. If you |
matched. This saves searching to find which numbered subpattern it was. If you |
| 1933 |
.rs |
.rs |
| 1934 |
.sp |
.sp |
| 1935 |
.nf |
.nf |
| 1936 |
Last updated: 29 May 2007 |
Last updated: 11 June 2007 |
| 1937 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
| 1938 |
.fi |
.fi |