| 96 |
simultaneously). |
simultaneously). |
| 97 |
</P> |
</P> |
| 98 |
<P> |
<P> |
| 99 |
|
Although the general principle of this matching algorithm is that it scans the |
| 100 |
|
subject string only once, without backtracking, there is one exception: when a |
| 101 |
|
lookaround assertion is encountered, the characters following or preceding the |
| 102 |
|
current point have to be independently inspected. |
| 103 |
|
</P> |
| 104 |
|
<P> |
| 105 |
The scan continues until either the end of the subject is reached, or there are |
The scan continues until either the end of the subject is reached, or there are |
| 106 |
no more unterminated paths. At this point, terminated paths represent the |
no more unterminated paths. At this point, terminated paths represent the |
| 107 |
different matching possibilities (if there are none, the match has failed). |
different matching possibilities (if there are none, the match has failed). |
| 108 |
Thus, if there is more than one possible match, this algorithm finds all of |
Thus, if there is more than one possible match, this algorithm finds all of |
| 109 |
them, and in particular, it finds the longest. In PCRE, there is an option to |
them, and in particular, it finds the longest. There is an option to stop the |
| 110 |
stop the algorithm after the first match (which is necessarily the shortest) |
algorithm after the first match (which is necessarily the shortest) is found. |
|
has been found. |
|
| 111 |
</P> |
</P> |
| 112 |
<P> |
<P> |
| 113 |
Note that all the matches that are found start at the same point in the |
Note that all the matches that are found start at the same point in the |
| 121 |
matches that start at later positions. |
matches that start at later positions. |
| 122 |
</P> |
</P> |
| 123 |
<P> |
<P> |
|
Although the general principle of this matching algorithm is that it scans the |
|
|
subject string only once, without backtracking, there is one exception: when a |
|
|
lookbehind assertion is encountered, the preceding characters have to be |
|
|
re-inspected. |
|
|
</P> |
|
|
<P> |
|
| 124 |
There are a number of features of PCRE regular expressions that are not |
There are a number of features of PCRE regular expressions that are not |
| 125 |
supported by the alternative matching algorithm. They are as follows: |
supported by the alternative matching algorithm. They are as follows: |
| 126 |
</P> |
</P> |
| 185 |
2. Because the alternative algorithm scans the subject string just once, and |
2. Because the alternative algorithm scans the subject string just once, and |
| 186 |
never needs to backtrack, it is possible to pass very long subject strings to |
never needs to backtrack, it is possible to pass very long subject strings to |
| 187 |
the matching function in several pieces, checking for partial matching each |
the matching function in several pieces, checking for partial matching each |
| 188 |
time. |
time. The |
| 189 |
|
<a href="pcrepartial.html"><b>pcrepartial</b></a> |
| 190 |
|
documentation gives details of partial matching. |
| 191 |
</P> |
</P> |
| 192 |
<br><a name="SEC6" href="#TOC1">DISADVANTAGES OF THE ALTERNATIVE ALGORITHM</a><br> |
<br><a name="SEC6" href="#TOC1">DISADVANTAGES OF THE ALTERNATIVE ALGORITHM</a><br> |
| 193 |
<P> |
<P> |
| 216 |
</P> |
</P> |
| 217 |
<br><a name="SEC8" href="#TOC1">REVISION</a><br> |
<br><a name="SEC8" href="#TOC1">REVISION</a><br> |
| 218 |
<P> |
<P> |
| 219 |
Last updated: 05 September 2009 |
Last updated: 29 September 2009 |
| 220 |
<br> |
<br> |
| 221 |
Copyright © 1997-2009 University of Cambridge. |
Copyright © 1997-2009 University of Cambridge. |
| 222 |
<br> |
<br> |