| 139 |
.SH "PARTIAL MATCHING AND WORD BOUNDARIES" |
.SH "PARTIAL MATCHING AND WORD BOUNDARIES" |
| 140 |
.rs |
.rs |
| 141 |
.sp |
.sp |
| 142 |
If a pattern ends with one of sequences \ew or \eW, which test for word |
If a pattern ends with one of sequences \eb or \eB, which test for word |
| 143 |
boundaries, partial matching with PCRE_PARTIAL_SOFT can give counter-intuitive |
boundaries, partial matching with PCRE_PARTIAL_SOFT can give counter-intuitive |
| 144 |
results. Consider this pattern: |
results. Consider this pattern: |
| 145 |
.sp |
.sp |
| 247 |
data> The date is 23ja\eP |
data> The date is 23ja\eP |
| 248 |
Partial match: 23ja |
Partial match: 23ja |
| 249 |
.sp |
.sp |
| 250 |
The this stage, an application could discard the text preceding "23ja", add on |
At this stage, an application could discard the text preceding "23ja", add on |
| 251 |
text from the next segment, and call \fBpcre_exec()\fP again. Unlike |
text from the next segment, and call \fBpcre_exec()\fP again. Unlike |
| 252 |
\fBpcre_dfa_exec()\fP, the entire matching string must always be available, and |
\fBpcre_dfa_exec()\fP, the entire matching string must always be available, and |
| 253 |
the complete matching process occurs for each call, so more memory and more |
the complete matching process occurs for each call, so more memory and more |
| 337 |
1234|ABCD |
1234|ABCD |
| 338 |
.sp |
.sp |
| 339 |
where no string can be a partial match for both alternatives. This is not a |
where no string can be a partial match for both alternatives. This is not a |
| 340 |
problem if \fPpcre_exec()\fP is used, because the entire match has to be rerun |
problem if \fBpcre_exec()\fP is used, because the entire match has to be rerun |
| 341 |
each time: |
each time: |
| 342 |
.sp |
.sp |
| 343 |
re> /1234|3789/ |
re> /1234|3789/ |
| 347 |
0: 3789 |
0: 3789 |
| 348 |
.sp |
.sp |
| 349 |
Of course, instead of using PCRE_DFA_PARTIAL, the same technique of re-running |
Of course, instead of using PCRE_DFA_PARTIAL, the same technique of re-running |
| 350 |
the entire match can also be used with \fBpcre_dfa_exec()\fP. |
the entire match can also be used with \fBpcre_dfa_exec()\fP. Another |
| 351 |
|
possibility is to work with two buffers. If a partial match at offset \fIn\fP |
| 352 |
|
in the first buffer is followed by "no match" when PCRE_DFA_RESTART is used on |
| 353 |
|
the second buffer, you can then try a new match starting at offset \fIn+1\fP in |
| 354 |
|
the first buffer. |
| 355 |
. |
. |
| 356 |
. |
. |
| 357 |
.SH AUTHOR |
.SH AUTHOR |
| 368 |
.rs |
.rs |
| 369 |
.sp |
.sp |
| 370 |
.nf |
.nf |
| 371 |
Last updated: 18 October 2009 |
Last updated: 19 October 2009 |
| 372 |
Copyright (c) 1997-2009 University of Cambridge. |
Copyright (c) 1997-2009 University of Cambridge. |
| 373 |
.fi |
.fi |