/[pcre]/code/trunk/doc/pcrepartial.3
ViewVC logotype

Diff of /code/trunk/doc/pcrepartial.3

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 435 by ph10, Sat Sep 5 10:20:44 2009 UTC revision 468 by ph10, Mon Oct 19 11:46:26 2009 UTC
# Line 32  whether or not a partial match is prefer Line 32  whether or not a partial match is prefer
32  though the details differ between the two matching functions. If both options  though the details differ between the two matching functions. If both options
33  are set, PCRE_PARTIAL_HARD takes precedence.  are set, PCRE_PARTIAL_HARD takes precedence.
34  .P  .P
35  Setting a partial matching option disables one of PCRE's optimizations. PCRE  Setting a partial matching option disables two of PCRE's optimizations. PCRE
36  remembers the last literal byte in a pattern, and abandons matching immediately  remembers the last literal byte in a pattern, and abandons matching immediately
37  if such a byte is not present in the subject string. This optimization cannot  if such a byte is not present in the subject string. This optimization cannot
38  be used for a subject string that might match only partially.  be used for a subject string that might match only partially. If the pattern
39    was studied, PCRE knows the minimum length of a matching string, and does not
40    bother to run the matching function on shorter strings. This optimization is
41    also disabled for partial matching.
42  .  .
43  .  .
44  .SH "PARTIAL MATCHING USING pcre_exec()"  .SH "PARTIAL MATCHING USING pcre_exec()"
# Line 53  instead of PCRE_ERROR_NOMATCH. If there Line 56  instead of PCRE_ERROR_NOMATCH. If there
56  vector, the first of them is set to the offset of the earliest character that  vector, the first of them is set to the offset of the earliest character that
57  was inspected when the partial match was found. For convenience, the second  was inspected when the partial match was found. For convenience, the second
58  offset points to the end of the string so that a substring can easily be  offset points to the end of the string so that a substring can easily be
59  extracted.  identified.
60  .P  .P
61  For the majority of patterns, the first offset identifies the start of the  For the majority of patterns, the first offset identifies the start of the
62  partially matched string. However, for patterns that contain lookbehind  partially matched string. However, for patterns that contain lookbehind
# Line 136  so returns that when PCRE_PARTIAL_HARD i Line 139  so returns that when PCRE_PARTIAL_HARD i
139  .SH "PARTIAL MATCHING AND WORD BOUNDARIES"  .SH "PARTIAL MATCHING AND WORD BOUNDARIES"
140  .rs  .rs
141  .sp  .sp
142  If a pattern ends with one of sequences \ew or \eW, which test for word  If a pattern ends with one of sequences \eb or \eB, which test for word
143  boundaries, partial matching with PCRE_PARTIAL_SOFT can give counter-intuitive  boundaries, partial matching with PCRE_PARTIAL_SOFT can give counter-intuitive
144  results. Consider this pattern:  results. Consider this pattern:
145  .sp  .sp
# Line 244  Consider an unanchored pattern that matc Line 247  Consider an unanchored pattern that matc
247    data> The date is 23ja\eP    data> The date is 23ja\eP
248    Partial match: 23ja    Partial match: 23ja
249  .sp  .sp
250  The this stage, an application could discard the text preceding "23ja", add on  At this stage, an application could discard the text preceding "23ja", add on
251  text from the next segment, and call \fBpcre_exec()\fP again. Unlike  text from the next segment, and call \fBpcre_exec()\fP again. Unlike
252  \fBpcre_dfa_exec()\fP, the entire matching string must always be available, and  \fBpcre_dfa_exec()\fP, the entire matching string must always be available, and
253  the complete matching process occurs for each call, so more memory and more  the complete matching process occurs for each call, so more memory and more
# Line 317  matching multi-segment data. The example Line 320  matching multi-segment data. The example
320  .P  .P
321  4. Patterns that contain alternatives at the top level which do not all  4. Patterns that contain alternatives at the top level which do not all
322  start with the same pattern item may not work as expected when  start with the same pattern item may not work as expected when
323  \fBpcre_dfa_exec()\fP is used. For example, consider this pattern:  PCRE_DFA_RESTART is used with \fBpcre_dfa_exec()\fP. For example, consider this
324    pattern:
325  .sp  .sp
326    1234|3789    1234|3789
327  .sp  .sp
# Line 333  patterns or patterns such as: Line 337  patterns or patterns such as:
337    1234|ABCD    1234|ABCD
338  .sp  .sp
339  where no string can be a partial match for both alternatives. This is not a  where no string can be a partial match for both alternatives. This is not a
340  problem if \fPpcre_exec()\fP is used, because the entire match has to be rerun  problem if \fBpcre_exec()\fP is used, because the entire match has to be rerun
341  each time:  each time:
342  .sp  .sp
343      re> /1234|3789/      re> /1234|3789/
# Line 342  each time: Line 346  each time:
346    data> 1237890    data> 1237890
347     0: 3789     0: 3789
348  .sp  .sp
349    Of course, instead of using PCRE_DFA_PARTIAL, the same technique of re-running
350    the entire match can also be used with \fBpcre_dfa_exec()\fP. Another
351    possibility is to work with two buffers. If a partial match at offset \fIn\fP
352    in the first buffer is followed by "no match" when PCRE_DFA_RESTART is used on
353    the second buffer, you can then try a new match starting at offset \fIn+1\fP in
354    the first buffer.
355  .  .
356  .  .
357  .SH AUTHOR  .SH AUTHOR
# Line 358  Cambridge CB2 3QH, England. Line 368  Cambridge CB2 3QH, England.
368  .rs  .rs
369  .sp  .sp
370  .nf  .nf
371  Last updated: 05 September 2009  Last updated: 19 October 2009
372  Copyright (c) 1997-2009 University of Cambridge.  Copyright (c) 1997-2009 University of Cambridge.
373  .fi  .fi

Legend:
Removed from v.435  
changed lines
  Added in v.468

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12