| 1 |
nigel |
75 |
<html> |
| 2 |
|
|
<head> |
| 3 |
|
|
<title>pcrepartial specification</title> |
| 4 |
|
|
</head> |
| 5 |
|
|
<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB"> |
| 6 |
|
|
<h1>pcrepartial man page</h1> |
| 7 |
|
|
<p> |
| 8 |
|
|
Return to the <a href="index.html">PCRE index page</a>. |
| 9 |
|
|
</p> |
| 10 |
|
|
<p> |
| 11 |
|
|
This page is part of the PCRE HTML documentation. It was generated automatically |
| 12 |
|
|
from the original man page. If there is any nonsense in it, please consult the |
| 13 |
|
|
man page, in case the conversion went wrong. |
| 14 |
|
|
<br> |
| 15 |
|
|
<ul> |
| 16 |
|
|
<li><a name="TOC1" href="#SEC1">PARTIAL MATCHING IN PCRE</a> |
| 17 |
|
|
<li><a name="TOC2" href="#SEC2">RESTRICTED PATTERNS FOR PCRE_PARTIAL</a> |
| 18 |
|
|
<li><a name="TOC3" href="#SEC3">EXAMPLE OF PARTIAL MATCHING USING PCRETEST</a> |
| 19 |
|
|
</ul> |
| 20 |
|
|
<br><a name="SEC1" href="#TOC1">PARTIAL MATCHING IN PCRE</a><br> |
| 21 |
|
|
<P> |
| 22 |
|
|
In normal use of PCRE, if the subject string that is passed to |
| 23 |
|
|
<b>pcre_exec()</b> matches as far as it goes, but is too short to match the |
| 24 |
|
|
entire pattern, PCRE_ERROR_NOMATCH is returned. There are circumstances where |
| 25 |
|
|
it might be helpful to distinguish this case from other cases in which there is |
| 26 |
|
|
no match. |
| 27 |
|
|
</P> |
| 28 |
|
|
<P> |
| 29 |
|
|
Consider, for example, an application where a human is required to type in data |
| 30 |
|
|
for a field with specific formatting requirements. An example might be a date |
| 31 |
|
|
in the form <i>ddmmmyy</i>, defined by this pattern: |
| 32 |
|
|
<pre> |
| 33 |
|
|
^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$ |
| 34 |
|
|
</pre> |
| 35 |
|
|
If the application sees the user's keystrokes one by one, and can check that |
| 36 |
|
|
what has been typed so far is potentially valid, it is able to raise an error |
| 37 |
|
|
as soon as a mistake is made, possibly beeping and not reflecting the |
| 38 |
|
|
character that has been typed. This immediate feedback is likely to be a better |
| 39 |
|
|
user interface than a check that is delayed until the entire string has been |
| 40 |
|
|
entered. |
| 41 |
|
|
</P> |
| 42 |
|
|
<P> |
| 43 |
|
|
PCRE supports the concept of partial matching by means of the PCRE_PARTIAL |
| 44 |
|
|
option, which can be set when calling <b>pcre_exec()</b>. When this is done, the |
| 45 |
|
|
return code PCRE_ERROR_NOMATCH is converted into PCRE_ERROR_PARTIAL if at any |
| 46 |
|
|
time during the matching process the entire subject string matched part of the |
| 47 |
|
|
pattern. No captured data is set when this occurs. |
| 48 |
|
|
</P> |
| 49 |
|
|
<P> |
| 50 |
|
|
Using PCRE_PARTIAL disables one of PCRE's optimizations. PCRE remembers the |
| 51 |
|
|
last literal byte in a pattern, and abandons matching immediately if such a |
| 52 |
|
|
byte is not present in the subject string. This optimization cannot be used |
| 53 |
|
|
for a subject string that might match only partially. |
| 54 |
|
|
</P> |
| 55 |
|
|
<br><a name="SEC2" href="#TOC1">RESTRICTED PATTERNS FOR PCRE_PARTIAL</a><br> |
| 56 |
|
|
<P> |
| 57 |
|
|
Because of the way certain internal optimizations are implemented in PCRE, the |
| 58 |
|
|
PCRE_PARTIAL option cannot be used with all patterns. Repeated single |
| 59 |
|
|
characters such as |
| 60 |
|
|
<pre> |
| 61 |
|
|
a{2,4} |
| 62 |
|
|
</pre> |
| 63 |
|
|
and repeated single metasequences such as |
| 64 |
|
|
<pre> |
| 65 |
|
|
\d+ |
| 66 |
|
|
</pre> |
| 67 |
|
|
are not permitted if the maximum number of occurrences is greater than one. |
| 68 |
|
|
Optional items such as \d? (where the maximum is one) are permitted. |
| 69 |
|
|
Quantifiers with any values are permitted after parentheses, so the invalid |
| 70 |
|
|
examples above can be coded thus: |
| 71 |
|
|
<pre> |
| 72 |
|
|
(a){2,4} |
| 73 |
|
|
(\d)+ |
| 74 |
|
|
</pre> |
| 75 |
|
|
These constructions run more slowly, but for the kinds of application that are |
| 76 |
|
|
envisaged for this facility, this is not felt to be a major restriction. |
| 77 |
|
|
</P> |
| 78 |
|
|
<P> |
| 79 |
|
|
If PCRE_PARTIAL is set for a pattern that does not conform to the restrictions, |
| 80 |
|
|
<b>pcre_exec()</b> returns the error code PCRE_ERROR_BADPARTIAL (-13). |
| 81 |
|
|
</P> |
| 82 |
|
|
<br><a name="SEC3" href="#TOC1">EXAMPLE OF PARTIAL MATCHING USING PCRETEST</a><br> |
| 83 |
|
|
<P> |
| 84 |
|
|
If the escape sequence \P is present in a <b>pcretest</b> data line, the |
| 85 |
|
|
PCRE_PARTIAL flag is used for the match. Here is a run of <b>pcretest</b> that |
| 86 |
|
|
uses the date example quoted above: |
| 87 |
|
|
<pre> |
| 88 |
|
|
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/ |
| 89 |
|
|
data> 25jun04\P |
| 90 |
|
|
0: 25jun04 |
| 91 |
|
|
1: jun |
| 92 |
|
|
data> 25dec3\P |
| 93 |
|
|
Partial match |
| 94 |
|
|
data> 3ju\P |
| 95 |
|
|
Partial match |
| 96 |
|
|
data> 3juj\P |
| 97 |
|
|
No match |
| 98 |
|
|
data> j\P |
| 99 |
|
|
No match |
| 100 |
|
|
</pre> |
| 101 |
|
|
The first data string is matched completely, so <b>pcretest</b> shows the |
| 102 |
|
|
matched substrings. The remaining four strings do not match the complete |
| 103 |
|
|
pattern, but the first two are partial matches. |
| 104 |
|
|
</P> |
| 105 |
|
|
<P> |
| 106 |
|
|
Last updated: 08 September 2004 |
| 107 |
|
|
<br> |
| 108 |
|
|
Copyright © 1997-2004 University of Cambridge. |
| 109 |
|
|
<p> |
| 110 |
|
|
Return to the <a href="index.html">PCRE index page</a>. |
| 111 |
|
|
</p> |