/[pcre]/code/trunk/doc/html/pcrecallout.html
ViewVC logotype

Contents of /code/trunk/doc/html/pcrecallout.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 75 - (hide annotations) (download) (as text)
Sat Feb 24 21:40:37 2007 UTC (6 years, 3 months ago) by nigel
File MIME type: text/html
File size: 7400 byte(s)
Load pcre-5.0 into code/trunk.

1 nigel 63 <html>
2     <head>
3     <title>pcrecallout specification</title>
4     </head>
5     <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
6 nigel 75 <h1>pcrecallout man page</h1>
7     <p>
8     Return to the <a href="index.html">PCRE index page</a>.
9     </p>
10     <p>
11     This page is part of the PCRE HTML documentation. It was generated automatically
12     from the original man page. If there is any nonsense in it, please consult the
13     man page, in case the conversion went wrong.
14     <br>
15 nigel 63 <ul>
16     <li><a name="TOC1" href="#SEC1">PCRE CALLOUTS</a>
17 nigel 75 <li><a name="TOC2" href="#SEC2">MISSING CALLOUTS</a>
18     <li><a name="TOC3" href="#SEC3">THE CALLOUT INTERFACE</a>
19     <li><a name="TOC4" href="#SEC4">RETURN VALUES</a>
20 nigel 63 </ul>
21     <br><a name="SEC1" href="#TOC1">PCRE CALLOUTS</a><br>
22     <P>
23     <b>int (*pcre_callout)(pcre_callout_block *);</b>
24     </P>
25     <P>
26     PCRE provides a feature called "callout", which is a means of temporarily
27     passing control to the caller of PCRE in the middle of pattern matching. The
28     caller of PCRE provides an external function by putting its entry point in the
29     global variable <i>pcre_callout</i>. By default, this variable contains NULL,
30     which disables all calling out.
31     </P>
32     <P>
33     Within a regular expression, (?C) indicates the points at which the external
34     function is to be called. Different callout points can be identified by putting
35     a number less than 256 after the letter C. The default value is zero.
36     For example, this pattern has two callout points:
37 nigel 75 <pre>
38     (?C1)\deabc(?C2)def
39     </pre>
40     If the PCRE_AUTO_CALLOUT option bit is set when <b>pcre_compile()</b> is called,
41     PCRE automatically inserts callouts, all with number 255, before each item in
42     the pattern. For example, if PCRE_AUTO_CALLOUT is used with the pattern
43     <pre>
44     A(\d{2}|--)
45     </pre>
46     it is processed as if it were
47     <br>
48     <br>
49     (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
50     <br>
51     <br>
52     Notice that there is a callout before and after each parenthesis and
53     alternation bar. Automatic callouts can be used for tracking the progress of
54     pattern matching. The
55     <a href="pcretest.html"><b>pcretest</b></a>
56     command has an option that sets automatic callouts; when it is used, the output
57     indicates how the pattern is matched. This is useful information when you are
58     trying to optimize the performance of a particular pattern.
59 nigel 63 </P>
60 nigel 75 <br><a name="SEC2" href="#TOC1">MISSING CALLOUTS</a><br>
61 nigel 63 <P>
62 nigel 75 You should be aware that, because of optimizations in the way PCRE matches
63     patterns, callouts sometimes do not happen. For example, if the pattern is
64 nigel 63 <pre>
65 nigel 75 ab(?C4)cd
66     </pre>
67     PCRE knows that any matching string must contain the letter "d". If the subject
68     string is "abyz", the lack of "d" means that matching doesn't ever start, and
69     the callout is never reached. However, with "abyd", though the result is still
70     no match, the callout is obeyed.
71 nigel 63 </P>
72 nigel 75 <br><a name="SEC3" href="#TOC1">THE CALLOUT INTERFACE</a><br>
73 nigel 63 <P>
74 nigel 75 During matching, when PCRE reaches a callout point, the external function
75     defined by <i>pcre_callout</i> is called (if it is set). The only argument is a
76     pointer to a <b>pcre_callout</b> block. This structure contains the following
77     fields:
78 nigel 63 <pre>
79     int <i>version</i>;
80     int <i>callout_number</i>;
81     int *<i>offset_vector</i>;
82     const char *<i>subject</i>;
83     int <i>subject_length</i>;
84     int <i>start_match</i>;
85     int <i>current_position</i>;
86     int <i>capture_top</i>;
87     int <i>capture_last</i>;
88     void *<i>callout_data</i>;
89 nigel 75 int <i>pattern_position</i>;
90     int <i>next_item_length</i>;
91     </pre>
92 nigel 63 The <i>version</i> field is an integer containing the version number of the
93 nigel 75 block format. The initial version was 0; the current version is 1. The version
94     number will change again in future if additional fields are added, but the
95     intention is never to remove any of the existing fields.
96 nigel 63 </P>
97     <P>
98     The <i>callout_number</i> field contains the number of the callout, as compiled
99 nigel 75 into the pattern (that is, the number after ?C for manual callouts, and 255 for
100     automatically generated callouts).
101 nigel 63 </P>
102     <P>
103     The <i>offset_vector</i> field is a pointer to the vector of offsets that was
104     passed by the caller to <b>pcre_exec()</b>. The contents can be inspected in
105     order to extract substrings that have been matched so far, in the same way as
106     for extracting substrings after a match has completed.
107     </P>
108     <P>
109 nigel 75 The <i>subject</i> and <i>subject_length</i> fields contain copies of the values
110 nigel 63 that were passed to <b>pcre_exec()</b>.
111     </P>
112     <P>
113     The <i>start_match</i> field contains the offset within the subject at which the
114     current match attempt started. If the pattern is not anchored, the callout
115 nigel 75 function may be called several times from the same point in the pattern for
116     different starting points in the subject.
117 nigel 63 </P>
118     <P>
119     The <i>current_position</i> field contains the offset within the subject of the
120     current match pointer.
121     </P>
122     <P>
123 nigel 71 The <i>capture_top</i> field contains one more than the number of the highest
124     numbered captured substring so far. If no substrings have been captured,
125     the value of <i>capture_top</i> is one.
126 nigel 63 </P>
127     <P>
128     The <i>capture_last</i> field contains the number of the most recently captured
129 nigel 75 substring. If no substrings have been captured, its value is -1.
130 nigel 63 </P>
131     <P>
132     The <i>callout_data</i> field contains a value that is passed to
133     <b>pcre_exec()</b> by the caller specifically so that it can be passed back in
134     callouts. It is passed in the <i>pcre_callout</i> field of the <b>pcre_extra</b>
135     data structure. If no such data was passed, the value of <i>callout_data</i> in
136     a <b>pcre_callout</b> block is NULL. There is a description of the
137 nigel 75 <b>pcre_extra</b> structure in the
138     <a href="pcreapi.html"><b>pcreapi</b></a>
139     documentation.
140 nigel 63 </P>
141     <P>
142 nigel 75 The <i>pattern_position</i> field is present from version 1 of the
143     <i>pcre_callout</i> structure. It contains the offset to the next item to be
144     matched in the pattern string.
145 nigel 63 </P>
146     <P>
147 nigel 75 The <i>next_item_length</i> field is present from version 1 of the
148     <i>pcre_callout</i> structure. It contains the length of the next item to be
149     matched in the pattern string. When the callout immediately precedes an
150     alternation bar, a closing parenthesis, or the end of the pattern, the length
151     is zero. When the callout precedes an opening parenthesis, the length is that
152     of the entire subpattern.
153     </P>
154     <P>
155     The <i>pattern_position</i> and <i>next_item_length</i> fields are intended to
156     help in distinguishing between different automatic callouts, which all have the
157     same callout number. However, they are set for all callouts.
158     </P>
159     <br><a name="SEC4" href="#TOC1">RETURN VALUES</a><br>
160     <P>
161     The external callout function returns an integer to PCRE. If the value is zero,
162     matching proceeds as normal. If the value is greater than zero, matching fails
163     at the current point, but backtracking to test other matching possibilities
164     goes ahead, just as if a lookahead assertion had failed. If the value is less
165     than zero, the match is abandoned, and <b>pcre_exec()</b> returns the negative
166     value.
167     </P>
168     <P>
169 nigel 63 Negative values should normally be chosen from the set of PCRE_ERROR_xxx
170     values. In particular, PCRE_ERROR_NOMATCH forces a standard "no match" failure.
171     The error number PCRE_ERROR_CALLOUT is reserved for use by callout functions;
172     it will never be used by PCRE itself.
173     </P>
174     <P>
175 nigel 75 Last updated: 09 September 2004
176 nigel 63 <br>
177 nigel 75 Copyright &copy; 1997-2004 University of Cambridge.
178     <p>
179     Return to the <a href="index.html">PCRE index page</a>.
180     </p>

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12