/[pcre]/code/trunk/doc/html/pcrejit.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcrejit.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 690 by ph10, Sun Aug 28 15:23:03 2011 UTC revision 691 by ph10, Sun Sep 11 14:31:21 2011 UTC
# Line 13  from the original man page. If there is Line 13  from the original man page. If there is
13  man page, in case the conversion went wrong.  man page, in case the conversion went wrong.
14  <br>  <br>
15  <ul>  <ul>
16    <li><a name="TOC1" href="#SEC1">PCRE JUST-IN-TIME COMPILER SUPPORT</a>
17    <li><a name="TOC2" href="#SEC2">AVAILABILITY OF JIT SUPPORT</a>
18    <li><a name="TOC3" href="#SEC3">SIMPLE USE OF JIT</a>
19    <li><a name="TOC4" href="#SEC4">UNSUPPORTED OPTIONS AND PATTERN ITEMS</a>
20    <li><a name="TOC5" href="#SEC5">RETURN VALUES FROM JIT EXECUTION</a>
21    <li><a name="TOC6" href="#SEC6">SAVING AND RESTORING COMPILED PATTERNS</a>
22    <li><a name="TOC7" href="#SEC7">CONTROLLING THE JIT STACK</a>
23    <li><a name="TOC8" href="#SEC8">EXAMPLE CODE</a>
24    <li><a name="TOC9" href="#SEC9">SEE ALSO</a>
25    <li><a name="TOC10" href="#SEC10">AUTHOR</a>
26    <li><a name="TOC11" href="#SEC11">REVISION</a>
27  </ul>  </ul>
28    <br><a name="SEC1" href="#TOC1">PCRE JUST-IN-TIME COMPILER SUPPORT</a><br>
29    <P>
30    Just-in-time compiling is a heavyweight optimization that can greatly speed up
31    pattern matching. However, it comes at the cost of extra processing before the
32    match is performed. Therefore, it is of most benefit when the same pattern is
33    going to be matched many times. This does not necessarily mean many calls of
34    \fPpcre_exec()\fP; if the pattern is not anchored, matching attempts may take
35    place many times at various positions in the subject, even for a single call to
36    <b>pcre_exec()</b>. If the subject string is very long, it may still pay to use
37    JIT for one-off matches.
38    </P>
39    <P>
40    JIT support applies only to the traditional matching function,
41    <b>pcre_exec()</b>. It does not apply when <b>pcre_dfa_exec()</b> is being used.
42    The code for this support was written by Zoltan Herczeg.
43    </P>
44    <br><a name="SEC2" href="#TOC1">AVAILABILITY OF JIT SUPPORT</a><br>
45    <P>
46    JIT support is an optional feature of PCRE. The "configure" option --enable-jit
47    (or equivalent CMake option) must be set when PCRE is built if you want to use
48    JIT. The support is limited to the following hardware platforms:
49    <pre>
50      ARM v5, v7, and Thumb2
51      Intel x86 32-bit and 64-bit
52      MIPS 32-bit
53      Power PC 32-bit and 64-bit
54    </pre>
55    If --enable-jit is set on an unsupported platform, compilation fails.
56    </P>
57    <P>
58    A program can tell if JIT support is available by calling <b>pcre_config()</b>
59    with the PCRE_CONFIG_JIT option. The result is 1 when JIT is available, and 0
60    otherwise. However, a simple program does not need to check this in order to
61    use JIT. The API is implemented in a way that falls back to the ordinary PCRE
62    code if JIT is not available.
63    </P>
64    <br><a name="SEC3" href="#TOC1">SIMPLE USE OF JIT</a><br>
65    <P>
66    You have to do two things to make use of the JIT support in the simplest way:
67    <pre>
68      (1) Call <b>pcre_study()</b> with the PCRE_STUDY_JIT_COMPILE option for
69          each compiled pattern, and pass the resulting <b>pcre_extra</b> block to
70          <b>pcre_exec()</b>.
71    
72      (2) Use <b>pcre_free_study()</b> to free the <b>pcre_extra</b> block when it is
73          no longer needed instead of just freeing it yourself. This
74          ensures that any JIT data is also freed.
75    </pre>
76    In some circumstances you may need to call additional functions. These are
77    described in the section entitled
78    <a href="#stackcontrol">"Controlling the JIT stack"</a>
79    below.
80    </P>
81    <P>
82    If JIT support is not available, PCRE_STUDY_JIT_COMPILE is ignored, and no JIT
83    data is set up. Otherwise, the compiled pattern is passed to the JIT compiler,
84    which turns it into machine code that executes much faster than the normal
85    interpretive code. When <b>pcre_exec()</b> is passed a <b>pcre_extra</b> block
86    containing a pointer to JIT code, it obeys that instead of the normal code. The
87    result is identical, but the code runs much faster.
88    </P>
89    <P>
90    There are some <b>pcre_exec()</b> options that are not supported for JIT
91    execution. There are also some pattern items that JIT cannot handle. Details
92    are given below. In both cases, execution automatically falls back to the
93    interpretive code.
94    </P>
95    <P>
96    If the JIT compiler finds an unsupported item, no JIT data is generated. You
97    can find out if JIT execution is available after studying a pattern by calling
98    <b>pcre_fullinfo()</b> with the PCRE_INFO_JIT option. A result of 1 means that
99    JIT compilationw was successful. A result of 0 means that JIT support is not
100    available, or the pattern was not studied with PCRE_STUDY_JIT_COMPILE, or the
101    JIT compiler was not able to handle the pattern.
102    </P>
103    <br><a name="SEC4" href="#TOC1">UNSUPPORTED OPTIONS AND PATTERN ITEMS</a><br>
104    <P>
105    The only <b>pcre_exec()</b> options that are supported for JIT execution are
106    PCRE_NO_UTF8_CHECK, PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and
107    PCRE_NOTEMPTY_ATSTART. Note in particular that partial matching is not
108    supported.
109    </P>
110    <P>
111    The unsupported pattern items are:
112    <pre>
113      \C            match a single byte, even in UTF-8 mode
114      (?Cn)          callouts
115      (?(&#60;name&#62;)...  conditional test on setting of a named subpattern
116      (?(R)...       conditional test on whole pattern recursion
117      (?(Rn)...      conditional test on recursion, by number
118      (?(R&name)...  conditional test on recursion, by name
119      (*COMMIT)      )
120      (*MARK)        )
121      (*PRUNE)       ) the backtracking control verbs
122      (*SKIP)        )
123      (*THEN)        )
124    </pre>
125    Support for some of these may be added in future.
126    </P>
127    <br><a name="SEC5" href="#TOC1">RETURN VALUES FROM JIT EXECUTION</a><br>
128    <P>
129    When a pattern is matched using JIT execution, the return values are the same
130    as those given by the interpretive <b>pcre_exec()</b> code, with the addition of
131    one new error code: PCRE_ERROR_JIT_STACKLIMIT. This means that the memory used
132    for the JIT stack was insufficient. See
133    <a href="#stackcontrol">"Controlling the JIT stack"</a>
134    below for a discussion of JIT stack usage. For compatibility with the
135    interpretive <b>pcre_exec()</b> code, no more than two-thirds of the
136    <i>ovector</i> argument is used for passing back captured substrings.
137    </P>
138    <P>
139    The error code PCRE_ERROR_MATCHLIMIT is returned by the JIT code if searching a
140    very large pattern tree goes on for too long, as it is in the same circumstance
141    when JIT is not used, but the details of exactly what is counted are not the
142    same. The PCRE_ERROR_RECURSIONLIMIT error code is never returned by JIT
143    execution.
144    </P>
145    <br><a name="SEC6" href="#TOC1">SAVING AND RESTORING COMPILED PATTERNS</a><br>
146    <P>
147    The code that is generated by the JIT compiler is architecture-specific, and is
148    also position dependent. For those reasons it cannot be saved and restored like
149    the bytecode and other data of a compiled pattern. You should be able run
150    <b>pcre_study()</b> on a saved and restored pattern, and thereby recreate the
151    JIT data, but because JIT compilation uses significant resources, it is
152    probably not worth doing this.
153    <a name="stackcontrol"></a></P>
154    <br><a name="SEC7" href="#TOC1">CONTROLLING THE JIT STACK</a><br>
155    <P>
156    When the compiled JIT code runs, it needs a block of memory to use as a stack.
157    By default, it uses 32K on the machine stack. However, some large or
158    complicated patterns need more than this. The error PCRE_ERROR_JIT_STACKLIMIT
159    is given when there is not enough stack. Three functions are provided for
160    managing blocks of memory for use as JIT stacks.
161    </P>
162    <P>
163    The <b>pcre_jit_stack_alloc()</b> function creates a JIT stack. Its arguments
164    are a starting size and a maximum size, and it returns a pointer to an opaque
165    structure of type <b>pcre_jit_stack</b>, or NULL if there is an error. The
166    <b>pcre_jit_stack_free()</b> function can be used to free a stack that is no
167    longer needed. (For the technically minded: the address space is allocated by
168    mmap or VirtualAlloc.)
169    </P>
170    <P>
171    JIT uses far less memory for recursion than the interpretive code,
172    and a maximum stack size of 512K to 1M should be more than enough for any
173    pattern.
174    </P>
175    <P>
176    The <b>pcre_assign_jit_stack()</b> function specifies which stack JIT code
177    should use. Its arguments are as follows:
178    <pre>
179      pcre_extra         *extra
180      pcre_jit_callback  callback
181      void               *data
182    </pre>
183    The <i>extra</i> argument must be the result of studying a pattern with
184    PCRE_STUDY_JIT_COMPILE. There are three cases for the values of the other two
185    options:
186    <pre>
187      (1) If <i>callback</i> is NULL and <i>data</i> is NULL, an internal 32K block
188          on the machine stack is used.
189    
190      (2) If <i>callback</i> is NULL and <i>data</i> is not NULL, <i>data</i> must be
191          a valid JIT stack, the result of calling <b>pcre_jit_stack_alloc()</b>.
192    
193      (3) If <i>callback</i> not NULL, it must point to a function that is called
194          with <i>data</i> as an argument at the start of matching, in order to
195          set up a JIT stack. If the result is NULL, the internal 32K stack
196          is used; otherwise the return value must be a valid JIT stack,
197          the result of calling <b>pcre_jit_stack_alloc()</b>.
198    </pre>
199    You may safely assign the same JIT stack to more than one pattern, as long as
200    they are all matched sequentially in the same thread. In a multithread
201    application, each thread must use its own JIT stack.
202    </P>
203    <P>
204    Strictly speaking, even more is allowed. You can assign the same stack to any
205    number of patterns as long as they are not used for matching by multiple
206    threads at the same time. For example, you can assign the same stack to all
207    compiled patterns, and use a global mutex in the callback to wait until the
208    stack is available for use. However, this is an inefficient solution, and
209    not recommended.
210    </P>
211    <P>
212    This is a suggestion for how a typical multithreaded program might operate:
213    <pre>
214      During thread initalization
215        thread_local_var = pcre_jit_stack_alloc(...)
216    
217      During thread exit
218        pcre_jit_stack_free(thread_local_var)
219    
220      Use a one-line callback function
221        return thread_local_var
222    </pre>
223    All the functions described in this section do nothing if JIT is not available,
224    and <b>pcre_assign_jit_stack()</b> does nothing unless the <b>extra</b> argument
225    is non-NULL and points to a <b>pcre_extra</b> block that is the result of a
226    successful study with PCRE_STUDY_JIT_COMPILE.
227    </P>
228    <br><a name="SEC8" href="#TOC1">EXAMPLE CODE</a><br>
229    <P>
230    This is a single-threaded example that specifies a JIT stack without using a
231    callback.
232    <pre>
233      int rc;
234      int ovector[30];
235      pcre *re;
236      pcre_extra *extra;
237      pcre_jit_stack *jit_stack;
238    
239      re = pcre_compile(pattern, 0, &error, &erroffset, NULL);
240      /* Check for errors */
241      extra = pcre_study(re, PCRE_STUDY_JIT_COMPILE, &error);
242      jit_stack = pcre_jit_stack_alloc(32*1024, 512*1024);
243      /* Check for error (NULL) */
244      pcre_assign_jit_stack(extra, NULL, jit_stack);
245      rc = pcre_exec(re, extra, subject, length, 0, 0, ovector, 30);
246      /* Check results */
247      pcre_free(re);
248      pcre_free_study(extra);
249      pcre_jit_stack_free(jit_stack);
250    
251    </PRE>
252    </P>
253    <br><a name="SEC9" href="#TOC1">SEE ALSO</a><br>
254    <P>
255    <b>pcreapi</b>(3)
256    </P>
257    <br><a name="SEC10" href="#TOC1">AUTHOR</a><br>
258    <P>
259    Philip Hazel
260    <br>
261    University Computing Service
262    <br>
263    Cambridge CB2 3QH, England.
264    <br>
265    </P>
266    <br><a name="SEC11" href="#TOC1">REVISION</a><br>
267    <P>
268    Last updated: 06 September 2011
269    <br>
270    Copyright &copy; 1997-2011 University of Cambridge.
271    <br>
272  <p>  <p>
273  Return to the <a href="index.html">PCRE index page</a>.  Return to the <a href="index.html">PCRE index page</a>.
274  </p>  </p>

Legend:
Removed from v.690  
changed lines
  Added in v.691

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12