/[pcre]/code/trunk/doc/html/pcrebuild.html
ViewVC logotype

Diff of /code/trunk/doc/html/pcrebuild.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 197 by ph10, Tue Jul 31 10:50:18 2007 UTC revision 453 by ph10, Fri Sep 18 19:12:35 2009 UTC
# Line 18  man page, in case the conversion went wr Line 18  man page, in case the conversion went wr
18  <li><a name="TOC3" href="#SEC3">UTF-8 SUPPORT</a>  <li><a name="TOC3" href="#SEC3">UTF-8 SUPPORT</a>
19  <li><a name="TOC4" href="#SEC4">UNICODE CHARACTER PROPERTY SUPPORT</a>  <li><a name="TOC4" href="#SEC4">UNICODE CHARACTER PROPERTY SUPPORT</a>
20  <li><a name="TOC5" href="#SEC5">CODE VALUE OF NEWLINE</a>  <li><a name="TOC5" href="#SEC5">CODE VALUE OF NEWLINE</a>
21  <li><a name="TOC6" href="#SEC6">BUILDING SHARED AND STATIC LIBRARIES</a>  <li><a name="TOC6" href="#SEC6">WHAT \R MATCHES</a>
22  <li><a name="TOC7" href="#SEC7">POSIX MALLOC USAGE</a>  <li><a name="TOC7" href="#SEC7">BUILDING SHARED AND STATIC LIBRARIES</a>
23  <li><a name="TOC8" href="#SEC8">HANDLING VERY LARGE PATTERNS</a>  <li><a name="TOC8" href="#SEC8">POSIX MALLOC USAGE</a>
24  <li><a name="TOC9" href="#SEC9">AVOIDING EXCESSIVE STACK USAGE</a>  <li><a name="TOC9" href="#SEC9">HANDLING VERY LARGE PATTERNS</a>
25  <li><a name="TOC10" href="#SEC10">LIMITING PCRE RESOURCE USAGE</a>  <li><a name="TOC10" href="#SEC10">AVOIDING EXCESSIVE STACK USAGE</a>
26  <li><a name="TOC11" href="#SEC11">CREATING CHARACTER TABLES AT BUILD TIME</a>  <li><a name="TOC11" href="#SEC11">LIMITING PCRE RESOURCE USAGE</a>
27  <li><a name="TOC12" href="#SEC12">USING EBCDIC CODE</a>  <li><a name="TOC12" href="#SEC12">CREATING CHARACTER TABLES AT BUILD TIME</a>
28  <li><a name="TOC13" href="#SEC13">SEE ALSO</a>  <li><a name="TOC13" href="#SEC13">USING EBCDIC CODE</a>
29  <li><a name="TOC14" href="#SEC14">AUTHOR</a>  <li><a name="TOC14" href="#SEC14">PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
30  <li><a name="TOC15" href="#SEC15">REVISION</a>  <li><a name="TOC15" href="#SEC15">PCRETEST OPTION FOR LIBREADLINE SUPPORT</a>
31    <li><a name="TOC16" href="#SEC16">SEE ALSO</a>
32    <li><a name="TOC17" href="#SEC17">AUTHOR</a>
33    <li><a name="TOC18" href="#SEC18">REVISION</a>
34  </ul>  </ul>
35  <br><a name="SEC1" href="#TOC1">PCRE BUILD-TIME OPTIONS</a><br>  <br><a name="SEC1" href="#TOC1">PCRE BUILD-TIME OPTIONS</a><br>
36  <P>  <P>
37  This document describes the optional features of PCRE that can be selected when  This document describes the optional features of PCRE that can be selected when
38  the library is compiled. They are all selected, or deselected, by providing  the library is compiled. It assumes use of the <b>configure</b> script, where
39  options to the <b>configure</b> script that is run before the <b>make</b>  the optional features are selected or deselected by providing options to
40  command. The complete list of options for <b>configure</b> (which includes the  <b>configure</b> before running the <b>make</b> command. However, the same
41  standard ones such as the selection of the installation directory) can be  options can be selected in both Unix-like and non-Unix-like environments using
42  obtained by running  the GUI facility of <b>cmake-gui</b> if you are using <b>CMake</b> instead of
43    <b>configure</b> to build PCRE.
44    </P>
45    <P>
46    There is a lot more information about building PCRE in non-Unix-like
47    environments in the file called <i>NON_UNIX_USE</i>, which is part of the PCRE
48    distribution. You should consult this file as well as the <i>README</i> file if
49    you are building in a non-Unix-like environment.
50    </P>
51    <P>
52    The complete list of options for <b>configure</b> (which includes the standard
53    ones such as the selection of the installation directory) can be obtained by
54    running
55  <pre>  <pre>
56    ./configure --help    ./configure --help
57  </pre>  </pre>
# Line 58  to the configure command. Line 73  to the configure command.
73  </P>  </P>
74  <br><a name="SEC3" href="#TOC1">UTF-8 SUPPORT</a><br>  <br><a name="SEC3" href="#TOC1">UTF-8 SUPPORT</a><br>
75  <P>  <P>
76  To build PCRE with support for UTF-8 character strings, add  To build PCRE with support for UTF-8 Unicode character strings, add
77  <pre>  <pre>
78    --enable-utf8    --enable-utf8
79  </pre>  </pre>
# Line 67  strings as UTF-8. As well as compiling P Line 82  strings as UTF-8. As well as compiling P
82  have to set the PCRE_UTF8 option when you call the <b>pcre_compile()</b>  have to set the PCRE_UTF8 option when you call the <b>pcre_compile()</b>
83  function.  function.
84  </P>  </P>
85    <P>
86    If you set --enable-utf8 when compiling in an EBCDIC environment, PCRE expects
87    its input to be either ASCII or UTF-8 (depending on the runtime option). It is
88    not possible to support both EBCDIC and UTF-8 codes in the same version of the
89    library. Consequently, --enable-utf8 and --enable-ebcdic are mutually
90    exclusive.
91    </P>
92  <br><a name="SEC4" href="#TOC1">UNICODE CHARACTER PROPERTY SUPPORT</a><br>  <br><a name="SEC4" href="#TOC1">UNICODE CHARACTER PROPERTY SUPPORT</a><br>
93  <P>  <P>
94  UTF-8 support allows PCRE to process character values greater than 255 in the  UTF-8 support allows PCRE to process character values greater than 255 in the
# Line 89  documentation. Line 111  documentation.
111  </P>  </P>
112  <br><a name="SEC5" href="#TOC1">CODE VALUE OF NEWLINE</a><br>  <br><a name="SEC5" href="#TOC1">CODE VALUE OF NEWLINE</a><br>
113  <P>  <P>
114  By default, PCRE interprets character 10 (linefeed, LF) as indicating the end  By default, PCRE interprets the linefeed (LF) character as indicating the end
115  of a line. This is the normal newline character on Unix-like systems. You can  of a line. This is the normal newline character on Unix-like systems. You can
116  compile PCRE to use character 13 (carriage return, CR) instead, by adding  compile PCRE to use carriage return (CR) instead, by adding
117  <pre>  <pre>
118    --enable-newline-is-cr    --enable-newline-is-cr
119  </pre>  </pre>
# Line 120  Whatever line ending convention is selec Line 142  Whatever line ending convention is selec
142  overridden when the library functions are called. At build time it is  overridden when the library functions are called. At build time it is
143  conventional to use the standard for your operating system.  conventional to use the standard for your operating system.
144  </P>  </P>
145  <br><a name="SEC6" href="#TOC1">BUILDING SHARED AND STATIC LIBRARIES</a><br>  <br><a name="SEC6" href="#TOC1">WHAT \R MATCHES</a><br>
146    <P>
147    By default, the sequence \R in a pattern matches any Unicode newline sequence,
148    whatever has been selected as the line ending sequence. If you specify
149    <pre>
150      --enable-bsr-anycrlf
151    </pre>
152    the default is changed so that \R matches only CR, LF, or CRLF. Whatever is
153    selected when PCRE is built can be overridden when the library functions are
154    called.
155    </P>
156    <br><a name="SEC7" href="#TOC1">BUILDING SHARED AND STATIC LIBRARIES</a><br>
157  <P>  <P>
158  The PCRE building process uses <b>libtool</b> to build both shared and static  The PCRE building process uses <b>libtool</b> to build both shared and static
159  Unix libraries by default. You can suppress one of these by adding one of  Unix libraries by default. You can suppress one of these by adding one of
# Line 130  Unix libraries by default. You can suppr Line 163  Unix libraries by default. You can suppr
163  </pre>  </pre>
164  to the <b>configure</b> command, as required.  to the <b>configure</b> command, as required.
165  </P>  </P>
166  <br><a name="SEC7" href="#TOC1">POSIX MALLOC USAGE</a><br>  <br><a name="SEC8" href="#TOC1">POSIX MALLOC USAGE</a><br>
167  <P>  <P>
168  When PCRE is called through the POSIX interface (see the  When PCRE is called through the POSIX interface (see the
169  <a href="pcreposix.html"><b>pcreposix</b></a>  <a href="pcreposix.html"><b>pcreposix</b></a>
# Line 146  such as Line 179  such as
179  </pre>  </pre>
180  to the <b>configure</b> command.  to the <b>configure</b> command.
181  </P>  </P>
182  <br><a name="SEC8" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br>  <br><a name="SEC9" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br>
183  <P>  <P>
184  Within a compiled pattern, offset values are used to point from one part to  Within a compiled pattern, offset values are used to point from one part to
185  another (for example, from an opening parenthesis to an alternation  another (for example, from an opening parenthesis to an alternation
# Line 162  to the configure command. The val Line 195  to the configure command. The val
195  longer offsets slows down the operation of PCRE because it has to load  longer offsets slows down the operation of PCRE because it has to load
196  additional bytes when handling them.  additional bytes when handling them.
197  </P>  </P>
198  <br><a name="SEC9" href="#TOC1">AVOIDING EXCESSIVE STACK USAGE</a><br>  <br><a name="SEC10" href="#TOC1">AVOIDING EXCESSIVE STACK USAGE</a><br>
199  <P>  <P>
200  When matching with the <b>pcre_exec()</b> function, PCRE implements backtracking  When matching with the <b>pcre_exec()</b> function, PCRE implements backtracking
201  by making recursive calls to an internal function called <b>match()</b>. In  by making recursive calls to an internal function called <b>match()</b>. In
# Line 193  perform better than malloc() and Line 226  perform better than malloc() and
226  slowly when built in this way. This option affects only the <b>pcre_exec()</b>  slowly when built in this way. This option affects only the <b>pcre_exec()</b>
227  function; it is not relevant for the the <b>pcre_dfa_exec()</b> function.  function; it is not relevant for the the <b>pcre_dfa_exec()</b> function.
228  </P>  </P>
229  <br><a name="SEC10" href="#TOC1">LIMITING PCRE RESOURCE USAGE</a><br>  <br><a name="SEC11" href="#TOC1">LIMITING PCRE RESOURCE USAGE</a><br>
230  <P>  <P>
231  Internally, PCRE has a function called <b>match()</b>, which it calls repeatedly  Internally, PCRE has a function called <b>match()</b>, which it calls repeatedly
232  (sometimes recursively) when matching a pattern with the <b>pcre_exec()</b>  (sometimes recursively) when matching a pattern with the <b>pcre_exec()</b>
# Line 222  constraints. However, you can set a lowe Line 255  constraints. However, you can set a lowe
255  </pre>  </pre>
256  to the <b>configure</b> command. This value can also be overridden at run time.  to the <b>configure</b> command. This value can also be overridden at run time.
257  </P>  </P>
258  <br><a name="SEC11" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>  <br><a name="SEC12" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>
259  <P>  <P>
260  PCRE uses fixed tables for processing characters whose code values are less  PCRE uses fixed tables for processing characters whose code values are less
261  than 256. By default, PCRE is built with a set of tables that are distributed  than 256. By default, PCRE is built with a set of tables that are distributed
# Line 239  compiling, because dftables is ru Line 272  compiling, because dftables is ru
272  create alternative tables when cross compiling, you will have to do so "by  create alternative tables when cross compiling, you will have to do so "by
273  hand".)  hand".)
274  </P>  </P>
275  <br><a name="SEC12" href="#TOC1">USING EBCDIC CODE</a><br>  <br><a name="SEC13" href="#TOC1">USING EBCDIC CODE</a><br>
276  <P>  <P>
277  PCRE assumes by default that it will run in an environment where the character  PCRE assumes by default that it will run in an environment where the character
278  code is ASCII (or Unicode, which is a superset of ASCII). This is the case for  code is ASCII (or Unicode, which is a superset of ASCII). This is the case for
# Line 250  EBCDIC environment by adding Line 283  EBCDIC environment by adding
283  </pre>  </pre>
284  to the <b>configure</b> command. This setting implies  to the <b>configure</b> command. This setting implies
285  --enable-rebuild-chartables. You should only use it if you know that you are in  --enable-rebuild-chartables. You should only use it if you know that you are in
286  an EBCDIC environment (for example, an IBM mainframe operating system).  an EBCDIC environment (for example, an IBM mainframe operating system). The
287    --enable-ebcdic option is incompatible with --enable-utf8.
288    </P>
289    <br><a name="SEC14" href="#TOC1">PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
290    <P>
291    By default, <b>pcregrep</b> reads all files as plain text. You can build it so
292    that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
293    them with <b>libz</b> or <b>libbz2</b>, respectively, by adding one or both of
294    <pre>
295      --enable-pcregrep-libz
296      --enable-pcregrep-libbz2
297    </pre>
298    to the <b>configure</b> command. These options naturally require that the
299    relevant libraries are installed on your system. Configuration will fail if
300    they are not.
301    </P>
302    <br><a name="SEC15" href="#TOC1">PCRETEST OPTION FOR LIBREADLINE SUPPORT</a><br>
303    <P>
304    If you add
305    <pre>
306      --enable-pcretest-libreadline
307    </pre>
308    to the <b>configure</b> command, <b>pcretest</b> is linked with the
309    <b>libreadline</b> library, and when its input is from a terminal, it reads it
310    using the <b>readline()</b> function. This provides line-editing and history
311    facilities. Note that <b>libreadline</b> is GPL-licenced, so if you distribute a
312    binary of <b>pcretest</b> linked in this way, there may be licensing issues.
313    </P>
314    <P>
315    Setting this option causes the <b>-lreadline</b> option to be added to the
316    <b>pcretest</b> build. In many operating environments with a sytem-installed
317    <b>libreadline</b> this is sufficient. However, in some environments (e.g.
318    if an unmodified distribution version of readline is in use), some extra
319    configuration may be necessary. The INSTALL file for <b>libreadline</b> says
320    this:
321    <pre>
322      "Readline uses the termcap functions, but does not link with the
323      termcap or curses library itself, allowing applications which link
324      with readline the to choose an appropriate library."
325    </pre>
326    If your environment has not been set up so that an appropriate library is
327    automatically included, you may need to add something like
328    <pre>
329      LIBS="-ncurses"
330    </pre>
331    immediately before the <b>configure</b> command.
332  </P>  </P>
333  <br><a name="SEC13" href="#TOC1">SEE ALSO</a><br>  <br><a name="SEC16" href="#TOC1">SEE ALSO</a><br>
334  <P>  <P>
335  <b>pcreapi</b>(3), <b>pcre_config</b>(3).  <b>pcreapi</b>(3), <b>pcre_config</b>(3).
336  </P>  </P>
337  <br><a name="SEC14" href="#TOC1">AUTHOR</a><br>  <br><a name="SEC17" href="#TOC1">AUTHOR</a><br>
338  <P>  <P>
339  Philip Hazel  Philip Hazel
340  <br>  <br>
# Line 265  University Computing Service Line 343  University Computing Service
343  Cambridge CB2 3QH, England.  Cambridge CB2 3QH, England.
344  <br>  <br>
345  </P>  </P>
346  <br><a name="SEC15" href="#TOC1">REVISION</a><br>  <br><a name="SEC18" href="#TOC1">REVISION</a><br>
347  <P>  <P>
348  Last updated: 30 July 2007  Last updated: 06 September 2009
349  <br>  <br>
350  Copyright &copy; 1997-2007 University of Cambridge.  Copyright &copy; 1997-2009 University of Cambridge.
351  <br>  <br>
352  <p>  <p>
353  Return to the <a href="index.html">PCRE index page</a>.  Return to the <a href="index.html">PCRE index page</a>.

Legend:
Removed from v.197  
changed lines
  Added in v.453

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12