| 32 |
exists as well, but as it specifies the default, it is not described. |
exists as well, but as it specifies the default, it is not described. |
| 33 |
. |
. |
| 34 |
. |
. |
| 35 |
|
.SH "BUILDING 8-BIT and 16-BIT LIBRARIES" |
| 36 |
|
.rs |
| 37 |
|
.sp |
| 38 |
|
By default, a library called \fBlibpcre\fP is built, containing functions that |
| 39 |
|
take string arguments contained in vectors of bytes, either as single-byte |
| 40 |
|
characters, or interpreted as UTF-8 strings. You can also build a separate |
| 41 |
|
library, called \fBlibpcre16\fP, in which strings are contained in vectors of |
| 42 |
|
16-bit data units and interpreted either as single-unit characters or UTF-16 |
| 43 |
|
strings, by adding |
| 44 |
|
.sp |
| 45 |
|
--enable-pcre16 |
| 46 |
|
.sp |
| 47 |
|
to the \fBconfigure\fP command. If you do not want the 8-bit library, add |
| 48 |
|
.sp |
| 49 |
|
--disable-pcre8 |
| 50 |
|
.sp |
| 51 |
|
as well. At least one of the two libraries must be built. Note that the C++ and |
| 52 |
|
POSIX wrappers are for the 8-bit library only, and that \fBpcregrep\fP is an |
| 53 |
|
8-bit program. None of these are built if you select only the 16-bit library. |
| 54 |
|
. |
| 55 |
|
. |
| 56 |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
| 57 |
.rs |
.rs |
| 58 |
.sp |
.sp |
| 68 |
.SH "C++ SUPPORT" |
.SH "C++ SUPPORT" |
| 69 |
.rs |
.rs |
| 70 |
.sp |
.sp |
| 71 |
By default, the \fBconfigure\fP script will search for a C++ compiler and C++ |
By default, if the 8-bit library is being built, the \fBconfigure\fP script |
| 72 |
header files. If it finds them, it automatically builds the C++ wrapper library |
will search for a C++ compiler and C++ header files. If it finds them, it |
| 73 |
for PCRE. You can disable this by adding |
automatically builds the C++ wrapper library (which supports only 8-bit |
| 74 |
|
strings). You can disable this by adding |
| 75 |
.sp |
.sp |
| 76 |
--disable-cpp |
--disable-cpp |
| 77 |
.sp |
.sp |
| 78 |
to the \fBconfigure\fP command. |
to the \fBconfigure\fP command. |
| 79 |
. |
. |
| 80 |
. |
. |
| 81 |
.SH "UTF-8 SUPPORT" |
.SH "UTF-8 and UTF-16 SUPPORT" |
| 82 |
.rs |
.rs |
| 83 |
.sp |
.sp |
| 84 |
To build PCRE with support for UTF-8 Unicode character strings, add |
To build PCRE with support for UTF Unicode character strings, add |
| 85 |
.sp |
.sp |
| 86 |
--enable-utf8 |
--enable-utf |
| 87 |
.sp |
.sp |
| 88 |
to the \fBconfigure\fP command. Of itself, this does not make PCRE treat |
to the \fBconfigure\fP command. This setting applies to both libraries, adding |
| 89 |
strings as UTF-8. As well as compiling PCRE with this option, you also have |
support for UTF-8 to the 8-bit library and support for UTF-16 to the 16-bit |
| 90 |
have to set the PCRE_UTF8 option when you call the \fBpcre_compile()\fP |
library. It is not possible to build one library with UTF support and the other |
| 91 |
or \fBpcre_compile2()\fP functions. |
without in the same configuration. (For backwards compatibility, --enable-utf8 |
| 92 |
|
is a synonym of --enable-utf.) |
| 93 |
|
.P |
| 94 |
|
Of itself, this setting does not make PCRE treat strings as UTF-8 or UTF-16. As |
| 95 |
|
well as compiling PCRE with this option, you also have have to set the |
| 96 |
|
PCRE_UTF8 or PCRE_UTF16 option when you call one of the pattern compiling |
| 97 |
|
functions. |
| 98 |
.P |
.P |
| 99 |
If you set --enable-utf8 when compiling in an EBCDIC environment, PCRE expects |
If you set --enable-utf when compiling in an EBCDIC environment, PCRE expects |
| 100 |
its input to be either ASCII or UTF-8 (depending on the runtime option). It is |
its input to be either ASCII or UTF-8 (depending on the runtime option). It is |
| 101 |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
| 102 |
library. Consequently, --enable-utf8 and --enable-ebcdic are mutually |
library. Consequently, --enable-utf and --enable-ebcdic are mutually |
| 103 |
exclusive. |
exclusive. |
| 104 |
. |
. |
| 105 |
. |
. |
| 106 |
.SH "UNICODE CHARACTER PROPERTY SUPPORT" |
.SH "UNICODE CHARACTER PROPERTY SUPPORT" |
| 107 |
.rs |
.rs |
| 108 |
.sp |
.sp |
| 109 |
UTF-8 support allows PCRE to process character values greater than 255 in the |
UTF support allows the libraries to process character codepoints up to 0x10ffff |
| 110 |
strings that it handles. On its own, however, it does not provide any |
in the strings that they handle. On its own, however, it does not provide any |
| 111 |
facilities for accessing the properties of such characters. If you want to be |
facilities for accessing the properties of such characters. If you want to be |
| 112 |
able to use the pattern escapes \eP, \ep, and \eX, which refer to Unicode |
able to use the pattern escapes \eP, \ep, and \eX, which refer to Unicode |
| 113 |
character properties, you must add |
character properties, you must add |
| 114 |
.sp |
.sp |
| 115 |
--enable-unicode-properties |
--enable-unicode-properties |
| 116 |
.sp |
.sp |
| 117 |
to the \fBconfigure\fP command. This implies UTF-8 support, even if you have |
to the \fBconfigure\fP command. This implies UTF support, even if you have |
| 118 |
not explicitly requested it. |
not explicitly requested it. |
| 119 |
.P |
.P |
| 120 |
Including Unicode property support adds around 30K of tables to the PCRE |
Including Unicode property support adds around 30K of tables to the PCRE |
| 196 |
.SH "POSIX MALLOC USAGE" |
.SH "POSIX MALLOC USAGE" |
| 197 |
.rs |
.rs |
| 198 |
.sp |
.sp |
| 199 |
When PCRE is called through the POSIX interface (see the |
When the 8-bit library is called through the POSIX interface (see the |
| 200 |
.\" HREF |
.\" HREF |
| 201 |
\fBpcreposix\fP |
\fBpcreposix\fP |
| 202 |
.\" |
.\" |
| 221 |
metacharacter). By default, two-byte values are used for these offsets, leading |
metacharacter). By default, two-byte values are used for these offsets, leading |
| 222 |
to a maximum size for a compiled pattern of around 64K. This is sufficient to |
to a maximum size for a compiled pattern of around 64K. This is sufficient to |
| 223 |
handle all but the most gigantic patterns. Nevertheless, some people do want to |
handle all but the most gigantic patterns. Nevertheless, some people do want to |
| 224 |
process truyl enormous patterns, so it is possible to compile PCRE to use |
process truly enormous patterns, so it is possible to compile PCRE to use |
| 225 |
three-byte or four-byte offsets by adding a setting such as |
three-byte or four-byte offsets by adding a setting such as |
| 226 |
.sp |
.sp |
| 227 |
--with-link-size=3 |
--with-link-size=3 |
| 228 |
.sp |
.sp |
| 229 |
to the \fBconfigure\fP command. The value given must be 2, 3, or 4. Using |
to the \fBconfigure\fP command. The value given must be 2, 3, or 4. For the |
| 230 |
longer offsets slows down the operation of PCRE because it has to load |
16-bit library, a value of 3 is rounded up to 4. Using longer offsets slows |
| 231 |
additional bytes when handling them. |
down the operation of PCRE because it has to load additional data when handling |
| 232 |
|
them. |
| 233 |
. |
. |
| 234 |
. |
. |
| 235 |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
| 330 |
to the \fBconfigure\fP command. This setting implies |
to the \fBconfigure\fP command. This setting implies |
| 331 |
--enable-rebuild-chartables. You should only use it if you know that you are in |
--enable-rebuild-chartables. You should only use it if you know that you are in |
| 332 |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
| 333 |
--enable-ebcdic option is incompatible with --enable-utf8. |
--enable-ebcdic option is incompatible with --enable-utf. |
| 334 |
. |
. |
| 335 |
. |
. |
| 336 |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
| 400 |
.SH "SEE ALSO" |
.SH "SEE ALSO" |
| 401 |
.rs |
.rs |
| 402 |
.sp |
.sp |
| 403 |
\fBpcreapi\fP(3), \fBpcre_config\fP(3). |
\fBpcreapi\fP(3), \fBpcre16\fP, \fBpcre_config\fP(3). |
| 404 |
. |
. |
| 405 |
. |
. |
| 406 |
.SH AUTHOR |
.SH AUTHOR |
| 417 |
.rs |
.rs |
| 418 |
.sp |
.sp |
| 419 |
.nf |
.nf |
| 420 |
Last updated: 06 September 2011 |
Last updated: 07 January 2012 |
| 421 |
Copyright (c) 1997-2011 University of Cambridge. |
Copyright (c) 1997-2012 University of Cambridge. |
| 422 |
.fi |
.fi |