| 729 |
.SH "LOCALE SUPPORT" |
.SH "LOCALE SUPPORT" |
| 730 |
.rs |
.rs |
| 731 |
.sp |
.sp |
| 732 |
PCRE handles caseless matching, and determines whether characters are letters |
PCRE handles caseless matching, and determines whether characters are letters, |
| 733 |
digits, or whatever, by reference to a set of tables, indexed by character |
digits, or whatever, by reference to a set of tables, indexed by character |
| 734 |
value. When running in UTF-8 mode, this applies only to characters with codes |
value. When running in UTF-8 mode, this applies only to characters with codes |
| 735 |
less than 128. Higher-valued codes never match escapes such as \ew or \ed, but |
less than 128. Higher-valued codes never match escapes such as \ew or \ed, but |
| 736 |
can be tested with \ep if PCRE is built with Unicode character property |
can be tested with \ep if PCRE is built with Unicode character property |
| 737 |
support. The use of locales with Unicode is discouraged. |
support. The use of locales with Unicode is discouraged. If you are handling |
| 738 |
.P |
characters with codes greater than 128, you should either use UTF-8 and |
| 739 |
An internal set of tables is created in the default C locale when PCRE is |
Unicode, or use locales, but not try to mix the two. |
| 740 |
built. This is used when the final argument of \fBpcre_compile()\fP is NULL, |
.P |
| 741 |
and is sufficient for many applications. An alternative set of tables can, |
PCRE contains an internal set of tables that are used when the final argument |
| 742 |
however, be supplied. These may be created in a different locale from the |
of \fBpcre_compile()\fP is NULL. These are sufficient for many applications. |
| 743 |
default. As more and more applications change to using Unicode, the need for |
Normally, the internal tables recognize only ASCII characters. However, when |
| 744 |
this locale support is expected to die away. |
PCRE is built, it is possible to cause the internal tables to be rebuilt in the |
| 745 |
|
default "C" locale of the local system, which may cause them to be different. |
| 746 |
|
.P |
| 747 |
|
The internal tables can always be overridden by tables supplied by the |
| 748 |
|
application that calls PCRE. These may be created in a different locale from |
| 749 |
|
the default. As more and more applications change to using Unicode, the need |
| 750 |
|
for this locale support is expected to die away. |
| 751 |
.P |
.P |
| 752 |
External tables are built by calling the \fBpcre_maketables()\fP function, |
External tables are built by calling the \fBpcre_maketables()\fP function, |
| 753 |
which has no arguments, in the relevant locale. The result can then be passed |
which has no arguments, in the relevant locale. The result can then be passed |
| 760 |
tables = pcre_maketables(); |
tables = pcre_maketables(); |
| 761 |
re = pcre_compile(..., tables); |
re = pcre_compile(..., tables); |
| 762 |
.sp |
.sp |
| 763 |
|
The locale name "fr_FR" is used on Linux and other Unix-like systems; if you |
| 764 |
|
are using Windows, the name for the French locale is "french". |
| 765 |
|
.P |
| 766 |
When \fBpcre_maketables()\fP runs, the tables are built in memory that is |
When \fBpcre_maketables()\fP runs, the tables are built in memory that is |
| 767 |
obtained via \fBpcre_malloc\fP. It is the caller's responsibility to ensure |
obtained via \fBpcre_malloc\fP. It is the caller's responsibility to ensure |
| 768 |
that the memory containing the tables remains available for as long as it is |
that the memory containing the tables remains available for as long as it is |