| 23 |
The original operation of PCRE was on strings of one-byte characters. However, |
The original operation of PCRE was on strings of one-byte characters. However, |
| 24 |
there is now also support for UTF-8 character strings. To use this, you must |
there is now also support for UTF-8 character strings. To use this, you must |
| 25 |
build PCRE to include UTF-8 support, and then call \fBpcre_compile()\fP with |
build PCRE to include UTF-8 support, and then call \fBpcre_compile()\fP with |
| 26 |
the PCRE_UTF8 option. There is also a special sequence that can be given at the |
the PCRE_UTF8 option. There is also a special sequence that can be given at the |
| 27 |
start of a pattern: |
start of a pattern: |
| 28 |
.sp |
.sp |
| 29 |
(*UTF8) |
(*UTF8) |
| 30 |
.sp |
.sp |
| 31 |
Starting a pattern with this sequence is equivalent to setting the PCRE_UTF8 |
Starting a pattern with this sequence is equivalent to setting the PCRE_UTF8 |
| 32 |
option. This feature is not Perl-compatible. How setting UTF-8 mode affects |
option. This feature is not Perl-compatible. How setting UTF-8 mode affects |
| 33 |
pattern matching is mentioned in several places below. There is also a summary |
pattern matching is mentioned in several places below. There is also a summary |
| 1071 |
.\" </a> |
.\" </a> |
| 1072 |
"Newline sequences" |
"Newline sequences" |
| 1073 |
.\" |
.\" |
| 1074 |
above. There is also the (*UTF8) leading sequence that can be used to set UTF-8 |
above. There is also the (*UTF8) leading sequence that can be used to set UTF-8 |
| 1075 |
mode; this is equivalent to setting the PCRE_UTF8 option. |
mode; this is equivalent to setting the PCRE_UTF8 option. |
| 1076 |
. |
. |
| 1077 |
. |
. |