| 601 |
PCRE_NO_UTF8_CHECK |
PCRE_NO_UTF8_CHECK |
| 602 |
.sp |
.sp |
| 603 |
When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is |
When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is |
| 604 |
automatically checked. If an invalid UTF-8 sequence of bytes is found, |
automatically checked. Note that the check is for a syntactically valid UTF-8 |
| 605 |
\fBpcre_compile()\fP returns an error. If you already know that your pattern is |
byte string, as defined by RFC 2279. It is \fInot\fP a check for a UTF-8 string |
| 606 |
valid, and you want to skip this check for performance reasons, you can set the |
of assigned or allowable Unicode code points. |
| 607 |
PCRE_NO_UTF8_CHECK option. When it is set, the effect of passing an invalid |
.P |
| 608 |
UTF-8 string as a pattern is undefined. It may cause your program to crash. |
If an invalid UTF-8 sequence of bytes is found, \fBpcre_compile()\fP returns an |
| 609 |
Note that this option can also be passed to \fBpcre_exec()\fP and |
error. If you already know that your pattern is valid, and you want to skip |
| 610 |
\fBpcre_dfa_exec()\fP, to suppress the UTF-8 validity checking of subject |
this check for performance reasons, you can set the PCRE_NO_UTF8_CHECK option. |
| 611 |
strings. |
When it is set, the effect of passing an invalid UTF-8 string as a pattern is |
| 612 |
|
undefined. It may cause your program to crash. Note that this option can also |
| 613 |
|
be passed to \fBpcre_exec()\fP and \fBpcre_dfa_exec()\fP, to suppress the UTF-8 |
| 614 |
|
validity checking of subject strings. |
| 615 |
. |
. |
| 616 |
. |
. |
| 617 |
.SH "COMPILATION ERROR CODES" |
.SH "COMPILATION ERROR CODES" |
| 1234 |
.sp |
.sp |
| 1235 |
When PCRE_UTF8 is set at compile time, the validity of the subject as a UTF-8 |
When PCRE_UTF8 is set at compile time, the validity of the subject as a UTF-8 |
| 1236 |
string is automatically checked when \fBpcre_exec()\fP is subsequently called. |
string is automatically checked when \fBpcre_exec()\fP is subsequently called. |
| 1237 |
The value of \fIstartoffset\fP is also checked to ensure that it points to the |
Note that the check is for a syntactically valid UTF-8 byte string, as defined |
| 1238 |
start of a UTF-8 character. If an invalid UTF-8 sequence of bytes is found, |
by RFC 2279. It is \fInot\fP a check for a UTF-8 string of assigned or |
| 1239 |
\fBpcre_exec()\fP returns the error PCRE_ERROR_BADUTF8. If \fIstartoffset\fP |
allowable Unicode code points. The value of \fIstartoffset\fP is also checked |
| 1240 |
contains an invalid value, PCRE_ERROR_BADUTF8_OFFSET is returned. |
to ensure that it points to the start of a UTF-8 character. If an invalid UTF-8 |
| 1241 |
|
sequence of bytes is found, \fBpcre_exec()\fP returns the error |
| 1242 |
|
PCRE_ERROR_BADUTF8. If \fIstartoffset\fP contains an invalid value, |
| 1243 |
|
PCRE_ERROR_BADUTF8_OFFSET is returned. |
| 1244 |
.P |
.P |
| 1245 |
If you already know that your subject is valid, and you want to skip these |
If you already know that your subject is valid, and you want to skip these |
| 1246 |
checks for performance reasons, you can set the PCRE_NO_UTF8_CHECK option when |
checks for performance reasons, you can set the PCRE_NO_UTF8_CHECK option when |
| 1641 |
.\" HREF |
.\" HREF |
| 1642 |
\fBpcrepattern\fP |
\fBpcrepattern\fP |
| 1643 |
.\" |
.\" |
| 1644 |
documentation. When duplicates are present, \fBpcre_copy_named_substring()\fP |
documentation. |
| 1645 |
and \fBpcre_get_named_substring()\fP return the first substring corresponding |
.P |
| 1646 |
to the given name that is set. If none are set, an empty string is returned. |
When duplicates are present, \fBpcre_copy_named_substring()\fP and |
| 1647 |
The \fBpcre_get_stringnumber()\fP function returns one of the numbers that are |
\fBpcre_get_named_substring()\fP return the first substring corresponding to |
| 1648 |
associated with the name, but it is not defined which it is. |
the given name that is set. If none are set, PCRE_ERROR_NOSUBSTRING (-7) is |
| 1649 |
.sp |
returned; no data is returned. The \fBpcre_get_stringnumber()\fP function |
| 1650 |
|
returns one of the numbers that are associated with the name, but it is not |
| 1651 |
|
defined which it is. |
| 1652 |
|
.P |
| 1653 |
If you want to get full details of all captured substrings for a given name, |
If you want to get full details of all captured substrings for a given name, |
| 1654 |
you must use the \fBpcre_get_stringtable_entries()\fP function. The first |
you must use the \fBpcre_get_stringtable_entries()\fP function. The first |
| 1655 |
argument is the compiled pattern, and the second is the name. The third and |
argument is the compiled pattern, and the second is the name. The third and |
| 1874 |
.rs |
.rs |
| 1875 |
.sp |
.sp |
| 1876 |
.nf |
.nf |
| 1877 |
Last updated: 30 July 2007 |
Last updated: 07 August 2007 |
| 1878 |
Copyright (c) 1997-2007 University of Cambridge. |
Copyright (c) 1997-2007 University of Cambridge. |
| 1879 |
.fi |
.fi |