| 442 |
<a href="pcre.html"><b>pcre</b></a> |
<a href="pcre.html"><b>pcre</b></a> |
| 443 |
page. |
page. |
| 444 |
</P> |
</P> |
| 445 |
|
<P> |
| 446 |
|
<pre> |
| 447 |
|
PCRE_NO_UTF8_CHECK |
| 448 |
|
</PRE> |
| 449 |
|
</P> |
| 450 |
|
<P> |
| 451 |
|
When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is |
| 452 |
|
automatically checked. If an invalid UTF-8 sequence of bytes is found, |
| 453 |
|
<b>pcre_compile()</b> returns an error. If you already know that your pattern is |
| 454 |
|
valid, and you want to skip this check for performance reasons, you can set the |
| 455 |
|
PCRE_NO_UTF8_CHECK option. When it is set, the effect of passing an invalid |
| 456 |
|
UTF-8 string as a pattern is undefined. It may cause your program to crash. |
| 457 |
|
Note that there is a similar option for suppressing the checking of subject |
| 458 |
|
strings passed to <b>pcre_exec()</b>. |
| 459 |
|
</P> |
| 460 |
<br><a name="SEC6" href="#TOC1">STUDYING A PATTERN</a><br> |
<br><a name="SEC6" href="#TOC1">STUDYING A PATTERN</a><br> |
| 461 |
<P> |
<P> |
| 462 |
<b>pcre_extra *pcre_study(const pcre *<i>code</i>, int <i>options</i>,</b> |
<b>pcre_extra *pcre_study(const pcre *<i>code</i>, int <i>options</i>,</b> |
| 877 |
unachored at matching time. |
unachored at matching time. |
| 878 |
</P> |
</P> |
| 879 |
<P> |
<P> |
| 880 |
|
When PCRE_UTF8 was set at compile time, the validity of the subject as a UTF-8 |
| 881 |
|
string is automatically checked. If an invalid UTF-8 sequence of bytes is |
| 882 |
|
found, <b>pcre_exec()</b> returns the error PCRE_ERROR_BADUTF8. If you already |
| 883 |
|
know that your subject is valid, and you want to skip this check for |
| 884 |
|
performance reasons, you can set the PCRE_NO_UTF8_CHECK option when calling |
| 885 |
|
<b>pcre_exec()</b>. When this option is set, the effect of passing an invalid |
| 886 |
|
UTF-8 string as a subject is undefined. It may cause your program to crash. |
| 887 |
|
</P> |
| 888 |
|
<P> |
| 889 |
There are also three further options that can be set only at matching time: |
There are also three further options that can be set only at matching time: |
| 890 |
</P> |
</P> |
| 891 |
<P> |
<P> |
| 1130 |
use by callout functions that want to yield a distinctive error code. See the |
use by callout functions that want to yield a distinctive error code. See the |
| 1131 |
<b>pcrecallout</b> documentation for details. |
<b>pcrecallout</b> documentation for details. |
| 1132 |
</P> |
</P> |
| 1133 |
|
<P> |
| 1134 |
|
<pre> |
| 1135 |
|
PCRE_ERROR_BADUTF8 (-10) |
| 1136 |
|
</PRE> |
| 1137 |
|
</P> |
| 1138 |
|
<P> |
| 1139 |
|
A string that contains an invalid UTF-8 byte sequence was passed as a subject. |
| 1140 |
|
</P> |
| 1141 |
<br><a name="SEC11" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br> |
<br><a name="SEC11" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br> |
| 1142 |
<P> |
<P> |
| 1143 |
<b>int pcre_copy_substring(const char *<i>subject</i>, int *<i>ovector</i>,</b> |
<b>int pcre_copy_substring(const char *<i>subject</i>, int *<i>ovector</i>,</b> |
| 1289 |
appropriate. |
appropriate. |
| 1290 |
</P> |
</P> |
| 1291 |
<P> |
<P> |
| 1292 |
Last updated: 03 February 2003 |
Last updated: 20 August 2003 |
| 1293 |
<br> |
<br> |
| 1294 |
Copyright © 1997-2003 University of Cambridge. |
Copyright © 1997-2003 University of Cambridge. |