--- code/trunk/doc/html/pcreapi.html 2007/02/24 21:40:20 70 +++ code/trunk/doc/html/pcreapi.html 2007/02/24 21:40:24 71 @@ -442,6 +442,21 @@ pcre page.
++
+ PCRE_NO_UTF8_CHECK ++ +
+When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is +automatically checked. If an invalid UTF-8 sequence of bytes is found, +pcre_compile() returns an error. If you already know that your pattern is +valid, and you want to skip this check for performance reasons, you can set the +PCRE_NO_UTF8_CHECK option. When it is set, the effect of passing an invalid +UTF-8 string as a pattern is undefined. It may cause your program to crash. +Note that there is a similar option for suppressing the checking of subject +strings passed to pcre_exec(). +
pcre_extra *pcre_study(const pcre *code, int options, @@ -862,6 +877,15 @@ unachored at matching time.
+When PCRE_UTF8 was set at compile time, the validity of the subject as a UTF-8 +string is automatically checked. If an invalid UTF-8 sequence of bytes is +found, pcre_exec() returns the error PCRE_ERROR_BADUTF8. If you already +know that your subject is valid, and you want to skip this check for +performance reasons, you can set the PCRE_NO_UTF8_CHECK option when calling +pcre_exec(). When this option is set, the effect of passing an invalid +UTF-8 string as a subject is undefined. It may cause your program to crash. +
+There are also three further options that can be set only at matching time:
@@ -1106,6 +1130,14 @@ use by callout functions that want to yield a distinctive error code. See the pcrecallout documentation for details.
++
+ PCRE_ERROR_BADUTF8 (-10) ++ +
+A string that contains an invalid UTF-8 byte sequence was passed as a subject. +
int pcre_copy_substring(const char *subject, int *ovector, @@ -1257,6 +1289,6 @@ appropriate.
-Last updated: 03 February 2003
+Last updated: 20 August 2003
Copyright © 1997-2003 University of Cambridge.