--- code/trunk/doc/html/pcretest.html 2010/11/16 17:51:37 571 +++ code/trunk/doc/html/pcretest.html 2010/11/17 17:55:57 572 @@ -385,7 +385,8 @@ \t tab (\x09) \v vertical tab (\x0b) \nnn octal character (up to 3 octal digits) - \xhh hexadecimal character (up to 2 hex digits) + always a byte unless > 255 in UTF-8 mode + \xhh hexadecimal byte (up to 2 hex digits) \x{hh...} hexadecimal character, any number of digits in UTF-8 mode \A pass the PCRE_ANCHORED option to pcre_exec() or pcre_dfa_exec() \B pass the PCRE_NOTBOL option to pcre_exec() or pcre_dfa_exec() @@ -423,6 +424,14 @@ \<anycrlf> pass the PCRE_NEWLINE_ANYCRLF option to pcre_exec() or pcre_dfa_exec() \<any> pass the PCRE_NEWLINE_ANY option to pcre_exec() or pcre_dfa_exec() +Note that \xhh always specifies one byte, even in UTF-8 mode; this makes it +possible to construct invalid UTF-8 sequences for testing purposes. On the +other hand, \x{hh} is interpreted as a UTF-8 character in UTF-8 mode, +generating more than one byte if the value is greater than 127. When not in +UTF-8 mode, it generates one byte for values less than 256, and causes an error +for greater values. +

+

The escapes that specify line ending sequences are literal strings, exactly as shown. No more than one newline setting should be present in any data line.

@@ -747,7 +756,7 @@


REVISION

-Last updated: 06 November 2010 +Last updated: 07 November 2010
Copyright © 1997-2010 University of Cambridge.