--- code/trunk/doc/html/pcreapi.html 2012/06/02 05:56:58 974 +++ code/trunk/doc/html/pcreapi.html 2012/06/02 11:03:06 975 @@ -317,7 +317,7 @@ strings: a single CR (carriage return) character, a single LF (linefeed) character, the two-character sequence CRLF, any of the three preceding, or any Unicode newline sequence. The Unicode newline sequences are the three just -mentioned, plus the single characters VT (vertical tab, U+000B), FF (formfeed, +mentioned, plus the single characters VT (vertical tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and PS (paragraph separator, U+2029).

@@ -641,8 +641,8 @@
   PCRE_EXTENDED
 
-If this bit is set, whitespace data characters in the pattern are totally -ignored except when escaped or inside a character class. Whitespace does not +If this bit is set, white space data characters in the pattern are totally +ignored except when escaped or inside a character class. White space does not include the VT character (code 11). In addition, characters between an unescaped # outside a character class and the next newline, inclusive, are also ignored. This is equivalent to Perl's /x option, and it can be changed within a @@ -659,7 +659,7 @@

This option makes it possible to include comments inside complicated patterns. -Note, however, that this applies only to data characters. Whitespace characters +Note, however, that this applies only to data characters. White space characters may never appear within special character sequences in a pattern, for example within the sequence (?( that introduces a conditional subpattern.

@@ -745,7 +745,7 @@
 preceding sequences should be recognized. Setting PCRE_NEWLINE_ANY specifies
 that any Unicode newline sequence should be recognized. The Unicode newline
 sequences are the three just mentioned, plus the single characters VT (vertical
-tab, U+000B), FF (formfeed, U+000C), NEL (next line, U+0085), LS (line
+tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
 separator, U+2028), and PS (paragraph separator, U+2029). For the 8-bit
 library, the last two are recognized only in UTF-8 mode.
 

@@ -759,7 +759,7 @@

The only time that a line break in a pattern is specially recognized when -compiling is when PCRE_EXTENDED is set. CR and LF are whitespace characters, +compiling is when PCRE_EXTENDED is set. CR and LF are white space characters, and so are ignored in this mode. Also, an unescaped # outside a character class indicates a comment that lasts until after the next line break sequence. In other circumstances, line break sequences in patterns are treated as literal @@ -916,6 +916,7 @@ 72 too many forward references 73 disallowed Unicode code point (>= 0xd800 && <= 0xdfff) 74 invalid UTF-16 string (specifically UTF-16) + 75 name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)

The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may be used if the limits were changed when PCRE was built. @@ -950,7 +951,7 @@

The second argument of pcre_study() contains option bits. There are three -options: +options:

   PCRE_STUDY_JIT_COMPILE
   PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE
@@ -1231,7 +1232,7 @@
 
Return the number of characters (NB not bytes) in the longest lookbehind assertion in the pattern. Note that the simple assertions \b and \B require a -one-character lookbehind. This information is useful when doing multi-segment +one-character lookbehind. This information is useful when doing multi-segment matching using the partial matching facilities.
   PCRE_INFO_MINLENGTH
@@ -1506,7 +1507,7 @@
 Limiting the recursion depth limits the amount of machine stack that can be
 used, or, when PCRE has been compiled to use memory on the heap instead of the
 stack, the amount of heap memory that can be used. This limit is not relevant,
-and is ignored, when matching is done using JIT compiled code. 
+and is ignored, when matching is done using JIT compiled code.
 

The default value for match_limit_recursion can be set when PCRE is @@ -1689,7 +1690,7 @@ "no match", the callouts do occur, and that items such as (*COMMIT) and (*MARK) are considered at every possible starting position in the subject string. If PCRE_NO_START_OPTIMIZE is set at compile time, it cannot be unset at matching -time. The use of PCRE_NO_START_OPTIMIZE disables JIT execution; when it is set, +time. The use of PCRE_NO_START_OPTIMIZE disables JIT execution; when it is set, matching is always done using interpretively.

@@ -2084,12 +2085,12 @@ pcrejit documentation for more details.

-  PCRE_ERROR_BADMODE (-28)
+  PCRE_ERROR_BADMODE        (-28)
 
This error is given if a pattern that was compiled by the 8-bit library is passed to a 16-bit library function, or vice versa.
-  PCRE_ERROR_BADENDIANNESS (-29)
+  PCRE_ERROR_BADENDIANNESS  (-29)
 
This error is given if a pattern that was compiled and saved is reloaded on a host with different endianness. The utility function @@ -2097,7 +2098,7 @@ so that it runs on the new host.

-Error numbers -16 to -20 and -22 are not used by pcre_exec(). +Error numbers -16 to -20, -22, and -30 are not used by pcre_exec().


Reason codes for invalid UTF-8 strings @@ -2592,6 +2593,13 @@ recursively, using private vectors for ovector and workspace. This error is given if the output vector is not large enough. This should be extremely rare, as a vector of size 1000 is used. +
+  PCRE_ERROR_DFA_BADRESTART (-30)
+
+When pcre_dfa_exec() is called with the PCRE_DFA_RESTART option, +some plausibility checks are made on the contents of the workspace, which +should contain data about the previous partial match. If any of these checks +fail, this error is given.


SEE ALSO

@@ -2610,7 +2618,7 @@


REVISION

-Last updated: 14 April 2012 +Last updated: 04 May 2012
Copyright © 1997-2012 University of Cambridge.