| 1 |
ChangeLog for PCRE |
ChangeLog for PCRE |
| 2 |
------------------ |
------------------ |
| 3 |
|
|
| 4 |
|
Version 4.5 01-Dec-03 |
| 5 |
|
--------------------- |
| 6 |
|
|
| 7 |
|
1. There has been some re-arrangement of the code for the match() function so |
| 8 |
|
that it can be compiled in a version that does not call itself recursively. |
| 9 |
|
Instead, it keeps those local variables that need separate instances for |
| 10 |
|
each "recursion" in a frame on the heap, and gets/frees frames whenever it |
| 11 |
|
needs to "recurse". Keeping track of where control must go is done by means |
| 12 |
|
of setjmp/longjmp. The whole thing is implemented by a set of macros that |
| 13 |
|
hide most of the details from the main code, and operates only if |
| 14 |
|
NO_RECURSE is defined while compiling pcre.c. If PCRE is built using the |
| 15 |
|
"configure" mechanism, "--disable-stack-for-recursion" turns on this way of |
| 16 |
|
operating. |
| 17 |
|
|
| 18 |
|
To make it easier for callers to provide specially tailored get/free |
| 19 |
|
functions for this usage, two new functions, pcre_stack_malloc, and |
| 20 |
|
pcre_stack_free, are used. They are always called in strict stacking order, |
| 21 |
|
and the size of block requested is always the same. |
| 22 |
|
|
| 23 |
|
The PCRE_CONFIG_STACKRECURSE info parameter can be used to find out whether |
| 24 |
|
PCRE has been compiled to use the stack or the heap for recursion. The |
| 25 |
|
-C option of pcretest uses this to show which version is compiled. |
| 26 |
|
|
| 27 |
|
A new data escape \S, is added to pcretest; it causes the amounts of store |
| 28 |
|
obtained and freed by both kinds of malloc/free at match time to be added |
| 29 |
|
to the output. |
| 30 |
|
|
| 31 |
|
2. Changed the locale test to use "fr_FR" instead of "fr" because that's |
| 32 |
|
what's available on my current Linux desktop machine. |
| 33 |
|
|
| 34 |
|
3. When matching a UTF-8 string, the test for a valid string at the start has |
| 35 |
|
been extended. If start_offset is not zero, PCRE now checks that it points |
| 36 |
|
to a byte that is the start of a UTF-8 character. If not, it returns |
| 37 |
|
PCRE_ERROR_BADUTF8_OFFSET (-11). Note: the whole string is still checked; |
| 38 |
|
this is necessary because there may be backward assertions in the pattern. |
| 39 |
|
When matching the same subject several times, it may save resources to use |
| 40 |
|
PCRE_NO_UTF8_CHECK on all but the first call if the string is long. |
| 41 |
|
|
| 42 |
|
4. The code for checking the validity of UTF-8 strings has been tightened so |
| 43 |
|
that it rejects (a) strings containing 0xfe or 0xff bytes and (b) strings |
| 44 |
|
containing "overlong sequences". |
| 45 |
|
|
| 46 |
|
5. Fixed a bug (appearing twice) that I could not find any way of exploiting! |
| 47 |
|
I had written "if ((digitab[*p++] && chtab_digit) == 0)" where the "&&" |
| 48 |
|
should have been "&", but it just so happened that all the cases this let |
| 49 |
|
through by mistake were picked up later in the function. |
| 50 |
|
|
| 51 |
|
6. I had used a variable called "isblank" - this is a C99 function, causing |
| 52 |
|
some compilers to warn. To avoid this, I renamed it (as "blankclass"). |
| 53 |
|
|
| 54 |
|
7. Cosmetic: (a) only output another newline at the end of pcretest if it is |
| 55 |
|
prompting; (b) run "./pcretest /dev/null" at the start of the test script |
| 56 |
|
so the version is shown; (c) stop "make test" echoing "./RunTest". |
| 57 |
|
|
| 58 |
|
8. Added patches from David Burgess to enable PCRE to run on EBCDIC systems. |
| 59 |
|
|
| 60 |
|
9. The prototype for memmove() for systems that don't have it was using |
| 61 |
|
size_t, but the inclusion of the header that defines size_t was later. I've |
| 62 |
|
moved the #includes for the C headers earlier to avoid this. |
| 63 |
|
|
| 64 |
|
10. Added some adjustments to the code to make it easier to compiler on certain |
| 65 |
|
special systems: |
| 66 |
|
|
| 67 |
|
(a) Some "const" qualifiers were missing. |
| 68 |
|
(b) Added the macro EXPORT before all exported functions; by default this |
| 69 |
|
is defined to be empty. |
| 70 |
|
(c) Changed the dftables auxiliary program (that builds chartables.c) so |
| 71 |
|
that it reads its output file name as an argument instead of writing |
| 72 |
|
to the standard output and assuming this can be redirected. |
| 73 |
|
|
| 74 |
|
11. In UTF-8 mode, if a recursive reference (e.g. (?1)) followed a character |
| 75 |
|
class containing characters with values greater than 255, PCRE compilation |
| 76 |
|
went into a loop. |
| 77 |
|
|
| 78 |
|
12. A recursive reference to a subpattern that was within another subpattern |
| 79 |
|
that had a minimum quantifier of zero caused PCRE to crash. For example, |
| 80 |
|
(x(y(?2))z)? provoked this bug with a subject that got as far as the |
| 81 |
|
recursion. If the recursively-called subpattern itself had a zero repeat, |
| 82 |
|
that was OK. |
| 83 |
|
|
| 84 |
|
13. In pcretest, the buffer for reading a data line was set at 30K, but the |
| 85 |
|
buffer into which it was copied (for escape processing) was still set at |
| 86 |
|
1024, so long lines caused crashes. |
| 87 |
|
|
| 88 |
|
14. A pattern such as /[ab]{1,3}+/ failed to compile, giving the error |
| 89 |
|
"internal error: code overflow...". This applied to any character class |
| 90 |
|
that was followed by a possessive quantifier. |
| 91 |
|
|
| 92 |
|
15. Modified the Makefile to add libpcre.la as a prerequisite for |
| 93 |
|
libpcreposix.la because I was told this is needed for a parallel build to |
| 94 |
|
work. |
| 95 |
|
|
| 96 |
|
16. If a pattern that contained .* following optional items at the start was |
| 97 |
|
studied, the wrong optimizing data was generated, leading to matching |
| 98 |
|
errors. For example, studying /[ab]*.*c/ concluded, erroneously, that any |
| 99 |
|
matching string must start with a or b or c. The correct conclusion for |
| 100 |
|
this pattern is that a match can start with any character. |
| 101 |
|
|
| 102 |
|
|
| 103 |
Version 4.4 13-Aug-03 |
Version 4.4 13-Aug-03 |
| 104 |
--------------------- |
--------------------- |
| 105 |
|
|