| 1 |
ChangeLog for PCRE |
ChangeLog for PCRE |
| 2 |
------------------ |
------------------ |
| 3 |
|
|
| 4 |
Version 4.00 17-Feb-03 |
Version 4.1 12-Mar-03 |
| 5 |
---------------------- |
--------------------- |
| 6 |
|
|
| 7 |
|
1. Compiling with gcc -pedantic found a couple of places where casts were |
| 8 |
|
needed, and a string in dftables.c that was longer than standard compilers are |
| 9 |
|
required to support. |
| 10 |
|
|
| 11 |
|
2. Compiling with Sun's compiler found a few more places where the code could |
| 12 |
|
be tidied up in order to avoid warnings. |
| 13 |
|
|
| 14 |
|
3. The variables for cross-compiling were called HOST_CC and HOST_CFLAGS; the |
| 15 |
|
first of these names is deprecated in the latest Autoconf in favour of the name |
| 16 |
|
CC_FOR_BUILD, because "host" is typically used to mean the system on which the |
| 17 |
|
compiled code will be run. I can't find a reference for HOST_CFLAGS, but by |
| 18 |
|
analogy I have changed it to CFLAGS_FOR_BUILD. |
| 19 |
|
|
| 20 |
|
4. Added -no-undefined to the linking command in the Makefile, because this is |
| 21 |
|
apparently helpful for Windows. To make it work, also added "-L. -lpcre" to the |
| 22 |
|
linking step for the pcreposix library. |
| 23 |
|
|
| 24 |
|
5. PCRE was failing to diagnose the case of two named groups with the same |
| 25 |
|
name. |
| 26 |
|
|
| 27 |
|
6. A problem with one of PCRE's optimizations was discovered. PCRE remembers a |
| 28 |
|
literal character that is needed in the subject for a match, and scans along to |
| 29 |
|
ensure that it is present before embarking on the full matching process. This |
| 30 |
|
saves time in cases of nested unlimited repeats that are never going to match. |
| 31 |
|
Problem: the scan can take a lot of time if the subject is very long (e.g. |
| 32 |
|
megabytes), thus penalizing straightforward matches. It is now done only if the |
| 33 |
|
amount of subject to be scanned is less than 1000 bytes. |
| 34 |
|
|
| 35 |
|
7. A lesser problem with the same optimization is that it was recording the |
| 36 |
|
first character of an anchored pattern as "needed", thus provoking a search |
| 37 |
|
right along the subject, even when the first match of the pattern was going to |
| 38 |
|
fail. The "needed" character is now not set for anchored patterns, unless it |
| 39 |
|
follows something in the pattern that is of non-fixed length. Thus, it still |
| 40 |
|
fulfils its original purpose of finding quick non-matches in cases of nested |
| 41 |
|
unlimited repeats, but isn't used for simple anchored patterns such as /^abc/. |
| 42 |
|
|
| 43 |
|
|
| 44 |
|
Version 4.0 17-Feb-03 |
| 45 |
|
--------------------- |
| 46 |
|
|
| 47 |
1. If a comment in an extended regex that started immediately after a meta-item |
1. If a comment in an extended regex that started immediately after a meta-item |
| 48 |
extended to the end of string, PCRE compiled incorrect data. This could lead to |
extended to the end of string, PCRE compiled incorrect data. This could lead to |