| 1 |
ChangeLog for PCRE |
ChangeLog for PCRE |
| 2 |
------------------ |
------------------ |
| 3 |
|
|
| 4 |
Version 4.00 17-Feb-03 |
Version 4.2 14-Apr-03 |
| 5 |
---------------------- |
--------------------- |
| 6 |
|
|
| 7 |
|
1. Typo "#if SUPPORT_UTF8" instead of "#ifdef SUPPORT_UTF8" fixed. |
| 8 |
|
|
| 9 |
|
2. Changes to the building process, supplied by Ronald Landheer-Cieslak |
| 10 |
|
[ON_WINDOWS]: new variable, "#" on non-Windows platforms |
| 11 |
|
[NOT_ON_WINDOWS]: new variable, "#" on Windows platforms |
| 12 |
|
[WIN_PREFIX]: new variable, "cyg" for Cygwin |
| 13 |
|
* Makefile.in: use autoconf substitution for OBJEXT, EXEEXT, BUILD_OBJEXT |
| 14 |
|
and BUILD_EXEEXT |
| 15 |
|
Note: automatic setting of the BUILD variables is not yet working |
| 16 |
|
set CPPFLAGS and BUILD_CPPFLAGS (but don't use yet) - should be used at |
| 17 |
|
compile-time but not at link-time |
| 18 |
|
[LINK]: use for linking executables only |
| 19 |
|
make different versions for Windows and non-Windows |
| 20 |
|
[LINKLIB]: new variable, copy of UNIX-style LINK, used for linking |
| 21 |
|
libraries |
| 22 |
|
[LINK_FOR_BUILD]: new variable |
| 23 |
|
[OBJEXT]: use throughout |
| 24 |
|
[EXEEXT]: use throughout |
| 25 |
|
<winshared>: new target |
| 26 |
|
<wininstall>: new target |
| 27 |
|
<dftables.o>: use native compiler |
| 28 |
|
<dftables>: use native linker |
| 29 |
|
<install>: handle Windows platform correctly |
| 30 |
|
<clean>: ditto |
| 31 |
|
<check>: ditto |
| 32 |
|
copy DLL to top builddir before testing |
| 33 |
|
|
| 34 |
|
As part of these changes, -no-undefined was removed again. This was reported |
| 35 |
|
to give trouble on HP-UX 11.0, so getting rid of it seems like a good idea |
| 36 |
|
in any case. |
| 37 |
|
|
| 38 |
|
3. Some tidies to get rid of compiler warnings: |
| 39 |
|
|
| 40 |
|
. In the match_data structure, match_limit was an unsigned long int, whereas |
| 41 |
|
match_call_count was an int. I've made them both unsigned long ints. |
| 42 |
|
|
| 43 |
|
. In pcretest the fact that a const uschar * doesn't automatically cast to |
| 44 |
|
a void * provoked a warning. |
| 45 |
|
|
| 46 |
|
. Turning on some more compiler warnings threw up some "shadow" variables |
| 47 |
|
and a few more missing casts. |
| 48 |
|
|
| 49 |
|
4. If PCRE was complied with UTF-8 support, but called without the PCRE_UTF8 |
| 50 |
|
option, a class that contained a single character with a value between 128 |
| 51 |
|
and 255 (e.g. /[\xFF]/) caused PCRE to crash. |
| 52 |
|
|
| 53 |
|
5. If PCRE was compiled with UTF-8 support, but called without the PCRE_UTF8 |
| 54 |
|
option, a class that contained several characters, but with at least one |
| 55 |
|
whose value was between 128 and 255 caused PCRE to crash. |
| 56 |
|
|
| 57 |
|
|
| 58 |
|
Version 4.1 12-Mar-03 |
| 59 |
|
--------------------- |
| 60 |
|
|
| 61 |
|
1. Compiling with gcc -pedantic found a couple of places where casts were |
| 62 |
|
needed, and a string in dftables.c that was longer than standard compilers are |
| 63 |
|
required to support. |
| 64 |
|
|
| 65 |
|
2. Compiling with Sun's compiler found a few more places where the code could |
| 66 |
|
be tidied up in order to avoid warnings. |
| 67 |
|
|
| 68 |
|
3. The variables for cross-compiling were called HOST_CC and HOST_CFLAGS; the |
| 69 |
|
first of these names is deprecated in the latest Autoconf in favour of the name |
| 70 |
|
CC_FOR_BUILD, because "host" is typically used to mean the system on which the |
| 71 |
|
compiled code will be run. I can't find a reference for HOST_CFLAGS, but by |
| 72 |
|
analogy I have changed it to CFLAGS_FOR_BUILD. |
| 73 |
|
|
| 74 |
|
4. Added -no-undefined to the linking command in the Makefile, because this is |
| 75 |
|
apparently helpful for Windows. To make it work, also added "-L. -lpcre" to the |
| 76 |
|
linking step for the pcreposix library. |
| 77 |
|
|
| 78 |
|
5. PCRE was failing to diagnose the case of two named groups with the same |
| 79 |
|
name. |
| 80 |
|
|
| 81 |
|
6. A problem with one of PCRE's optimizations was discovered. PCRE remembers a |
| 82 |
|
literal character that is needed in the subject for a match, and scans along to |
| 83 |
|
ensure that it is present before embarking on the full matching process. This |
| 84 |
|
saves time in cases of nested unlimited repeats that are never going to match. |
| 85 |
|
Problem: the scan can take a lot of time if the subject is very long (e.g. |
| 86 |
|
megabytes), thus penalizing straightforward matches. It is now done only if the |
| 87 |
|
amount of subject to be scanned is less than 1000 bytes. |
| 88 |
|
|
| 89 |
|
7. A lesser problem with the same optimization is that it was recording the |
| 90 |
|
first character of an anchored pattern as "needed", thus provoking a search |
| 91 |
|
right along the subject, even when the first match of the pattern was going to |
| 92 |
|
fail. The "needed" character is now not set for anchored patterns, unless it |
| 93 |
|
follows something in the pattern that is of non-fixed length. Thus, it still |
| 94 |
|
fulfils its original purpose of finding quick non-matches in cases of nested |
| 95 |
|
unlimited repeats, but isn't used for simple anchored patterns such as /^abc/. |
| 96 |
|
|
| 97 |
|
|
| 98 |
|
Version 4.0 17-Feb-03 |
| 99 |
|
--------------------- |
| 100 |
|
|
| 101 |
1. If a comment in an extended regex that started immediately after a meta-item |
1. If a comment in an extended regex that started immediately after a meta-item |
| 102 |
extended to the end of string, PCRE compiled incorrect data. This could lead to |
extended to the end of string, PCRE compiled incorrect data. This could lead to |