--- code/trunk/ChangeLog 2007/07/19 10:38:20 190 +++ code/trunk/ChangeLog 2007/08/03 09:44:26 202 @@ -9,13 +9,13 @@ brackets there is not right, since it causes compilers to look for an installed pcre.h, not the version that is in the source that is being compiled (which of course may be different). I have changed it back to: - + #include "pcre.h" - - I have a vague recollection that the change was concerned with compiling in - different directories, but in the new build system, that is taken care of - by the VPATH setting the Makefile. - + + I have a vague recollection that the change was concerned with compiling in + different directories, but in the new build system, that is taken care of + by the VPATH setting the Makefile. + 2. The pattern .*$ when run in not-DOTALL UTF-8 mode with newline=any failed when the subject happened to end in the byte 0x85 (e.g. if the last character was \x{1ec5}). *Character* 0x85 is one of the "any" newline @@ -23,9 +23,47 @@ of another character. The bug was that, for an unlimited repeat of . in not-DOTALL UTF-8 mode, PCRE was advancing by bytes rather than by characters when looking for a newline. - - 3. A small performance improvement in the DOTALL UTF-8 mode .* case. + 3. A small performance improvement in the DOTALL UTF-8 mode .* case. + + 4. Debugging: adjusted the names of opcodes for different kinds of parentheses + in debug output. + + 5. Arrange to use "%I64d" instead of "%lld" and "%I64u" instead of "%llu" for + long printing in the pcrecpp unittest when running under MinGW. + + 6. ESC_K was left out of the EBCDIC table. + + 7. Change 7.0/38 introduced a new limit on the number of nested non-capturing + parentheses; I made it 1000, which seemed large enough. Unfortunately, the + limit also applies to "virtual nesting" when a pattern is recursive, and in + this case 1000 isn't so big. I have been able to remove this limit at the + expense of backing off one optimization in certain circumstances. Normally, + when pcre_exec() would call its internal match() function recursively and + immediately return the result unconditionally, it uses a "tail recursion" + feature to save stack. However, when a subpattern that can match an empty + string has an unlimited repetition quantifier, it no longer makes this + optimization. That gives it a stack frame in which to save the data for + checking that an empty string has been matched. Previously this was taken + from the 1000-entry workspace that had been reserved. So now there is no + explicit limit, but more stack is used. + + 8. Applied Daniel's patches to solve problems with the import/export magic + syntax that is required for Windows, and which was going wrong for the + pcreposix and pcrecpp parts of the library. These were overlooked when this + problem was solved for the main library. + + 9. There were some crude static tests to avoid integer overflow when computing + the size of patterns that contain repeated groups with explicit upper + limits. As the maximum quantifier is 65535, the maximum group length was + set at 30,000 so that the product of these two numbers did not overflow a + 32-bit integer. However, it turns out that people want to use groups that + are longer than 30,000 bytes (though not repeat them that many times). + Change 7.0/17 (the refactoring of the way the pattern size is computed) has + made it possible to implement the integer overflow checks in a much more + dynamic way, which I have now done. The artificial limitation on group + length has been removed - we now have only the limit on the total length of + the compiled pattern, which depends on the LINK_SIZE setting. Version 7.2 19-Jun-07