/[pcre]/code/trunk/ChangeLog
ViewVC logotype

Diff of /code/trunk/ChangeLog

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 5 by nigel, Sat Feb 24 21:38:05 2007 UTC revision 41 by nigel, Sat Feb 24 21:39:17 2007 UTC
# Line 2  ChangeLog for PCRE Line 2  ChangeLog for PCRE
2  ------------------  ------------------
3    
4    
5    Version 2.09 14-Sep-99
6    ----------------------
7    
8    1. Add support for the /+ modifier to perltest (to output $` like it does in
9    pcretest).
10    
11    2. Add support for the /g modifier to perltest.
12    
13    3. Fix pcretest so that it behaves even more like Perl for /g when the pattern
14    matches null strings.
15    
16    4. Fix perltest so that it doesn't do unwanted things when fed an empty
17    pattern. Perl treats empty patterns specially - it reuses the most recent
18    pattern, which is not what we want. Replace // by /(?#)/ in order to avoid this
19    effect.
20    
21    5. The POSIX interface was broken in that it was just handing over the POSIX
22    captured string vector to pcre_exec(), but (since release 2.00) PCRE has
23    required a bigger vector, with some working space on the end. This means that
24    the POSIX wrapper now has to get and free some memory, and copy the results.
25    
26    
27    Version 2.08 31-Aug-99
28    ----------------------
29    
30    1. When startoffset was not zero and the pattern began with ".*", PCRE was not
31    trying to match at the startoffset position, but instead was moving forward to
32    the next newline as if a previous match had failed.
33    
34    2. pcretest was not making use of PCRE_NOTEMPTY when repeating for /g and /G,
35    and could get into a loop if a null string was matched other than at the start
36    of the subject.
37    
38    3. Added definitions of PCRE_MAJOR and PCRE_MINOR to pcre.h so the version can
39    be distinguished at compile time, and for completeness also added PCRE_DATE.
40    
41    5. Added Paul Sokolovsky's minor changes to make it easy to compile a Win32 DLL
42    in GnuWin32 environments.
43    
44    
45    Version 2.07 29-Jul-99
46    ----------------------
47    
48    1. The documentation is now supplied in plain text form and HTML as well as in
49    the form of man page sources.
50    
51    2. C++ compilers don't like assigning (void *) values to other pointer types.
52    In particular this affects malloc(). Although there is no problem in Standard
53    C, I've put in casts to keep C++ compilers happy.
54    
55    3. Typo on pcretest.c; a cast of (unsigned char *) in the POSIX regexec() call
56    should be (const char *).
57    
58    4. If NOPOSIX is defined, pcretest.c compiles without POSIX support. This may
59    be useful for non-Unix systems who don't want to bother with the POSIX stuff.
60    However, I haven't made this a standard facility. The documentation doesn't
61    mention it, and the Makefile doesn't support it.
62    
63    5. The Makefile now contains an "install" target, with editable destinations at
64    the top of the file. The pcretest program is not installed.
65    
66    6. pgrep -V now gives the PCRE version number and date.
67    
68    7. Fixed bug: a zero repetition after a literal string (e.g. /abcde{0}/) was
69    causing the entire string to be ignored, instead of just the last character.
70    
71    8. If a pattern like /"([^\\"]+|\\.)*"/ is applied in the normal way to a
72    non-matching string, it can take a very, very long time, even for strings of
73    quite modest length, because of the nested recursion. PCRE now does better in
74    some of these cases. It does this by remembering the last required literal
75    character in the pattern, and pre-searching the subject to ensure it is present
76    before running the real match. In other words, it applies a heuristic to detect
77    some types of certain failure quickly, and in the above example, if presented
78    with a string that has no trailing " it gives "no match" very quickly.
79    
80    9. A new runtime option PCRE_NOTEMPTY causes null string matches to be ignored;
81    other alternatives are tried instead.
82    
83    
84    Version 2.06 09-Jun-99
85    ----------------------
86    
87    1. Change pcretest's output for amount of store used to show just the code
88    space, because the remainder (the data block) varies in size between 32-bit and
89    64-bit systems.
90    
91    2. Added an extra argument to pcre_exec() to supply an offset in the subject to
92    start matching at. This allows lookbehinds to work when searching for multiple
93    occurrences in a string.
94    
95    3. Added additional options to pcretest for testing multiple occurrences:
96    
97       /+   outputs the rest of the string that follows a match
98       /g   loops for multiple occurrences, using the new startoffset argument
99       /G   loops for multiple occurrences by passing an incremented pointer
100    
101    4. PCRE wasn't doing the "first character" optimization for patterns starting
102    with \b or \B, though it was doing it for other lookbehind assertions. That is,
103    it wasn't noticing that a match for a pattern such as /\bxyz/ has to start with
104    the letter 'x'. On long subject strings, this gives a significant speed-up.
105    
106    
107    Version 2.05 21-Apr-99
108    ----------------------
109    
110    1. Changed the type of magic_number from int to long int so that it works
111    properly on 16-bit systems.
112    
113    2. Fixed a bug which caused patterns starting with .* not to work correctly
114    when the subject string contained newline characters. PCRE was assuming
115    anchoring for such patterns in all cases, which is not correct because .* will
116    not pass a newline unless PCRE_DOTALL is set. It now assumes anchoring only if
117    DOTALL is set at top level; otherwise it knows that patterns starting with .*
118    must be retried after every newline in the subject.
119    
120    
121    Version 2.04 18-Feb-99
122    ----------------------
123    
124    1. For parenthesized subpatterns with repeats whose minimum was zero, the
125    computation of the store needed to hold the pattern was incorrect (too large).
126    If such patterns were nested a few deep, this could multiply and become a real
127    problem.
128    
129    2. Added /M option to pcretest to show the memory requirement of a specific
130    pattern. Made -m a synonym of -s (which does this globally) for compatibility.
131    
132    3. Subpatterns of the form (regex){n,m} (i.e. limited maximum) were being
133    compiled in such a way that the backtracking after subsequent failure was
134    pessimal. Something like (a){0,3} was compiled as (a)?(a)?(a)? instead of
135    ((a)((a)(a)?)?)? with disastrous performance if the maximum was of any size.
136    
137    
138    Version 2.03 02-Feb-99
139    ----------------------
140    
141    1. Fixed typo and small mistake in man page.
142    
143    2. Added 4th condition (GPL supersedes if conflict) and created separate
144    LICENCE file containing the conditions.
145    
146    3. Updated pcretest so that patterns such as /abc\/def/ work like they do in
147    Perl, that is the internal \ allows the delimiter to be included in the
148    pattern. Locked out the use of \ as a delimiter. If \ immediately follows
149    the final delimiter, add \ to the end of the pattern (to test the error).
150    
151    4. Added the convenience functions for extracting substrings after a successful
152    match. Updated pcretest to make it able to test these functions.
153    
154    
155    Version 2.02 14-Jan-99
156    ----------------------
157    
158    1. Initialized the working variables associated with each extraction so that
159    their saving and restoring doesn't refer to uninitialized store.
160    
161    2. Put dummy code into study.c in order to trick the optimizer of the IBM C
162    compiler for OS/2 into generating correct code. Apparently IBM isn't going to
163    fix the problem.
164    
165    3. Pcretest: the timing code wasn't using LOOPREPEAT for timing execution
166    calls, and wasn't printing the correct value for compiling calls. Increased the
167    default value of LOOPREPEAT, and the number of significant figures in the
168    times.
169    
170    4. Changed "/bin/rm" in the Makefile to "-rm" so it works on Windows NT.
171    
172    5. Renamed "deftables" as "dftables" to get it down to 8 characters, to avoid
173    a building problem on Windows NT with a FAT file system.
174    
175    
176    Version 2.01 21-Oct-98
177    ----------------------
178    
179    1. Changed the API for pcre_compile() to allow for the provision of a pointer
180    to character tables built by pcre_maketables() in the current locale. If NULL
181    is passed, the default tables are used.
182    
183    
184    Version 2.00 24-Sep-98
185    ----------------------
186    
187    1. Since the (>?) facility is in Perl 5.005, don't require PCRE_EXTRA to enable
188    it any more.
189    
190    2. Allow quantification of (?>) groups, and make it work correctly.
191    
192    3. The first character computation wasn't working for (?>) groups.
193    
194    4. Correct the implementation of \Z (it is permitted to match on the \n at the
195    end of the subject) and add 5.005's \z, which really does match only at the
196    very end of the subject.
197    
198    5. Remove the \X "cut" facility; Perl doesn't have it, and (?> is neater.
199    
200    6. Remove the ability to specify CASELESS, MULTILINE, DOTALL, and
201    DOLLAR_END_ONLY at runtime, to make it possible to implement the Perl 5.005
202    localized options. All options to pcre_study() were also removed.
203    
204    7. Add other new features from 5.005:
205    
206       $(?<=           positive lookbehind
207       $(?<!           negative lookbehind
208       (?imsx-imsx)    added the unsetting capability
209                       such a setting is global if at outer level; local otherwise
210       (?imsx-imsx:)   non-capturing groups with option setting
211       (?(cond)re|re)  conditional pattern matching
212    
213       A backreference to itself in a repeated group matches the previous
214       captured string.
215    
216    8. General tidying up of studying (both automatic and via "study")
217    consequential on the addition of new assertions.
218    
219    9. As in 5.005, unlimited repeated groups that could match an empty substring
220    are no longer faulted at compile time. Instead, the loop is forcibly broken at
221    runtime if any iteration does actually match an empty substring.
222    
223    10. Include the RunTest script in the distribution.
224    
225    11. Added tests from the Perl 5.005_02 distribution. This showed up a few
226    discrepancies, some of which were old and were also with respect to 5.004. They
227    have now been fixed.
228    
229    
230    Version 1.09 28-Apr-98
231    ----------------------
232    
233    1. A negated single character class followed by a quantifier with a minimum
234    value of one (e.g.  [^x]{1,6}  ) was not compiled correctly. This could lead to
235    program crashes, or just wrong answers. This did not apply to negated classes
236    containing more than one character, or to minima other than one.
237    
238    
239    Version 1.08 27-Mar-98
240    ----------------------
241    
242    1. Add PCRE_UNGREEDY to invert the greediness of quantifiers.
243    
244    2. Add (?U) and (?X) to set PCRE_UNGREEDY and PCRE_EXTRA respectively. The
245    latter must appear before anything that relies on it in the pattern.
246    
247    
248    Version 1.07 16-Feb-98
249    ----------------------
250    
251    1. A pattern such as /((a)*)*/ was not being diagnosed as in error (unlimited
252    repeat of a potentially empty string).
253    
254    
255    Version 1.06 23-Jan-98
256    ----------------------
257    
258    1. Added Markus Oberhumer's little patches for C++.
259    
260    2. Literal strings longer than 255 characters were broken.
261    
262    
263    Version 1.05 23-Dec-97
264    ----------------------
265    
266    1. Negated character classes containing more than one character were failing if
267    PCRE_CASELESS was set at run time.
268    
269    
270    Version 1.04 19-Dec-97
271    ----------------------
272    
273    1. Corrected the man page, where some "const" qualifiers had been omitted.
274    
275    2. Made debugging output print "{0,xxx}" instead of just "{,xxx}" to agree with
276    input syntax.
277    
278    3. Fixed memory leak which occurred when a regex with back references was
279    matched with an offsets vector that wasn't big enough. The temporary memory
280    that is used in this case wasn't being freed if the match failed.
281    
282    4. Tidied pcretest to ensure it frees memory that it gets.
283    
284    5. Temporary memory was being obtained in the case where the passed offsets
285    vector was exactly big enough.
286    
287    6. Corrected definition of offsetof() from change 5 below.
288    
289    7. I had screwed up change 6 below and broken the rules for the use of
290    setjmp(). Now fixed.
291    
292    
293    Version 1.03 18-Dec-97
294    ----------------------
295    
296    1. A erroneous regex with a missing opening parenthesis was correctly
297    diagnosed, but PCRE attempted to access brastack[-1], which could cause crashes
298    on some systems.
299    
300    2. Replaced offsetof(real_pcre, code) by offsetof(real_pcre, code[0]) because
301    it was reported that one broken compiler failed on the former because "code" is
302    also an independent variable.
303    
304    3. The erroneous regex a[]b caused an array overrun reference.
305    
306    4. A regex ending with a one-character negative class (e.g. /[^k]$/) did not
307    fail on data ending with that character. (It was going on too far, and checking
308    the next character, typically a binary zero.) This was specific to the
309    optimized code for single-character negative classes.
310    
311    5. Added a contributed patch from the TIN world which does the following:
312    
313      + Add an undef for memmove, in case the the system defines a macro for it.
314    
315      + Add a definition of offsetof(), in case there isn't one. (I don't know
316        the reason behind this - offsetof() is part of the ANSI standard - but
317        it does no harm).
318    
319      + Reduce the ifdef's in pcre.c using macro DPRINTF, thereby eliminating
320        most of the places where whitespace preceded '#'. I have given up and
321        allowed the remaining 2 cases to be at the margin.
322    
323      + Rename some variables in pcre to eliminate shadowing. This seems very
324        pedantic, but does no harm, of course.
325    
326    6. Moved the call to setjmp() into its own function, to get rid of warnings
327    from gcc -Wall, and avoided calling it at all unless PCRE_EXTRA is used.
328    
329    7. Constructs such as \d{8,} were compiling into the equivalent of
330    \d{8}\d{0,65527} instead of \d{8}\d* which didn't make much difference to the
331    outcome, but in this particular case used more store than had been allocated,
332    which caused the bug to be discovered because it threw up an internal error.
333    
334    8. The debugging code in both pcre and pcretest for outputting the compiled
335    form of a regex was going wrong in the case of back references followed by
336    curly-bracketed repeats.
337    
338    
339    Version 1.02 12-Dec-97
340    ----------------------
341    
342    1. Typos in pcre.3 and comments in the source fixed.
343    
344    2. Applied a contributed patch to get rid of places where it used to remove
345    'const' from variables, and fixed some signed/unsigned and uninitialized
346    variable warnings.
347    
348    3. Added the "runtest" target to Makefile.
349    
350    4. Set default compiler flag to -O2 rather than just -O.
351    
352    
353  Version 1.01 19-Nov-97  Version 1.01 19-Nov-97
354  ----------------------  ----------------------
355    

Legend:
Removed from v.5  
changed lines
  Added in v.41

webmaster@exim.org
ViewVC Help
Powered by ViewVC 1.1.12