| 51 |
9. The restrictions on what a pattern can contain when partial matching is |
9. The restrictions on what a pattern can contain when partial matching is |
| 52 |
requested for pcre_exec() have been removed. All patterns can now be |
requested for pcre_exec() have been removed. All patterns can now be |
| 53 |
partially matched by this function. In addition, if there are at least two |
partially matched by this function. In addition, if there are at least two |
| 54 |
slots in the offset vector, the offsets of the first-encountered partial |
slots in the offset vector, the offset of the earliest inspected character |
| 55 |
match are set in them when PCRE_ERROR_PARTIAL is returned. |
for the match and the offset of the end of the subject are set in them when |
| 56 |
|
PCRE_ERROR_PARTIAL is returned. |
| 57 |
|
|
| 58 |
10. Partial matching has been split into two forms: PCRE_PARTIAL_SOFT, which is |
10. Partial matching has been split into two forms: PCRE_PARTIAL_SOFT, which is |
| 59 |
synonymous with PCRE_PARTIAL, for backwards compatibility, and |
synonymous with PCRE_PARTIAL, for backwards compatibility, and |
| 74 |
earlier partial match, unless partial matching was again requested. For |
earlier partial match, unless partial matching was again requested. For |
| 75 |
example, with the pattern /dog.(body)?/, the "must contain" character is |
example, with the pattern /dog.(body)?/, the "must contain" character is |
| 76 |
"g". If the first part-match was for the string "dog", restarting with |
"g". If the first part-match was for the string "dog", restarting with |
| 77 |
"sbody" failed. |
"sbody" failed. This bug has been fixed. |
| 78 |
|
|
| 79 |
13. Added a pcredemo man page, created automatically from the pcredemo.c file, |
13. The string returned by pcre_dfa_exec() after a partial match has been |
| 80 |
|
changed so that it starts at the first inspected character rather than the |
| 81 |
|
first character of the match. This makes a difference only if the pattern |
| 82 |
|
starts with a lookbehind assertion or \b or \B (\K is not supported by |
| 83 |
|
pcre_dfa_exec()). It's an incompatible change, but it makes the two |
| 84 |
|
matching functions compatible, and I think it's the right thing to do. |
| 85 |
|
|
| 86 |
|
14. Added a pcredemo man page, created automatically from the pcredemo.c file, |
| 87 |
so that the demonstration program is easily available in environments where |
so that the demonstration program is easily available in environments where |
| 88 |
PCRE has not been installed from source. |
PCRE has not been installed from source. |
| 89 |
|
|
| 90 |
14. Arranged to add -DPCRE_STATIC to cflags in libpcre.pc, libpcreposix.cp, |
15. Arranged to add -DPCRE_STATIC to cflags in libpcre.pc, libpcreposix.cp, |
| 91 |
libpcrecpp.pc and pcre-config when PCRE is not compiled as a shared |
libpcrecpp.pc and pcre-config when PCRE is not compiled as a shared |
| 92 |
library. |
library. |
| 93 |
|
|
| 94 |
15. Added REG_UNGREEDY to the pcreposix interface, at the request of a user. |
16. Added REG_UNGREEDY to the pcreposix interface, at the request of a user. |
| 95 |
It maps to PCRE_UNGREEDY. It is not, of course, POSIX-compatible, but it |
It maps to PCRE_UNGREEDY. It is not, of course, POSIX-compatible, but it |
| 96 |
is not the first non-POSIX option to be added. Clearly some people find |
is not the first non-POSIX option to be added. Clearly some people find |
| 97 |
these options useful. |
these options useful. |
| 98 |
|
|
| 99 |
16. If a caller to the POSIX matching function regexec() passes a non-zero |
17. If a caller to the POSIX matching function regexec() passes a non-zero |
| 100 |
value for \fInmatch\fP with a NULL value for \fIpmatch\fP, the value of |
value for nmatch with a NULL value for pmatch, the value of |
| 101 |
\fInmatch\fP is forced to zero. |
nmatch is forced to zero. |
| 102 |
|
|
| 103 |
|
18. RunGrepTest did not have a test for the availability of the -u option of |
| 104 |
|
the diff command, as RunTest does. It now checks in the same way as |
| 105 |
|
RunTest, and also checks for the -b option. |
| 106 |
|
|
| 107 |
|
19. If an odd number of negated classes containing just a single character |
| 108 |
|
interposed, within parentheses, between a forward reference to a named |
| 109 |
|
subpattern and the definition of the subpattern, compilation crashed with |
| 110 |
|
an internal error, complaining that it could not find the referenced |
| 111 |
|
subpattern. An example of a crashing pattern is /(?&A)(([^m])(?<A>))/. |
| 112 |
|
[The bug was that it was starting one character too far in when skipping |
| 113 |
|
over the character class, thus treating the ] as data rather than |
| 114 |
|
terminating the class. This meant it could skip too much.] |
| 115 |
|
|
| 116 |
|
20. Added PCRE_NOTEMPTY_ATSTART in order to be able to correctly implement the |
| 117 |
|
/g option in pcretest when the pattern contains \K, which makes it possible |
| 118 |
|
to have an empty string match not at the start, even when the pattern is |
| 119 |
|
anchored. Updated pcretest and pcredemo to use this option. |
| 120 |
|
|
| 121 |
|
21. If the maximum number of capturing subpatterns in a recursion was greater |
| 122 |
|
than the maximum at the outer level, the higher number was returned, but |
| 123 |
|
with unset values at the outer level. The correct (outer level) value is |
| 124 |
|
now given. |
| 125 |
|
|
| 126 |
|
22. If (*ACCEPT) appeared inside capturing parentheses, previous releases of |
| 127 |
|
PCRE did not set those parentheses (unlike Perl). I have now found a way to |
| 128 |
|
make it do so. The string so far is captured, making this feature |
| 129 |
|
compatible with Perl. |
| 130 |
|
|
| 131 |
|
23. The tests have been re-organized, adding tests 11 and 12, to make it |
| 132 |
|
possible to check the Perl 5.10 features against Perl 5.10. |
| 133 |
|
|
| 134 |
|
24. Perl 5.10 allows subroutine calls in lookbehinds, as long as the subroutine |
| 135 |
|
pattern matches a fixed length string. PCRE did not allow this; now it |
| 136 |
|
does. Neither allows recursion. |
| 137 |
|
|
| 138 |
|
25. I finally figured out how to implement a request to provide the minimum |
| 139 |
|
length of subject string that was needed in order to match a given pattern. |
| 140 |
|
(It was back references and recursion that I had previously got hung up |
| 141 |
|
on.) This code has now been added to pcre_study(); it finds a lower bound |
| 142 |
|
to the length of subject needed. It is not necessarily the greatest lower |
| 143 |
|
bound, but using it to avoid searching strings that are too short does give |
| 144 |
|
some useful speed-ups. The value is available to calling programs via |
| 145 |
|
pcre_fullinfo(). |
| 146 |
|
|
| 147 |
|
26. While implementing 25, I discovered to my embarrassment that pcretest had |
| 148 |
|
not been passing the result of pcre_study() to pcre_dfa_exec(), so the |
| 149 |
|
study optimizations had never been tested with that matching function. |
| 150 |
|
Oops. What is worse, even when it was passed study data, there was a bug in |
| 151 |
|
pcre_dfa_exec() that meant it never actually used it. Double oops. There |
| 152 |
|
were also very few tests of studied patterns with pcre_dfa_exec(). |
| 153 |
|
|
| 154 |
|
|
| 155 |
Version 7.9 11-Apr-09 |
Version 7.9 11-Apr-09 |