| 77 |
generating tables according to the current locale when PCRE is compiled. It |
generating tables according to the current locale when PCRE is compiled. It |
| 78 |
turns out that in some environments, 0x85 and 0xa0, which are Unicode space |
turns out that in some environments, 0x85 and 0xa0, which are Unicode space |
| 79 |
characters, are recognized by isspace() and therefore were getting set in |
characters, are recognized by isspace() and therefore were getting set in |
| 80 |
these tables. This caused a problem in UTF-8 mode when pcre_study() was |
these tables, and indeed these tables seem to approximate to ISO 8859. This |
| 81 |
used to create a list of bytes that can start a match. For \s, it was |
caused a problem in UTF-8 mode when pcre_study() was used to create a list |
| 82 |
including 0x85 and 0xa0, which of course cannot start UTF-8 characters. I |
of bytes that can start a match. For \s, it was including 0x85 and 0xa0, |
| 83 |
have changed the code so that only real ASCII characters (less than 128) |
which of course cannot start UTF-8 characters. I have changed the code so |
| 84 |
and the correct starting bytes for UTF-8 encodings are set in this case. |
that only real ASCII characters (less than 128) and the correct starting |
| 85 |
(When PCRE_UCP is set - see 9 above - the code is different altogether.) |
bytes for UTF-8 encodings are set for characters greater than 127 when in |
| 86 |
|
UTF-8 mode. (When PCRE_UCP is set - see 9 above - the code is different |
| 87 |
|
altogether.) |
| 88 |
|
|
| 89 |
|
20. Added the /T option to pcretest so as to be able to run tests with non- |
| 90 |
|
standard character tables, thus making it possible to include the tests |
| 91 |
|
used for 19 above in the standard set of tests. |
| 92 |
|
|
| 93 |
|
|
| 94 |
Version 8.02 19-Mar-2010 |
Version 8.02 19-Mar-2010 |