| 117 |
Character classes |
Character classes |
| 118 |
----------------- |
----------------- |
| 119 |
|
|
| 120 |
OP_CLASS is used for a character class. It is followed by a 32-byte bit map |
OP_CLASS is used for a character class, and OP_NEGCLASS for a negated character |
| 121 |
containing a 1 bit for every character that is acceptable. The bits are counted |
class, provided there are at least two characters in the class. If there is |
| 122 |
from the least significant end of each byte. |
only one character, OP_CHARS is used for a positive class, and OP_NOT for a |
| 123 |
|
negative one. A set of repeating opcodes (OP_NOTSTAR etc.) are used for a |
| 124 |
|
repeated, negated, single-character class. |
| 125 |
|
|
| 126 |
|
Both OP_CLASS and OP_NEGCLASS are followed by a 32-byte bit map containing a 1 |
| 127 |
|
bit for every character that is acceptable. The bits are counted from the least |
| 128 |
|
significant end of each byte. The reason for having two opcodes is to cope with |
| 129 |
|
negated character classes when caseless matching is specified at run time but |
| 130 |
|
not at compile time. If it is specified at compile time, the bit map is built |
| 131 |
|
appropriately. This is the only time that a distinction is made between |
| 132 |
|
OP_CLASS and OP_NEGCLASS, when the bit map was built in a caseful manner but |
| 133 |
|
matching must be caseless. For OP_CLASS, a character matches if either of its |
| 134 |
|
cases is in the bit map, but for OP_NEGCLASS, both of them must be present. |
| 135 |
|
|
| 136 |
|
|
| 137 |
Back references |
Back references |
| 208 |
|
|
| 209 |
|
|
| 210 |
Philip Hazel |
Philip Hazel |
| 211 |
October 1997 |
December 1997 |