| 2 |
of PCRE's API, error diagnostics, and the compiled code of some patterns. |
of PCRE's API, error diagnostics, and the compiled code of some patterns. |
| 3 |
It also checks the non-Perl syntax the PCRE supports (Python, .NET, |
It also checks the non-Perl syntax the PCRE supports (Python, .NET, |
| 4 |
Oniguruma). Finally, there are some tests where PCRE and Perl differ, |
Oniguruma). Finally, there are some tests where PCRE and Perl differ, |
| 5 |
either because PCRE can't be compatible, or there is potential Perl |
either because PCRE can't be compatible, or there is a possible Perl |
| 6 |
bug. --/ |
bug. --/ |
| 7 |
|
|
| 8 |
/-- Originally, the Perl 5.10 things were in here too, but now I have separated |
/-- Originally, the Perl 5.10 and 5.11 things were in here too, but now I have |
| 9 |
many (most?) of them out into test 11. However, there may still be some |
separated many (most?) of them out into test 11. However, there may still |
| 10 |
that were overlooked. --/ |
be some that were overlooked. --/ |
| 11 |
|
|
| 12 |
/(a)b|/I |
/(a)b|/I |
| 13 |
|
|
| 51 |
|
|
| 52 |
/(?X)[\B]/ |
/(?X)[\B]/ |
| 53 |
|
|
| 54 |
|
/(?X)[\R]/ |
| 55 |
|
|
| 56 |
|
/(?X)[\X]/ |
| 57 |
|
|
| 58 |
|
/[\B]/BZ |
| 59 |
|
|
| 60 |
|
/[\R]/BZ |
| 61 |
|
|
| 62 |
|
/[\X]/BZ |
| 63 |
|
|
| 64 |
/[z-a]/ |
/[z-a]/ |
| 65 |
|
|
| 66 |
/^*/ |
/^*/ |
| 354 |
*** Failers |
*** Failers |
| 355 |
a |
a |
| 356 |
|
|
| 357 |
/This one is here because I think Perl 5.005_02 gets the setting of $1 wrong/I |
/This one is here because Perl behaves differently; see also the following/I |
| 358 |
|
|
| 359 |
/^(a\1?){4}$/I |
/^(a\1?){4}$/I |
| 360 |
|
aaaa |
| 361 |
aaaaaa |
aaaaaa |
| 362 |
|
|
| 363 |
|
/Perl does not fail these two for the final subjects. Neither did PCRE until/ |
| 364 |
|
/release 8.01. The problem is in backtracking into a subpattern that contains/ |
| 365 |
|
/a recursive reference to itself. PCRE has now made these into atomic patterns./ |
| 366 |
|
|
| 367 |
|
/^(xa|=?\1a){2}$/ |
| 368 |
|
xa=xaa |
| 369 |
|
** Failers |
| 370 |
|
xa=xaaa |
| 371 |
|
|
| 372 |
|
/^(xa|=?\1a)+$/ |
| 373 |
|
xa=xaa |
| 374 |
|
** Failers |
| 375 |
|
xa=xaaa |
| 376 |
|
|
| 377 |
/These are syntax tests from Perl 5.005/I |
/These are syntax tests from Perl 5.005/I |
| 378 |
|
|
| 2289 |
/a+b?(*THEN)c+(*FAIL)/C |
/a+b?(*THEN)c+(*FAIL)/C |
| 2290 |
aaabccc |
aaabccc |
| 2291 |
|
|
|
/a(*PRUNE:XXX)b/ |
|
|
|
|
| 2292 |
/a(*MARK)b/ |
/a(*MARK)b/ |
| 2293 |
|
|
| 2294 |
/(?i:A{1,}\6666666666)/ |
/(?i:A{1,}\6666666666)/ |
| 3198 |
/()i(?(1)a)/SI |
/()i(?(1)a)/SI |
| 3199 |
ia |
ia |
| 3200 |
|
|
| 3201 |
|
/(?i)a(?-i)b|c/BZ |
| 3202 |
|
XabX |
| 3203 |
|
XAbX |
| 3204 |
|
CcC |
| 3205 |
|
** Failers |
| 3206 |
|
XABX |
| 3207 |
|
|
| 3208 |
|
/(?i)a(?s)b|c/BZ |
| 3209 |
|
|
| 3210 |
|
/(?i)a(?s-i)b|c/BZ |
| 3211 |
|
|
| 3212 |
|
/^(ab(c\1)d|x){2}$/BZ |
| 3213 |
|
xabcxd |
| 3214 |
|
|
| 3215 |
|
/^(?&t)*+(?(DEFINE)(?<t>.))$/BZ |
| 3216 |
|
|
| 3217 |
|
/^(?&t)*(?(DEFINE)(?<t>.))$/BZ |
| 3218 |
|
|
| 3219 |
|
/ -- The first four of these are not in the Perl 5.10 test because Perl |
| 3220 |
|
documents that the use of \K in assertions is "not well defined". The |
| 3221 |
|
last is here because Perl gives the match as "b" rather than "ab". I |
| 3222 |
|
believe this to be a Perl bug. --/ |
| 3223 |
|
|
| 3224 |
|
/(?=a\Kb)ab/ |
| 3225 |
|
ab |
| 3226 |
|
|
| 3227 |
|
/(?!a\Kb)ac/ |
| 3228 |
|
ac |
| 3229 |
|
|
| 3230 |
|
/^abc(?<=b\Kc)d/ |
| 3231 |
|
abcd |
| 3232 |
|
|
| 3233 |
|
/^abc(?<!b\Kq)d/ |
| 3234 |
|
abcd |
| 3235 |
|
|
| 3236 |
|
/(?>a\Kb)z|(ab)/ |
| 3237 |
|
ab |
| 3238 |
|
|
| 3239 |
|
/----------------------/ |
| 3240 |
|
|
| 3241 |
|
/(?P<L1>(?P<L2>0|)|(?P>L2)(?P>L1))/ |
| 3242 |
|
|
| 3243 |
|
/abc(*MARK:)pqr/ |
| 3244 |
|
|
| 3245 |
|
/abc(*:)pqr/ |
| 3246 |
|
|
| 3247 |
|
/abc(*FAIL:123)xyz/ |
| 3248 |
|
|
| 3249 |
|
/--- This should, and does, fail. In Perl, it does not, which I think is a |
| 3250 |
|
bug because replacing the B in the pattern by (B|D) does make it fail. ---/ |
| 3251 |
|
|
| 3252 |
|
/A(*COMMIT)B/+K |
| 3253 |
|
ACABX |
| 3254 |
|
|
| 3255 |
|
/--- These should be different, but in Perl 5.11 are not, which I think |
| 3256 |
|
is a bug in Perl. ---/ |
| 3257 |
|
|
| 3258 |
|
/A(*THEN)B|A(*THEN)C/K |
| 3259 |
|
AC |
| 3260 |
|
|
| 3261 |
|
/A(*PRUNE)B|A(*PRUNE)C/K |
| 3262 |
|
AC |
| 3263 |
|
|
| 3264 |
|
/--- A whole lot of tests of verbs with arguments are here rather than in test |
| 3265 |
|
11 because Perl doesn't seem to follow its specification entirely |
| 3266 |
|
correctly. ---/ |
| 3267 |
|
|
| 3268 |
|
/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is |
| 3269 |
|
not clear how Perl defines "involved in the failure of the match". ---/ |
| 3270 |
|
|
| 3271 |
|
/^(A(*THEN:A)B|C(*THEN:B)D)/K |
| 3272 |
|
AB |
| 3273 |
|
CD |
| 3274 |
|
** Failers |
| 3275 |
|
AC |
| 3276 |
|
CB |
| 3277 |
|
|
| 3278 |
|
/--- Check the use of names for success and failure. PCRE doesn't show these |
| 3279 |
|
names for success, though Perl does, contrary to its spec. ---/ |
| 3280 |
|
|
| 3281 |
|
/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K |
| 3282 |
|
AB |
| 3283 |
|
CD |
| 3284 |
|
** Failers |
| 3285 |
|
AC |
| 3286 |
|
CB |
| 3287 |
|
|
| 3288 |
|
/--- An empty name does not pass back an empty string. It is the same as if no |
| 3289 |
|
name were given. ---/ |
| 3290 |
|
|
| 3291 |
|
/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K |
| 3292 |
|
AB |
| 3293 |
|
CD |
| 3294 |
|
|
| 3295 |
|
/--- PRUNE goes to next bumpalong; COMMIT does not. ---/ |
| 3296 |
|
|
| 3297 |
|
/A(*PRUNE:A)B/K |
| 3298 |
|
ACAB |
| 3299 |
|
|
| 3300 |
|
/(*MARK:A)(*PRUNE:B)(C|X)/K |
| 3301 |
|
C |
| 3302 |
|
D |
| 3303 |
|
|
| 3304 |
|
/(*MARK:A)(*THEN:B)(C|X)/K |
| 3305 |
|
C |
| 3306 |
|
D |
| 3307 |
|
|
| 3308 |
|
/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/ |
| 3309 |
|
|
| 3310 |
|
/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK |
| 3311 |
|
AAAC |
| 3312 |
|
|
| 3313 |
|
/--- Same --/ |
| 3314 |
|
|
| 3315 |
|
/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK |
| 3316 |
|
AAAC |
| 3317 |
|
|
| 3318 |
|
/--- This should fail; the SKIP advances by one, but when we get to AC, the |
| 3319 |
|
PRUNE kills it. ---/ |
| 3320 |
|
|
| 3321 |
|
/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK |
| 3322 |
|
AAAC |
| 3323 |
|
|
| 3324 |
|
/A(*:A)A+(*SKIP)(B|Z) | AC/xK |
| 3325 |
|
AAAC |
| 3326 |
|
|
| 3327 |
|
/--- This should fail, as a null name is the same as no name ---/ |
| 3328 |
|
|
| 3329 |
|
/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK |
| 3330 |
|
AAAC |
| 3331 |
|
|
| 3332 |
|
/--- This fails in PCRE, and I think that is in accordance with Perl's |
| 3333 |
|
documentation, though in Perl it succeeds. ---/ |
| 3334 |
|
|
| 3335 |
|
/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK |
| 3336 |
|
AAAC |
| 3337 |
|
|
| 3338 |
|
/--- Mark names can be duplicated ---/ |
| 3339 |
|
|
| 3340 |
|
/A(*:A)B|X(*:A)Y/K |
| 3341 |
|
AABC |
| 3342 |
|
XXYZ |
| 3343 |
|
|
| 3344 |
|
/^A(*:A)B|^X(*:A)Y/K |
| 3345 |
|
** Failers |
| 3346 |
|
XAQQ |
| 3347 |
|
|
| 3348 |
|
/--- A check on what happens after hitting a mark and them bumping along to |
| 3349 |
|
something that does not even start. Perl reports tags after the failures here, |
| 3350 |
|
though it does not when the individual letters are made into something |
| 3351 |
|
more complicated. ---/ |
| 3352 |
|
|
| 3353 |
|
/A(*:A)B|XX(*:B)Y/K |
| 3354 |
|
AABC |
| 3355 |
|
XXYZ |
| 3356 |
|
** Failers |
| 3357 |
|
XAQQ |
| 3358 |
|
XAQQXZZ |
| 3359 |
|
AXQQQ |
| 3360 |
|
AXXQQQ |
| 3361 |
|
|
| 3362 |
|
/--- COMMIT at the start of a pattern should be the same as an anchor. Perl |
| 3363 |
|
optimizations defeat this. So does the PCRE optimization unless we disable it |
| 3364 |
|
with \Y. ---/ |
| 3365 |
|
|
| 3366 |
|
/(*COMMIT)ABC/ |
| 3367 |
|
ABCDEFG |
| 3368 |
|
** Failers |
| 3369 |
|
DEFGABC\Y |
| 3370 |
|
|
| 3371 |
|
/--- Repeat some tests with added studying. ---/ |
| 3372 |
|
|
| 3373 |
|
/A(*COMMIT)B/+KS |
| 3374 |
|
ACABX |
| 3375 |
|
|
| 3376 |
|
/A(*THEN)B|A(*THEN)C/KS |
| 3377 |
|
AC |
| 3378 |
|
|
| 3379 |
|
/A(*PRUNE)B|A(*PRUNE)C/KS |
| 3380 |
|
AC |
| 3381 |
|
|
| 3382 |
|
/^(A(*THEN:A)B|C(*THEN:B)D)/KS |
| 3383 |
|
AB |
| 3384 |
|
CD |
| 3385 |
|
** Failers |
| 3386 |
|
AC |
| 3387 |
|
CB |
| 3388 |
|
|
| 3389 |
|
/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS |
| 3390 |
|
AB |
| 3391 |
|
CD |
| 3392 |
|
** Failers |
| 3393 |
|
AC |
| 3394 |
|
CB |
| 3395 |
|
|
| 3396 |
|
/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS |
| 3397 |
|
AB |
| 3398 |
|
CD |
| 3399 |
|
|
| 3400 |
|
/A(*PRUNE:A)B/KS |
| 3401 |
|
ACAB |
| 3402 |
|
|
| 3403 |
|
/(*MARK:A)(*PRUNE:B)(C|X)/KS |
| 3404 |
|
C |
| 3405 |
|
D |
| 3406 |
|
|
| 3407 |
|
/(*MARK:A)(*THEN:B)(C|X)/KS |
| 3408 |
|
C |
| 3409 |
|
D |
| 3410 |
|
|
| 3411 |
|
/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS |
| 3412 |
|
AAAC |
| 3413 |
|
|
| 3414 |
|
/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS |
| 3415 |
|
AAAC |
| 3416 |
|
|
| 3417 |
|
/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS |
| 3418 |
|
AAAC |
| 3419 |
|
|
| 3420 |
|
/A(*:A)A+(*SKIP)(B|Z) | AC/xKS |
| 3421 |
|
AAAC |
| 3422 |
|
|
| 3423 |
|
/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS |
| 3424 |
|
AAAC |
| 3425 |
|
|
| 3426 |
|
/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS |
| 3427 |
|
AAAC |
| 3428 |
|
|
| 3429 |
|
/A(*:A)B|XX(*:B)Y/KS |
| 3430 |
|
AABC |
| 3431 |
|
XXYZ |
| 3432 |
|
** Failers |
| 3433 |
|
XAQQ |
| 3434 |
|
XAQQXZZ |
| 3435 |
|
AXQQQ |
| 3436 |
|
AXXQQQ |
| 3437 |
|
|
| 3438 |
|
/(*COMMIT)ABC/ |
| 3439 |
|
ABCDEFG |
| 3440 |
|
** Failers |
| 3441 |
|
DEFGABC\Y |
| 3442 |
|
|
| 3443 |
|
/^(ab (c+(*THEN)cd) | xyz)/x |
| 3444 |
|
abcccd |
| 3445 |
|
|
| 3446 |
|
/^(ab (c+(*PRUNE)cd) | xyz)/x |
| 3447 |
|
abcccd |
| 3448 |
|
|
| 3449 |
|
/^(ab (c+(*FAIL)cd) | xyz)/x |
| 3450 |
|
abcccd |
| 3451 |
|
|
| 3452 |
|
/--- Perl 5.11 gets some of these wrong ---/ |
| 3453 |
|
|
| 3454 |
|
/(?>.(*ACCEPT))*?5/ |
| 3455 |
|
abcde |
| 3456 |
|
|
| 3457 |
|
/(.(*ACCEPT))*?5/ |
| 3458 |
|
abcde |
| 3459 |
|
|
| 3460 |
|
/(.(*ACCEPT))5/ |
| 3461 |
|
abcde |
| 3462 |
|
|
| 3463 |
|
/(.(*ACCEPT))*5/ |
| 3464 |
|
abcde |
| 3465 |
|
|
| 3466 |
|
/A\NB./BZ |
| 3467 |
|
ACBD |
| 3468 |
|
*** Failers |
| 3469 |
|
A\nB |
| 3470 |
|
ACB\n |
| 3471 |
|
|
| 3472 |
|
/A\NB./sBZ |
| 3473 |
|
ACBD |
| 3474 |
|
ACB\n |
| 3475 |
|
*** Failers |
| 3476 |
|
A\nB |
| 3477 |
|
|
| 3478 |
|
/A\NB/<crlf> |
| 3479 |
|
A\nB |
| 3480 |
|
A\rB |
| 3481 |
|
** Failers |
| 3482 |
|
A\r\nB |
| 3483 |
|
|
| 3484 |
|
/\R+b/BZ |
| 3485 |
|
|
| 3486 |
|
/\R+\n/BZ |
| 3487 |
|
|
| 3488 |
|
/\R+\d/BZ |
| 3489 |
|
|
| 3490 |
|
/\d*\R/BZ |
| 3491 |
|
|
| 3492 |
|
/\s*\R/BZ |
| 3493 |
|
|
| 3494 |
|
/-- Perl treats this one differently, not failing the second string. I believe |
| 3495 |
|
that is a bug in Perl. --/ |
| 3496 |
|
|
| 3497 |
|
/^((abc|abcx)(*THEN)y|abcd)/ |
| 3498 |
|
abcd |
| 3499 |
|
*** Failers |
| 3500 |
|
abcxy |
| 3501 |
|
|
| 3502 |
/-- End of testinput2 --/ |
/-- End of testinput2 --/ |