Re: [pcre-dev] 'Hard' partial matching don't work with some …

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] 'Hard' partial matching don't work with some assertions
On Sun, 29 Aug 2010, ND wrote:

> '\z' need to know not value of next character but *it's presence*.


Yes, and if the matching is at the end of the string, it knows there is
no presence because there are no more characters.

> And '\b' assertion must know next character.


Again, \b is defined to match at the end of the string if the last
character matches \w.

Both of these effects occur because the "partial" means "partial
matching", not "partial string".

> How can I check it right in multisegment string?


I do not know. PCRE was not designed to work with multisegment strings.

[Idea: if you get a match and the match ends at the end of your segment,
add the next segment and try again?]

> Say, please, what appliance to 'hard' option kepped in mind if it can not
> operate with multisegment strings?


"Hard" prefers a partial match to a later, complete match (as described
in the "pcrepartial" document. "Soft" prefers a later, complete match if
it can find one.

> I think when 'hard' option was borned, this lookahead assertions action
> does not have been taken in consideration. And the accordant correction
> will fix the problem.


I will not change current behaviour because there may be people who are
relying on it. Compatibility is important.

In theory, perhaps, a new option could be added. However, I am now
retired (and getting old and more forgetful) and I do not think I will
have the desire, or the time, to add new features to PCRE, though I do
still mend bugs.

Presumably, the restrictions for pcre_dfa_exec() mean that you can't use
it? It is more straightforward to handle multisegment strings with
pcre_dfa_exec(), but it does behave in a very different way.

I am sorry that I have misunderstood you during this thread; I plead old
age.

Philip

--
Philip Hazel