develooper Front page | perl.perl5.porters | Postings from March 2013

Re: split patches working now, but it revealed a problem in theregex compilation

Thread Previous | Thread Next
Dave Mitchell
March 26, 2013 11:02
Re: split patches working now, but it revealed a problem in theregex compilation
Message ID:
On Tue, Mar 26, 2013 at 08:40:30AM +0100, demerphq wrote:
> I have pushed a new yves/revert_splitwhite branch.
> It passes all tests except for some which now fail in re/recompile.t
> As far as I can tell these failures come from a change I did to fix
> the problem that the caching logic was using the precompiled string
> (meaning without "(?...:....)" wrapper) and not checking that the
> compiled pattern and the uncompiled had the same regex flags.
> I noticed this testing the behavior of
> split $_ ? / / : ' ', $string for 1,0,1,0
> which would not recompile due to the two patterns differing only by flags.
> My fix is apparently overly pessimistic, and causes perl to recompile
> things like "\x{100}" more often than we should. This is because the
> patterns starts off as utf8-off, but ends up as a utf8-on.
> At worst this means we recompile utf8 patterns more often.
> Anyway, before I TODO the failing tests I thought I should let Dave M
> know and see if he can come up with something.

I'm not really an expert in this area; I just happened to notice during
the re_eval work that it was buggily caching stuff because it was just
comparing the pattern bytes, and not the utf8 flag; and further, I found
found that the recompile logic didn't have any tests so I added a basic
test file.

My general opinion is that we should favour correctness over speed; so
if your fix improves correctness, but disables caching for a few cases,
then that's fine by me.

O Unicef Clearasil!
Gibberish and Drivel!
    -- "Bored of the Rings"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About