develooper Front page | perl.perl5.porters | Postings from April 2006

Re: super-linear cache

Thread Next
April 17, 2006 09:43
Re: super-linear cache
Message ID:
Dave Mitchell <> wrote:
:CACHEsayYES appears to have a typo:
:            PL_reg_poscache[0] |= (1<<POSCACHE_SUCCESS) || (1<<POSCACHE_SEEN); \
:	                                                ^^
:Surely that should be | rather than || ?
:However, changing it appears to make no change to passing tests, nor to
:the speed of executing them, so I'm a bit puzzled.

Here's a test case fixed by correcting the typo:

  perl -wle '("a" x 31) =~ /^(a*?)(?!(a{6}|a{5})*$)/; print length($1)'

which should give n=12, since 19 is the greatest number <= 31 which cannot
be expressed as 5x+6y, with (x, y) >= 0, but actually gives n=18.

Note that we get the correct answer if the alternates are reversed.
I expressed the alternates this way rather than 'aaaaaa|aaaaa' to avoid
the trie optimisation, but it appears to make no difference in either
case, so Yves has done a good job of being bug-compatible. :)

I don't really understand the -Dr output I'm getting, but I believe the
typo is causing it to do something like:
- try with n=0, fail
- detect that caching is useful
- try with n=1
  - find a success, try to record it; due to typo we set POSCACHE_SUCCESS
    but not POSCACHE_SEEN and mark the position as a success
  - find a failure, try to record it; we set POSCACHE_SEEN assuming that
    POSCACHE_SUCCESS is still false, and mark the position thinking that
    we are marking it as a failure
  - now the cache has POSCACHE_SUCCESS and POSCACHE_SEEN, so only that one
    failure is poisoning it
.. but I'm by no means certain of the exact route it is following.

In the process of investigating this, I found that it was rather random
whether we ended up with a success cache or a failure cache; I suspect
that means we'd nearly always be better off having both, but I'm not
going to attempt a patch for that right now.

I'm not sure whether the testcase is build-dependent, but we might as well
have it in.

--- regexec.c.old	Mon Apr 17 13:52:53 2006
+++ regexec.c	Mon Apr 17 17:41:43 2006
@@ -2245,7 +2245,7 @@
 #define CACHEsayYES STMT_START { \
     if (st->u.whilem.cache_offset | st->u.whilem.cache_bit) { \
 	if (!(PL_reg_poscache[0] & (1<<POSCACHE_SEEN))) \
-	    PL_reg_poscache[0] |= (1<<POSCACHE_SUCCESS) || (1<<POSCACHE_SEEN); \
+	    PL_reg_poscache[0] |= (1<<POSCACHE_SUCCESS) | (1<<POSCACHE_SEEN); \
         else if (!(PL_reg_poscache[0] & (1<<POSCACHE_SUCCESS))) { \
 	    /* cache records failure, but this is success */ \
 	    DEBUG_r( \
--- t/op/re_tests.old	Thu Apr 13 00:55:36 2006
+++ t/op/re_tests	Mon Apr 17 17:41:32 2006
@@ -961,3 +961,4 @@
 ^(.)\s+.$(?(1))	A B	y	$1	A	# [perl #37688]
 (?:r?)*?r|(.{2,4})	abcde	y	$1	abcd
 (?!)+?|(.{2,4})	abcde	y	$1	abcd
+^(a*?)(?!(a{6}|a{5})*$)	aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa	y	$+[1]	12	# super-linear cache bug may return 18

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About