develooper Front page | perl.perl5.porters | Postings from November 2010

Re: "perl: utf8.c:1997: Perl_swash_fetch: Assertion `klen <=sizeof(PL_last_swash_key)' failed." [5.12.1]

Thread Previous | Thread Next
From:
Chip Salzenberg
Date:
November 25, 2010 19:48
Subject:
Re: "perl: utf8.c:1997: Perl_swash_fetch: Assertion `klen <=sizeof(PL_last_swash_key)' failed." [5.12.1]
Message ID:
AANLkTimh5+eTv=cok3K5zMk7mnEkHzXWp2A=yn2OKAD2@mail.gmail.com
On Wed, Nov 24, 2010 at 8:22 PM, karl williamson
<public@khwilliamson.com> wrote:
>
> Chip Salzenberg wrote:
>>
>> I'm experimenting with some text scanning code against a very large corpus,
>> and I've got Perl 5.12.1 dying on this assertion failure:
>>
>>   perl: utf8.c:1997: Perl_swash_fetch: Assertion `klen <= sizeof(PL_last_swash_key)' failed.
>>
>>
>
> Here's a relevant comment:
>    /* Given a UTF-X encoded char 0xAA..0xYY,0xZZ
>     * then the "swatch" is a vec() for all the chars which start
>     * with 0xAA..0xYY
>     * So the key in the hash (klen) is length of encoded char -1
>     */
>

I've uncovered the string that's causing this problem.  When the
attached string has the utf8 bit enabled and a regex is applied, Perl
dies with the above exception.  Fortunately, utf8::valid() returns
false, so I have an easy way of avoiding this particular crash.  But
it's still a Perl bug that should be fixed - assertion failures should
not result from applying a regex, even to an invalid utf8 string.

PS: The proximate source of the invalid string is Encode::Guess.  But,
well, its name does say "guess" after all.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About