develooper Front page | perl.perl5.porters | Postings from July 2008

Re: [perl #57040] pos() function doesn't handle unicode well

Thread Previous | Thread Next
From:
andreas.koenig.7os6VVqR
Date:
July 18, 2008 11:44
Subject:
Re: [perl #57040] pos() function doesn't handle unicode well
Message ID:
87zlofkp3t.fsf@k75.linux.bogus
>>>>> On Thu, 17 Jul 2008 15:55:17 -0400, "Eric Brine" <ikegami@adaelis.com> said:

  > On Thu, Jul 17, 2008 at 6:42 AM, via RT Marcela Maslanova
  > <perlbug-followup@perl.org> wrote:
 >> # New Ticket Created by  Marcela Maslanova
 >> # Please include the string:  [perl #57040]
 >> # in the subject line of all future correspondence about this issue.
 >> # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=57040 >


  > A simpler test that demonstrates the problem violently:
 >> perl -e"$_=qq{\x{2660}\t}; s/\t/       qq{\t}/ge"

 >> perl -e"$_=qq{\x{2660}\t}; s/\t/pos(); qq{\t}/ge"
  > Malformed UTF-8 character (unexpected end of string) in match position
  > at -e line 1.

This bug has somehow disappeared in bleadperl just right at the same
moment as this patch went in:

Change 33580 by nicholas@nicholas-saigo on 2008/03/26 21:05:20

	The offset for pos is stored as bytes, and converted to (Unicode)
	character position when read, if needed. The code for setting pos
	inside subst was incorrectly converting to character position before
	storing the value. This code appears to have been buggy since it was
	added in 2000 in change 7562.

I think the ticket can be closed.

Thanks,
-- 
andreas

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About