develooper Front page | perl.perl5.porters | Postings from April 2010

Re: [perl #41530] RFC: internal string upgrade latin-1 => utf8after s/// results in illegal utf8

Thread Previous
From:
Dave Mitchell
Date:
April 12, 2010 09:34
Subject:
Re: [perl #41530] RFC: internal string upgrade latin-1 => utf8after s/// results in illegal utf8
Message ID:
20100412163351.GK3792@iabyn.com
On Sun, Apr 11, 2010 at 09:07:08PM -0700, Karl Williamson via RT wrote:
> I'm preparing a patch for this bug, and I'm uncertain about the best way
> to do it.
> 
> First, the bug is caused by the code not realizing that when you have
> two strings that independently may be in utf8 or not, that there are 4
> cases to take care of.  I mention this because the error of only taking
> care of 3 of the cases occurs in other places in the code as well.
> 
> The code does not consider the possibility that the replacement string
> could be in utf8 when the source/target string isn't.  Thus 
> 
> $latin1 =~ s/latin1/utf8/;
> 
> fails.  The solution is to upgrade the variable to utf8.  My dilemma is
> whether to always do the upgrade when the replacement string is in utf8,
> or to do it only if the match succeeds.  The difference can lead to
> different results later, as if there is no upgrade, the scalar's
> characters in the 128-255 range will have different semantics than if
> the upgrade takes place.
> 
> I'm leaning towards doing the upgrade, as I think we can infer from the
> replacement string being in utf8 that the programmer intended that the
> string have Unicode semantics, even if it isn't in utf8.  Therefore,
> it's better to do the upgrade to force those semantics.
> 
> Is there a contrary opinion?

Yes!

I think it could could just a validly be argued that the programmer only
intended the utf8 upgrade for the cases that matched. Which makes us even.
Then I think the tie-breaker is that we should try to be as conservative
as possible and only upgrade when we need to.

-- 
In economics, the exam questions are the same every year.
They just change the answers.

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About