develooper Front page | perl.perl5.porters | Postings from April 2010

Re: [perl #41530] RFC: internal string upgrade latin-1 => utf8after s/// results in illegal utf8

Thread Previous
Dave Mitchell
April 12, 2010 09:34
Re: [perl #41530] RFC: internal string upgrade latin-1 => utf8after s/// results in illegal utf8
Message ID:
On Sun, Apr 11, 2010 at 09:07:08PM -0700, Karl Williamson via RT wrote:
> I'm preparing a patch for this bug, and I'm uncertain about the best way
> to do it.
> First, the bug is caused by the code not realizing that when you have
> two strings that independently may be in utf8 or not, that there are 4
> cases to take care of.  I mention this because the error of only taking
> care of 3 of the cases occurs in other places in the code as well.
> The code does not consider the possibility that the replacement string
> could be in utf8 when the source/target string isn't.  Thus 
> $latin1 =~ s/latin1/utf8/;
> fails.  The solution is to upgrade the variable to utf8.  My dilemma is
> whether to always do the upgrade when the replacement string is in utf8,
> or to do it only if the match succeeds.  The difference can lead to
> different results later, as if there is no upgrade, the scalar's
> characters in the 128-255 range will have different semantics than if
> the upgrade takes place.
> I'm leaning towards doing the upgrade, as I think we can infer from the
> replacement string being in utf8 that the programmer intended that the
> string have Unicode semantics, even if it isn't in utf8.  Therefore,
> it's better to do the upgrade to force those semantics.
> Is there a contrary opinion?


I think it could could just a validly be argued that the programmer only
intended the utf8 upgrade for the cases that matched. Which makes us even.
Then I think the tie-breaker is that we should try to be as conservative
as possible and only upgrade when we need to.

In economics, the exam questions are the same every year.
They just change the answers.

Thread Previous Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About