develooper Front page | perl.perl5.porters | Postings from June 2003

Re: gb2312.* vs euc-cn

Thread Next
Dan Kogai
June 29, 2003 06:30
Re: gb2312.* vs euc-cn
Message ID:
On Sunday, June 29, 2003, at 07:09  PM, Jarkko Hietaniemi wrote:
> Because of the new "randomised hashes" the Aliases.t tests randomly
> fails once in a while (just try setting PERL_HASH_SEED to an integer
> value, in many platforms '2' seems to work to bring out the failure).
> What fails is that when looking for 'gb2312-raw' and expecting
> 'gb2312-raw', 'euc-cn' is returned instead.
> Now, however, looking at and Supported.pod, I'm confused: it
> looks as if the 'euc-cn' would be the _right_ answer (so the Aliases.t
> needs to be fixed to expect 'euc-cn' also from gb2312-raw):
>         # This fixes gb2312 vs. euc-cn confusion, practically
>         define_alias( qr/\bGB[-_ ]?2312(?:\D.*$|$)/i => '"euc-cn"' );

Your confusion is just;  From alias point of view gb2312-raw resolves 
to euc-cn but gb2312-raw is not an alias so find_encoding() was not 
supposed to bother resolving aliases.  But I agree this aliase is not 
very good so please apply the patch below and see if it fixes the 
problem.  I have check at least this does not break existing perl 5.8.0

Dan the Encode Maintainer

RCS file: lib/Encode/,v
retrieving revision 1.36
diff -u -r1.36 lib/Encode/
--- lib/Encode/ 2003/05/19 04:56:03     1.36
+++ lib/Encode/ 2003/06/29 13:25:19
@@ -204,7 +204,7 @@
         # CP936 doesn't have vendor-addon for GBK, so they're identical.
         define_alias( qr/^gbk$/i => '"cp936"');
         # This fixes gb2312 vs. euc-cn confusion, practically
-       define_alias( qr/\bGB[-_ ]?2312(?:\D.*$|$)/i => '"euc-cn"' );
+       define_alias( qr/\bGB[-_ ]?2312(?!-?raw)/i => '"euc-cn"' );
         # for Encode::JP
         define_alias( qr/\bjis$/i            => '"7bit-jis"' );
         define_alias( qr/\beuc.*jp$/i        => '"euc-jp"' );

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About