Front page | perl.perl5.porters |
Postings from December 2000
[ID 20001230.003] UTF-8 tr still hurts
From:
jhi
Date:
December 30, 2000 12:57
Subject:
[ID 20001230.003] UTF-8 tr still hurts
Message ID:
200012302057.WAA31168@alpha.hut.fi
This is a bug report for perl from jhi@kosh.hut.fi,
generated with the help of perlbug 1.33 running under perl v5.7.0.
-----------------------------------------------------------------
[Please enter your report here]
Just perlbugging the proposed news tr tests: though Inaba's patch
(#8267) makes the situation much better some tr bugs still remain.
==== //depot/perl/t/op/tr.t#10 - /u/vieraat/vieraat/jhi/pp4/perl/t/op/tr.t ====
Index: perl/t/op/tr.t
--- perl/t/op/tr.t.~1~ Sat Dec 30 20:23:18 2000
+++ perl/t/op/tr.t Sat Dec 30 20:23:18 2000
@@ -5,7 +5,7 @@
@INC = '../lib';
}
-print "1..29\n";
+print "1..46\n";
$_ = "abcdefghijklmnopqrstuvwxyz";
@@ -181,3 +181,95 @@
print (($@ =~ m|^Can't modify constant item in transliteration \(tr///\)|)
? '' : 'not ', "ok 29\n");
+# v300 (0x12c) is UTF-8-encoded as 196 172 (0xc4 0xac)
+# v400 (0x190) is UTF-8-encoded as 198 144 (0xc6 0x90)
+
+# Transliterate a byte to a byte, all four ways.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\xc5/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 30\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\x{c5}/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 31\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{c4}/\xc5/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 32\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{c4}/\x{c5}/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 33\n";
+
+# Transliterate a byte to a wide character.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\x{12d}/;
+print "not " unless $a eq v300.301.172.300.301.172;
+print "ok 34\n";
+
+# Transliterate a wide character to a byte.
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\xc3/;
+print "not " unless $a eq v195.196.172.195.196.172;
+print "ok 35\n";
+
+# Transliterate a wide character to a wide character.
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\x{12d}/;
+print "not " unless $a eq v301.196.172.301.196.172;
+print "ok 36\n";
+
+# Transliterate both ways.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4\x{12c}/\x{12d}\xc3/;
+print "not " unless $a eq v195.301.172.195.301.172;
+print "ok 37\n";
+
+# Transliterate all (four) ways.
+
+($a = v300.196.172.300.196.172.400.198.144) =~
+ tr/\xac\xc4\x{12c}\x{190}/\xad\x{12d}\xc5\x{191}/;
+print "not " unless $a eq v197.301.173.197.301.173.401.198.144;
+print "ok 38\n";
+
+# Transliterate and count.
+
+print "not "
+ unless (($a = v300.196.172.300.196.172) =~ tr/\xc4/\xc5/) == 2;
+print "ok 39\n";
+
+print "not "
+ unless (($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\x{12d}/) == 2;
+print "ok 40\n";
+
+# Transliterate with complement.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\x{12d}/c;
+print "not " unless $a eq v301.196.301.301.196.301;
+print "ok 41\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\xc5/c;
+print "not " unless $a eq v300.197.197.300.197.197;
+print "ok 42\n";
+
+# Transliterate with deletion.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4//d;
+print "not " unless $a eq v300.172.300.172;
+print "ok 43\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}//d;
+print "not " unless $a eq v196.172.196.172;
+print "ok 44\n";
+
+# Transliterate with squeeze.
+
+($a = v196.196.172.300.300.196.172) =~ tr/\xc4/\xc5/s;
+print "not " unless $a eq v197.172.300.300.197.172;
+print "ok 45\n";
+
+($a = v196.172.300.300.196.172.172) =~ tr/\x{12c}/\x{12d}/s;
+print "not " unless $a eq v196.172.301.196.172.172;
+print "ok 46\n";
+
End of Patch.
[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
category=core
severity=medium
---
Site configuration information for perl v5.7.0:
Configured by jhi at Sat Dec 30 22:33:44 EET 2000.
Summary of my perl5 (revision 5.0 version 7 subversion 0) configuration:
Platform:
osname=dec_osf, osvers=4.0f, archname=alpha-dec_osf
uname='osf1 kosh.hut.fi v4.0 1229 alpha '
config_args='-des -Dusedevel -Doptimize=-g -Dccflags=-DDEBUGGING'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=define use64bitall=define uselongdouble=undef
Compiler:
cc='cc', ccflags ='-DDEBUGGING -std -DDEBUGGING -DLANGUAGE_C',
optimize='-g',
cppflags='-DDEBUGGING -std -DDEBUGGING -DLANGUAGE_C'
ccversion='V5.9-010', gccversion='', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
alignbytes=8, usemymalloc=y, prototype=define
Linker and Libraries:
ld='ld', ldflags =''
libpth=/usr/shlib /usr/ccs/lib /usr/lib/cmplrs/cc /usr/lib /var/shlib
libs=-lgdbm -ldbm -ldb -lm -liconv -lutil
perllibs=-lm -liconv -lutil
libc=/usr/shlib/libc.so, so=so, useshrplib=true, libperl=libperl.so
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' -Wl,-rpath,/usr/local/lib/perl5/5.7.0/alpha-dec_osf/CORE'
cccdlflags=' ', lddlflags='-shared -expect_unresolved "*" -g -msym -std'
Locally applied patches:
DEVEL8268
---
@INC for perl v5.7.0:
lib
/u/vieraat/vieraat/jhi/Perl/lib
/usr/local/lib/perl5/5.7.0/alpha-dec_osf
/usr/local/lib/perl5/5.7.0
/usr/local/lib/perl5/site_perl/5.7.0/alpha-dec_osf
/usr/local/lib/perl5/site_perl/5.7.0
/usr/local/lib/perl5/site_perl
.
---
Environment for perl v5.7.0:
HOME=/u/vieraat/vieraat/jhi
LANG=C
LANGUAGE (unset)
LC_ALL=fi_FI.ISO8859-1
LC_CTYPE=fi_FI.ISO8859-1
LD_LIBRARY_PATH=/u/vieraat/vieraat/jhi/pp4/perl
LOGDIR (unset)
PATH=/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/.s:/u/vieraat/vieraat/jhi/.b/OSF1:/c/bin:/p/bin:/p/adm/bin:/usr/bin:/usr/sbin:/sbin:/bin:/usr/ccs/bin:/usr/lib:/etc:/lib:/p/X6/bin:/p/X5/bin:/usr/bin/X11:/usr/lbin:/usr/sbin/acct:/usr/tcb/bin:/tcb/bin:/usr/field:/u/vieraat/vieraat/jhi
PERLLIB=/u/vieraat/vieraat/jhi/Perl/lib
PERL_BADLANG (unset)
SHELL=/bin/zsh
-
[ID 20001230.003] UTF-8 tr still hurts
by jhi