develooper Front page | perl.perl5.porters | Postings from December 2000

[ID 20001230.003] UTF-8 tr still hurts

From:
jhi
Date:
December 30, 2000 12:57
Subject:
[ID 20001230.003] UTF-8 tr still hurts
Message ID:
200012302057.WAA31168@alpha.hut.fi

This is a bug report for perl from jhi@kosh.hut.fi,
generated with the help of perlbug 1.33 running under perl v5.7.0.


-----------------------------------------------------------------
[Please enter your report here]

Just perlbugging the proposed news tr tests: though Inaba's patch
(#8267) makes the situation much better some tr bugs still remain.

==== //depot/perl/t/op/tr.t#10 - /u/vieraat/vieraat/jhi/pp4/perl/t/op/tr.t ====
Index: perl/t/op/tr.t
--- perl/t/op/tr.t.~1~	Sat Dec 30 20:23:18 2000
+++ perl/t/op/tr.t	Sat Dec 30 20:23:18 2000
@@ -5,7 +5,7 @@
     @INC = '../lib';
 }
 
-print "1..29\n";
+print "1..46\n";
 
 $_ = "abcdefghijklmnopqrstuvwxyz";
 
@@ -181,3 +181,95 @@
 print (($@ =~ m|^Can't modify constant item in transliteration \(tr///\)|)
        ? '' : 'not ', "ok 29\n");
 
+# v300 (0x12c) is UTF-8-encoded as 196 172 (0xc4 0xac)
+# v400 (0x190) is UTF-8-encoded as 198 144 (0xc6 0x90)
+
+# Transliterate a byte to a byte, all four ways.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\xc5/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 30\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\x{c5}/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 31\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{c4}/\xc5/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 32\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{c4}/\x{c5}/;
+print "not " unless $a eq v300.197.172.300.197.172;
+print "ok 33\n";
+
+# Transliterate a byte to a wide character.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\x{12d}/;
+print "not " unless $a eq v300.301.172.300.301.172;
+print "ok 34\n";
+
+# Transliterate a wide character to a byte.
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\xc3/;
+print "not " unless $a eq v195.196.172.195.196.172;
+print "ok 35\n";
+
+# Transliterate a wide character to a wide character.
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\x{12d}/;
+print "not " unless $a eq v301.196.172.301.196.172;
+print "ok 36\n";
+
+# Transliterate both ways.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4\x{12c}/\x{12d}\xc3/;
+print "not " unless $a eq v195.301.172.195.301.172;
+print "ok 37\n";
+
+# Transliterate all (four) ways.
+
+($a = v300.196.172.300.196.172.400.198.144) =~
+	tr/\xac\xc4\x{12c}\x{190}/\xad\x{12d}\xc5\x{191}/;
+print "not " unless $a eq v197.301.173.197.301.173.401.198.144;
+print "ok 38\n";
+
+# Transliterate and count.
+
+print "not "
+    unless (($a = v300.196.172.300.196.172) =~ tr/\xc4/\xc5/)       == 2;
+print "ok 39\n";
+
+print "not "
+    unless (($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\x{12d}/) == 2;
+print "ok 40\n";
+
+# Transliterate with complement.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4/\x{12d}/c;
+print "not " unless $a eq v301.196.301.301.196.301;
+print "ok 41\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}/\xc5/c;
+print "not " unless $a eq v300.197.197.300.197.197;
+print "ok 42\n";
+
+# Transliterate with deletion.
+
+($a = v300.196.172.300.196.172) =~ tr/\xc4//d;
+print "not " unless $a eq v300.172.300.172;
+print "ok 43\n";
+
+($a = v300.196.172.300.196.172) =~ tr/\x{12c}//d;
+print "not " unless $a eq v196.172.196.172;
+print "ok 44\n";
+
+# Transliterate with squeeze.
+
+($a = v196.196.172.300.300.196.172) =~ tr/\xc4/\xc5/s;
+print "not " unless $a eq v197.172.300.300.197.172;
+print "ok 45\n";
+
+($a = v196.172.300.300.196.172.172) =~ tr/\x{12c}/\x{12d}/s;
+print "not " unless $a eq v196.172.301.196.172.172;
+print "ok 46\n";
+
End of Patch.


[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=medium
---
Site configuration information for perl v5.7.0:

Configured by jhi at Sat Dec 30 22:33:44 EET 2000.

Summary of my perl5 (revision 5.0 version 7 subversion 0) configuration:
  Platform:
    osname=dec_osf, osvers=4.0f, archname=alpha-dec_osf
    uname='osf1 kosh.hut.fi v4.0 1229 alpha '
    config_args='-des -Dusedevel -Doptimize=-g -Dccflags=-DDEBUGGING'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
  Compiler:
    cc='cc', ccflags ='-DDEBUGGING -std -DDEBUGGING -DLANGUAGE_C',
    optimize='-g',
    cppflags='-DDEBUGGING -std -DDEBUGGING -DLANGUAGE_C'
    ccversion='V5.9-010', gccversion='', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries:
    ld='ld', ldflags =''
    libpth=/usr/shlib /usr/ccs/lib /usr/lib/cmplrs/cc /usr/lib /var/shlib
    libs=-lgdbm -ldbm -ldb -lm -liconv -lutil
    perllibs=-lm -liconv -lutil
    libc=/usr/shlib/libc.so, so=so, useshrplib=true, libperl=libperl.so
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='  -Wl,-rpath,/usr/local/lib/perl5/5.7.0/alpha-dec_osf/CORE'
    cccdlflags=' ', lddlflags='-shared -expect_unresolved "*" -g -msym -std'

Locally applied patches:
    DEVEL8268

---
@INC for perl v5.7.0:
    lib
    /u/vieraat/vieraat/jhi/Perl/lib
    /usr/local/lib/perl5/5.7.0/alpha-dec_osf
    /usr/local/lib/perl5/5.7.0
    /usr/local/lib/perl5/site_perl/5.7.0/alpha-dec_osf
    /usr/local/lib/perl5/site_perl/5.7.0
    /usr/local/lib/perl5/site_perl
    .

---
Environment for perl v5.7.0:
    HOME=/u/vieraat/vieraat/jhi
    LANG=C
    LANGUAGE (unset)
    LC_ALL=fi_FI.ISO8859-1
    LC_CTYPE=fi_FI.ISO8859-1
    LD_LIBRARY_PATH=/u/vieraat/vieraat/jhi/pp4/perl
    LOGDIR (unset)
    PATH=/u/vieraat/vieraat/jhi/Perl/bin:/u/vieraat/vieraat/jhi/.s:/u/vieraat/vieraat/jhi/.b/OSF1:/c/bin:/p/bin:/p/adm/bin:/usr/bin:/usr/sbin:/sbin:/bin:/usr/ccs/bin:/usr/lib:/etc:/lib:/p/X6/bin:/p/X5/bin:/usr/bin/X11:/usr/lbin:/usr/sbin/acct:/usr/tcb/bin:/tcb/bin:/usr/field:/u/vieraat/vieraat/jhi
    PERLLIB=/u/vieraat/vieraat/jhi/Perl/lib
    PERL_BADLANG (unset)
    SHELL=/bin/zsh




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About