develooper Front page | perl.perl5.porters | Postings from November 2000

Re: [ID 20001114.001] use utf8;use charnames; is incorrect for \x{80}-\x{FF}

Thread Previous | Thread Next
From:
Nick Ing-Simmons
Date:
November 14, 2000 00:33
Subject:
Re: [ID 20001114.001] use utf8;use charnames; is incorrect for \x{80}-\x{FF}
Message ID:
200011140832.IAA07470@mikado.tiuk.ti.com
Andrew McNaughton <andrew@tki.org.nz> writes:
>This is a bug report for perl from andrew@tki.org.nz,
>generated with the help of perlbug 1.26 running under perl 5.006.
>
>
>-----------------------------------------------------------------
>[Please enter your report here]
>
>The following fails:
>
>use utf8;
>use charnames ':full';
>$text .= "\N{LATIN CAPITAL LETTER A WITH DIAERESIS}";
>
>
>This fails because of the final line of &charnames::charnames.  It returns an
>8 bit value.

It is an 8-bit value - that is the UNICODE codepoint is < 256.

The problem is not with charnames as such, but rather
the fact that perl's internal optimization of hold chars in range 0..255
as single bytes is visible, and in particular there is as yet no way to 
tell perl that you want utf8 for _output_ ("use utf8" affects litteral 
strings on _input_ and has one or two other "odd" effects).


>
>
>
>I've fixed this on my own installation like so:
>(hand written so not complete patch format)
>
>-  return chr $ord;
>+  else {
>+    use utf8;
>+    return eval sprintf('"\x{%x}"',$ord);
>+  }
>
>
>It's unfortunate that the eval is necessary.  One would hope the following
>would work:
>
>else {
>   use utf8;
>   return chr($ord);
>}
>
>It seems that the chr function has a similar bug for characters 0x80 - 0xFF
>
>
>
>Andrew McNaughton
>andrew@tki.org.nz
>
>
>
>
>
>
>
>
>
>[Please do not change anything below this line]
>-----------------------------------------------------------------
>
>---
>This perlbug was built using Perl 5.00503 - $Date: 1999/05/05 19:42:40 $
>It is being executed now by  Perl 5.006 - Wed Jun 28 23:41:06 NZST 2000.
>
>Site configuration information for perl 5.006:
>
>Configured by andrew at Wed Jun 28 23:41:06 NZST 2000.
>
>Summary of my perl5 (revision 5.0 version 6 subversion 0) configuration:
>  Platform:
>    osname=freebsd, osvers=3.4-stable, archname=i386-freebsd
>    uname='freebsd sub.internal.cwa.co.nz 3.4-stable freebsd 3.4-stable #1: tue may 16 20:48:22 nzst 2000 andrew@sub.internal.cwa.co.nz:usrsrcsyscompilesub-2000012101 i386 '
>    config_args='-de -Uuselargefiles'
>    hint=recommended, useposix=true, d_sigaction=define
>    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
>    useperlio=undef d_sfio=undef uselargefiles=undef 
>    use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=undef
>  Compiler:
>    cc='cc', optimize='-O', gccversion=2.7.2.3
>    cppflags='-I/usr/local/include'
>    ccflags ='-I/usr/local/include'
>    stdchar='char', d_stdstdio=undef, usevfork=true
>    intsize=4, longsize=4, ptrsize=4, doublesize=8
>    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
>    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
>    alignbytes=4, usemymalloc=n, prototype=define
>  Linker and Libraries:
>    ld='cc', ldflags ='-Wl,-E  -L/usr/local/lib'
>    libpth=/usr/lib /usr/local/lib
>    libs=-lm -lc -lcrypt
>    libc=/usr/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
>  Dynamic Linking:
>    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
>    cccdlflags='-DPIC -fpic', lddlflags='-shared  -L/usr/local/lib'
>
>Locally applied patches:
>    
>
>---
>@INC for perl 5.006:
>    /usr/local/lib/perl5/5.6.0/i386-freebsd
>    /usr/local/lib/perl5/5.6.0
>    /usr/local/lib/perl5/site_perl/5.6.0/i386-freebsd
>    /usr/local/lib/perl5/site_perl/5.6.0
>    /usr/local/lib/perl5/site_perl/5.005/i386-freebsd
>    /usr/local/lib/perl5/site_perl/5.005
>    /usr/local/lib/perl5/site_perl
>    .
>
>---
>Environment for perl 5.006:
>    HOME=/root
>    LANG (unset)
>    LANGUAGE (unset)
>    LD_LIBRARY_PATH (unset)
>    LOGDIR (unset)
>    PATH=/home/andrew/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/bin:/usr/X11R6/bin:/usr/local/snns/bin
>    PERL_BADLANG (unset)
>    SHELL=/usr/local/bin/bash
-- 
Nick Ing-Simmons <nik@tiuk.ti.com>
Via, but not speaking for: Texas Instruments Ltd.


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About