develooper Front page | perl.perl5.porters | Postings from March 2000

[ID 20000316.001] UTF8-ness seems strange

From:
Graham Barr
Date:
March 16, 2000 02:53
Subject:
[ID 20000316.001] UTF8-ness seems strange
Message ID:
200003161051.KAA15952@chipper.localdomain

This is a bug report for perl from gbarr@chipper.localdomain,
generated with the help of perlbug 1.27 running under perl v5.6.0.


-----------------------------------------------------------------
[Please enter your report here]

The setting of the UTF8 bit still seems strange.

Surley if a regexp is executed in a utf8 block then $1 etc
should be tagged as UTF8. Likewise if executed in a block
which explicitly hase use bytes.

Also split, if executed in a utf8 scope, should its results
not be utf8 strings ?

Graham.

use Devel::Peek;

$\ = "\n";
$x = v1.999;
print length $x;
{
 use bytes;
 print length $x;
 ($y) = ($x =~ /(.*)/s);
}

print unpack("H*",$x);

print "Should be UTF8 - OK";
print Dump($x);

print "Should not be UTF8 - OK";
print Dump($y);

{
 use utf8;
 print "Should be UTF8 - BAD";
 print Dump(($x =~ /(.)/gs)[1]);
}
{
 use utf8;
 print "Should be UTF8 - BAD";
 print Dump( (split(//,$y))[1]);
}

[Please do not change anything below this line]
-----------------------------------------------------------------

---
Site configuration information for perl v5.6.0:

Configured by gbarr at Thu Mar 16 10:24:54 GMT 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.2.13, archname=i686-linux
    uname='linux chipper 2.2.13 #1 mon nov 8 15:37:25 cet 1999 i686 unknown '
    config_args='-der -Doptimize=-g'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define 
    use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=undef
  Compiler:
    cc='cc', optimize='-g', gccversion=egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
    cppflags='-DDEBUGGING -fno-strict-aliasing -I/usr/local/include'
    ccflags ='-DDEBUGGING -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    stdchar='char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldbm -ldb -ldl -lm -lc -lposix -lcrypt
    libc=, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    v5.6.0-RC2

---
@INC for perl v5.6.0:
    lib
    /home/value/perl
    /usr/local/lib/perl5/5.6.0/i686-linux
    /usr/local/lib/perl5/5.6.0
    /usr/local/lib/perl5/site_perl/5.6.0/i686-linux
    /usr/local/lib/perl5/site_perl/5.6.0
    /usr/local/lib/perl5/site_perl
    .

---
Environment for perl v5.6.0:
    HOME=/home/gbarr
    LANG=POSIX
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/gbarr/bin:/usr/local/bin:/usr/bin:/usr/X11R6/bin:/bin:/usr/games/bin:/usr/games:/opt/gnome/bin:/opt/kde/bin:.
    PERL5LIB=/home/value/perl
    PERL_BADLANG (unset)
    SHELL=/usr/bin/zsh



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About