develooper Front page | perl.perl5.porters | Postings from July 2008

[perl #56902] regex utf8 "uninitialized value" error

Thread Previous
From:
Ben Bullock
Date:
July 13, 2008 22:37
Subject:
[perl #56902] regex utf8 "uninitialized value" error
Message ID:
rt-3.6.HEAD-8814-1215988216-834.56902-75-0@perl.org
# New Ticket Created by  "Ben Bullock" 
# Please include the string:  [perl #56902]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=56902 >


This is a bug report for perl from benkasminbullock@gmail.com,
generated with the help of perlbug 1.36 running under perl 5.10.0.


-----------------------------------------------------------------
[Please enter your report here]

The following script prints lots of erroneous "uninitialized value"
warnings depending on whether UTF-8 is switched on or off

#!/usr/bin/perl
use warnings;
use strict;

my $regex =
"([\x{ff10}-\x{ff19}0-9]{4}|[\x{5341}\x{516d}\x{4e03}\x{4e5d}\x{4e94}\x{56db}\x{5343}\x{767e}\x{4e8c}\x{4e00}\x{516b}\x{4e09}]?\x{5343}[\x{5341}\x{516d}\x{4e03}\x{4e5d}\x{4e94}\x{56db}\x{5343}\x{767e}\x{4e8c}\x{4e00}\x{516b}\x{4e09}]*)\\s*\x{5e74}";
my $test = "ABCDEFG";
if ($test =~ /($regex)/) {
    print "m:<$1>\n";
}
__END__

If the last character ("\x{5e74}") is removed from the regexp, the
warning vanishes. But if the capturing () is removed (leaving just
"\\s*\x{5e74}", the warning vanishes, too - so it's not just \x{5e74}
which triggers the warning, only that combined with something else.


The above is a condensed version, which was originally as followss:
#!/usr/local/bin/perl -lw
use strict;
use Encode 'decode';
use Lingua::JA::FindDates 'subsjdate';
binmode STDERR,"utf8";
binmode STDOUT,"utf8";
print STDERR "first try\n";
my $test = "ABCDEFG";
print subsjdate($test);
print STDERR "now try again\n";
$test = decode ('utf8', $test);
print subsjdate($test);

See also this discussion:

http://groups.google.co.jp/group/comp.lang.perl.misc/browse_frm/thread/e487e48569c928b7?hl=en#


[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=low
---
Site configuration information for perl 5.10.0:

Configured by ben at Sun Mar 23 08:50:32 JST 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.6.22-14-generic, archname=i686-linux
    uname='linux lemon 2.6.22-14-generic #1 smp tue feb 12 07:42:25
utc 2008 i686 gnulinux '
    config_args=''
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
    use64bitint=undef, use64bitall=undef, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.1.3 20070929 (prerelease) (Ubuntu
4.1.2-16ubuntu2)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /usr/lib64
    libs=-lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.6.1.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version='2.6.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib'

Locally applied patches:


---
@INC for perl 5.10.0:
    /usr/local/lib/perl5/5.10.0/i686-linux
    /usr/local/lib/perl5/5.10.0
    /usr/local/lib/perl5/site_perl/5.10.0/i686-linux
    /usr/local/lib/perl5/site_perl/5.10.0
    .

---
Environment for perl 5.10.0:
    HOME=/home/ben
    LANG=en_GB.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/home/ben/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11:/usr/games
    PERL_BADLANG (unset)
    SHELL=/bin/bash


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About