develooper Front page | perl.perl5.porters | Postings from May 2003

[perl #22203] unicode regular expressions not working correctly

Thread Next
From:
perlbug-followup
Date:
May 14, 2003 16:17
Subject:
[perl #22203] unicode regular expressions not working correctly
Message ID:
rt-22203-57614.19.0785786693864@bugs6.perl.org
# New Ticket Created by  nlevitt@columbia.edu 
# Please include the string:  [perl #22203]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=22203 >



This is a bug report for perl from nlevitt@columbia.edu,
generated with the help of perlbug 1.34 running under perl v5.8.0.


-----------------------------------------------------------------
[Please enter your report here]


Please note, this message has UTF-8 characters. Hopefully perlbug
handles that correctly.

The following test program:

   #!/usr/bin/perl -w
   
   my @strings = ('aaa一一a', 'aa一一a', 'a一一a', '一一a',);
   
   for my $string (@strings)
   {
       if ($string =~ /\A\p{L}{2}/) {
           print "$string starts with two letters\n";
       }
       else {
           print "$string doesn't start with two letters\n";
       }
   
       if ($string =~ /\A\p{L}{3}/) {
           print "$string starts with three letters\n";
       }
       else {
           print "$string doesn't start with three letters\n";
       }
   }

prints the following output for me:

   aaa一一a starts with two letters
   aaa一一a starts with three letters
   aa一一a starts with two letters
   aa一一a starts with three letters
   a一一a starts with two letters
   a一一a doesn't start with three letters
   一一a doesn't start with two letters
   一一a doesn't start with three letters

which is incorrect. All the strings should match all the regular
expressions (they all start with at least 3 letters where by letter I
mean \p{L}).

Noah

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=high
---
Site configuration information for perl v5.8.0:

Configured by Debian Project at Mon Feb 17 13:30:42 UTC 2003.

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.4.19, archname=i386-linux-thread-multi
    uname='linux cyberhq 2.4.19 #1 smp sun aug 4 11:30:45 pdt 2002 i686 unknown unknown gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8.0 -Darchlib=/usr/lib/perl/5.8.0 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.0 -Dsitearch=/usr/local/lib/perl/5.8.0 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.0 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O3',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing'
    ccversion='', gccversion='3.2.2', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.1.so, so=so, useshrplib=true, libperl=libperl.so.5.8.0
    gnulibc_version='2.3.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    

---
@INC for perl v5.8.0:
    /etc/perl
    /usr/local/lib/perl/5.8.0
    /usr/local/share/perl/5.8.0
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8.0
    /usr/share/perl/5.8.0
    /usr/local/lib/site_perl
    .

---
Environment for perl v5.8.0:
    HOME=/home/nlevitt
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH=/opt/pango-20030415/lib:/opt/xrender-20030424/lib:/opt/xft-20030424/lib:/opt/vte-20030501/lib
    LOGDIR (unset)
    PATH=/usr/src/gucharmap/gucharmap:/usr/local/bin:/usr/X11R6/bin:/bin:/usr/bin:/usr/games:/sbin:/usr/sbin:/opt/j2sdk1.4.0_02/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About