develooper Front page | perl.perl5.porters | Postings from March 2007

[perl #42167] Text::Tabs fails to expand correctly in the presence of UTF8 characters

Robin Norwood
March 29, 2007 23:58
[perl #42167] Text::Tabs fails to expand correctly in the presence of UTF8 characters
Message ID:
# New Ticket Created by  Robin Norwood 
# Please include the string:  [perl #42167]
# in the subject line of all future correspondence about this issue. 
# <URL: >

This is a bug report for perl from,
generated with the help of perlbug 1.35 running under perl v5.8.8.

It appears that Text::Tabs doesn't expand tabs properly when the tab comes after UTF8 characters.

perl -CS -MText::Tabs -e 'print expand("\taa\t.\n\t\x{010a}\x{010a}\t."), "\n"'

        aa      .
        %GĊĊ%@       .

My text editor/mailer may munge the UTF8 improperly - essentially the
line with two UTF8 characters gets an extra space before the dot when
run through Text::Tabs::expand.

This appears to be also broken in 5.9.4.

The bug is in Red Hat's bugzilla as:

Incidentally, the problem appears to stem from the pos() function not
counting UTF8 characters correctly - I haven't delved into the source
deeply enough to figure out why, though.  There is an alternative
version of expand() in the source file (after __END__) that
does not have this bug.  Since the repo browser at seems broken right
now ('no space left on device' errors - reported to the email address
listed on that page), I don't have access to the annotations/history
of that file to see why.

[Please do not change anything below this line]
This perlbug was built using Perl v5.8.8 in the Red Hat build system.
It is being executed now by Perl v5.8.8 - Tue Oct  3 11:01:05 EDT 2006.

Site configuration information for perl v5.8.8:

Configured by Red Hat, Inc. at Tue Oct  3 11:01:05 EDT 2006.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
    osname=linux, osvers=2.6.9-34.elsmp, archname=i386-linux-thread-multi
    uname='linux 2.6.9-34.elsmp #1 smp fri feb 24 16:56:28 est 2006 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -Dversion=5.8.8 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dinc_version_list=5.8.7 5.8.6 5.8.5 -Dscriptdir=/usr/bin'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='4.1.1 20060928 (Red Hat 4.1.1-28)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/, so=so, useshrplib=true,
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables -L/usr/local/lib'

Locally applied patches:

@INC for perl v5.8.8:

Environment for perl v5.8.8:
    LANGUAGE (unset)
    LOGDIR (unset)
    PERL_BADLANG (unset)
    SHELL=/bin/bash Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About