develooper Front page | perl.perl5.porters | Postings from June 2004

[perl #30442] Text::ParseWords does not handle backslashed newline inside quoted text

Thread Next
From:
Ephraim Dan
Date:
June 24, 2004 01:56
Subject:
[perl #30442] Text::ParseWords does not handle backslashed newline inside quoted text
Message ID:
rt-3.0.9-30442-91297.4.12944162802255@perl.org
# New Ticket Created by  Ephraim Dan 
# Please include the string:  [perl #30442]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=30442 >



This is a bug report for perl from
ephraim.dan@exlibris.co.il,
generated with the help of perlbug 1.34 running under
perl v5.8.2.


-----------------------------------------------------------------
[Please enter your report here]

I am trying to parse a "line" (really a string) of
text using
Text::ParseWords.  I have quoted fields delimited by a
"tab"
(\t) character.  Within the quoted fields, an actual
linefeed
(0x0a) character can appear, escaped with a backslash.
 e.g.

"field1" "field2\
still field2"   "field3"
__END__

I looked at the source, and the main regex in
parse_line()
appears to be supposed to handle backslashed things
inside
the quoted text.  I think this should include a
newline.
The problem is using the "." without the "/s"
modifier.

At first, I tested out changing the regex to use the
"/s"
modifier (in addition to the "/x" modifier of course).
 It
worked well on my data.  It then occurred to me that
the
more correct solution would be to change "." to
"[\000-\377]"

I then noticed that just a plain linefeed, not
backslashed,
works fine, which makes me much closer to 100% sure
that this
is a bug, and not somehow the intended behaviour.

Here is a patch:

*** ../app/perl/lib/5.8.2/Text/ParseWords.pm    Thu
Jan  1 15:24:48 2004
--- ./lib/Text/ParseWords.pm        Wed Jun 23
14:14:43 2004
***************
*** 59,69 ****

        ($quote, $quoted, undef, $unquoted, $delim,
undef) =
            $line =~ m/^(["'])                 # a
$quote
!                         ((?:\\.|(?!\1)[^\\])*)    #
and $quoted text
                          \1                   #
followed by the same quote
                          ([\000-\377]*)              
# and the rest
                       |                       #
--OR--
!                        ^((?:\\.|[^\\"'])*?)    # an
$unquoted text
                     
(\Z(?!\n)|(?-x:$delimiter)|(?!^)(?=["']))
                                                 #
plus EOL, delimiter, or quote
                        ([\000-\377]*)         # the
rest
--- 59,69 ----

        ($quote, $quoted, undef, $unquoted, $delim,
undef) =
            $line =~ m/^(["'])                 # a
$quote
!                        
((?:\\[\000-\377]|(?!\1)[^\\])*)    # and $quoted text
                          \1                   #
followed by the same quote
                          ([\000-\377]*)              
# and the rest
                       |                       #
--OR--
!                       
^((?:\\[\000-\377]|[^\\"'])*?)    # an $unquoted text
                     
(\Z(?!\n)|(?-x:$delimiter)|(?!^)(?=["']))
                                                 #
plus EOL, delimiter, or quote
                        ([\000-\377]*)         # the
rest
***************
*** 76,84 ****
            $quoted = "$quote$quoted$quote";
        }
          else {
!           $unquoted =~ s/\\(.)/$1/g;
            if (defined $quote) {
!               $quoted =~ s/\\(.)/$1/g if ($quote eq
'"');
                $quoted =~ s/\\([\\'])/$1/g if (
$PERL_SINGLE_QUOTE && $quote eq "'");
              }
        }
--- 76,84 ----
            $quoted = "$quote$quoted$quote";
        }
          else {
!           $unquoted =~ s/\\([\000-\377])/$1/g;
            if (defined $quote) {
!               $quoted =~ s/\\([\000-\377])/$1/g if
($quote eq '"');
                $quoted =~ s/\\([\\'])/$1/g if (
$PERL_SINGLE_QUOTE && $quote eq "'");
              }
        }


__END__

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=library
    severity=high
---
Site configuration information for perl v5.8.2:

Configured by sharvitd_s at Thu Jan  1 14:51:25 IST
2004.

Summary of my perl5 (revision 5.0 version 8 subversion
2) configuration:
  Platform:
    osname=linux, osvers=2.4.9-e.3,
archname=i686-linux-64int-ld
    uname='linux rachel 2.4.9-e.3 #1 fri may 3
17:02:43 edt 2002 i686 unknown '
    config_args=''
    hint=recommended, useposix=true,
d_sigaction=define
    usethreads=undef use5005threads=undef
useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define
usesocks=undef
    use64bitint=define use64bitall=undef
uselongdouble=define
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-strict-aliasing
-I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-fno-strict-aliasing
-I/usr/local/include'
    ccversion='', gccversion='2.96 20000731 (Red Hat
Linux 7.2 2.96-108.1)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8,
byteorder=12345678
    d_longlong=define, longlongsize=8,
d_longdbl=define, longdblsize=12
    ivtype='long long', ivsize=8, nvtype='long
double', nvsize=12, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
    libc=/lib/libc-2.2.4.so, so=so, useshrplib=false,
libperl=libperl.a
    gnulibc_version='2.2.4'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef,
ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared
-L/usr/local/lib'

Locally applied patches:


---
@INC for perl v5.8.2:
   
/exlibris/sfx_ver/sfx_version_3/app/perl/lib/5.8.2/i686-linux-64int-ld
    /exlibris/sfx_ver/sfx_version_3/app/perl/lib/5.8.2
   
/exlibris/sfx_ver/sfx_version_3/app/perl/lib/site_perl/5.8.2/i686-linux-64int-ld
   
/exlibris/sfx_ver/sfx_version_3/app/perl/lib/site_perl/5.8.2
   
/exlibris/sfx_ver/sfx_version_3/app/perl/lib/site_perl
    .

---
Environment for perl v5.8.2:
    HOME=/exlibris/sfx_ver/sfx_version_3/ed_3/home
    LANG=en_US
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
   
PATH=/exlibris/sfx_ver/sfx_version_3/app/mysql/bin:/exlibris/sfx_ver/sfx_version_3/app/perl/bin:/exlibris/sfx_ver/sfx_vers
ion_3/app/utils:/opt/IBMJava2-131/bin:/opt/IBMJava2-131/jre/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:.:/exlibris/sfx_ve
r/sfx_version_3/ed_3/home/bin
    PERL_BADLANG (unset)
   
PERL_HOME=/exlibris/sfx_ver/sfx_version_3/app/perl/bin
    SHELL=/bin/tcsh

--i5NG5bW25034.1088006740/rachel.exlibris-int.il--




		
__________________________________
Do you Yahoo!?
Yahoo! Mail - You care about security. So do we.
http://promotions.yahoo.com/new_mail


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About