develooper Front page | perl.perl5.porters | Postings from October 2003

[perl #24202] Perl 5.8.0 regexp anomaly

From:
perlbug-followup
Date:
October 14, 2003 17:04
Subject:
[perl #24202] Perl 5.8.0 regexp anomaly
Message ID:
rt-24202-66013.0.990672050783843@rt.perl.org
# New Ticket Created by  mario@anchor.sps.mot.com 
# Please include the string:  [perl #24202]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=24202 >



This is a bug report for perl from mario@hauler.sps.mot.com,
generated with the help of perlbug 1.34 running under perl v5.8.0.


-----------------------------------------------------------------
[Please enter your report here]

I have run across an odd problem with regular expressions.

In trying to summarize the problem, I believe I have found a workaround, but
the behavior is odd enough where I think perhaps it warrants reporting.

I am trying to parse a syntax like

key : value

The second field character can be a :, @ or % to indicate string, list or
hash, and it may be repeated to indicate a multiline value.

The relevent fact is that I'm using an RE to parse the line for me into $1, $2
and $3.

This regex is  /^\s*(.*?)\s*(:+|%+|@+)\s*(.*?)\s*$/  and worked fine in our
older version of perl (5.005).  Now, with both 5.8.0 on Solaris and 5.8.1
on Linux, there is some odd behavior.  Call the following file "/tmp/re":

--- cut here ---

#! /usr/bin/perl -w

use strict;

while (<>) {
  chomp;
  print(">>$_<<\n");
  if (! /^\s*(.*?)\s*(:+|%+|@+)\s*(.*?)\s*$/ ) {
    print(STDERR "Illegal key at line $.: $_.\n");
  } else {
    print("1:$1, 2:$2, 3:$3.\n");
  }
}

--- cut here ---

When I run the following from bash, I get some unexpected results:

COMMAND:

%% perl -v

This is perl, v5.8.0 built for sun4-solaris ....

%% { echo "Hello:"; echo "Hello:"; } | perl /tmp/re

RESULTS:

>>Hello:<<
1:, 2:, 3:Hello:.
>>Hello:<<
1:Hello, 2::, 3:.

With my older version, I get the expected results:

COMMAND:

%% /usr/local/bin/perl -v

This is perl, version 5.005_03 built for sun4-solaris ....

%% { echo "Hello:"; echo "Hello:"; } | /usr/bin/perl /tmp/re
>>Hello:<<
1:Hello, 2::, 3:.
>>Hello:<<
1:Hello, 2::, 3:.


So the newer version is somehow treating the same exact input differently
if it's the first it sees.  Odder still, or maybe illustrative (which is
why I'm mentioning it), here's a second progam, all self-contained; call it
"/tmp/re2".

--- cut here ---

#! /usr/bin/perl -w

use strict;

sub check($) {
  $_ = shift;
  print(">>$_<<\n");
  if (! /^\s*(.*?)\s*(:+|%+|@+)\s*(.*?)\s*$/ ) {
    print(STDERR "Illegal key at line $.: $_.\n");
  } else {
    print("1:$1, 2:$2, 3:$3.\n");
  }
}

check "Hello:";
check "Hello:";

exit 0;

--- cut here ---

Commands and results:

%% perl /tmp/re2
>>Hello:<<
1:, 2:, 3:Hello:.
>>Hello:<<
1:, 2:, 3:Hello:.
%% /usr/bin/perl /tmp/re2
>>Hello:<<
1:Hello, 2::, 3:.
>>Hello:<<
1:Hello, 2::, 3:.


... so the problem seems to maybe manifest only for the first time the
expression is encountered within a dynamic scope.  (Or maybe I'm full of
ka-ka.)

Workaround:

As I was searching the web looking for something like this bug anywhere,
I ran across the v5.8.0 delta document, which mentioned that "@foo" now
actually will always refer to array foo.  This set me thinking, so I
added some escaping to the regular expression and the odd behavior has
gone away.  I don't know whether this means it's a boneheaded user issue,
or whether there's something just weird about how patterns are expanded.

I'd argue that if it's expected behavior then Perl would respond the same way
to the first occurrence as the second, so I think this is, in fact, indicative
of a bug somewhere.

Anyway, the following program works:

--- cut here ---

#! /usr/bin/perl -w

use strict;

while (<>) {
  chomp;
  print(">>$_<<\n");
  #                      v   v
  if (! /^\s*(.*?)\s*(:+|\%+|\@+)\s*(.*?)\s*$/ ) {
  #                      ^   ^
    print(STDERR "Illegal key at line $.: $_.\n");
  } else {
    print("1:$1, 2:$2, 3:$3.\n");
  }
}

--- cut here ---

This works for 5.005 as well as 5.8.0 and 5.8.1.  So I'm happy and can
go on working, but I thought I ought to report this.

-- Mario


[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=medium
---
Site configuration information for perl v5.8.0:

Configured by mario at Tue Sep 16 01:49:49 MST 2003.

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration:
  Platform:
    osname=solaris, osvers=2.8, archname=sun4-solaris
    uname='sunos hauler 5.8 generic_108528-22 sun4u sparc sunw,ultra-enterprise '
    config_args='-Dprefix=/tools/LOCAL/002/SunOS_5.8 -Dcc=gcc -Uinstallusrbinperl -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/gnu/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O',
    cppflags='-fno-strict-aliasing -I/usr/gnu/include'
    ccversion='', gccversion='3.0.4', gccosandvers='solaris2.8'
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib -L/usr/gnu/lib '
    libpth=/usr/local/lib /usr/gnu/lib /usr/lib /usr/ccs/lib
    libs=-lsocket -lnsl -ldl -lm -lc
    perllibs=-lsocket -lnsl -ldl -lm -lc
    libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' -Wl,-E'
    cccdlflags='-fPIC', lddlflags=' -Wl,-E -G -L/usr/local/lib -L/usr/gnu/lib'

Locally applied patches:
    

---
@INC for perl v5.8.0:
    /tools/LOCAL/002/SunOS_5.8/lib/perl5/5.8.0/sun4-solaris
    /tools/LOCAL/002/SunOS_5.8/lib/perl5/5.8.0
    /tools/LOCAL/002/SunOS_5.8/lib/perl5/site_perl/5.8.0/sun4-solaris
    /tools/LOCAL/002/SunOS_5.8/lib/perl5/site_perl/5.8.0
    /tools/LOCAL/002/SunOS_5.8/lib/perl5/site_perl
    .

---
Environment for perl v5.8.0:
    HOME=/home/mario
    LANG (unset)
    LANGUAGE (unset)
    LC_TIME=C
    LD_LIBRARY_PATH=/usr/dt/lib:/usr/openwin/lib:/usr/gnu/lib:/usr/LOCAL/lib:/usr/local/lib:/usr/lib:/opt/SUNWapcy/lib:/opt/SUNWdat/lib:/opt/SUNWppro/lib:/opt/SUNWsdb/lib:/opt/hpnp/lib
    LOGDIR (unset)
    PATH=/home/mario/bin:/tools/wrapper/genii:/usr/gnu/bin:/tools/wrapper:/usr/condor/bin:/usr/LOCAL/bin:/usr/local/bin:/usr/dt/bin:/usr/openwin/bin:/usr/bin:/usr/ucb:/usr/sbin:/sbin:/usr/ccs/bin:/opt/SUNWdat/bin:/opt/SUNWppro/bin:/opt/SUNWrtvc/bin:/opt/hpnp/bin:.
    PERL_BADLANG (unset)
    SHELL=/usr/gnu/bin/bash




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About