develooper Front page | perl.perl5.porters | Postings from June 2004

[perl #30434] Possible bug in Perl 5.8.x using (?x) to enable extended syntax i nside qr{....}

From:
Gillman John
Date:
June 23, 2004 10:14
Subject:
[perl #30434] Possible bug in Perl 5.8.x using (?x) to enable extended syntax i nside qr{....}
Message ID:
rt-3.0.9-30434-91272.10.5014806660734@perl.org
# New Ticket Created by  "Gillman John (ExStaff)" 
# Please include the string:  [perl #30434]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=30434 >


Hi there,

I have written a module that I have used without problem with Perl 5.005_02
& 5.6.1 (on Solaris 2.6 and Solaris 8 respectively). I have now built Perl
5.8.3 on Solaris 8 but the module now fails. I have tracked the failure down
to a compiled regular expression in the module which is used to validate a
Unix filesystem path and extract it's elements. I am using extended syntax
so that I can comment the regular expression; extended syntax is invoked by
using (?x) at the beginning of the expression, i.e. qr{(?x)...};. This no
longer works and instead I now have to write qr{...}x; to get it to work.

I have looked at the supplied Perl 5.8.3 documentation and it seems that
(?x) should still be a valid construct.

Since the module is a few hundred lines long, I have extracted the regular
expression and put it in a small test script that prompts for a path,
validates it and prints out it's various parts. It does this in a loop,
Ctrl-D gets you out. Below are two versions of the script, the first has the
original regular expression which now fails at Perl 5.8.x and the second has
the workround. (Please excuse the cockeyed indenting, it seems Windows
disagrees with Solaris on how many spaces make a tab :-)

*---------------------------------------------------------------------------
---------------*

#!/usr/local/bin/perl -w
#
# Create prompt for user and string to cover up prompt and tidy the
# screen on exit.
#
$prompt = "Please enter a path (Ctrl-D to exit) > ";
$dePrompt = "\r" . " " x (length($prompt) + 2) . "\r";
$rxPath = qr
       {
          (?x)			# Use extended regular expression syntax to
					#   allow comments and white space
          ^				# Anchor pattern to beginning of
string
          (?=.)			# Zero-width look ahead assertion to ensure
					#   that there must be at least one
character
					#   for the match to succeed
          (.*/)?			# A memory grouping (1st) for path,
greedy
					#   match of any characters up to
and inc-
					#   luding the rightmost slash (the
path
					#   part) with a quantifier of '?'
(0 or 1),
					#   i.e. there may or may not be a
directory
 					#   part
          (				# Open memory grouping (2nd) for
file name
             (.*?)		# A memory grouping (3rd) for file name stub
					#   of a non-greedy match of any
character
					#   without a quantifier since, if
there is a
					#   file name part, at least some of
it will
					#   form a stub otherwise it would
be a dot-
					#   file such as .profile
             (			# A memory grouping (4th) for file name 
					#   extension
                (?<=[^/])	# Zero width look behind assertion such
					#   that following pattern will only
succeed
					#   if preceded by any character
other than
					#   a slash '/'
                \.[^.]+		# A literal dot '.' followed by one or more
					#   non-dots
             )?			# Close memory grouping (4th) with a quanti-
					#   fier of '?' (0 or 1), i.e. there
may or
					#   may not be a file name extension
part
          )?			# Close memory grouping (2nd) with a quanti-
					#   fier of '?' (0 or 1), i.e. there
may or
					#   may not be a file name part
          $				# Anchor pattern to end of string
       };
#
# The above regular expression would appear thus without extended
# syntax:-
#
#     qr{^(?=.)(.*/)?((.*?)((?<=[^/])\.[^.]+)?)?$};
#
print "Using regular expression:-\n\n$rxPath\n\n";
#
# Loop around prompting for a path. Ctrl-D quits loop.
#
while(1)
{
    print $prompt;
    last if eof STDIN;
    $_ = <STDIN>;
    chomp;
    #
    # Test path for validity against compiled regex. If valid, print out
    # the elements of the path. Otherwise, give error message.
    #
    if(/$rxPath/)
    {
	my $dirName = $1 ? $1 : "";
	my $fileName = $2 ? $2 : "";
	my $fileStub = $3 ? $3 : "";
	my $fileExt = $4 ? $4 : "";
	print "\n",
	   "   Path supplied - $_\n",
	   "  Directory name - $dirName\n",
	   "       File Name - $fileName\n",
	   "       File Stub - $fileStub\n",
	   "  File Extension - $fileExt\n\n",
    }
    else
    {
        print "\n*** $_ is NOT a valid path ***\n\n";
    }
}
#
# Tidy up screen.
#
print $dePrompt;

*---------------------------------------------------------------------------
---------------*

#!/usr/local/bin/perl -w
#
# Create prompt for user and string to cover up prompt and tidy the
# screen on exit.
#
$prompt = "Please enter a path (Ctrl-D to exit) > ";
$dePrompt = "\r" . " " x (length($prompt) + 2) . "\r";
$rxPath = qr
       {
          ^				# Anchor pattern to beginning of
string
          (?=.)			# Zero-width look ahead assertion to ensure
					#   that there must be at least one
character
					#   for the match to succeed
          (.*/)?			# A memory grouping (1st) for path,
greedy
					#   match of any characters up to
and inc-
					#   luding the rightmost slash (the
path
					#   part) with a quantifier of '?'
(0 or 1),
					#   i.e. there may or may not be a
directory
 					#   part
          (				# Open memory grouping (2nd) for
file name
             (.*?)		# A memory grouping (3rd) for file name stub
					#   of a non-greedy match of any
character
					#   without a quantifier since, if
there is a
					#   file name part, at least some of
it will
					#   form a stub otherwise it would
be a dot-
					#   file such as .profile
             (			# A memory grouping (4th) for file name 
					#   extension
                (?<=[^/])	# Zero width look behind assertion such
					#   that following pattern will only
succeed
					#   if preceded by any character
other than
					#   a slash '/'
                \.[^.]+		# A literal dot '.' followed by one or more
					#   non-dots
             )?			# Close memory grouping (4th) with a quanti-
					#   fier of '?' (0 or 1), i.e. there
may or
					#   may not be a file name extension
part
          )?			# Close memory grouping (2nd) with a quanti-
					#   fier of '?' (0 or 1), i.e. there
may or
					#   may not be a file name part
          $				# Anchor pattern to end of string
       }x;				# Use extended regular expression
syntax to
					#   allow comments and white space
#
# The above regular expression would appear thus without extended
# syntax:-
#
#     qr{^(?=.)(.*/)?((.*?)((?<=[^/])\.[^.]+)?)?$};
#
print "Using regular expression:-\n\n$rxPath\n\n";
#
# Loop around prompting for a path. Ctrl-D quits loop.
#
while(1)
{
    print $prompt;
    last if eof STDIN;
    $_ = <STDIN>;
    chomp;
    #
    # Test path for validity against compiled regex. If valid, print out
    # the elements of the path. Otherwise, give error message.
    #
    if(/$rxPath/)
    {
	my $dirName = $1 ? $1 : "";
	my $fileName = $2 ? $2 : "";
	my $fileStub = $3 ? $3 : "";
	my $fileExt = $4 ? $4 : "";
	print "\n",
	   "   Path supplied - $_\n",
	   "  Directory name - $dirName\n",
	   "       File Name - $fileName\n",
	   "       File Stub - $fileStub\n",
	   "  File Extension - $fileExt\n\n",
    }
    else
    {
        print "\n*** $_ is NOT a valid path ***\n\n";
    }
}
#
# Tidy up screen.
#
print $dePrompt;

*---------------------------------------------------------------------------
---------------*

I have wrapped the failing regular expression in an eval{} to see if there
was a problem with the compilation but there were no complaints. It just
fails to match a valid path (the regular expression makes no judgement on
which characters are valid in a path other than that '/' is what separates
directories).

I give below the outputs of "perl -V" on the systems in this office that I
have tested. I have also tested at home using Perl 5.8.4 built with gcc
3.3.2 on a Sun Ultra 1 running 64-bit Solaris 8 and the original regular
expression still fails but the workround is ok; I don't have "perl -V"
output here for that system. On a whim, I tried the test scripts on an
ActiveState build of Perl 5.8.0 running on an Intel Pentium III 500MHz box
running Windows XP Pro; that also failed on the original but was ok with the
workround.

*---------------------------------------------------------------------------
---------------*

Below is the output of "perl -V" for the system that exhibits the problem.

$ perl -V
Summary of my perl5 (revision 5.0 version 8 subversion 3) configuration:
  Platform:
    osname=solaris, osvers=2.8, archname=sun4-solaris
    uname='sunos syrah 5.8 generic_117000-03 sun4u sparc sunw,sun-fire-v250
'
    config_args='-Dcc=gcc'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/local/include
-I/usr/local/BerkeleyDB.3.3/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64',
    optimize='-O',
    cppflags='-fno-strict-aliasing -I/usr/local/include
-I/usr/local/BerkeleyDB.3.3/include'
    ccversion='', gccversion='3.3.2', gccosandvers='solaris2.8'
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib '
    libpth=/opt/sfw/lib /usr/local/lib /usr/local/BerkeleyDB.3.3/lib
/usr/ccs/lib /usr/lib
    libs=-lsocket -lnsl -lgdbm -ldb -ldl -lm -lc
    perllibs=-lsocket -lnsl -ldl -lm -lc
    libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Compile-time options: USE_LARGE_FILES
  Built under solaris
  Compiled at Apr 21 2004 18:09:42
  @INC:
    /usr/local/lib/perl5/5.8.3/sun4-solaris
    /usr/local/lib/perl5/5.8.3
    /usr/local/lib/perl5/site_perl/5.8.3/sun4-solaris
    /usr/local/lib/perl5/site_perl/5.8.3
    /usr/local/lib/perl5/site_perl
    .
$ 

*---------------------------------------------------------------------------
---------------*

The original regular expression works without problem on this Perl build.

$ perl -V
Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration:
  Platform:
    osname=solaris, osvers=2.8, archname=sun4-solaris
    uname='sunos agusta 5.8 generic_108528-07 sun4u sparc sunw,ultra-250 '
    config_args='-Dcc=gcc'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=undef
    useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler:
    cc='gcc', ccflags ='-fno-strict-aliasing -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O',
    cppflags='-fno-strict-aliasing -I/usr/local/include'
    ccversion='', gccversion='2.95.3 20010315 (release)',
gccosandvers='solaris2.8'
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
    alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib '
    libpth=/usr/local/lib /usr/lib /usr/ccs/lib
    libs=-lsocket -lnsl -lgdbm -ldl -lm -lc
    perllibs=-lsocket -lnsl -ldl -lm -lc
    libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Compile-time options: USE_LARGE_FILES
  Built under solaris
  Compiled at Sep 26 2001 12:19:31
  @INC:
    /usr/local/lib/perl5/5.6.1/sun4-solaris
    /usr/local/lib/perl5/5.6.1
    /usr/local/lib/perl5/site_perl/5.6.1/sun4-solaris
    /usr/local/lib/perl5/site_perl/5.6.1
    /usr/local/lib/perl5/site_perl
    .
$ 

*---------------------------------------------------------------------------
---------------*

And it also works on this one.

$ perl -V
Summary of my perl5 (5.0 patchlevel 5 subversion 2) configuration:
  Platform:
    osname=solaris, osvers=2.6, archname=sun4-solaris
    uname='sunos aprilia 5.6 generic_105181-05 sun4u sparc sunw,ultra-30 '
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=undef useperlio=undef d_sfio=undef
  Compiler:
    cc='gcc', optimize='-O', gccversion=2.8.1
    cppflags='-I/usr/local/include'
    ccflags ='-I/usr/local/include'
    stdchar='unsigned char', d_stdstdio=define, usevfork=false
    intsize=4, longsize=4, ptrsize=4, doublesize=8
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib /usr/ccs/lib
    libs=-lsocket -lnsl -lgdbm -ldb -ldl -lm -lc -lcrypt
    libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
    cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'


Characteristics of this binary (from libperl): 
  Built under solaris
  Compiled at Oct  7 1998 16:09:09
  @INC:
    /usr/local/lib/perl5/5.00502/sun4-solaris
    /usr/local/lib/perl5/5.00502
    /usr/local/lib/perl5/site_perl/5.005/sun4-solaris
    /usr/local/lib/perl5/site_perl/5.005
    .
$ 

*---------------------------------------------------------------------------
---------------*

But it fails on this Wintel box.

P:\FreeWare>\perl\bin\perl -V
Summary of my perl5 (revision 5 version 8 subversion 0) configuration:
  Platform:
    osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread
    uname=''
    config_args='undef'
    hint=recommended, useposix=true, d_sigaction=undef
    usethreads=undef use5005threads=undef useithreads=define
usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cl', ccflags ='-nologo -Gf -W3 -MD -DNDEBUG -O1 -DWIN32 -D_CONSOLE
-DNO_STRICT -DHAVE_DES_FCRYPT  -DPERL_IMPLICIT_CONTEXT -D
PERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX',
    optimize='-MD -DNDEBUG -O1',
    cppflags='-DWIN32'
    ccversion='', gccversion='', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64',
lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='link', ldflags ='-nologo -nodefaultlib -release
-libpath:"p:\perl\lib\CORE"  -machine:x86'
    libpth=/lib /usr/lib /usr/local/lib "p:\perl\lib\CORE"
    libs=  oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
  netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib  version.lib
odbc32.lib odbccp32.lib msvcrt.lib
    perllibs=  oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib
comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32
.lib  netapi32.lib uuid.lib wsock32.lib mpr.lib winmm.lib  version.lib
odbc32.lib odbccp32.lib msvcrt.lib
    libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib
    gnulibc_version='undef'
  Dynamic Linking:
    dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -release
-libpath:"p:\perl\lib\CORE"  -machine:x86'


Characteristics of this binary (from libperl):
  Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES
PERL_IMPLICIT_CONTEXT PERL_IMPLICIT_SYS
  Locally applied patches:
        ActivePerl Build 804
  Built under MSWin32
  Compiled at Dec  1 2002 23:15:13
  @INC:
    P:/perl/lib
    P:/perl/site/lib
    .

P:\FreeWare>

*---------------------------------------------------------------------------
---------------*

It is possible that I am doing something completely dumb but I don't think
so. I hope I have given you enough information to decide whether this really
is a bug or not.


Kind regards,

John Gillman

LogicaCMG

c/o DTI
    Bay 2105
    1 Victoria St
    London
    SW1H 0ET
    United Kingdom

    +44(0)20 7215 6977

    john.gillman@dti.gsi.gov.uk
    john.gillman@logicacmg.com



______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About