develooper Front page | perl.perl5.porters | Postings from November 2003

[perl #24507] Bug in split, when EXPRESSION only contains PATTERN

Thread Next
From:
Moeller Wolf-Dietrich
Date:
November 17, 2003 15:47
Subject:
[perl #24507] Bug in split, when EXPRESSION only contains PATTERN
Message ID:
rt-24507-67552.14.3368113370123@rt.perl.org
# New Ticket Created by  Moeller Wolf-Dietrich 
# Please include the string:  [perl #24507]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=24507 >


split / /; (or split /a/; , or similar) produces the empty list if
expression contains only PATTERN (once or multiple times).

The documentation on split says the following:
"Empty leading (or trailing) fields are produced when there are positive
width matches at the beginning (or end) of the string" and "split(/ /) will
give you as many null initial fields as there are leading spaces".
Therefore I expect that split /a/,'aa'; should produce 2 empty strings as
result list, but not the empty list. There is no mentioning that the
generation of empty strings is valid only if there are further characters
not equal PATTERN following.

For some test cases please see test program below.

This bug exists for a long time (I think at least in early 5.6.1) and still
persists up to 5.8.1.

###################### start test program ###############################
#!/usr/local/bin/perl
# test script to show split bug (description see end of program)
# Wolf-Dietrich Moeller, 2003-11-14,
<mailto:wolf-dietrich.moeller@siemens.com>
# tested on Perl 5.8.0_806 and 5.8.1_807 Win32 ActiveState,
# also on Perl 5.6.1_633 ActiveState Win32,
# and even older Perl 5.6.1 under Apache webserver and freeBSD (source
distribution)
# output is (command Line and CGI-script):
#######################################################
# -- split(/a/)
# 01: in=undefined  @val=0
# 02: in=''  @val=0
# 03: in='a'  @val=0
# 04: in='aa'  @val=0
# 05: in=' a'  @val=1  value(length): ' ' (1), 
# 06: in='a '  @val=2  value(length): '' (0), ' ' (1), 
# 07: in='aa '  @val=3  value(length): '' (0), '' (0), ' ' (1), 
# -- split(/ /)
# 08: in=undefined  @val=0
# 09: in=''  @val=0
# 10: in=' '  @val=0
# 11: in='  '  @val=0
# 12: in='a '  @val=1  value(length): 'a' (1), 
# 13: in=' a'  @val=2  value(length): '' (0), 'a' (1), 
# 14: in='  a'  @val=3  value(length): '' (0), '' (0), 'a' (1),
#######################################################
use strict;
binmode STDOUT;
print "Content-Type: text/plain\x0D\x0A\x0D\x0A";
#
my @val;
my $j = 0;
print "# -- split(/a/)\x0D\x0A";
for (undef,'','a','aa',' a','a ','aa ') {
 if (length(++$j) < 2) { $j = '0'.$j }
 @val = split(/a/);
 print '# ',$j,': in=',(defined($_)?'\''.$_.'\'':'undefined'),'
@val=',scalar @val;
 if (@val) {
  print '  value(length): ';
  for (@val) { print '\'',$_,'\' (',length($_),'), ' }
  }
 print "\x0D\x0A";
 }
#
print "# -- split(/ /)\x0D\x0A";
for (undef,'',' ','  ','a ',' a','  a') {
 if (length(++$j) < 2) { $j = '0'.$j }
 @val = split(/ /);
 print '# ',$j,': in=',(defined($_)?'\''.$_.'\'':'undefined'),'
@val=',scalar @val;
 if (@val) {
  print '  value(length): ';
  for (@val) { print '\'',$_,'\' (',length($_),'), ' }
  }
 print "\x0D\x0A";
 }
#
print join "\x0D\x0A",
'#',
'# error in lines 3 + 4 (and 10 + 11)',
'# there should be one or two empty strings in @val according to doc on
split:',
'# "Empty leading (or trailing) fields are produced when there are
positive',
'# width matches at the beginning (or end) of the string" and',
'# "split(/ /) will give you as many null initial fields as there are
leading',
'# spaces". There is no mentioning that this is valid only if there are
further',
'# characters not equal PATTERN following (as in line 6 + 7 and 13 + 14).',
'#';
######################## end test program ####################

----------------------------------------
Dr. Wolf-Dietrich Moeller
Siemens AG, CT IC 3, D-81730 München
Corporate Technology Department Security
Mch P, Tel. +49 89 636-53391, Fax -48000
mailto:wolf-dietrich.moeller@siemens.com
Intranet https://security.ct.siemens.de/



Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About