develooper Front page | perl.perl5.porters | Postings from July 2001

[PATCH split()] split()'s unused captures should be undef, not ''

Thread Previous | Thread Next
From:
Jeff 'japhy/Marillion' Pinyan
Date:
July 27, 2001 11:00
Subject:
[PATCH split()] split()'s unused captures should be undef, not ''
Message ID:
Pine.GSO.4.21.0107271358310.28213-100000@crusoe.crusoe.net
Patch below sig.

-- 
Jeff "japhy" Pinyan      japhy@pobox.com      http://www.pobox.com/~japhy/
I am Marillion, the wielder of Ringril, known as Hesinaur, the Winter-Sun.
Are you a Monk?  http://www.perlmonks.com/     http://forums.perlguru.com/
Perl Programmer at RiskMetrics Group, Inc.     http://www.riskmetrics.com/
Acacia Fraternity, Rensselaer Chapter.         Brother #734
**      Manning Publications, Co, is publishing my Perl Regex book      **

--- pp.c.old	Fri Jul 27 12:22:31 2001
+++ pp.c	Fri Jul 27 12:23:58 2001
@@ -4228,12 +4228,16 @@
 		for (i = 1; i <= rx->nparens; i++) {
 		    s = rx->startp[i] + orig;
 		    m = rx->endp[i] + orig;
-		    if (m && s) {
+
+		    /* japhy (07/27/01) -- the (m && s) test doesn't catch
+		       parens that didn't match -- they should be set to
+		       undef, not the empty string */
+		    if (m >= orig && s >= orig) {
 			dstr = NEWSV(33, m-s);
 			sv_setpvn(dstr, s, m-s);
 		    }
 		    else
-			dstr = NEWSV(33, 0);
+			dstr = &PL_sv_undef;  /* undef, not "" */
 		    if (make_mortal)
 			sv_2mortal(dstr);
 		    if (do_utf8)

--- pod/perlfunc.pod.old	Fri Jul 27 13:48:22 2001
+++ pod/perlfunc.pod	Fri Jul 27 13:52:22 2001
@@ -4481,6 +4481,10 @@
 
 produces the output 'h:i:t:h:e:r:e'.
 
+Using the empty pattern C<//> specifically matches the null string, and is
+not be confused with the use of C<//> to mean "the last successful pattern
+match".
+
 Empty leading (or trailing) fields are produced when there positive width
 matches at the beginning (or end) of the string; a zero-width match at the
 beginning (or end) of the string does not produce an empty field.  For
@@ -4540,6 +4544,11 @@
 	#...
     }
 
+As with regular pattern matching, any capturing parentheses that are not
+matched in a C<split()> will be set to C<undef> when returned:
+
+    @fields = split /(A)|B/, "1A2B3";
+    # @fields is (1, 'A', 2, undef, 3)
 
 =item sprintf FORMAT, LIST
 

--- t/op/split.t.old	Fri Jul 27 13:46:28 2001
+++ t/op/split.t	Fri Jul 27 13:48:15 2001
@@ -5,7 +5,7 @@
     @INC = '../lib';
 }
 
-print "1..45\n";
+print "1..46\n";
 
 $FS = ':';
 
@@ -253,4 +253,15 @@
     }
     print "not " unless $r eq "he:o cruel world";
     print "ok 45\n";
+}
+
+
+{
+    # split /(A)|B/, "1B2" should return (1, undef, 2)
+    my @x = split /(A)|B/, "1B2";
+    print "not " unless
+      $x[0] eq '1' and
+      (not defined $x[1]) and
+      $x[2] eq '2';
+    print "ok 46\n";
 }


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About