develooper Front page | perl.perl6.language | Postings from May 2005

Re: split /(..)*/, 1234567890

Thread Previous | Thread Next
From:
Jonathan Scott Duff
Date:
May 12, 2005 10:04
Subject:
Re: split /(..)*/, 1234567890
Message ID:
20050512170355.GA18077@pobox.com
On Thu, May 12, 2005 at 06:29:49PM +0200, "TSa (Thomas SandlaƟ)" wrote:
> Autrijus Tang wrote:
> >I don't know, I didn't invent that! :-)
> >
> >    $ perl -le 'print join ",", split /(..)/, 123'
> >    ,12,3
> 
> Hmm,
> 
> perl -le 'print join ",", split /(..)/, 112233445566'
> ,11,,22,,33,,44,,55,,66
> 
> For longer strings it makes every other match an empt string.

Not quite. The matching part are the strings "11", "22", "33", etc.
And since what matches is what we're splitting on, we get the empty
string between pairs of characters (including the leading empty
string).    The only reason you're getting the string that was matched
in the output is because that's what you've asked split to do by
placing parens around the pattern.  (Type "perldoc -f split" at your
command prompt and read all about it)

To bring this back to perl6, autrijus' original query was regarding

	$ pugs -e 'say join ",", split /(..)*/, 1234567890'

which currently generates a list of ('','12','34','56','78','90')
In perl5 it would generate a list of ('','90') because only the last
pair of characters matched is kept (such is the nature of quantifiers
applied to capturing parens). But in perl6 quantified captures put all
of the matches into an array such that "abcdef" ~~ /(..)*/ will make
$0 = ['ab','cd','ef']. 

I think that the above split should generate a list like this:

	('', [ '12','34','56','78','90'])

Or, another example:

	$ pugs -e 'say join ",", split /(<[abc]>)*/, "xabxbxbcx"'
	# ('x', ['a','b'], 'x', ['b'], 'x', ['b','c'], 'x')

But that's just MHO.

> With the "Positions between chars" interpretation the above
> string is with '.' indication position:
> 
> .1.1.2.2.3.3.4.4.5.5.6.6.
> 0 1 2 3 4 5 6 7 8 9 1 1 1
>                     0 1 2
> 
> There are two matches each at 0, 2, 4, 6, 8 and 10.
> The empty match at the end seams to be skipped because
> position 12 is after the string? 

No, the empty match at the end is skipped because that's the default
behaviour of split.  Preserve leading empty fields and discard empty
trailing ones.

> And for odd numbers of
> chars the before last position doesn't produce an empty
> match:
> perl -le 'print join ",", split /(..)/, 11223'
> ,11,,22,3

There's an empty field between the beginning of the string and "11",
there's an empty field between the "11" and the "22", and finally
there's a field at the end containing only "3"

> Am I the only one who finds that inconsistent?

Probably.  :-)

-Scott
-- 
Jonathan Scott Duff
duff@pobox.com

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About