develooper Front page | perl.beginners | Postings from October 2009

Re: split n characters into n chunks

Thread Previous | Thread Next
From:
Dr.Ruud
Date:
October 26, 2009 06:06
Subject:
Re: split n characters into n chunks
Message ID:
20091025220047.732.qmail@lists.develooper.com
Shawn H Corey wrote:
> John W. Krahn wrote:

>> $ perl -le'
>> my $word = "thequickbrown";
>> my $subsets = 3;
>> print for $word =~ /(?=(.{$subsets}))/g;
> 
> Getting up there but substr is still the fastest.

I had to set the iterations to 300_000, to get rid of warnings.


$ perl5.8.8 3.pl
             Rate  arrays   match  unpack  match3  match2 unpack2  substr
arrays   41265/s      --    -39%    -40%    -43%    -44%    -51%    -73%
match    67114/s     63%      --     -2%     -8%     -9%    -20%    -56%
unpack   68337/s     66%      2%      --     -6%     -7%    -19%    -56%
match3   72816/s     76%      8%      7%      --     -1%    -13%    -53%
match2   73350/s     78%      9%      7%      1%      --    -13%    -52%
unpack2  84034/s    104%     25%     23%     15%     15%      --    -45%
substr  153846/s    273%    129%    125%    111%    110%     83%      --



I moved some of the setup up, because I felt like it.

unpack2() has less overhead than unpack.

substr() mainly wins because it doesn't copy data.
(I assume it just creates an extra SvP on (a part of) it)



$ cat 3.pl
#!/usr/bin/perl -w
use strict;
$| = 1;

use Data::Dumper;

# Make Data::Dumper pretty
$Data::Dumper::Sortkeys = 1;
$Data::Dumper::Indent   = 1;

# Set maximum depth for Data::Dumper, zero means unlimited
$Data::Dumper::Maxdepth = 0;

sub Testing() { 0 }
use Benchmark qw(:all);

my $word = "thequickbrown";
my $size = 3;

Testing and print "$_$/" for
   my $re_match  = sprintf( ".(?=%s)", "." x ($size-1) ),
   my $re_match2 = sprintf( "(?=(%s))", "." x $size ),
;
my $max = length( $word ) - $size;

if ( Testing ) {
     via_arrays();
     via_substr();
     via_unpack();
     via_match();
     via_match2();
     via_unpack2();
}
else {
   cmpthese( 300_000, {
     'arrays'  => \&via_arrays,
     'substr'  => \&via_substr,
     'unpack'  => \&via_unpack,
     'unpack2' => \&via_unpack2,
     'match'   => \&via_match,
     'match2'  => \&via_match2,
   });
}

sub via_arrays {
     my @list = ();
     my @array = split //, $word;
     push @list, join '', @array[ $_ .. $_ + $size - 1 ] for 0 .. $max;
     print Dumper \@list if Testing;
}

sub via_substr {
     my @list = ();
     push @list, substr( $word, $_, $size ) for 0 .. $max;
     print Dumper \@list if Testing;
}

sub via_unpack {
     my @list = ();
     push @list, (unpack( "A${_}A$size", $word ))[1] for 0 .. $max;
     print Dumper \@list if Testing;
}

sub via_unpack2 {
     my @list = ();
     push @list, unpack( "x${_}a$size", $word ) for  0 .. $max;
     print Dumper \@list if Testing;
}

sub via_match {
     my @list = ();
     push @list, substr( $word, $-[0], $size )
       while $word =~ /$re_match/og;
     print Dumper \@list if Testing;
}

sub via_match2 {
     my @list = ();
     push @list, $_ for $word =~ /$re_match2/og;
     print Dumper \@list if Testing;
}


-- 
Ruud

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About