develooper Front page | perl.perl5.porters | Postings from February 2011

Bug with given() localization, pos() magic, and m//gc state

Thread Previous | Thread Next
From:
Father Chrysostomos
Date:
February 20, 2011 15:45
Subject:
Bug with given() localization, pos() magic, and m//gc state
Message ID:
999DBEE7-CFBD-439D-ACE8-A60F18B7D1B2@cpan.org
Tom Christiansen wrote:
> (Tested on 5.13.9.)
> 
> I can't figure out whether I've found a bug, or whether I'm being dense.
> What happens is that if you do not manually reset the pos state when you
> finally hit the end of the string, the next time through you are still at
> the end.  Well, of course.  Right.
> 
> But this happens when it is not the same variable anymore!  That is, it's 
> a new my variable that we're localizing via given().  It seems that the 
> pos() magic isn't getting reset when I feel that it should be.  Only when 
> you manually reset it do things get better, and no combination of my() 
> and local() makes any difference.

It *is* the same variable. given() re-uses the same lexical $_ (its own) each time.

given() doesn’t do any localisation or aliasing. ‘given($foo) { ++$_ }’ is a no-op.

Just change given() to for() and never look back. :-)

As far as I’m concerned, given() is broken by design.

> Without the manual reset of saying 
> undef pos when done (or pos = undef, or pos = 0), you get this behavior:
> 
>     Tokenizing string <One room @ $100/night>
>       pos $string is undef
>       pos $_      is undef
> 	    @??=Letters      <One>
> 	    @ 3=Separators   < >
> 	    @ 4=Letters      <room>
> 	    @ 8=Separators   < >
> 	    @ 9=Punctuation  <@>
> 	    @10=Separators   < >
> 	    @11=Symbols      <$>
> 	    @12=Numbers      <100>
> 	    @15=Punctuation  </>
> 	    @16=Letters      <night>
> 	    @21=Done
> 
>     Tokenizing string <How now... O Brown Cow?!>
>       pos $string is undef
>       pos $_      is undef
> 	    @21=Letters      <w>
> 	    @22=Punctuation  <?!>
> 	    @24=Done
> 
>     Tokenizing string <Quoth the raven, "Nevermore.">
>       pos $string is undef
>       pos $_      is undef
> 	    @24=Letters      <ore>
> 	    @27=Punctuation  <.">
> 	    @29=Done
> 
>     Tokenizing string <That's all, folks!>
>       pos $string is undef
>       pos $_      is undef
> 	    @29=Done
> 
>     Tokenizing string <FINIS>
>       pos $string is undef
>       pos $_      is undef
> 	    @29=Done
> 
> See the way the pos magic gets stuck?  if you uncomment the
> "undef pos" line below, all works fine. Am I doing something
> stupid, or is this a genuine bug?
> 
> 
> Thanks much!
> 
> --tom
> 
> #!/usr/bin/env perl
> #
> # forgiven - demo for/given bug in m/\G/c 
> #   Tom Christiansen <tchrist@perl.com>
> #   Sun Feb 20 14:14:21 MST 2011
> 
> use 5.13.0;
> use strict;
> use autodie;
> use warnings qw[ FATAL all ];
> 
> END { close STDOUT }
> $| = 1;
> 
> #################################################################
> 
> our @Lines = ( 
>     q{One room @ $100/night},
>     q{How now... O Brown Cow?!},
>     q{Quoth the raven, "Nevermore."},
>     q{That's all, folks!},
>     q{FINIS},
> );
> 
> for my $line (@Lines) {
>     tokenize($line);
> } 
> 
> exit;
> 
> #################################################################
> 
> sub tokenize {
>     my $string = shift();
>     my $mask = "%-12s <%s>\n";
>     printf "Tokenizing string <%s>\n", $string;
> 
>     ### These don't help:
>     ###     local $_;
>     ###     my    $_;
> 
>     printf "  pos \$string is %s\n", pos($string) // "undef";
>     printf "  pos \$_      is %s\n", pos          // "undef";
> 
> TOKEN: for (;;) { 
> 	 given ($string) {
> 
> use Devel::Peek; Dump $_;
> 
> 	    printf "\t\@%2s=", pos // "??";
> 
> 	    when ((pos || 0) >= length) {
> 		### XXX: uncomment this next line, and all works; WHY??
> 		### undef pos;   
> 		last TOKEN;
> 	    } 
> 
> 	    printf $mask, Letters      => $1      when  /\G(\pL+)/gc;
> 	    printf $mask, Numbers      => $1      when  /\G(\pN+)/gc;
> 	    printf $mask, Symbols      => $1      when  /\G(\pS+)/gc;
> 	    printf $mask, Punctuation  => $1      when  /\G(\pP+)/gc;
> 	    printf $mask, Separators   => $1      when  /\G(\pZ+)/gc;
> 	    printf $mask, Marks        => $1      when  /\G(\pM+)/gc;
> 	    printf $mask, Other        => $1      when  /\G(\pO+)/gc;
> 
> 	    default {
> 	      die "UNCLASSIFIED: " .
> 		substr($_, pos || 0, (length > 65) ? 65 : length);
> 	    }
>         }  
>     }     
> 
>     say "Done\n";
> } 


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About