Front page | perl.perl5.porters |
Postings from February 2011
Bug with given() localization, pos() magic, and m//gc state
Thread Previous
|
Thread Next
From:
Father Chrysostomos
Date:
February 20, 2011 15:45
Subject:
Bug with given() localization, pos() magic, and m//gc state
Message ID:
999DBEE7-CFBD-439D-ACE8-A60F18B7D1B2@cpan.org
Tom Christiansen wrote:
> (Tested on 5.13.9.)
>
> I can't figure out whether I've found a bug, or whether I'm being dense.
> What happens is that if you do not manually reset the pos state when you
> finally hit the end of the string, the next time through you are still at
> the end. Well, of course. Right.
>
> But this happens when it is not the same variable anymore! That is, it's
> a new my variable that we're localizing via given(). It seems that the
> pos() magic isn't getting reset when I feel that it should be. Only when
> you manually reset it do things get better, and no combination of my()
> and local() makes any difference.
It *is* the same variable. given() re-uses the same lexical $_ (its own) each time.
given() doesn’t do any localisation or aliasing. ‘given($foo) { ++$_ }’ is a no-op.
Just change given() to for() and never look back. :-)
As far as I’m concerned, given() is broken by design.
> Without the manual reset of saying
> undef pos when done (or pos = undef, or pos = 0), you get this behavior:
>
> Tokenizing string <One room @ $100/night>
> pos $string is undef
> pos $_ is undef
> @??=Letters <One>
> @ 3=Separators < >
> @ 4=Letters <room>
> @ 8=Separators < >
> @ 9=Punctuation <@>
> @10=Separators < >
> @11=Symbols <$>
> @12=Numbers <100>
> @15=Punctuation </>
> @16=Letters <night>
> @21=Done
>
> Tokenizing string <How now... O Brown Cow?!>
> pos $string is undef
> pos $_ is undef
> @21=Letters <w>
> @22=Punctuation <?!>
> @24=Done
>
> Tokenizing string <Quoth the raven, "Nevermore.">
> pos $string is undef
> pos $_ is undef
> @24=Letters <ore>
> @27=Punctuation <.">
> @29=Done
>
> Tokenizing string <That's all, folks!>
> pos $string is undef
> pos $_ is undef
> @29=Done
>
> Tokenizing string <FINIS>
> pos $string is undef
> pos $_ is undef
> @29=Done
>
> See the way the pos magic gets stuck? if you uncomment the
> "undef pos" line below, all works fine. Am I doing something
> stupid, or is this a genuine bug?
>
>
> Thanks much!
>
> --tom
>
> #!/usr/bin/env perl
> #
> # forgiven - demo for/given bug in m/\G/c
> # Tom Christiansen <tchrist@perl.com>
> # Sun Feb 20 14:14:21 MST 2011
>
> use 5.13.0;
> use strict;
> use autodie;
> use warnings qw[ FATAL all ];
>
> END { close STDOUT }
> $| = 1;
>
> #################################################################
>
> our @Lines = (
> q{One room @ $100/night},
> q{How now... O Brown Cow?!},
> q{Quoth the raven, "Nevermore."},
> q{That's all, folks!},
> q{FINIS},
> );
>
> for my $line (@Lines) {
> tokenize($line);
> }
>
> exit;
>
> #################################################################
>
> sub tokenize {
> my $string = shift();
> my $mask = "%-12s <%s>\n";
> printf "Tokenizing string <%s>\n", $string;
>
> ### These don't help:
> ### local $_;
> ### my $_;
>
> printf " pos \$string is %s\n", pos($string) // "undef";
> printf " pos \$_ is %s\n", pos // "undef";
>
> TOKEN: for (;;) {
> given ($string) {
>
> use Devel::Peek; Dump $_;
>
> printf "\t\@%2s=", pos // "??";
>
> when ((pos || 0) >= length) {
> ### XXX: uncomment this next line, and all works; WHY??
> ### undef pos;
> last TOKEN;
> }
>
> printf $mask, Letters => $1 when /\G(\pL+)/gc;
> printf $mask, Numbers => $1 when /\G(\pN+)/gc;
> printf $mask, Symbols => $1 when /\G(\pS+)/gc;
> printf $mask, Punctuation => $1 when /\G(\pP+)/gc;
> printf $mask, Separators => $1 when /\G(\pZ+)/gc;
> printf $mask, Marks => $1 when /\G(\pM+)/gc;
> printf $mask, Other => $1 when /\G(\pO+)/gc;
>
> default {
> die "UNCLASSIFIED: " .
> substr($_, pos || 0, (length > 65) ? 65 : length);
> }
> }
> }
>
> say "Done\n";
> }
Thread Previous
|
Thread Next