develooper Front page | perl.perl6.users | Postings from September 2019

Re: List in regexp

Thread Previous
From:
William Michels via perl6-users
Date:
September 4, 2019 01:16
Subject:
Re: List in regexp
Message ID:
CAA99HCzm1gcdep5LJ8Qtgwjc_94P4ixp5SXxgQotQ6wtPej71w@mail.gmail.com
Hi,

> my $commasep ='abc,<digit>+';
abc,<digit>+

>  say 'abc' ~~ / $( $commasep.split(',') ) /;
Nil
>  say 'abc' ~~ / $( $commasep.split(',')[0] ) /;
「abc」
> say '123' ~~ / $( $commasep.split(',')[1] ) /;
Nil
>  say 'abc' ~~ / $( $commasep.split(',')[0..*] ) /;
Nil

>  say 'abc' ~~ / @( $commasep.split(',') ) /;
「abc」
>  say 'abc' ~~ / @( $commasep.split(',')[0] ) /;
「abc」
>  say '123' ~~ / @( $commasep.split(',')[1] ) /;
Nil
say 'abc123' ~~ m:g/ <{ $commasep.split(',') }> /
(「abc」 「123」)

This is interesting. When you coerce to an Array, Perl6 understands
you want a return from any of the regex matches (an alternation). But
if you use a List, you can still pull out each individual matching
sub-pattern by using a post-circumfix index. Presumably one could Loop
through all the sub-patterns generated by a List, to test them
individually.

As Yary has noted, we see problems (above) as "<digit>+" in string
"commasep" does not get interpreted as a pattern. (The last example
shows a List inside a block getting matched correctly, although as
Simon's code suggests, interpolation may be problematic for characters
that require escaping).

> my $sepcomma ='abc,123';
abc,123
> say 'abc' ~~ / $( $sepcomma.split(',')[0] ) /;
「abc」
> say '123' ~~ / $( $sepcomma.split(',')[1] ) /;
「123」
> say 'abc123' ~~ / $( $sepcomma.split(',') ) /;
Nil
> say 'abc123' ~~ m:g/ $( $sepcomma.split(',') ) /;
()

> say 'abc' ~~ m:g/ @( $sepcomma.split(',') ) /;
(「abc」)
> say '123' ~~ m:g/ @( $sepcomma.split(',') ) /;
(「123」)
> say 'abc123' ~~ m:g/ @( $sepcomma.split(',') ) /;
(「abc」 「123」)
>  say 'abc123' ~~ m:g/ <{ $sepcomma.split(',') }> /
(「abc」 「123」)

Regardless, sticking with a List-parentheses syntax seems to be
problematic in that I don't seem to be able to pull out a regex match
to "<digit>+", no matter how I alter global matching or indices.
String "sepcomma" with a literal "123" gets recognized, while string
"commasep" with a "<digit>+" does not get interpreted as a pattern.

With "List-parentheses" syntax inside a regex, List elements appear to
be interpreted literally. And at this point, it's not clear to me that
Array-parentheses "@(code)" syntax behaves any differently.

HTH, Bill.


On Tue, Sep 3, 2019 at 5:10 AM The Sidhekin <sidhekin@gmail.com> wrote:
>
>
>
> On Tue, Sep 3, 2019 at 8:37 AM yary <not.com@gmail.com> wrote:
>>>
>>> I see.  And that's discussed here (had to really look for it):
>>> https://docs.perl6.org/language/regexes#Quoted_lists_are_LTM_matches
>>> At first I was looking further down in the "Regex interpolation"
>>> section, where it's also touched on, though I kept missing it:
>>> > When an array variable is interpolated into a regex, the regex engine handles it like a | alternative of the regex elements (see the documentation on embedded lists, above).
>>
>>
>> Which brings up another area for improvement- maybe. A regexp interprets a list as alternation only when it's a literal quoted list < like this >, or embedded in the regexp as a @-sigilled variable.
>>
>> That set up an expectation for me- "oh a list in a regexp becomes an alternation of the list elements" - but interpolation meets that expectation inconsistently.
>>
>> The interpolation doc section states, for both forms of code interpolation $(code) and <{code}>, "Runs Perl 6 code inside the regex, and interpolates the stringified return value..."
>
>
>   I missed this the first time (or I would have suggested a version without the variable), but I think what you want is interpolation by way of @(code).
>
>   eirik@greencat[14:04:06]~$ perl6
> You may want to `zef install Readline` or `zef install Linenoise` or use rlwrap for a line editor
>
> To exit type 'exit' or '^D'
> > my $comma-separated='abc,<digit>+';
> abc,<digit>+
> > say 'abc' ~~ / @( $comma-separated.split(',') ) /;
> 「abc」
> > eirik@greencat[14:05:48]~$
>
>   I'd argue it's better this way, since syntax has me expecting $(...) to interpolate a scalar (even if produced from a list), and @(…) to interpolate a list.
>
>
> Eirik

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About