Front page | perl.perl6.users |
Postings from August 2021
Re: pairs of separators from a string
Thread Previous
|
Thread Next
From:
William Michels via perl6-users
Date:
August 22, 2021 07:37
Subject:
Re: pairs of separators from a string
Message ID:
CAA99HCyiOccKBptoF+qgaY2AoAuZ9bXePD-1tjLq8_UECoxqYQ@mail.gmail.com
Hi Marc (and yary)!
I'll give this a shot, focusing on bracket pairs. It's not clear from your
question whether there's any inherent order to the bracket characters in
your string, or whether-or-not some might be nested within each other. You
show a lone ampersand ("&") at the beginning of your example, but other
strings may not be so simple.
Here's a regex attempt (using the REPL), detecting 3 bracket pairs in your
example string:
> my $string = q!&""''(){}[]!
&""''(){}[]
> $string.comb(/ ( <:Ps> ~ <:Pe> .?) /, :global).raku.say
("()", "\{}", "[]").Seq
> $string.comb(/ ( <:Ps> ~ <:Pe> .*?) /, :global).raku.say
("()", "\{}", "[]").Seq
>
> my $string2 = q!&"a"'b'(c){d}[e]!
&"a"'b'(c){d}[e]
> $string2.comb(/ ( <:Ps> ~ <:Pe> .+?) /, :global).raku.say
("(c)", "\{d}", "[e]").Seq
>
> #try original $string with comb().join().comb() strategy:
Nil
> $string.comb.join("_").comb(/ ( <:Ps> ~ <:Pe> _ ) /, :global).raku.say
("(_)", "\{_}", "[_]").Seq
Here I'm taking advantage of:
1). the `tilde` regex construct (
https://docs.raku.org/language/regexes#Tilde_for_nesting_structures ), and
2). Unicode 'starting-Punctuation' <:Ps> and Unicode 'ending-Punctuation'
<:Pe> character classes. See: https://stackoverflow.com/a/13535289 however
that link is at least 8 years old, so maybe our Unicode experts will chime
in with a more current answer.
You can also play around with nested brackets with satisfying results
(nesting 3-deep detected below):
> my $string4a = q![_]!
[_]
> $string4a.comb(/ [<:Ps> ~ <:Pe> .*? ]+ /, :global).raku.say
("[_]",).Seq
> my $string4b = q!([_])!
([_])
> $string4b.comb(/ [<:Ps> ~ <:Pe> _ ]+ /, :global).raku.say
("[_]",).Seq
> $string4b.comb(/ [<:Ps>**2 ~ <:Pe>**2 _ ] /, :global).raku.say
("([_])",).Seq
> my $string4c = q!{([_])}!
{([_])}
> $string4c.comb(/ [<:Ps>**3 ~ <:Pe>**3 _ ] /, :global).raku.say
("\{([_])}",).Seq
>
> $*VM
moar (2021.06)
>
Caveat: for malformed nested-bracket strings (using the simple regexes
above), there will be issues where a <:Ps> character is paired with an
incorrect <:Pe> character (e.g. opening square-bracket paired with closing
curly-brace). But that certainly doesn't seem to be an insurmountable
problem, given the Raku tools available.
Best Regards,
Bill.
On Sat, Aug 21, 2021 at 6:24 PM yary <not.com@gmail.com> wrote:
> This reminds me of "the comma rule" which I haven't quite internalized
> yet. This gets a little closer
>
> map { .[0,2…∞], .[1,3…∞] }, ("012345".comb,)
> (((0 2 4) (1 3 5)))
>
> and my instinct is that "map" is adding a layer you don't need or want for
> this issue, should just be sending the results of comb to a block. But I
> can't quite get the syntax right (and docs.raku.org seems down at the
> moment)
>
> I sent a variation of this as a potential question to Perl Weekly
> Challenge, maybe it will get a bunch of answers in a few weeks!
>
> -y
>
>
> On Sat, Aug 21, 2021 at 3:05 AM Marc Chantreux <eiro@phear.org> wrote:
>
>> hello,
>>
>> i would like to get the list of opening (α) and closing
>> (ω) separators from this string:
>>
>> &""''(){}[]
>>
>> too many years of perl made me think about this solution
>> or something alike but it didn't work.
>>
>> my (\α,\ω) =| map
>> { .[0,2…∞], .[1,3…∞] },
>> q&""''(){}[]&.comb;
>>
>> fixing this is important to me because it illustrate how bad i am to
>> comprehend how raku flatten things.
>>
>> regards,
>> marc
>>
>>
>>
Thread Previous
|
Thread Next