develooper Front page | perl.perl6.users | Postings from August 2021

Re: pairs of separators from a string

Thread Previous | Thread Next
From:
William Michels via perl6-users
Date:
August 22, 2021 07:37
Subject:
Re: pairs of separators from a string
Message ID:
CAA99HCyiOccKBptoF+qgaY2AoAuZ9bXePD-1tjLq8_UECoxqYQ@mail.gmail.com
Hi Marc (and yary)!

I'll give this a shot, focusing on bracket pairs. It's not clear from your
question whether there's any inherent order to the bracket characters in
your string, or whether-or-not some might be nested within each other. You
show a lone ampersand ("&") at the beginning of your example, but other
strings may not be so simple.

Here's a regex attempt (using the REPL), detecting 3 bracket pairs in your
example string:

> my $string = q!&""''(){}[]!
&""''(){}[]
> $string.comb(/ ( <:Ps> ~ <:Pe> .?) /, :global).raku.say
("()", "\{}", "[]").Seq
> $string.comb(/ ( <:Ps> ~ <:Pe> .*?) /, :global).raku.say
("()", "\{}", "[]").Seq
>
> my $string2 = q!&"a"'b'(c){d}[e]!
&"a"'b'(c){d}[e]
> $string2.comb(/ ( <:Ps> ~ <:Pe> .+?) /, :global).raku.say
("(c)", "\{d}", "[e]").Seq
>
> #try original $string with comb().join().comb() strategy:
Nil
> $string.comb.join("_").comb(/ ( <:Ps> ~ <:Pe> _ ) /, :global).raku.say
("(_)", "\{_}", "[_]").Seq

Here I'm taking advantage of:

1). the `tilde` regex construct  (
https://docs.raku.org/language/regexes#Tilde_for_nesting_structures ), and

2). Unicode 'starting-Punctuation' <:Ps> and Unicode 'ending-Punctuation'
<:Pe> character classes. See: https://stackoverflow.com/a/13535289 however
that link is at least 8 years old, so maybe our Unicode experts will chime
in with a more current answer.

You can also play around with nested brackets with satisfying results
(nesting 3-deep detected below):

> my $string4a = q![_]!
[_]
> $string4a.comb(/  [<:Ps> ~ <:Pe>  .*? ]+  /, :global).raku.say
("[_]",).Seq
> my $string4b = q!([_])!
([_])
> $string4b.comb(/  [<:Ps> ~ <:Pe>  _ ]+  /, :global).raku.say
("[_]",).Seq
> $string4b.comb(/  [<:Ps>**2 ~ <:Pe>**2  _ ]  /, :global).raku.say
("([_])",).Seq
> my $string4c = q!{([_])}!
{([_])}
> $string4c.comb(/  [<:Ps>**3 ~ <:Pe>**3  _ ]  /, :global).raku.say
("\{([_])}",).Seq
>
> $*VM
moar (2021.06)
>

Caveat: for malformed nested-bracket strings (using the simple regexes
above), there will be issues where a <:Ps> character is paired with an
incorrect <:Pe> character (e.g. opening square-bracket paired with closing
curly-brace).  But that certainly doesn't seem to be an insurmountable
problem, given the Raku tools available.

Best Regards,

Bill.


On Sat, Aug 21, 2021 at 6:24 PM yary <not.com@gmail.com> wrote:

> This reminds me of "the comma rule" which I haven't quite internalized
> yet. This gets a little closer
>
> map { .[0,2…∞], .[1,3…∞] }, ("012345".comb,)
> (((0 2 4) (1 3 5)))
>
> and my instinct is that "map" is adding a layer you don't need or want for
> this issue, should just be sending the results of comb to a block. But I
> can't quite get the syntax right (and docs.raku.org seems down at the
> moment)
>
> I sent a variation of this as a potential question to Perl Weekly
> Challenge, maybe it will get a bunch of answers in a few weeks!
>
> -y
>
>
> On Sat, Aug 21, 2021 at 3:05 AM Marc Chantreux <eiro@phear.org> wrote:
>
>> hello,
>>
>> i would like to get the list of opening (α) and closing
>> (ω) separators from this string:
>>
>>     &""''(){}[]
>>
>> too many years of perl made me think about this solution
>> or something alike but it didn't work.
>>
>> my (\α,\ω) =| map
>>     { .[0,2…∞], .[1,3…∞] },
>>     q&""''(){}[]&.comb;
>>
>> fixing this is important to me because it illustrate how bad i am to
>> comprehend how raku flatten things.
>>
>> regards,
>> marc
>>
>>
>>

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About