Front page | perl.perl6.users |
Postings from May 2021
Re: how to form rules for commutative triads?
Thread Previous
|
Thread Next
From:
Bruce Gray
Date:
May 30, 2021 02:51
Subject:
Re: how to form rules for commutative triads?
Message ID:
466787E5-D2A9-46E8-953D-2C09DDFF905C@acm.org
> On May 29, 2021, at 5:57 PM, rir <rirans@comcast.net> wrote:
>
>
> Given:
> rule cmp_expression {
> | <str_const> <cmp_op> <identifier>
> | <num_const> <cmp_op> <identifier>
> | ...
> }
>
> What is a good, concise way to express that all the alternatives are
> commutative?
I am not at all clear on what you are asking, so if none of my ideas are helpful, please consider adding more detail.
1. I don't know of a regex construct that automatically converts this:
/foo bar baz/
into meaning this:
/foo bar baz | baz bar foo/
. So, we do not have a convenient shortcut like:
rule cmp_expression {
| COMMUTATIVE( <str_const> <cmp_op> <identifier> )
| COMMUTATIVE( <num_const> <cmp_op> <identifier> )
| COMMUTATIVE( ... )
}
2. If the order of the operands does not matter (i.e "are commutative", as you said),
*and* the whole set of left-operands are compatible
with the whole set of right-operands
*and the two sets are disjoint
(i.e. if AopB is valid then so is BopA,
but that doesn't mean AopA is valid, nor BopB),
then I would try creating rules or tokens to extract those two sets,
leaving `cmp_expression` with only two branches of alternation:
rule cmp_operands_A {
| <str_const>
| <num_const>
| ...
}
rule cmp_operands_B {
| <identifier>
| ...
}
rule cmp_expression {
<cmp_operands_A> <cmp_op> <cmp_operands_B>
| <cmp_operands_B> <cmp_op> <cmp_operands_A>
}
3. If <cmp_operands_A> and <cmp_operands_B> are actually the exact same set,
then the "Modified quantifier" (which I think of as "Is Separated By")
will allow very concise code (after extracting the operands).
https://docs.raku.org/language/regexes#Modified_quantifier:_%,_%%
rule cmp_operands { # ??? token instead of rule ???
| <str_const>
| <num_const>
| <identifier>
| ...
}
rule cmp_expression {
<cmp_operands> ** 2 % <cmp_op>
}
4. If none of that compresses the regex (maybe because not every <A> forms a valid pairing with *every* <B>),
I would make each BopA variant live on the same line as its AopB cousin:
rule cmp_expression {
| <str_const> <cmp_op> <identifier> | <identifier> <cmp_op> <str_const>
| <num_const> <cmp_op> <identifier> | <identifier> <cmp_op> <num_const>
| ...
}
> I imagine that generally this is a useless question, which is
> avoided by:
>
> rule cmp_expression {
> <value_expression> <cmp_op> <value_expression>
> }
>
> but here many tokens under value_expression exist but are not well
> defined, nor known by me.
This paragraph confuses me.
I read it as a less concise version of my #3 above,
but when you say "many tokens under value_expression…by me”,
it sounds like you can’t/won't pursue this shortened form of the regex
because you don’t actually *know* the long form of the regex yet.
If so, since #1 is not available, I would do #4 until the full details
of all the operands becomes clear, then try to refactor to #2 or #3.
> rir
--
Hope this helps,
Bruce Gray (Util of PerlMonks)
Thread Previous
|
Thread Next