Front page | perl.perl6.language |
Postings from May 2005
Re: comprehensive list of perl6 rule tokens
Thread Previous
|
Thread Next
From:
Jeff 'japhy' Pinyan
Date:
May 26, 2005 16:09
Subject:
Re: comprehensive list of perl6 rule tokens
Message ID:
Pine.LNX.4.61.0505261855070.19144@perlmonk.org
On May 26, Patrick R. Michaud said:
> On Tue, May 24, 2005 at 08:25:03PM -0400, Jeff 'japhy' Pinyan wrote:
>> I have looked through the latest
>> revisions of Apo05 and Syn05 (from Dec 2004) and come up with the
>> following list:
>>
>> http://japhy.perlmonk.org/perl6/rules.txt
>
> I'll review the list below, but it's also worthwhile to read
>
> http://www.nntp.perl.org/group/perl.perl6.language/21120
>
> which is Larry's latest missive on character classes, and
>
> http://www.nntp.perl.org/group/perl.perl6.language/20985
>
> which describes the capturing semantics (but be sure to note
> the lengthy threads that follow concerning changes in the
> indexing from $1, $2, ... to $0, $1, ... ).
I'll check them out. Right now, I'm really only concerned with syntax
rather than implementation. Perl6::Rule::Parser will only parse the rule
into a tree structure.
> & a&b N conjunction
> &var N subroutine
>
> I'm not sure that "&var" means subroutine anymore. A05 does mention
Ok. If it goes away, I'm fine with that.
> x**{n..m} N previous atom n..m times
>
> Keeping in mind that the "n..m" can actually be any sort of closure
Yeah, I know.
> ( (x) Y capture 'x'
> ) Y must match opening '('
>
> It may be worth noting that parens not only capture, they also
> introduce a new scope for any nested subpattern and subrule captures.
Ok. I don't think that'll affects me right now.
> :ignorecase N case insensitivity :i
> :global N match globally :g
> :continue N start scanning after previous match :c
> ...etc
>
> I'm not sure these are "tokens" in the sense of "single unit of purpose"
> in your original message. I think these are all adverbs, and the "token"
> is just the initial C<:> at the beginning of a group.
I understand, but that set is particularly important to me, because as far
as I am concerned, the rule
/abc/
is the object Perl6::Rule::Parser::exact->new('abc'), whereas the rule
/:i abc/
is the object Perl6::Rule::Parser::exactf->new('abc') -- this is using
node terminology from Perl 5, where "exactf" means "exact with case
folding".
> :keepall N all rules and invoked rules remember everything
>
> That's now ":parsetree" according to Damian's proposed capture rules.
Ok. I haven't seen those yet.
> <commit> N backtracking fails completely
> <cut> N remove what matched up to this point from the string
> <after P> N we must be after the pattern P
> <!after P> N we must NOT be after the pattern P
> <before P> N we must be before the pattern P
> <!before P> N we must NOT be before the pattern P
>
> As with ':words', etc., I'm not sure that these qualify as "tokens"
> when parsing the regex -- the tokens are actually "<" or "<!" and
I understand. Luckily this new syntax will enable me to abstract things
in the parser.
my $obj = $S->object(assertion => $name, $neg);
# where $name is the part after the < or <!
# and $neg is a boolean denoting the presence of !
Since there's no longer different prefixes for every type of assertion, I
no longer need to make specific classes of objects.
> <?ws> N match whitespace by :w rules
> <?sp> N match a space character (chr 32 ONLY)
>
> Here the token is "<?", indicating a non-capturing subrule.
Right.
> <$rule> N indirect rule
> <::$rulename> N indirect symbolic rule
> <@rules> N like '@rules'
> <%rules> N like '%rules'
> <{ code }> N code produces a rule
> <&foo()> N subroutine returns rule
> <( code )> N code must return true or backtracking ensues
>
> Here the leading tokens are actually "<$", "<::$", "<@", "<%", "<{", "<&",
> and "<(", and I suspect we have "<?$", "<?::$", "<?@", and "<!$", "<!::$",
> "<!@", etc. counterparts.
Per your second message, <!@rules> would mean <!before <@rules>>, right?
> Of course, one could claim that these are
> really separated as in "<", "?", and "$" tokens, but PGE's parser currently
> treats them as a unit to make it easier to jump directly into the correct
> handler for what follows.
Yes, so does mine. :)
> <[a-z]> N character class
> <+alpha> N character class
> <-[a-z]> N complemented character class
>
> The tokens for character class manipulation are currently "<[", "<+",
> and "<-", although that's not officially documented in A05 or S05 yet.
> Also, ranges are now <[a..z]> -- an unescaped hyphen appearing in an
> enumerated character class generates a warning.
>
> <+\w-[0-9]> N character class "arithmetic"
>
> I'm not sure that it's been decided/documented that \w, \s, etc.
> can appear in character class arithmetic (although it seems like it
> should).
The new character class idiom is going to confuse me for a while. I'll
have to read the above URL in which Larry sheds light.
> <prop:X> N Unicode property match
> <-prop:X> N complemented Unicode property match
>
> Here "prop" is just a subrule (or character class) similar to
> <+alpha>, <+digit>, etc. Also, note that <prop:X> is a capturing
> subrule, while <+prop:X> would be a character class match (and presumably
> not capture).
I think I'll wait to handle Unicode properties until a syntax has been
agreed upon... <prop:X>, <X>, <prop(X)>, etc.
> <rule> N match rule (and capture to $rule)
> <?rule> N match rule (don't capture)
> <<rule>> N match rule (don't capture)
>
> Do we still have the <<rule>> syntax, or was that abandoned in
> favor of <?rule> ? (I know there are still some remnants of <<...>>
> in S05 and A05, but I'm not sure they're intentional.)
I saw <<...>> in A/S 05, but if they're accidental, then I just won't deal
with it.
And, what's the deal with <RULE> capturing? Does that mean I have to
write <?digit> everywhere instead of <digit> unless I want a capture? Eh,
I guess \d exists for that reason...
>> Thanks for your help. Unless you're difficult.
>
> "You're welcome" unless $Pm ~~ /<?difficult>/;
Difficulty nonexistent.
--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
Thread Previous
|
Thread Next