develooper Front page | perl.perl6.compiler | Postings from January 2005

Re: Let the hacking commence!

Thread Previous | Thread Next
Luke Blanshard
January 8, 2005 02:56
Re: Let the hacking commence!
Message ID:
Luke Palmer wrote:
> This list is for people interested in building the Perl 6 compiler.  Now
> you have your first real task!  
> We have to make a formal grammar for Perl 6.  Perl 6 is a huge language,
> so the task seems better done incrementally by the community...
> Send patches to this list.

OK, I'll bite.  In contrast to Luke's 50-thousand-foot level, I'm
diving down into the goriest of details.  At the end of this message
is a rule for whitespace within Perl code, and supporting rules for
comments and pod.

I'm not posting this as a diff, because I have the faint suspicion
that others might have been hacking on this file offline.  But I
gather these rules should go in the "TOKENS" section.

[By the way, shouldn't this grammar be called "Perl" rather than
"Perl6::Grammar"?  Also, is this file now available in some repository

I'd like reviewers to pay special attention to the pod stuff.  It's
not clear to me what the precise rules are or should be for blank
lines preceding pod commands.  I got from S02 the idea that we should
allow standalone =begin/=end sections (and that they should nest).
But does the =end line have to be preceded by a blank line?  As far as
I can tell, the =begin line does not.  In the interest of symmetry, I
have written the rules to not require a blank line before the closing
=end either.  Even though this appears to violate the usual rules for

(Another guy called) Luke


# Whitespace definition for Perl code.
rule ws() {
       # Case 1: Unicode space characters, comments, or POD blocks, or
       # any combination thereof.
     [ \s | «comment» | «pod» ]+

       # Case 2: We're looking at a non-word-constituent or EOF,
       # meaning zero-width counts as whitespace.
   | <before \W> | $

       # Case 3: We must be looking at a word constituent.  We match
       # whitespace at BOF or after a non-word-constituent.
   | ^ | <after \W>

# Comment definition for Perl code.
rule comment() {
       # A hash ("#"), then everything through the next newline or EOF.
     <'#'> .*? [ \n | $ ]

# A POD block, as extended for P6.  This is a =begin/=end pair, a =for
# paragraph, or a standard =<anything>/=cut block.
rule pod() {
       # Case 1: a =begin/=end block, in its own rule so it can
       # recurse.

       # Case 2: a =for paragraph.  "=for" at BOL, plus any space
       # character, starts it, and the first blank line (or EOF) ends
       # it.
   | ^^=for \s :: .*? [ \n \h* \n | $ ]

       # Case 3: any arbitrary POD block.  Starts with "=" at BOL,
       # followed by a letter, ends with "=cut" at BOL or at EOF.
   | ^^=<+<alpha>> :: .*? [ \n =cut [ \s | $ ] | $ ]

# A (recursive) =begin/=end POD block.
rule pod_begin_end_block() {
       # Starts with "=begin" at BOL, followed by an optional name
       # which we save to match with the corresponding "=end".
     ^^=begin [ \h+ $<name> := (\S+) | \h* \n ]

       # Next comes any number of single characters or nested =begin/
       # =end blocks -- but the smallest number that will match...
     [ . | «pod_begin_end_block» ]*?

       # "=end" at BOL followed by the name saved above, or
       # followed by nothing if there wasn't one.  If we make it to EOF
       # without finding the "=end" line, we blow up.
       ^^=end [ <( $<name> )> :: \h+ $<name> | <null> ] \h* [ \n | $ ]
       $ <commit> { fail "Unterminated =begin/=end block" }

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About