develooper Front page | perl.perl6.compiler | Postings from January 2005

Re: Let the hacking commence!

Thread Previous | Thread Next
From:
Luke Blanshard
Date:
January 8, 2005 02:56
Subject:
Re: Let the hacking commence!
Message ID:
20050108015729.25501.qmail@lists.develooper.com
Luke Palmer wrote:
> This list is for people interested in building the Perl 6 compiler.  Now
> you have your first real task!  
> 
> We have to make a formal grammar for Perl 6.  Perl 6 is a huge language,
> so the task seems better done incrementally by the community...
> 
> Send patches to this list.

OK, I'll bite.  In contrast to Luke's 50-thousand-foot level, I'm
diving down into the goriest of details.  At the end of this message
is a rule for whitespace within Perl code, and supporting rules for
comments and pod.

I'm not posting this as a diff, because I have the faint suspicion
that others might have been hacking on this file offline.  But I
gather these rules should go in the "TOKENS" section.

[By the way, shouldn't this grammar be called "Perl" rather than
"Perl6::Grammar"?  Also, is this file now available in some repository
somewhere?]

I'd like reviewers to pay special attention to the pod stuff.  It's
not clear to me what the precise rules are or should be for blank
lines preceding pod commands.  I got from S02 the idea that we should
allow standalone =begin/=end sections (and that they should nest).
But does the =end line have to be preceded by a blank line?  As far as
I can tell, the =begin line does not.  In the interest of symmetry, I
have written the rules to not require a blank line before the closing
=end either.  Even though this appears to violate the usual rules for
pod.


(Another guy called) Luke


====================================

# Whitespace definition for Perl code.
rule ws() {
       # Case 1: Unicode space characters, comments, or POD blocks, or
       # any combination thereof.
     [ \s | «comment» | «pod» ]+

       # Case 2: We're looking at a non-word-constituent or EOF,
       # meaning zero-width counts as whitespace.
   | <before \W> | $

       # Case 3: We must be looking at a word constituent.  We match
       # whitespace at BOF or after a non-word-constituent.
   | ^ | <after \W>
}

# Comment definition for Perl code.
rule comment() {
       # A hash ("#"), then everything through the next newline or EOF.
     <'#'> .*? [ \n | $ ]
}

# A POD block, as extended for P6.  This is a =begin/=end pair, a =for
# paragraph, or a standard =<anything>/=cut block.
rule pod() {
       # Case 1: a =begin/=end block, in its own rule so it can
       # recurse.
     «pod_begin_end_block»

       # Case 2: a =for paragraph.  "=for" at BOL, plus any space
       # character, starts it, and the first blank line (or EOF) ends
       # it.
   | ^^=for \s :: .*? [ \n \h* \n | $ ]

       # Case 3: any arbitrary POD block.  Starts with "=" at BOL,
       # followed by a letter, ends with "=cut" at BOL or at EOF.
   | ^^=<+<alpha>> :: .*? [ \n =cut [ \s | $ ] | $ ]
}

# A (recursive) =begin/=end POD block.
rule pod_begin_end_block() {
       # Starts with "=begin" at BOL, followed by an optional name
       # which we save to match with the corresponding "=end".
     ^^=begin [ \h+ $<name> := (\S+) | \h* \n ]

       # Next comes any number of single characters or nested =begin/
       # =end blocks -- but the smallest number that will match...
     [ . | «pod_begin_end_block» ]*?

       # ...an "=end" at BOL followed by the name saved above, or
       # followed by nothing if there wasn't one.  If we make it to EOF
       # without finding the "=end" line, we blow up.
     [
       ^^=end [ <( $<name> )> :: \h+ $<name> | <null> ] \h* [ \n | $ ]
     |
       $ <commit> { fail "Unterminated =begin/=end block" }
     ]
}


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About