develooper Front page | perl.perl6.language | Postings from May 2005

Re: comprehensive list of perl6 rule tokens

Thread Previous | Thread Next
From:
Jeff 'japhy' Pinyan
Date:
May 27, 2005 21:58
Subject:
Re: comprehensive list of perl6 rule tokens
Message ID:
Pine.LNX.4.61.0505280039300.14490@perlmonk.org
In regards to http://www.nntp.perl.org/group/perl.perl6.language/21120 
which discusses character class syntax in Perl 6, I have some comments to 
make.

First, I've been very interested in seeing proper set notation for char 
classes in Perl 5.  I was pretty vocal about it during TPC in 2002, I 
think, and have since added some features that are in Perl 5 now that 
allow you to define your own Unicode properties with not only + and - and 
! but & as well.

If we want to treat character classes as sets, then we should try to use 
notation that reads properly.  I don't see how '+' and '|' are any 
different in this case: <+Foo +Bar> and <Foo | Bar> should produce the 
same results always.  I suppose the + is helpful in distinguishing a 
character class assertion from any other, though.  To *complement* a 
character class, I think the character ~ is appropriate.  Intersection 
should be done with &.  Subtraction can be provided with -, although it's 
really just a shorthand:  A - B is really A & ~B... but I suppose huffman 
encoding tells us we should provide the - sign.

Here are some examples, then:

   <+alpha -vowels>	all alphabetic characters except vowels
   <+alpha & ~vowels>	same thing
   <[a..z] -[aeiou]>	all characters 'a' through 'z' minus vowels
   <[a..z] & ~[aeiou]>	same thing
   <~(X & Y) | Z>	all characters not in X-and-Y, or in Z

The last example shows <~ which is currently unclaimed as far as 
assertions go.  Since I'd be advocating the removal of a unary - in 
character classes (to be replaced by ~), I think this would be ok.  The 
allowance for a unary + in character classes has already been justified.

For the people who are really going to use it, the notation won't be 
foreign.  And I'd expect most people who'd use it would actually abstract 
a good portion of it away into their own property definitions, so that

   <~(X & Y) | Z>

would actually just be

   <+My_XYZ_Property>

which would be defined elsewhere.

What say you?

-- 
Jeff "japhy" Pinyan         %  How can we ever be the sold short or
RPI Acacia Brother #734     %  the cheated, we who for every service
http://japhy.perlmonk.org/  %  have long ago been overpaid?
http://www.perlmonks.org/   %    -- Meister Eckhart

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About