develooper Front page | perl.perl5.porters | Postings from August 2011

language-lawyers' rules for Perl identifiers

Thread Next
From:
Tom Christiansen
Date:
August 13, 2011 08:29
Subject:
language-lawyers' rules for Perl identifiers
Message ID:
13642.1313248255@chthon
The 5.14.0 perldelta reads:

   Change in the parsing of identifiers

   Characters outside the Unicode "XIDStart" set are no
   longer allowed at the beginning of an identifier.  This
   means that certain accents and marks that normally follow
   an alphabetic character may no longer be the first
   character of an identifier.

Should we say something further regarding which version we're talking about?

Is the following true, accurate, comprehensive (=didn't forget something), and
meets with the conformance requirements outlined further below?

    Apart from punctuation variables of length one, Perl identifiers must
    begin with a character that has either the Unicode XID_Start (XIDS)
    property or which has the Unicode General_Category property value
    Connector_Punctuation (GC=Pc).  Any following characters must have the
    Unicode XID_Continue (XIDC) property.  GC=Pc, XIDS, and XIDC are 
    defined according to whichever version of the Unicode Standard is
    currently supported by Perl, which as of this writing is Unicode 6.0.0.

I write that in light of TUS's chapter 3 on Conformance claims:

    References to Unicode Character Properties

    Properties and property values have defined names and abbreviations, such as

	  Property:       General_Category (gc)
	  Property Value: Uppercase_Letter (Lu)

    To reference a given property and property value, these aliases are used, as in this example:

	  The property value Uppercase_Letter from the General_Category prop-
	  erty, as specified in Version 6.0.0 of the Unicode Standard.

    Then cite that version of the standard, using the standard citation format that is provided
    for each version of the Unicode Standard.

    When referencing multi-word properties or property values, it is permissible to omit the
    underscores in these aliases or to replace them by spaces.

    When referencing a Unicode character property, it is customary to prepend the word "Uni-
    code" to the name of the property, unless it is clear from context that the Unicode Standard
    is the source of the specification.

That doesn't talk about $main::sail or $main'sail, but the package separator
is a different matter.  

--tom

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About