develooper Front page | perl.perl6.language | Postings from March 2009

about the Str type and Unicode

Thread Next
Darren Duncan
March 12, 2009 17:28
about the Str type and Unicode
Message ID:
I have a quick question about the Str type, described in Synopsis 2:

   Str     Perl string (finite sequence of Unicode characters)

Specifically, and partly in the interest in future-proofing, is there support in 
Str for representing codepoint numbers that are beyond the range currently 
described in the Unicode spec; eg, can someone validly say "\x[263a123456789]" 
and pass around said as a Str value?

Or would there potentially be language constraints to prevent such from 

I think it would be useful for the above to be allowed so that one could still 
encode future larger codepoints under an older Perl that doesn't attribute any 
meaning to them, and just falls back to treating the Str as a generic string of 
integers, that is what happens by default when you don't have special character 
tables handy AFAIK.

That's not to say you can't also have a stricter subtype defined, eg Uni5_1Str, 
which includes just the characters defined by Unicode version 5.1, where people 
want to use that.

So if Perl's Str is lax in this way I think it should be documented somewhere 
that a Str may contain a sequence of potential and not just actual Unicode 
characters.  Or if that already is documented, please say where.

And I want to emphasize that I'm not proposing changing the logical/conceptual 
meaning of Str, it is still defined as a string of characters, not as a string 
of integers.

One reason I'm asking is that I wanted to make the Text type of my Muldis D 
language support arbitrarily large codepoints partly for future-proofing, and 
I'm hoping to be able to say that when mapping the language to Perl 6 that any 
Text value can be represented simply by a Perl 6 Str value.  But if Perl 6's Str 
isn't likely to be that flexible then I'd like to know for my planning purposes.

Thank you. -- Darren Duncan

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About