develooper Front page | perl.perl5.porters | Postings from February 2008

Re: use encoding 'utf8' bug for Latin-1 range

Thread Previous | Thread Next
Juerd Waalboer
February 27, 2008 02:30
Re: use encoding 'utf8' bug for Latin-1 range
Message ID:
Glenn Linderman skribis 2008-02-27  1:50 (-0800):
> * Deprecate "use encoding".


> * Deprecate non-ASCII characters in Perl 5.12 source code unless a 
> source encoding is specified.  Make UTF-8, rather than ASCII, the 
> default source encoding for Perl 5.14.

I wouldn't object, but would prefer to see Perl 5.12 already interpret
source code as UTF-8 if it happens to indeed be valid UTF-8. A silent or
warning fallback to latin1 could be used for backwards compatibility.

> * Implement a pragma to apply Unicode semantics to all character 
> operations

Disagreed. This should be the default mode of operation, not enabled by
a pragma. However, a pragma would still be better than the current

An in-between solution would be to hijack "use feature" for this, and
have it automatically enabled with "use 5.12;".

> * Implement a pragma to specify a source charset/encoding.

If anyone feels like implementing it. I think it would be a waste of
energy and time.

> It would translate all \x codes via the source encoding

\x is documented to use character numbers, so this should say "source
charset". There is currently no way to request a certain non-Unicode

> a new syntax qu (like qq) but is interpreted as UTF-8

Silly idea because all the operators would need a u variant, and this
leads to an explosion of new operators.

> * Under these pragmas, chr/ord would always deal in decoded numbers for 
> characters (utf8 characters).

I hope you mean unicode characters. UTF8 does not have characters, it
has byte sequences that *represent* characters. Especially in *decoded*
form, there no longer are bytes.
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <>  <>
  Convolution:     ICT solutions and consultancy <>

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About