develooper Front page | perl.perl5.porters | Postings from February 2007

Re: Future Perl development

Gerard Goossen
February 4, 2007 10:12
Re: Future Perl development
Message ID:
On Sun, Feb 04, 2007 at 08:43:15PM +0900, SADAHIRO Tomoyuki wrote:
> > > At least his idea should not work on EBCDIC platforms like IBM z/OS.
> >  
> > I choose to make UTF-8 the encoding used for strings (some people would say 
> > this is the internal encoding and thus should not matter).
> > Support for EBCDIC would be in the form that input/output will be converted to EBCDIC.
> There are many parts of perl internal code that assume
> the unicode encoding should have same octet representations as those
> of the native encoding (ASCII to UTF-8 or EBCDIC to UTF-EBCDIC).
> For example '\n' in C on EBCDIC platforms is LF in UTF-EBCDIC as well,
> that is the internal assumption, while that is not LF in UTF-8.
> Your idea requires such conversion at all parts, not only codes for
> executions but also the parser and the lexer.
> Just input/output conversion must not be enough.

You convinced me, so I have to restore the utfebcdic.h (and I probably broke EBCDIC 
support on a few more place). 

Is there some way to fake EBCDIC? Or some other way to test it?

But some other things are probably gonna change on EBCDIC plaforms, 
like C<ord('A') == 65>  ie C<ord> returns the unicode codepoint, and also \x{41}
would be an 'A'. Does that sound oke?

If we make \x{?..}? really insert codepoints, and not sometimes bytes, we
need an escape sequence for bytes. In my patch I used \x.. to do that
and only \x{..} to insert a codepoint, but I am not very happy about that,
maybe \x[..], other suggestions?

Gerard Goossen Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About