develooper Front page | perl.perl5.porters | Postings from November 1999

Re: EBCDIC/Unicode??

From:
Peter Prymmer
Date:
November 30, 1999 14:25
Subject:
Re: EBCDIC/Unicode??
Message ID:
199911302224.OAA03267@brio.forte.com

In Message-ID: <86256838.00594278.00@smtp.sears.com> Mon, 29 Nov 1999 
Geoffrey Rommel, asked:

> I noticed some comments in MJD's digest about problems with EBCDIC and Unicode.
> This could have implications for my Convert::IBM390 module. Could you folks
> point me to a good place to read about Unicode support in Perl so that I can
> think about how I should handle this? Thanks.

Some aspects of perl's unicode support have yet to be (pod-)documented.
Some matters were supposed to be discussed on on the perl-unicode@perl.org
mailing list which has an archive at:

  http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/

On the perl-mvs list (open to all perl && EBCDIC users no matter what OS they 
are running) I would not mind discussing a possible way to turn the 
utf8.{c,pm} files into utfebcdic.{c,pm} equivalents so that perhaps we might 
use C<use utfebcdic;> as an internal representation similar to the way ASCII 
machines can use the C<use utf8;> pragma to turn on UTF-8 internal 
representation.  UTF-EBCDIC is discussed at:

   http://www.unicode.org/unicode/reports/tr16/

which was just updated 11-nov.  For now the pragma C<use utf8;> in a perl
regression test on an ebcdic machine such as OS/390 just messes stuff up
since there is no mapping at all.  I should mention that for the short term 
my tuit supply is short and I'd really like to get dynamic loading working 
on OS/390 so I may not be able to help with a utfebcdic pragma right away.  
The perl-mvs list archive is at:

  http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/

I note that in http://www.perl.com/pub/1999/11/p5pdigest/THISWEEK-19991114.html
Mark-Jason Dominus wrote:

   Under EBCDIC, however, 0x21 is a capital letter O. (I think.) 

not quite.  All upper (and lower) case latin letters, along with the arabic
numerals have a high bit set hence are all over 0x7f.  See any of:

   http://www.best.com/~pvhp/os390/doc/perlebcdic.pod
   http://www.best.com/~pvhp/os390/doc/perlebcdic.pod.txt
   http://www.best.com/~pvhp/os390/doc/perlebcdic.pod.txt

for explicit mention of three EBCDIC codes pages that Perl OSes use
and that are compatible with mappings to 8859-1 ISO-Latin.  Under those
three code pages 0x21 maps to C1 control character 2 which is non-printing
character that the unicode people did not want to even name.

Peter Prymmer




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About