Front page | perl.perl5.porters |
Postings from November 1999
Re: EBCDIC/Unicode??
From:
Peter Prymmer
Date:
November 30, 1999 14:25
Subject:
Re: EBCDIC/Unicode??
Message ID:
199911302224.OAA03267@brio.forte.com
In Message-ID: <86256838.00594278.00@smtp.sears.com> Mon, 29 Nov 1999
Geoffrey Rommel, asked:
> I noticed some comments in MJD's digest about problems with EBCDIC and Unicode.
> This could have implications for my Convert::IBM390 module. Could you folks
> point me to a good place to read about Unicode support in Perl so that I can
> think about how I should handle this? Thanks.
Some aspects of perl's unicode support have yet to be (pod-)documented.
Some matters were supposed to be discussed on on the perl-unicode@perl.org
mailing list which has an archive at:
http://www.xray.mpe.mpg.de/mailing-lists/perl-unicode/
On the perl-mvs list (open to all perl && EBCDIC users no matter what OS they
are running) I would not mind discussing a possible way to turn the
utf8.{c,pm} files into utfebcdic.{c,pm} equivalents so that perhaps we might
use C<use utfebcdic;> as an internal representation similar to the way ASCII
machines can use the C<use utf8;> pragma to turn on UTF-8 internal
representation. UTF-EBCDIC is discussed at:
http://www.unicode.org/unicode/reports/tr16/
which was just updated 11-nov. For now the pragma C<use utf8;> in a perl
regression test on an ebcdic machine such as OS/390 just messes stuff up
since there is no mapping at all. I should mention that for the short term
my tuit supply is short and I'd really like to get dynamic loading working
on OS/390 so I may not be able to help with a utfebcdic pragma right away.
The perl-mvs list archive is at:
http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/
I note that in http://www.perl.com/pub/1999/11/p5pdigest/THISWEEK-19991114.html
Mark-Jason Dominus wrote:
Under EBCDIC, however, 0x21 is a capital letter O. (I think.)
not quite. All upper (and lower) case latin letters, along with the arabic
numerals have a high bit set hence are all over 0x7f. See any of:
http://www.best.com/~pvhp/os390/doc/perlebcdic.pod
http://www.best.com/~pvhp/os390/doc/perlebcdic.pod.txt
http://www.best.com/~pvhp/os390/doc/perlebcdic.pod.txt
for explicit mention of three EBCDIC codes pages that Perl OSes use
and that are compatible with mappings to 8859-1 ISO-Latin. Under those
three code pages 0x21 maps to C1 control character 2 which is non-printing
character that the unicode people did not want to even name.
Peter Prymmer