Front page | perl.perl5.porters |
Postings from October 2000
[PATCH: perl@7483] add three new EBCDIC Encode-ings && many tests
Thread Previous
|
Thread Next
From:
Peter Prymmer
Date:
October 30, 2000 16:57
Subject:
[PATCH: perl@7483] add three new EBCDIC Encode-ings && many tests
Message ID:
Pine.OSF.4.10.10010301643590.67363-100000@aspara.forte.com
The enclosed patch adds three new Encode .enc files and also mentions them
in the MANIFEST. I have also taken the liberty to do three ascii page ->
ebcdic pages tests for many codepoints and one iso8859-1 -> 3 ebcdic round
trip tests for all 256 codepoints from 0..255. Owing to the
"2 tests per codepoint" manner in which the new encode tests are written
(using functional not OO calls ) this patch boosts the overall test count
by quite a lot going from, e.g.:
All tests successful.
u=2.17 s=1.22 cu=232.84 cs=40.27 scripts=257 tests=12550
Whereas after this patch:
All tests successful.
u=2.3 s=1.19 cu=237.71 cs=42 scripts=257 tests=15202
So if folks would rather have those rewritten to perhaps treat the
256 byte string via eq I could rewrite it. I like being able to pinpoint
translation troubles this way though :-)
Files affected:
MANIFEST # mention three new files
ext/Encode/Encode/cp1047.enc # new file
ext/Encode/Encode/cp37.enc # new file
ext/Encode/Encode/posix-bc.enc # new file
t/lib/encode.t # many (2652) new tests
Note that the issue of whether 0000 or FFFF or GGGG is a good
non-character indicator does not quite arise with these since the
single 0000 at the start is the NULL for all three. Many thanks to
Philip Newton, Nick Ing-Simmons, and Mark Leisher. I've not yet
had a chance to debug the breakage of the t/lib/encode.t tests
on an EBCDIC platform.
Here you go:
diff -ruN perl.7483/MANIFEST perl/MANIFEST
--- perl.7483/MANIFEST Sat Oct 28 18:34:36 2000
+++ perl/MANIFEST Mon Oct 30 11:19:53 2000
@@ -195,6 +195,7 @@
ext/Encode/Todo Encode extension
ext/Encode/Encode/ascii.enc Encoding tables
ext/Encode/Encode/big5.enc Encoding tables
+ext/Encode/Encode/cp1047.enc Encoding tables
ext/Encode/Encode/cp1250.enc Encoding tables
ext/Encode/Encode/cp1251.enc Encoding tables
ext/Encode/Encode/cp1252.enc Encoding tables
@@ -204,6 +205,7 @@
ext/Encode/Encode/cp1256.enc Encoding tables
ext/Encode/Encode/cp1257.enc Encoding tables
ext/Encode/Encode/cp1258.enc Encoding tables
+ext/Encode/Encode/cp37.enc Encoding tables
ext/Encode/Encode/cp437.enc Encoding tables
ext/Encode/Encode/cp737.enc Encoding tables
ext/Encode/Encode/cp775.enc Encoding tables
@@ -260,6 +262,7 @@
ext/Encode/Encode/macThai.enc Encoding tables
ext/Encode/Encode/macTurkish.enc Encoding tables
ext/Encode/Encode/macUkraine.enc Encoding tables
+ext/Encode/Encode/posix-bc.enc Encoding tables
ext/Encode/Encode/shiftjis.enc Encoding tables
ext/Encode/Encode/symbol.enc Encoding tables
ext/Errno/ChangeLog Errno perl module change log
diff -ruN perl.7483/MANIFEST.new perl/MANIFEST.new
--- perl.7483/MANIFEST.new Sat Oct 28 18:34:36 2000
+++ perl/MANIFEST.new Mon Oct 30 11:19:53 2000
@@ -195,6 +195,7 @@
ext/Encode/Todo Encode extension
ext/Encode/Encode/ascii.enc Encoding tables
ext/Encode/Encode/big5.enc Encoding tables
+ext/Encode/Encode/cp1047.enc Encoding tables
ext/Encode/Encode/cp1250.enc Encoding tables
ext/Encode/Encode/cp1251.enc Encoding tables
ext/Encode/Encode/cp1252.enc Encoding tables
@@ -204,6 +205,7 @@
ext/Encode/Encode/cp1256.enc Encoding tables
ext/Encode/Encode/cp1257.enc Encoding tables
ext/Encode/Encode/cp1258.enc Encoding tables
+ext/Encode/Encode/cp37.enc Encoding tables
ext/Encode/Encode/cp437.enc Encoding tables
ext/Encode/Encode/cp737.enc Encoding tables
ext/Encode/Encode/cp775.enc Encoding tables
@@ -260,6 +262,7 @@
ext/Encode/Encode/macThai.enc Encoding tables
ext/Encode/Encode/macTurkish.enc Encoding tables
ext/Encode/Encode/macUkraine.enc Encoding tables
+ext/Encode/Encode/posix-bc.enc Encoding tables
ext/Encode/Encode/shiftjis.enc Encoding tables
ext/Encode/Encode/symbol.enc Encoding tables
ext/Errno/ChangeLog Errno perl module change log
diff -ruN perl.7483/ext/Encode/Encode/cp1047.enc perl/ext/Encode/Encode/cp1047.enc
--- perl.7483/ext/Encode/Encode/cp1047.enc Wed Dec 31 16:00:00 1969
+++ perl/ext/Encode/Encode/cp1047.enc Mon Oct 30 10:59:50 2000
@@ -0,0 +1,20 @@
+# Encoding file: cp1047 (EBCDIC), single-byte
+S
+003F 0 1
+00
+0000000100020003009C00090086007F0097008D008E000B000C000D000E000F
+0010001100120013009D000A00080087001800190092008F001C001D001E001F
+0080008100820083008400850017001B00880089008A008B008C000500060007
+0090009100160093009400950096000400980099009A009B00140015009E001A
+002000A000E200E400E000E100E300E500E700F100A2002E003C0028002B007C
+002600E900EA00EB00E800ED00EE00EF00EC00DF00210024002A0029003B005E
+002D002F00C200C400C000C100C300C500C700D100A6002C0025005F003E003F
+00F800C900CA00CB00C800CD00CE00CF00CC0060003A002300400027003D0022
+00D800610062006300640065006600670068006900AB00BB00F000FD00FE00B1
+00B0006A006B006C006D006E006F00700071007200AA00BA00E600B800C600A4
+00B5007E0073007400750076007700780079007A00A100BF00D0005B00DE00AE
+00AC00A300A500B700A900A700B600BC00BD00BE00DD00A800AF005D00B400D7
+007B00410042004300440045004600470048004900AD00F400F600F200F300F5
+007D004A004B004C004D004E004F00500051005200B900FB00FC00F900FA00FF
+005C00F70053005400550056005700580059005A00B200D400D600D200D300D5
+003000310032003300340035003600370038003900B300DB00DC00D900DA009F
diff -ruN perl.7483/ext/Encode/Encode/cp37.enc perl/ext/Encode/Encode/cp37.enc
--- perl.7483/ext/Encode/Encode/cp37.enc Wed Dec 31 16:00:00 1969
+++ perl/ext/Encode/Encode/cp37.enc Mon Oct 30 10:59:50 2000
@@ -0,0 +1,20 @@
+# Encoding file: cp37 (EBCDIC), single-byte
+S
+003F 0 1
+00
+0000000100020003009C00090086007F0097008D008E000B000C000D000E000F
+0010001100120013009D008500080087001800190092008F001C001D001E001F
+00800081008200830084000A0017001B00880089008A008B008C000500060007
+0090009100160093009400950096000400980099009A009B00140015009E001A
+002000A000E200E400E000E100E300E500E700F100A2002E003C0028002B007C
+002600E900EA00EB00E800ED00EE00EF00EC00DF00210024002A0029003B00AC
+002D002F00C200C400C000C100C300C500C700D100A6002C0025005F003E003F
+00F800C900CA00CB00C800CD00CE00CF00CC0060003A002300400027003D0022
+00D800610062006300640065006600670068006900AB00BB00F000FD00FE00B1
+00B0006A006B006C006D006E006F00700071007200AA00BA00E600B800C600A4
+00B5007E0073007400750076007700780079007A00A100BF00D000DD00DE00AE
+005E00A300A500B700A900A700B600BC00BD00BE005B005D00AF00A800B400D7
+007B00410042004300440045004600470048004900AD00F400F600F200F300F5
+007D004A004B004C004D004E004F00500051005200B900FB00FC00F900FA00FF
+005C00F70053005400550056005700580059005A00B200D400D600D200D300D5
+003000310032003300340035003600370038003900B300DB00DC00D900DA009F
diff -ruN perl.7483/ext/Encode/Encode/posix-bc.enc perl/ext/Encode/Encode/posix-bc.enc
--- perl.7483/ext/Encode/Encode/posix-bc.enc Wed Dec 31 16:00:00 1969
+++ perl/ext/Encode/Encode/posix-bc.enc Mon Oct 30 10:59:50 2000
@@ -0,0 +1,20 @@
+# Encoding file: posix-bc (EBCDIC), single-byte
+S
+003F 0 1
+00
+0000000100020003009C00090086007F0097008D008E000B000C000D000E000F
+0010001100120013009D000A00080087001800190092008F001C001D001E001F
+0080008100820083008400850017001B00880089008A008B008C000500060007
+0090009100160093009400950096000400980099009A009B00140015009E001A
+002000A000E200E400E000E100E300E500E700F10060002E003C0028002B007C
+002600E900EA00EB00E800ED00EE00EF00EC00DF00210024002A0029003B009F
+002D002F00C200C400C000C100C300C500C700D1005E002C0025005F003E003F
+00F800C900CA00CB00C800CD00CE00CF00CC00A8003A002300400027003D0022
+00D800610062006300640065006600670068006900AB00BB00F000FD00FE00B1
+00B0006A006B006C006D006E006F00700071007200AA00BA00E600B800C600A4
+00B500AF0073007400750076007700780079007A00A100BF00D000DD00DE00AE
+00A200A300A500B700A900A700B600BC00BD00BE00AC005B005C005D00B400D7
+00F900410042004300440045004600470048004900AD00F400F600F200F300F5
+00A6004A004B004C004D004E004F00500051005200B900FB00FC00DB00FA00FF
+00D900F70053005400550056005700580059005A00B200D400D600D200D300D5
+003000310032003300340035003600370038003900B3007B00DC007D00DA007E
diff -ruN perl.7483/t/lib/encode.t perl/t/lib/encode.t
--- perl.7483/t/lib/encode.t Sun Oct 22 11:38:09 2000
+++ perl/t/lib/encode.t Mon Oct 30 11:28:25 2000
@@ -12,7 +12,11 @@
use charnames qw(greek);
my @encodings = grep(/iso8859/,Encode::encodings());
my $n = 2;
-plan test => 13+$n*@encodings;
+my @character_set = ('0'..'9', 'A'..'Z', 'a'..'z');
+my @source = qw(ascii iso8859-1 cp1250);
+my @destiny = qw(cp1047 cp37 posix-bc);
+my @ebcdic_sets = qw(cp1047 cp37 posix-bc);
+plan test => 13+$n*@encodings + 2*@source*@destiny*@character_set + 2*@ebcdic_sets*256;
my $str = join('',map(chr($_),0x20..0x7E));
my $cpy = $str;
ok(length($str),from_to($cpy,'iso8859-1','Unicode'),"Length Wrong");
@@ -27,7 +31,7 @@
my $sym = Encode->getEncoding('symbol');
my $uni = $sym->toUnicode('a');
-ok("\N{alpha}",substr($uni,0,1),"alpha does not map so symbol 'a'");
+ok("\N{alpha}",substr($uni,0,1),"alpha does not map to symbol 'a'");
$str = $sym->fromUnicode("\N{Beta}");
ok("B",substr($str,0,1),"Symbol 'B' does not map to Beta");
@@ -41,3 +45,49 @@
ok($cpy,$str,"$enc mangled translating to Unicode and back");
}
+# On ASCII based machines see if we can map several codepoints from
+# three distinct ASCII sets to three distinct EBCDIC coded character sets.
+# On EBCDIC machines see if we can map from three EBCDIC sets to three
+# distinct ASCII sets.
+
+my @expectation = (240..249, 193..201,209..217,226..233, 129..137,145..153,162..169);
+if (ord('A') != 65) {
+ my @temp = @destiny;
+ @destiny = @source;
+ @source = @temp;
+ undef(@temp);
+ @expectation = (48..57, 65..90, 97..122);
+}
+
+foreach my $to (@destiny)
+ {
+ foreach my $from (@source)
+ {
+ my @expected = @expectation;
+ foreach my $chr (@character_set)
+ {
+ my $native_chr = $chr;
+ my $cpy = $chr;
+ my $rc = from_to($cpy,$from,$to);
+ ok(1,$rc,"Could not translate from $from to $to");
+ ok(ord($cpy),shift(@expected),"mangled translating $native_chr from $from to $to");
+ }
+ }
+ }
+
+# On either ASCII or EBCDIC machines ensure we can take the full one
+# byte repetoire to EBCDIC sets and back.
+
+my $enc_as = 'iso8859-1';
+foreach my $enc_eb (@ebcdic_sets)
+ {
+ foreach my $ord (0..255)
+ {
+ $str = chr($ord);
+ my $rc = from_to($str,$enc_as,$enc_eb);
+ $rc += from_to($str,$enc_eb,$enc_as);
+ ok($rc,2,"return code for $ord $enc_eb -> $enc_as -> $enc_eb was not obtained");
+ ok($ord,ord($str),"$enc_as mangled translating $ord to $enc_eb and back");
+ }
+ }
+
End of Patch.
Peter Prymmer
Thread Previous
|
Thread Next