develooper Front page | perl.perl5.porters | Postings from June 2001

Re: [PATCH] Encode.pm to use escape-sequence encoding

Thread Previous | Thread Next
From:
SADAHIRO Tomoyuki
Date:
June 30, 2001 05:38
Subject:
Re: [PATCH] Encode.pm to use escape-sequence encoding
Message ID:
20010630213554.F67A.BQW10602@nifty.com

Hello. 
Here is a patch for ext/Encode/Encode/.

On Sat, 30 Jun 2001 07:33:37 +0900
SADAHIRO Tomoyuki <BQW10602@nifty.com> wrote:
> Known problems:
> 
> (1)
>  For present, any compiled encodings
> (ASCII, ISO-8859-*, etc.) are not available 
> for the code extension of escape-sequence encoding.
this problem is patched.

> (2) encodings with SINGLE SHIFTs (SS2, SS3)
>   are not avaliable.
still remaining.

> Modification:
> 
> (1) iso2022-jp.enc and iso2022-kr.enc may contain
> the GR characters ("\xA0" .. "\xFF").
> 
> According to RFC1554 (ISO-2022-JP-2) and
> RFC1557 (Korean Character Encoding for Internet Messages),
> they must be in 7 bit format.
> 
> So, the following files are added. 
>   7bit.enc (ASCII, not including ESC, SI, SO)
>   7bit-jis.enc
>   7bit-kana.enc
>   7bit-kr.enc
> (these names might be not so good...
>  please comment and/or tell better names)
7bit.enc is now unnecessary.
to be replaced by ascii.enc.

> (2) A new parameter, 'standard'. It means the
> escape sequence omitted at the beginning of the string
> and added at the end of the string if neccessary
> (but not always. if the last character is an ASCII,
> the final \x1b(B is not appended).
now unnecessary, removed.
(the escape sequence first appearing in the *.enc 
is regarded as the 'standard')

==============
diff -ruN Encode.orig/7bit-jis.enc Encode/7bit-jis.enc
--- Encode.orig/7bit-jis.enc	Sat Jun 30 05:55:08 2001
+++ Encode/7bit-jis.enc	Sat Jun 30 20:55:16 2001
@@ -3,9 +3,8 @@
 name		7bit-jis
 init		{}
 final		{}
-standard	\x1b(B
-7bit		\x1b(B
-7bit		\x1b(J
+ascii		\x1b(B
+ascii		\x1b(J
 7bit-kana	\x1b(I
 jis0208		\x1b$B
 jis0208		\x1b$@
diff -ruN Encode.orig/7bit-kana.enc Encode/7bit-kana.enc
--- Encode.orig/7bit-kana.enc	Sat Jun 30 07:21:10 2001
+++ Encode/7bit-kana.enc	Sat Jun 30 13:53:18 2001
@@ -3,7 +3,7 @@
 0025 0 1
 00
 0000000100020003000400050006000700080009000A000B000C000D00000000
-0010001100120013001400150016001700180019001A001B001C001D0000001F
+0010001100120013001400150016001700180019001A0000001C001D001E001F
 0000FF61FF62FF63FF64FF65FF66FF67FF68FF69FF6AFF6BFF6CFF6DFF6EFF6F
 FF70FF71FF72FF73FF74FF75FF76FF77FF78FF79FF7AFF7BFF7CFF7DFF7EFF7F
 FF80FF81FF82FF83FF84FF85FF86FF87FF88FF89FF8AFF8BFF8CFF8DFF8EFF8F
diff -ruN Encode.orig/7bit-kr.enc Encode/7bit-kr.enc
--- Encode.orig/7bit-kr.enc	Sat Jun 30 05:54:52 2001
+++ Encode/7bit-kr.enc	Sat Jun 30 20:55:44 2001
@@ -3,5 +3,5 @@
 name		7bit-kr
 init		\x1b$)C
 final		{}
-7bit		\x0f
+ascii		\x0f
 ksc5601		\x0e
Only in Encode.orig: 7bit.enc
diff -ruN Encode.orig/Tcl.pm Encode/Tcl.pm
--- Encode.orig/Tcl.pm	Sat Jun 30 07:27:46 2001
+++ Encode/Tcl.pm	Sat Jun 30 20:55:00 2001
@@ -174,7 +174,7 @@
  my ($obj,$str,$chk) = @_;
  my $rep   = $obj->{'Rep'};
  my $touni = $obj->{'ToUni'};
- my $uni   = '';
+ my $uni;
  while (length($str))
   {
    my $ch = ord(substr($str,0,1,''));
@@ -204,9 +204,9 @@
 {
  my ($obj,$uni,$chk) = @_;
  my $fmuni = $obj->{'FmUni'};
- my $str   = '';
  my $def   = $obj->{'Def'};
  my $rep   = $obj->{'Rep'};
+ my $str;
  while (length($uni))
   {
    my $ch = substr($uni,0,1,'');
@@ -257,7 +257,7 @@
  my $fin = $obj->{'final'};
  my $std = $ctl->[0];
  my $cur = $std;
- my $uni   = '';
+ my $uni;
  while (length($str)){
    my $uch = substr($str,0,1,'');
    if($uch eq "\e"){
@@ -272,6 +272,10 @@
     $cur = $uch and next;
    }
    my $x;
+   if(ref($tbl->{$cur}) eq 'Encode::XS'){
+     $uni .= $tbl->{$cur}->decode($uch);
+     next;
+   }
    my $ch = ord($uch);
    my $rep   = $tbl->{$cur}->{'Rep'};
    my $touni = $tbl->{$cur}->{'ToUni'};
@@ -302,17 +306,22 @@
  my $ctl = $obj->{'Ctl'};
  my $ini = $obj->{'init'};
  my $fin = $obj->{'final'};
- my $std = $obj->{'standard'} || '';
+ my $std = $ctl->[0];
  my $str = $ini;
  my $pre = $std;
  my $cur = $pre;
 
  while (length($uni)){
   my $ch = chr(ord(substr($uni,0,1,'')));
-  my $x  = $tbl->{$pre}->{FmUni}->{$ch};
+  my $x  = ref($tbl->{$pre}) eq 'Encode::XS'
+	? $tbl->{$pre}->encode($ch,1)
+	: $tbl->{$pre}->{FmUni}->{$ch};
+
   unless(defined $x){
    foreach my $esc (@$ctl){
-    $x = $tbl->{$esc}->{FmUni}->{$ch};
+    $x = ref($tbl->{$esc}) eq 'Encode::XS'
+	? $tbl->{$esc}->encode($ch,1)
+	: $tbl->{$esc}->{FmUni}->{$ch};
     $cur = $esc and last if defined $x;
    }
   }
@@ -324,6 +333,12 @@
     $pre = $std;
     next;
    }
+  if(ref($tbl->{$cur}) eq 'Encode::XS'){
+   $str .= $cur unless $cur eq $pre;
+   $str .= $x; # "DEF" is lost
+   $pre = $cur;
+   next;
+  }
   my $def = $tbl->{$cur}->{'Def'};
   my $rep = $tbl->{$cur}->{'Rep'};
   unless (defined $x){



regards,
SADAHIRO Tomoyuki
E-mail: bqw10602@nifty.com
URL: http://homepage1.nifty.com/nomenclator/perl/


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About