develooper Front page | perl.i18n | Postings from November 2011

Re: GB2312 Encoding and File Names

Thread Previous | Thread Next
From:
--[ UxBoD ]--
Date:
November 21, 2011 03:44
Subject:
Re: GB2312 Encoding and File Names
Message ID:
b077fdb6-a017-40b7-9dc0-743ffb343a5d@office.splatnix.net
Just a follow up for some help on this problem. I appear to be able to decode Simplified Chinese okay but Tradional Chinese is somewhat more difficult.  I have the file name MIME entity:

=?gb2312?B?MzYw0MLOxbzgsuItMTItMDEtQ2hpIFNpbXAudHh0?=

which should decode to:

DPM2007exchange電郵與郵箱修復.zip

but when I try and decode that name in Perl it comes out as:

DPM2007exchange���]�c�]箱修��.zip

I have installed the Encode::HanExtra module but even with that it is still not showing correctly. Am I missing some other type of module ?
--
Thanks, Phil

----- Original Message -----
> Hello all,
>
> I do hope I am in the right place for some help! I am working on a
> project that requires email attachments to be extracted to the file
> system. All was working great until one of our kind testers tried
> with normal and simplified Chinese; where I ended up with files of
> the name ?????.txt.
>
> Am using the module MIME::Parser to extract the files and after some
> great help from the developer I have realized that one need to
> override a method in MIME::Parser::Filer so that the correct file
> names are generated.
>
> One of the attachments in the test email is show below:
>
> 360新闻监测-12-01-Chi Simp.txt
>
> I have tried to use MIME::EncWords and MIME::Charset to extract the
> correct name from the MIME entity using:
>
> my $fname = decode_mimewords($head->recommended_filename);
>
> but this still does not work :( so I tried to compare what the file
> name looks like with the LANG with/and without UTF8
>
> With LANG en_GB.UTF8
>
> 360新闻监测-12-01-Chi Simp.txt
>
> With LANG en_GB
>
> 360�?��?��??��?-12-01-Chi Simp.txt
>
> Now this is what happens when I extract the file with my new method:
>
> With LANG en_GB
>
> 360���ż���-12-01-Chi Simp.txt
>
> With LANG en_GB.UTF8
>
> 360???ż???-12-01-Chi Simp.txt
>
> The MIME file name appears as
> ?gb2312?B?MzYw0MLChLFPnHktMTItMDEtQ2hpIFRyYWQudHh0?=
>
> This is not may area of expertise so reaching out to you for some
> help. How can one extract the file name from an email and have it
> reflect its really Chinese name ?  Hope this make sense!
> --
> Thanks, Phil
>

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About