develooper Front page | perl.i18n | Postings from November 2011

Re: GB2312 Encoding and File Names

Thread Previous
From:
--[ UxBoD ]--
Date:
November 22, 2011 00:28
Subject:
Re: GB2312 Encoding and File Names
Message ID:
dffec579-cd02-444c-828f-5d6fef408c71@office.splatnix.net
Resolved by encoding to UTF-8 once the decoding had been completed:

my $fname = encode('utf8', decode('MIME-EncWords', $head->recommended_filename));
--
Thanks, Phil

----- Original Message -----
> Through some help of the PerlMonks board I have decoded the file name
> correctly; but when you dump it does not match the physical file
> name as it is stored within the file system ie.
>
> MIME Header :
> =?gb2312?B?RFBNMjAwN2V4Y2hhbmdl64rgXcVj4F3P5NDej80uemlw?=
> Decoded     : DPM2007exchange電郵與郵箱修復.zip
> $VAR1 =
> "DPM2007exchange\x{96fb}\x{90f5}\x{8207}\x{90f5}\x{7bb1}\x{4fee}\x{5fa9}.zip";
>
> so when one tries to compare to what is read from a directory listing
> you cannot match them together :( How do I get the decoded name to
> be as it is meant to be; as show above.
> --
> Thanks, Phil
>
> ----- Original Message -----
> > Just a follow up for some help on this problem. I appear to be able
> > to decode Simplified Chinese okay but Tradional Chinese is somewhat
> > more difficult.  I have the file name MIME entity:
> >
> > =?gb2312?B?MzYw0MLOxbzgsuItMTItMDEtQ2hpIFNpbXAudHh0?=
> >
> > which should decode to:
> >
> > DPM2007exchange電郵與郵箱修復.zip
> >
> > but when I try and decode that name in Perl it comes out as:
> >
> > DPM2007exchange���]�c�]箱修��.zip
> >
> > I have installed the Encode::HanExtra module but even with that it
> > is
> > still not showing correctly. Am I missing some other type of module
> > ?
> > --
> > Thanks, Phil
> >
> > ----- Original Message -----
> > > Hello all,
> > >
> > > I do hope I am in the right place for some help! I am working on
> > > a
> > > project that requires email attachments to be extracted to the
> > > file
> > > system. All was working great until one of our kind testers tried
> > > with normal and simplified Chinese; where I ended up with files
> > > of
> > > the name ?????.txt.
> > >
> > > Am using the module MIME::Parser to extract the files and after
> > > some
> > > great help from the developer I have realized that one need to
> > > override a method in MIME::Parser::Filer so that the correct file
> > > names are generated.
> > >
> > > One of the attachments in the test email is show below:
> > >
> > > 360新闻监测-12-01-Chi Simp.txt
> > >
> > > I have tried to use MIME::EncWords and MIME::Charset to extract
> > > the
> > > correct name from the MIME entity using:
> > >
> > > my $fname = decode_mimewords($head->recommended_filename);
> > >
> > > but this still does not work :( so I tried to compare what the
> > > file
> > > name looks like with the LANG with/and without UTF8
> > >
> > > With LANG en_GB.UTF8
> > >
> > > 360新闻监测-12-01-Chi Simp.txt
> > >
> > > With LANG en_GB
> > >
> > > 360�?��?��??��?-12-01-Chi Simp.txt
> > >
> > > Now this is what happens when I extract the file with my new
> > > method:
> > >
> > > With LANG en_GB
> > >
> > > 360���ż���-12-01-Chi Simp.txt
> > >
> > > With LANG en_GB.UTF8
> > >
> > > 360???ż???-12-01-Chi Simp.txt
> > >
> > > The MIME file name appears as
> > > ?gb2312?B?MzYw0MLChLFPnHktMTItMDEtQ2hpIFRyYWQudHh0?=
> > >
> > > This is not may area of expertise so reaching out to you for some
> > > help. How can one extract the file name from an email and have it
> > > reflect its really Chinese name ?  Hope this make sense!
> > > --
> > > Thanks, Phil
> > >
> >
>

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About