Front page | perl.perl5.porters |
Postings from October 2011
Re: The "Unicode Bug"
Thread Previous
|
Thread Next
From:
Mons Anderson
Date:
October 17, 2011 07:33
Subject:
Re: The "Unicode Bug"
Message ID:
201110171833.56223.inthrax@gmail.com
On Monday 17 October 2011 17:43:13 Tom Christiansen wrote:
> > Ok, I'll try to explain.
> > I write xml parser.
> > It should parse byte streams.
>
> Ah, so you panic on code points that are larger than 255? That's not
> very friendly. However can you know how to interpret those bytes?
>
> --tom
No, I don't panic.
Everything correctly works with either flagged and unflagged salars, whose
unicode characters greater than 255.
Problem, from point of Eric's view, is with downgraded scalars with values
between 7f-ff
With those upgraded, everything is ok, since their byte buffer seems to be
correct utf-8 sequence for corresponding chars.
Because downgraded \xb2 (\262 without flag) equals in perl context to upgraded
\xb2 (\302\262 + UTF flag internally) Eric says that parser must handle them
equally.
--
Vladimir Perepelitsa aka Mons Anderson
<inthrax@gmail.com> / #99779956
Thread Previous
|
Thread Next