develooper Front page | perl.perl5.porters | Postings from September 2012

[perl #80190] Length-caching bug in utf8::decode

Thread Next
From:
Father Chrysostomos via RT
Date:
September 28, 2012 09:49
Subject:
[perl #80190] Length-caching bug in utf8::decode
Message ID:
rt-3.6.HEAD-11172-1348850975-442.80190-15-0@perl.org
On Sat Mar 19 12:45:44 2011, davem wrote:
> On Wed, Jan 26, 2011 at 08:50:12PM +1100, Tony Cook wrote:
> > On Mon, Jan 10, 2011 at 11:54:01PM +1100, Tony Cook wrote:
> > > Actually, looking at how this is all implemented, the solution could
> > > be as simple as:
> > > 
> > >   SvSETMAGIC(sv);
> > > 
> > > since Perl_magic_setutf8() clears the saved length and pos cache.
> > > 
> > > Putting that in XS_utf8_decode() in universal.c would be safest in
> > > terms of the least side-effects on other code, but may leave other
> > > code that calls sv_utf8_decode() with the same problem.
> > 
> > Unfortunately while adding SvSETMAGIC() to the XS fixes the saved
> > length problem, several utf8 taint tests fail.
> 
> I've now fixed it (and pos issues too) with this commit;
> 
> commit 75da9d4c616bae3e6791af93d2ced52dc8080f06
> Author:     David Mitchell <davem@iabyn.com>
> AuthorDate: Sat Mar 19 19:26:49 2011 +0000
> Commit:     David Mitchell <davem@iabyn.com>
> CommitDate: Sat Mar 19 19:41:55 2011 +0000
> 
>     reset pos and utf8 cache when de/encoding utf8 str
>     
>     When using
>         utf8::upgrade
>         utf8::downgrade
>         utf8::encode
>         utf8::decode
>     or the underlying C-level functions
>         sv_utf8_upgrade_flags_grow
>         sv_utf8_downgrade
>         sv_utf8_encode
>         sv_utf8_decode
>     and
>         sv_recode_to_utf8
>     
>     update the position of the pos magic, if any, and clear the utf8
>     length/position-mapping cache.
>     
>     This fixes [perl #80190].
> 
> M       lib/utf8.t
> M       sv.c

The problem with that approach is that it still leaves us with
utf8::decode not calling STORE on tied variables.

And for pos() to survive a modification to the scalar makes utf8::decode
quite unique.

I think a more correct solution is to preserve taint magic explicitly
around a call to SvSETMAGIC.

I will do that soon if nobody objects.

-- 

Father Chrysostomos


---
via perlbug:  queue: perl5 status: resolved
https://rt.perl.org:443/rt3/Ticket/Display.html?id=80190

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About