On Wed, Jan 16, 2013 at 10:36:12AM +1100, Tony Cook wrote: > On Tue, Jan 15, 2013 at 01:24:29PM -0800, bulk88 wrote: > > I recently found a bug in a core module, where a loop was done over a > > string from SvPVutf8(sv, len) to SvEND(sv). The sv is a RO literal SV. > > So it looped over uninitialized memory and sometimes segvs. > > > > http://perl5.git.perl.org/perl.git/blob/a8c6ff7b8e8c6037333c21f9b3f6b38b9278df4f:/sv.c#l3060 > > > > A mortal copy was made of the RO SV was made and returned as the result > > of SvPVutf8. > > > > All that SvPVutf8's docs say is "Like C<SvPV>, but converts sv to utf8 > > first if necessary." > > > > SvPV says "The SV may cache the stringified version becoming C<SvPOK>." > > So that is the disclaimer that SvPVX after SvPV is not the same pointer > > as returned from SvPV. I think the docs can be improved for SvPVutf8 and > > maybe SvPV to make it more obvious that these 2 do not create POK SVs. I > > don't have any proposed wording so no patch is included. > > > > This ticket might be related to > > http://perl5.git.perl.org/perl.git/commit/fe46cbda823c09f80e4bc48dd93fafb26cc805f6 > > and https://rt.perl.org/rt3/Ticket/Display.html?id=108994 . > > Perhaps: > > Like C<SvPV>, but returns always returns a utf8 string. > > char* SvPVutf8(SV* sv, STRLEN len) > > This may return a pointer to a temporary instead of C<SvPVX(sv)>. > > The other SvPVutf8* macros should be updated too. Perhaps we just need to clarify SvPV and rely on the fact that all the others say "SvPVfoo is just like SvPV, except for foo". I'd suggest adding the following to SvPV: Note that there is no guarantee that that the return value of C<SvPV()> is equal to C<SvPVX(sv)> (or even that C<SvPVX(sv)> contains valid data), due to the way things like overloading and Copy-on-write are handled. In these cases, the return value may point to a temporary buffer or similar. If you absolutely need the SvPVX field to be valid (for example, if you intend to write to it), then see C<SvPV_force>. And for SvPV_force, which currently says: Like C<SvPV> but will force the SV into containing a string (C<SvPOK>), and only a string (C<SvPOK_only>), by hook or by crook. You want force if you are going to update the C<SvPVX> directly. Processes get magic. I'd suggest s/You want force/You need force/ (to make it stronger), and add: Note that coercing an arbitrary scalar into a plain PV will potentially strip useful data from it. For example if the SV was C<SvROK>, then referent will have its reference count decremented, and the SV itself may be converted to an C<SvPOK> scalar with a string buffer containing value such as C<"ARRAY(0x1234)">. Then make the SvPV_force_nomg() doc refer to SvPV_force() rather than duplicating all the boiler-plate just added. Note that updating SvPV* docs like this will also help with the goal of improving the docs vis-a-vis preparing for re-enabling COW. -- A major Starfleet emergency breaks out near the Enterprise, but fortunately some other ships in the area are able to deal with it to everyone's satisfaction. -- Things That Never Happen in "Star Trek" #13Thread Previous | Thread Next