develooper Front page | perl.perl5.porters | Postings from October 2003

Re: find_encoding("UTF-16BE")->encode("abc") does not NUL-terminate

Thread Previous | Thread Next
Gisle Aas
October 8, 2003 02:19
Re: find_encoding("UTF-16BE")->encode("abc") does not NUL-terminate
Message ID:
"Nick Ing-Simmons" <> writes:

> Gisle Aas <> writes:
> >
> >The following invariant should always hold if SvPOK(sv):
> >
> >   - SvCUR(sv) < SvLEN(sv)
> Unless SvLEN(sv) == 0

Oh, I forgot that special case.  It would be really cool if we had
code/documentation that specified exactly what the invariants of the
perl structures should be.

> >   - *SvEND(sv) == '\0'
> >
> >The perl core ensures that 
> Experience with Tk says that is not always true, it is mostly true,
> and I can't give a core case where it isn't true, but it has happened.
> I seem to recall magcial $1 used to point into the original/saved PV
> and not be copied to add the NUL.
> The SvLEN(sv) == 0 case (SvPV is owned by someone else) is made considerably 
> less useful if code expects terminating '\0' - the alien SvPV may not 
> have that property, but doing a new malloc and a mega-byte copy just to 
> add the '\0' is a pain.

This is a good point.  Is SvLEN(sv) == 0 the hack used much?

But, this also make perl really unsafe.  You will easily get a
segfault if you call any system call builtins with that string.  Perl
could be made to do the copy if such an SV was passed to such a
function or we could add a flag bit (if there are any left) that can
be set when you are able to quaranteee NUL-termination for strings
with SvLEN(sv) == 0.

> There are very few spots in the core which _need_ the NUL

I think there are quite a few, but they don't show because there is
almost never any strings that are missing it.  Do you think we should
redefine the invariant to not need the terminating NUL and fix all
code that depend it this?


Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About