develooper Front page | perl.perl5.porters | Postings from October 2003

Re: find_encoding("UTF-16BE")->encode("abc") does not NUL-terminate

Thread Previous | Thread Next
From:
Nick Ing-Simmons
Date:
October 8, 2003 01:51
Subject:
Re: find_encoding("UTF-16BE")->encode("abc") does not NUL-terminate
Message ID:
20031008085140.2598.2@llama.elixent.com
Gisle Aas <gisle@ActiveState.com> writes:
>
>The following invariant should always hold if SvPOK(sv):
>
>   - SvCUR(sv) < SvLEN(sv)

Unless SvLEN(sv) == 0

>   - *SvEND(sv) == '\0'
>
>The perl core ensures that 

Experience with Tk says that is not always true, it is mostly true,
and I can't give a core case where it isn't true, but it has happened.
I seem to recall magcial $1 used to point into the original/saved PV
and not be copied to add the NUL.

The SvLEN(sv) == 0 case (SvPV is owned by someone else) is made considerably 
less useful if code expects terminating '\0' - the alien SvPV may not 
have that property, but doing a new malloc and a mega-byte copy just to 
add the '\0' is a pain. There are very few spots in the core 
which _need_ the NUL 

>
>> After all, null-termination itself is moot w/ UTF-(16|32)(BE|LE)?.
>
>True, but when they are stuffed in perl strings they should still be
>made safe for external APIs that expect NUL-termination.

Ah, but if external API is has a 32-bit wchar_t and expects NUL-termination
then perl must ensure that all SvPVs have FOUR trailing 0 octets...
(Been burnt here with rendering "wide" strings...)



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About