develooper Front page | perl.perl5.porters | Postings from March 2021

Re: SvPVutf8 validity

Thread Previous | Thread Next
Felipe Gasper
March 22, 2021 10:53
Re: SvPVutf8 validity
Message ID:

> On Mar 21, 2021, at 11:20 PM, Tony Cook <> wrote:
> On Sun, Mar 21, 2021 at 11:02:25PM -0400, Felipe Gasper wrote:
>> Hello,
>> Does SvPVutf8 have the same UTF-8 validity problems as Encode::encode_utf8()?
> It returns the internal UTF-8 encoding, which can include surrogates,
> etc.
> If that's not what concerns you, please be more specific.

Yeah, that’s it: SvPVutf8 is Perl’s internal “lax” utf8 rather than official, valid UTF-8. So SvPVutf8 will happily encode code points that UTF-8 forbids, e.g., "\x{ffff}".

It sounds, then, like XS modules that speak UTF-8 to external libraries should normally pass SvPVutf8’s output through is_strict_utf8_string() or a variant? So this would be a (documentation-worthy?) caveat of using SvPVutf8.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About