develooper Front page | perl.perl5.porters | Postings from September 2000

Re: unicode support and perl

Thread Previous | Thread Next
Nicholas Clark
September 15, 2000 09:03
Re: unicode support and perl
Message ID:
On Fri, Sep 15, 2000 at 05:52:32PM +0200, Marc Lehmann wrote:
> On Fri, Sep 15, 2000 at 04:38:45PM +0100, Nicholas Clark <> wrote:
> > is hidden from the user (nearly all the time except in cases like syscall())
> > it might be sensible to have an interface like SvIOK, SvPOK such that
> > "binary" and "string" were two distinct (convertible) types which might
> > happen to be implemented inside the SV * as a buffer pointer and a flag?
> AFAICS this is *exactly* how it is currently implemented. Please also note
> that the internal representation of scalars is quite an important issue
> for users ("0 but true"? use integer?).
> The single important difference is that you cannot tell utf8 strings from
> normal strings (in general), and that only the user ultimately knows which
> datatype the scalar has.
> > And that C and XS code uses one or other set of macros depending on whether
> > it wants binary (octet) data, or string data?
> It's not that easy. There are two differnt types of strings as well, and
> converting between them cannot be automatic. Maybe I was too drastic when
> proposing "binary" and "string" datatypes. What I meant to say was that

I didn't directly realise that you were. But I appear to be contemplating it

> perl cannot tell binary/string from each other (because it is a policy
> decision by the user) and thus the internal representation of "PV" scalars
> must be known.

Ah, sorry. I wasn't clear.
I was contemplating obsoleting "PV" scalars and having "BV" and "SV" scalars
(except that it can't be SV for string and CV is code and aargh what letter?)

so that there are binary and string datatypes in the API, they are distinct
and conversion of some sort is defined between them, but the implementation
of perl is free to store them in whatever way it chooses inside the struct
as long as all the macros work. But I don't see how PV maros can be kept
working, as they'd need to be doing implicit conversions with the same
insufficient data that is currently exposed as SvUTF8_on and friends

Nicholas Clark

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About