develooper Front page | perl.perl5.porters | Postings from August 2001

vstrings and Unicode datapoints

Thread Next
From:
John Peacock
Date:
August 15, 2001 12:13
Subject:
vstrings and Unicode datapoints
Message ID:
3B7AC881.EAC4DA64@rowman.com
I have version 2 of my vstring code ready to go, but I have discovered a 
problem serious enough to derail all of my work.  :~( 

The current vstring code will turn 

	v10.0.2

into the U8 string 

	\012\0\2

as it currently operates.  The clever ones out there will have already 
noticed that the \0 byte will mark the end of the string, so the \2 
will vanish into the ether.  Other than that, my code works great! :O

My question to the list, and the Unicode aware in particular, is how to 
code \0 inside a U8 string so that I can differentiate it later?  I 
need to make sure that

	v10.0.2 < v10.1.0

using as simplistic a means as possible (strcmp had come to mind), so
I cannot just set \0 to be something else for storage purposes (like
0xFFFF).

What do I do here?  Should I just automatically add 1 to each sub-
version and patch sprintf("%vd") to undo my damage?  Should I wade
even deeper and make any Unicode strings end in 0xFFFF instead of 
0x0000 (a much more fundamental change, and one I am not worthy to
complete).  Is there some better way to proceed here?

To quote my favorite philosopher:

BUCKAROO BANZAI:
        No, no, no, no.  Don't tug on that.  You never know what it 
        might be attached to.

John

--
John Peacock
Director of Information Research and Technology
Rowman & Littlefield Publishing Group
4720 Boston Way
Lanham, MD 20706
301-459-3366 x.5010
fax 301-429-5747

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About