develooper Front page | perl.perl5.porters | Postings from November 2000

question about retlen in utf8.c:Perl_utf8_to_uv()

Thread Next
From:
Peter Prymmer
Date:
November 28, 2000 17:38
Subject:
question about retlen in utf8.c:Perl_utf8_to_uv()
Message ID:
Pine.OSF.4.10.10011281732350.216089-100000@aspara.forte.com

Hi,

In goofing around with the functions in utf8.c I noticed
that the apidoc type of one of the args to Perl_utf8_to_uv
was incorrectly expressed.  Since on several of the machines
here a size_t type is unsigned (unsigned long on VMS and OSF,
unsigned int on OS/390) I thought that the -1 return value was
not ok.  Hence I came up with this diff *BUT QUESTIONS REMAIN -
THIS IS NOT A PATCH*

--- utf8.c.orig	Tue Nov 28 14:57:40 2000
+++ utf8.c	Tue Nov 28 16:57:56 2000
@@ -171,7 +171,7 @@
 }
 
 /*
-=for apidoc Am|U8* s|utf8_to_uv|STRLEN curlen|I32 *retlen|U32 flags
+=for apidoc Am|U8* s|utf8_to_uv|STRLEN curlen|STRLEN *retlen|U32 flags
 
 Returns the character value of the first character in the string C<s>
 which is assumed to be in UTF8 encoding and no longer than C<curlen>;
@@ -181,8 +181,9 @@
 If C<s> does not point to a well-formed UTF8 character, the behaviour
 is dependent on the value of C<flags>: if it contains UTF8_CHECK_ONLY,
 it is assumed that the caller will raise a warning, and this function
-will set C<retlen> to C<-1> and return.  The C<flags> can also contain
-various flags to allow deviations from the strict UTF-8 encoding.
+will set C<retlen> to C<0> and return.  The C<flags> can also contain
+various flags to allow deviations from the strict UTF-8 encoding 
+(see F<utf.h>).
 
 =cut */
 
@@ -323,7 +324,7 @@
 
     if (flags & UTF8_CHECK_ONLY) {
 	if (retlen)
-	    *retlen = -1;
+	    *retlen = 0;
 	return 0;
     }
 
End of diff - not a patch.

With that in place both OSF and VMS fail t/warnings test 410.

Here, e.g. is the test run on OSF/1 V 4.0 D:

# From pragma/warn/universal
ok 409
# From pragma/warn/utf8
Out of memory during "large" request for 134221824 bytes, total sbrk() is 68380.
Exit 12

Does this look like OK behavior?

I also have a little program that calls uv_to_utf8() then utf8_to_uv()
that runs into a great deal of accesss violation trouble with or
without this modification to Perl_utf8_to_uv() but I'll not include that
just yet.  

Peter Prymmer




Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About