Hi,
Which Parrot strings are supposed to be false in a boolean context?
For instance, is "\x{FF10}" (FULLWIDTH DIGIT ZERO) false?
docs/strings.pod says[1] a string is false if it "consists of one
digit character whose numeric value (as decided by its character type)
is zero".
However, string.c says[2] 'A string is true if it is equal to anything
but "" and "0"' - implying that "\x{FF10}" is true. But then it
calls s->type->get_digit, and strangely enough, chartypes/unicode.c
has a FIXME comment which implies[3] that unicode_get_digit(U+FF10)
should return 0.
Allowing things like "\x{FF10}" to be false sounds like a bit of a
nightmare to me. There are already over 20 forms of zero in Unicode
3.1; if the next version of unicode adds another one at, say, U+33333,
does the next version of parrot change to think that "\x{33333}" is
a false string?
Thanks,
--
David
[1] docs/strings.pod:
> To test a string for truth, use:
>
> BOOLVAL string_bool(struct Parrot_Interp *, STRING* s);
>
> A string is false if it
>
> o is not yet allocated
> o has zero length
> o consists of one digit character whose numeric value (as
> decided by its character type) is zero.
>
> Otherwise the string will be true.
[2] string.c:
> /* A string is "true" if it is equal to anything but "" and "0" */
> BOOLVAL string_bool (const STRING* s) {
[...]
> if (len == 1) {
> UINTVAL c = s->encoding->decode(s->bufstart);
> if (s->type->is_digit(c) && s->type->get_digit(c) == 0) {
> return 0;
> }
> }
>
> return 1; /* it must be true */
> }
[3] chartypes/unicode.c:
> static BOOLVAL
> unicode_is_digit(UINTVAL c) {
> return (BOOLVAL)(isdigit(c) ? 1 : 0); /* FIXME - Other code points are also digits */
> }
>
> static INTVAL
> unicode_get_digit(UINTVAL c) {
> return c - '0'; /* FIXME - many more digits than this... */
> }
Thread Next