develooper Front page | perl.perl5.porters | Postings from November 2014

Re: [perl #122853] Guarantee 0-9, A-Z, a-z character classes

Thread Previous | Thread Next
Karl Williamson
November 14, 2014 05:29
Re: [perl #122853] Guarantee 0-9, A-Z, a-z character classes
Message ID:
On 10/30/2014 09:24 AM, Father Chrysostomos via RT wrote:
> On Thu Oct 30 06:08:46 2014, jhi wrote:
>>> [\x{04}-\N{U+09}]
>> I think people who ask for weird things like this should be expecting
>> weird results.  In other words, I wouldn't feel bad outlawing them.
>> The start of the range says "the 0x4 in native", the end of the range
>> is "the U+09, in Unicode".  It makes no sense.  If they wanted
>> native-native, they can write that.  If they wanted Unicode-Unicode,
>> they can write that.
> As a native ASCII speaker, I might not understand the native/Unicode distinction.
>> Similarly, think of ranges like [A-z] (that's upper-A-to-lower-z), or
>> [0-z] (zero-to-lower-z).  Just think in ASCII.  Should these mean
>> 0x41-0x7a, and 0x30-0x7a?  If so, they *will* contain the [[\\\]_`] in
>> the first case, and the [:;<=>?@\[\\\]_`] in the second.
> Perl lets people do stupid things.  That is one of its strengths.

And it is one of its weaknesses.  I believe this is a big part of the 
reason that Perl has the reputation of being just for toy programs, and 
not for production use.

If we as a project really thought that not warning for stupid things is 
the right thing for production code, we wouldn't compile perl itself 
with -Wall, and even -Wextra.  No, we want all the warnings the compiler 
reasonably can give us, even if some of them are bogus.

The discipline of software engineering is to try to get the best code 
with the fewest bugs with the least effort.  The rule of thumb I was 
taught was (and may still be) that an error detected at a given stage in 
a product life-cycle is an order of magnitude more expensive to fix than 
one found at the immediately prior stage.  As a developer with a long 
todo list, I want the compiler to tell me that I'm doing something iffy, 
along with a way to suppress the warning if I decide to do it anyway, 
perhaps because the compiler is wrong.  But the compiler should err 
towards more, rather than less warning.

I don't think in terms of ASCII in such ranges, except for a-z, A-Z, and 
0-9.  I don't believe that most programmers do either.  An example is 
the recent introduction of [A-z] into blead.  It was a typo, rather than 
the intent of the programmer.

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About