develooper Front page | perl.perl5.porters | Postings from November 2008

Re: PATCH [perl #59342] chr(0400) =~ /\400/ fails for >= 400

Thread Previous | Thread Next
From:
Glenn Linderman
Date:
November 13, 2008 09:31
Subject:
Re: PATCH [perl #59342] chr(0400) =~ /\400/ fails for >= 400
Message ID:
491C644D.9090403@NevCal.com
On approximately 11/12/2008 11:30 PM, came the following characters from 
the keyboard of demerphq:

> First please separate what Glenn said from what Rafael and I said,
> which is that it might be a good idea to deprecate octal IN REGULAR
> EXPRESSIONS.
> 
> I spoke perhaps more harshly than I meant originally, which is what
> kicked this off. I should have said "strongly discouraged" and not
> "deprecated".
> 
> Obviously from a back compat viewpoint we can't actually remove octal
> completely FROM THE REGEX ENGINE. At the very least there is a large
> amount of code that either generates octal sequences or contains them
> IN REGULAR EXPRESSSIONS.
> 
> But we sure can say n the docs that "it is recommended that you do not
> use octal in regular expressions in new code as it is ambiguous as to
> how they will be interpreted, especially low value octal (excepting
> \0) can easily be mistaken for a backreference".


So providing an alternate octal syntax, such as \o{n} might be a nice 
way of encouraging the avoidance of ambiguity, while providing an 
alternative that enhances the ability to use octal notation for those 
that like it.  Suggesting not using the current octal notation forces 
people to convert bases, which may not be a pleasant choice for them.


> Oh cmon! You of all people must know a whole whack of ways to count
> them. You dont have to include them all in a mail. Gmail didn't even
> let me see the full list. The list also is a bit off-topic* as very
> few of those are actually in regular expressions, and amusingly the
> second item in your list isn't octal. Illustrating the problem nicely.


Yes, that is illustrative.  And that one is not the only one in Tom's 
list that is a backref, either.


> * Glen changed the topic of this subthread somewhat by taking an idea
> and seeing how far he could run with it. But the original topic was
> octal IN REGULAR EXPRESSIONS, so lets keep it on that subject.


Exactly.  Exploring the boundaries of an idea can be educational, which 
can help make better decisions.

I remain a proponent of adding \o{n} and 0onnnn notations to perl, 
because they add capability to octal notation for people that like and 
use octal, and the few situations where octal is more interpretable than 
hex; they add consistency to the language (compared to hex and binary 
notations); as well, the notation would allow coders to remove ambiguity 
from regex notation.

Clearly deprecating or removing the existing, ambiguous in regex 
notation octal syntax, whether in perl as a whole, or only within regex 
notation, would force some people to change code if they choose to 
upgrade to that version of perl.

As Chip mentioned off-line, perhaps I am looking for a "use strict" 
option that would allow people to choose to be forced to change such 
code.  Or a "use re" option that would do similarly.  Or a "use re" 
option that would warn only when the notation actually is ambiguous 
(i.e. count the captures, and warn about \n notation that is in the 
range of the number of captures.  Of course, \8 and \9 are not 
ambiguous, but \10 could be, etc.).  Documenting such an option right 
along with regex syntax and highlighting the ambiguity, could help 
convince people to either use a new octal notation, or use the ambiguity 
detector, or both.

Saying "don't do that" without offering a palatable alternative, isn't 
always very effective.  With an alternative syntax, I'm sure Tom could 
write a regex in octal notation to convert existing octal notation in 
existing regex in existing documentation to the new notation.  But watch 
out for those backrefs, Tom!  :)


-- 
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About