Front page | perl.perl5.porters |
Postings from April 2010
Re: RFC: Perl manual pages -- update to follow the perlstyle.pod guidelines
Thread Previous
|
Thread Next
From:
Tom Christiansen
Date:
April 2, 2010 13:59
Subject:
Re: RFC: Perl manual pages -- update to follow the perlstyle.pod guidelines
Message ID:
16844.1270241942@chthon
In-Reply-To: Message from Jari Aalto <jari.aalto@cante.net>
of "Fri, 02 Apr 2010 21:30:56 +0300."
<87eiix7llr.fsf@jondo.cante.net>
Most of what you wrote is just fine. But there's a nasty myth that sorely
needs busting, or as Larry might phrase it, "in a rather dispassionate sort
of way, to put a bullet through its head."
> Let me tell you how I approach the documentation: With the eyes like how
> a starter perl programmer would read them.
That seems reasonable.
> From this perspective I tend
> to favor practices which:
> - Are safer than average (in this case I refer to: 'and', 'or' ops)
Not only are the spelt-out logical operators no safer than the punctuation
versions and may even be worse, they're also much harder to read.
Let me briefly explain what I mean.
First, regarding "safety", you wouldn't believe how many times I've had to
correct naïve neophytes who've mistakenly swapped in the spelt-out versions
because they've been deceived by the myth that && and || and ! are just a
pair of nastily unsafe operators from old-fashioned perl4 or C code that
have no place in perl5.
$a = $b || $c; # works
$a = $b or $c; # FAILS
There is nothing to be gained from the spelt-out versions in programmers
who clearly delimit function arguments with parentheses -- which all should.
Second, being paren-stingy is dangerous in several ways. It's harder
to read, it's harder on beginners, and it's harder to maintain.
Omitting parentheses wherever one can creates a nightmare for whoever has
to maintain that code. True, anyone with a smattering of algebra will
likely understand the varying sorts of precedence and associativity involved
in expressions like
$n = 3*$x**2 + 17*$x + 32;
but even that is a bit questionable. Worse, many expressions in Perl are
fundamentally ambiguous otherwise. How do these parse:
[1] $x = fn -$y;
[2] $p = fn1 fn2 -$y, $z;
[3] fn1 $y, fn2 $x, fn3 fn4 fn5 $x, 1 while f5 $x && f5 $z;
Did you guess this way:
[1a] $delta = time() - $then;
or did you guess this way:
[1b] $size = abs(-$value);
And did you guess this:
[2a] $p4 = atan2(abs($y), $x);
or did you guess
[2b] $ok = STDERR->print(-$y, $x);
And did you happen to guess
[3a] push(@a, splice(@b, int(rand(scalar(@b))), 1))
while scalar(@b) && scalar(@c);
or did you happen to guess one of the many, *many* other possible parses?
The answer is #1, you *do* not know and #2, you *cannot* know: an answer
that's all three of unkind, unsafe, and unmanageable. Perl has more
built-in functions than any beginner shall ever know the prototypes to,
and that isn't even counting user-defined functions. The same Perl
expression parses completely differently depending on visible prototypes
and dative-slot disambiguation. Given that, how are they supposed to know
whether it's fn(), fn($), fn($$), fn(@), or whatever if you don't use
parentheses to tell them? They cannot. By forcing people to guess, you
guarantee that someday, somewhere, someone will guess *wrong*. That's not
planned obsolescence: it's planned disaster.
Next is the matter of readability. That's what punctuation is there for.
Consider what happens ot the previous paragraph without punctation:
THE ANSWER IS 1 YOU DO NOT KNOW AND 2 YOU CANNOT KNOW AN ANSWER THAT
S ALL THREE OF UNKIND UNSAFE AND UNMANAGEABLE PERL HAS MORE BUILT
IN FUNCTIONS THAN ANY BEGINNER SHALL EVER KNOW THE PROTOTYPES TO
AND THAT ISN T EVEN COUNTING USER DEFINED FUNCTIONS THE SAME PERL
EXPRESSION PARSES COMPLETELY DIFFERENTLY DEPENDING ON VISIBLE
PROTOTYPES AND DATIVE SLOT DISAMBIGUATION GIVEN THAT HOW ARE THEY
SUPPOSED TO KNOW WHETHER IT S FN NOUGHT FN SCALAR FN SCALAR SCALAR FN
LIST OR WHATEVER IF YOU DON T USE PARENTHESES TO TELL THEM THEY CANNOT
BY FORCING PEOPLE TO GUESS YOU GUARANTEE THAT SOMEDAY SOMEWHERE SOMEONE
WILL GUESS WRONG THAT S NOT PLANNED OBSOLESCENCE IT S PLANNED DISASTER
There are now many misparses you'll get garden-pathed down in that
paragraph. Without the punctuated version to disambiguate alternate
parses, you would have had a much harder time making sense of it.
If that was too easy in its unpunctuated version, try this:
SENATVSPOPVLVSQVEROMANVSIMPERATORICAESARIDIVINERVAEFILIONERVAE
TRAIANOAVGVSTOGERMANICODACICOPONTIFICIMAXIMOTRIBVNICIAPOTESTATE
XVIIIMPERATORIVICONSVLIVIPATRIPATRIAEADDECLARANDVMQVANTAEALTITVDINIS
'Nuff said.
Now let's consider
[3] fn1 $y, fn2 $x, fn3 fn4 fn5 $x, 1 while f5 $x && f5 $z;
again. The one thing you can say about that mess is that at least
you can distinguish the nouns from the non-nouns. Why? Because
the nouns are still marked with Perl's characteristic prefix
I'm-a-noun marker, the $ sigil.
[3x] fn1 y pois fn2 x pois fn3 fn4 fn5 x pois baz quand f5 x et f5 z;
Version [3] scans a lot closer to [3x] than it does to [3a]. You can
scream messy-messy all you want about [3a], but it enjoys one sublime
property the others lack: anybody looking at it knows precisely which
arguments go with which function because it is not amenable to parsing
ambiguities reliant on how the arguments clump together. Surely this
is a property crucial to writing readable code!
Punctuation guides eye and mind to trace the syntax of the expression.
Without these clues, it all runs together like so much porridge. Line [3]
is a lot clearer than [3x] because of the former's punctuation. Without
[3] to rely on, how quickly could you figure out what [3x] is doing?
You might say well, it wouldn't be as quick, and you would be right, but
remember we've already proven that one cannot reliably infer [3a] from [3]
anyway, so the answer is infinitely long. Apart from that slightly
pathological answer, can't you see how much easier it is to read && for a
logical conjunction than it is to read et to mean one? The reason for that
is the same as with the I'm-a-noun sigil: & does not look like a noun or
any keyword. Its texture is fundamentally different from any unmarked ident.
Coding elements like commas, parentheses, braces, and all the punctuational
operators stand out and apart from the rest of the code in a way that's
crucial for quickly and correctly apprehending the underlying *syntax*,
which even the Greeks knew meant "arranged together".
Text without distinctive syntactic markers is hard to arrange together in
your head, whether it's program text or natural-language text, and for many
reasons punctuation works much better for this than does non-punctuation.
That's why
while crunch func alpha and func beta func gamma
begin
store next sum 1 last
end
does not arrange itself into meaningful togetherness with anything
like the speed with which this does:
while ( crunch(func($alpha)) && func($beta) && func($gamma) ) {
$next = 1 + $last;
}
Although omitting parens can in very limited circumstances improve things,
and you gave good examples of some of them. Bu far more often doing so
will harm legibility. Someone the way begin and end are no win over {
and }, but now even worse because they abut and mimic identifiers, these
spelt-out forms you like like and and or and not are not the easy
guidepoints of the mind that are the tried and true &&, ||, and !.
Punctuation is your mind's friend: Eschew it at your own peril, but lead
not the little children into that desolate sea of flavorless porridge full
of silent syntactic snafus. It doesn't help anyone. It hurts them.
That's enough for now. The rest of what you wrote is plenty reasonable.
I just want to asphyxiate this myth that the spelt-out boolops are somehow
more readable or safer than the originals, when in fact quite often they're
precisely the opposite of those things. I know I'm not alone in this view.
--tom
--
Even if you aren't in doubt, consider the mental welfare of the
person who has to maintain the code after you, and who will probably
put parens in the wrong place. --Larry Wall
Thread Previous
|
Thread Next