develooper Front page | perl.perl5.porters | Postings from August 2008

Re: [PATCH] Add open "|-" and open "-|" to perlopentut

From:
Tom Christiansen
Date:
August 25, 2008 01:36
Subject:
Re: [PATCH] Add open "|-" and open "-|" to perlopentut
Message ID:
25235.1219653324@chthon
In-Reply-To: Message from Shlomi Fish <shlomif@iglu.org.il> 
   of "Mon, 25 Aug 2008 06:30:21 +0300." <200808250630.21539.shlomif@iglu.org.il> 

> On Monday 25 August 2008, Eric Wilhelm wrote:
>> # from Shlomi Fish
>>
>> # on Sunday 24 August 2008 13:28:
>> >This patch documents open "|-" and open "-|" in perlopentut.
>>
>> Perhaps it should follow the practice of single quoting non-interpolated
>> strings like '|-' and 'fortune'?  

> Fixed in my copy.

>> Also, the filehandle in a simple scalar $pipe does not need to be
>> wrapped in a block when used as an argument to print.

>But:

> {{{{{{{{{{
> print $pipe "Hi there!\n"
> }}}}}}}}}}

> is not recommended by PBP and I agree that the $pipe does not really
> stand out there.

While PBP provide much good advice, some does not stand up so well to close
scrutiny.  However, I believe that this one does.  It is easier to explain
that these all work the same when you use a code block in the dative slot:

    print { $fh     } @data;

    print { $fh[$i] } @data;

    print { $ok
                ? *STDOUT 
                : *STDERR 
          } @data;

In contrast, it is not so easy to explain that only the
first of these works, or why:

    print $fh @data;
    print $fh[$i] @data;

It also avoids the problem of explaining the missing comma, as 
folks are use to using close braces without a comma after them:

    @ordered_nums = sort {   $b <=> $a  } @random_nums; 
    @big_nums     = grep { length > 6   } @random_nums; 
    @cubic_nums   = map  { $_ * $_ * $_ } @random_nums;

Perhaps it is a personal idiosyncrasy, but alone of all functions, I do not
feel compelled to wrap everything in parens using list ops whose object is
clearly delimited by squirrel braces, but with others, I do.  For example:

    @big_nums     = grep(length() > 6, @random_nums)
    @cubic_nums   = map($_ ** 3, @random_nums);

But I rather uncordially mislike those latter forms, for sound and 
simple reasons I shall explore below.

It's the same way with other dative slots.  This summoning:

    speak { summon Wizard "Gandalf" } "friend";

risks no daemonic misparsing.  And while it's a bit more
legible than 

    Wizard->summon("Gandalf")->speak("friend")

The arrow notation is much better at right-to-left mental parsing
than the inside-outness than trying to tell the difference between

    ${$new[0]}   
     $$new[0]

Which is why we don't all write these, and wonder what x is:

    1.  $$x 
    2.  ${$x}[0] 
    3.  %$x 
    4.  ${$x[0]} 
    5.  @{$x[0]} 
    6.  %{$x{"java"}} 
    7.  @{$x}{"perl", "c"} 
    8.  ${${$x{"perl"}}[0]}{"rules"} 

Can you tell what "x" is in all those?  Sometimes it's easy, and
sometimes, well, its easiness is a matter open to debate.

    1. x is a scalar holding a scalar ref.
    2. x is scalar holding an array ref, whose 0th elt we desire.
    3. x is a scalar holding a hash ref.
    4. x is an array whose 0th element hold a scalar ref.
    5. x is an array whose 0th element hold a array ref.
    6. x is a hash, the value of whose "java" element hold a array ref.
    7. x is a scalar holding a hash ref, the values of whose "perl" 
       and "C" elements we desire.
    8. x is a hash, the value of whose "perl" key holds an array ref
       whose 0th element holds a hash ref, the value of whose "rules" 
       key we desire.

Isn't it much easier to read those as:

    1. ${$x}
    2. $x->[0]
    3. %{$x}
    4. ${ $x[0] } 
    5. @{ $x[0] } 
    6. %{ $x{JAVA} } 
    7. @{$x}{"PERL", "C"}
    8. $x{PERL}[0][RULES]
    or $x{PERL}->[0]->[RULES]

Well, ok, not the 7th one.  But the point is that the left-to-right arrows
do help; if you can use them instead of inside-out processing, it's going to 
a lighter cognitive load because you don't have to keep as deep a stack on 
your parser.  You can just LALR(1) or so and be done with.

If you have to go inside-out, then braces usually help, but sometimes (as
the first set) they can hinder.  But putting them around the object when
you're using dative syntax

    print { STDERR } "panic string\n";

isn't such a bad thing.

Then again, I'm one of *those*, you know, those who think it's 
ok (and knows why it's *un*ambiguous) to write:

    $obj = new ElvenRing::
                name    => "Narya",
                owner   => "Gandalf",
                domain  => "fire",
                stone   => "ruby";

Even so, I don't cavalierly write 

    @objs = new ElvenRing 3;

let alone the even-worse

    $obj = new ElvenRing;

Because I know the :: makes a critically disambiguating difference, 
one that guarantees you need never wonder whether there's a new() or
Elvenring() that visible.  Others use this as a bogeyman to frighten
people from the dative, but just use args and/or :: and you'll be fine.

>> The error message could probably stand to contain 
>> $! as well.

> Fixed in my copy.

>> I find the md5sum example not very illustrative (because md5sum will
>> just print the result to stdout.)  What about simply restating the
>> aforementioned lpr and netstat examples?

> I converted the '|-' example to "lpr".

Nice to see better quotes.

>> Finally, should these examples be included *before* the
>> references to IPC::Open2 and perlipc?

> Don't know.

No.

> New patch attached.

How delightful: politically-correct Bowlerization in programming 
languages' documentation.  Must we fiddle about with t's that are 
already crossed whilst Rome is burning?

For the record, and if I were to fix it back up so it actually reads as I
write (and would see other write) again, I'd certainly make doubly assured
always to use double-quotes in all places other than those scant few that
have just cause to see interpolation suppressed.

The two-fold reasoning for this is simple, succinct, and satisfying.  

The first part resides in surface syntax; the second, in deeper semantics.
Together, they compose a strong reason to do other than you intend.

#1: Using a double-quote (chr#34, "), no reader will ever wonder whether
    the current font and point size--and unbespectacled eyes leaing back--
    are sure there's looking at a single-quote (chr#39, ') or a back-quote
    (chr#96, `).  Everyone has come upon this scenario, and either guessed,
    scratched 'is head, or leant forward to peer ever-closer at the
    ambiguous glyph at 3 in the morning.

    Code that forces the reader to resort to such silliness is inevitably
    harder to read, understand, debug, maintain, and support.  It slips 
    by without you noticing, and the differences are profound.  It makes
    absolutely no sense whatsoever to impose a needlessly error-prone
    practice when you've no need to.

    Proof: Which one of these two equivalent expressions is quicker to 
           parse, and why?

                1) $Widget'sales
                2) $Widget::sales

    Answer: #2, because the :: provide for more space that a simple tick,
            they are easier on the reader to see, and thus less like to be
            confused for something they're not.  (Rather easier to get into
            "strings", too.)  Small matter that is, in writing maintainable
            code, little things like this do add up.  Or would you see the
            perl4 style resurrected?  I think not.

In a Huffman sort of way, we reserve the simplest, easiest-to-type, and
easiest-to-miss or misread, markings for those operations that are the 
most frequently used, the most fundamental to the language.  But 
the suppression of due interpolation is *not* the more common of the two
operations, a fact not an opinion, and one which I shall prove in 2.

#2: Perl's normal and customary strings are all interpolative.  To
    suppress interpolation requires special action.  To allow it to occur
    does not. This is no a string-substitution, macro-processing language,
    which is one way of saying that if you want cpp or m4, you know where
    to find it.  Rather. it is one where a variable represents a value
    *right* *now*, not later.  In fact, to do otherwise violates the 
    principle of least supprise.  

    If you were to come upon a function called as 

        $acme = zenith($values);

    and told it was defined as 

        $acme = distill($this > $that : $this ? $that, @those);

    wouldn't you wonder what happened when?  (See List::Util)

    That's why we write

        @hits = grep { /somepat/ } @targets;

    instead of 

        @hits = grep(/somepat/, @targets);

    and even more 

        @cubes = grep { $_ ** 3 } @simples;

    over 

        @cubes = grep($_ ** 3, @simples);

    Because the braces remind the reader that unlike a proper 
    function, such as

        $y = f($x)

    or 
        $tan = sin($x)/cos($x);

    The grep() function does not receive the evaluated results of its first
    argument.  Rather, it is passed in *UN*evaluated, or, if you would,
    uninterpolated.  This is so abnormal as to confuse the reader; hence
    the preferred alternate notation.

    [There's also the evil engendered by those writers who force their
     readers to wonder whether you get ($that, @those) due to lost souls
     enamoured of writing the perplexingly and error-prone paren-free
     syntactic version:

        @these = distill $this > $that : $this ? $that, @those;

     over 

        @these = distill {$this > $that : $this ? $that} @those;

     but some other time.
    ]

    But Perl interpolates values in many places for more basic than this
    slightly esoteric nonsense.  Because of this, special on the user's part
    to suppress this fundamental aspect of Perl's natural is required.

    Consider execution-quotes; that is, the interpolated commands provided
    by grave accent marks.

        $answer = `cmd $arg1 $arg2`;

    Tell me please: who expands $arg1 and $arg2?  It is the shell,
    or is Perl?  Of course the answer is Perl.  And unlike the shell,
    no amount of funny mixed quoting is quoting to change this.  
    You cannot write

        $answer = `cmd '$arg1' '$arg2'`;

    although you may write

        $answer = `cmd \$arg1 \$arg2`;

    You may also write

        $answer = qx'cmd $arg1 $arg2';

    Aren't those odd-looking?  That's because they run contrary
    to Perl nature.

    Let us continue.

    Perhaps you consider it a matter of esoterica, but some of us
    still write

	#include <stdio.h>

    and expect it to behave differently than 

	#include "myhack.h"

    behaves.  In the same manner, we like to write

	@his_dots = <~joe/.??*>

    or even (alias, there's an FD<DH leak here)

	open(my exrc, "<", <~joe/.exrc>) || die....

    Well, that brings us to an interesting observation:

	$suffix = "jpg";
	@pix = <*.$suffix>;

    Did that interpolate?  Why of course it did!  That's because Perl 
    is *by nature* interpolative.  I don't use quotes, so of course it
    expands.  (Just please don't make me explain the difference between
    <$var> and <${var}>; the shame!)

    Back to something easier.  If you write 

        $s = lc(<<EOS) . "and Frederick\n";
            [heredoc text]
        EOS

    then that heredoc text is subject to interpolation.

    This is just as occurs if you write:

        lc($heredoc_text) 

    The function does not receive the variable's name or address, but its
    contents.  That's because this is Perl's nature.  In all but
    extremely fridge cases, it makes nary a weaselly whit of differnce
    to write that as 

        lc("$heredoc_text") 

    and so too makes it no difference to write

        $s = lc(<<"EOS") . "and Frederick\n";
            [heredoc text]
        EOS

    That's because Perl is interpolative *BY*NATURE*.

    Now, if you don't care for that, it's up to you to do something
    special.  At your whim, you may write 

        lc(\$heredoc_text) 

    or, if you prefer, you may write:

        lc('$heredoc_text') 

    And so too with the heredoc, for

        $s = lc(<<\EOS) . "and Frederick\n";
            [heredoc text]
        EOS

    here, too, means the same as 

        $s = lc(<<'EOS') . "and Frederick\n";
            [heredoc text]
        EOS

    And this is just the tip of the iceberg, for the interpolative nature
    of Perl is a leitmotiv appearing in many scenes, wherein we use either
    single-quotes or backslashes (or sometimes either) to stop the normal
    thing from happening.

    Let us next consider pattern patching.

        $match_success = ($variable =~ /pattern/);
        $match_success = ($variable =~ 'pattern');
        $match_success = ($variable =~ /$pat_rx/);
        $match_success = ($variable =~ "$pat_rx");
        $match_success = ($variable =~  $pat_rx );

    are all the same, given that $pat_rx holds the same thing as 
    pattern.  To get something else, you must work much harder.

        $match_success = ($variable =~ m'pattern');
        $match_success = ($variable =~ m'$pat_rx');
        $match_success = ($variable =~  '$pat_rx');

    I could show similar demonstration with the applied 
    pattern-matching that occurs with s/// (vs s'''),
    with split(), and with grep(), but I bet you're starting
    to catch on.

    But we're not at all done.

    Perl's interpolative nature is hardly limited to scalars
    and to strings.

    Consider:

        fn(@alpha, @beta, @gamma);

    Those all interpolate.  True, they'd interpolate differently
    written with stringification:

        fn("@alpha", "@beta", "@gamma");

    buy interpolate they still would.  And why do you we do
    if don't care for this?  The same as always: *special*
    processing.

        fn('@alpha', '@beta', '@gamma');
    or
        fn(\@alpha,  \@beta,  \@gamma);
    or even 
        sub fn;
        fn qw(@alpha  @beta    @gamma);
    or for that matter, the simpler and more useful:
        sub fn;
        fn \(@alpha,  @beta,   @gamma);

Are you yet convinced that Perl is interpolative by nature, that *not*
interpolating is a signal of something out of the  ordinary occurring?
(Yes, qw// *is* an exception; so is q// -- but in a way, it's not.)

If not, then I give you one further scenario whereby the obsequious fawning
over 'asdf' instead of the more legible and more intuitive "asdf" will
cause trouble for any beginner, or maintainer, or software.

Consider the humble shell programmer, writing:

    % echo -n '\t' | wc -c
           1 
    % echo -n "\t" | wc -c
           1 

Or the C programmer writing 

    putc('\t'); putc('\n');

vs the one writing 

    puts("\t");

gets a chr#9 ("\cI") emitted, plus a newline.

As you see, both write a single character #9 to stdout, followed by a
newline.  There are not slackbatches slinking about.

And so we have a culture--no, strike that--we have *two* cultures, where
they are expecting \t, and \n too, to mean a single character.

If you follow normal Perl, simply letting Perl does what it does, then all
will be well: they'll get what they're expecting. But if you do *not*, if
you blindly follow some disjointed tune of a PC-mad[*] Pied Piper calling out
his rats upon an out-of-tune instrument and so blindly writing '\t' where
the more natural "\t" is needed, you'll not rid your town of rats, but
infest it with them same, as many will stumble, and some will fall.

And *woe* be unto he who must explain why 

    $tab = '\t';
    $got_tab = ($var = /$tab/);

(with or without the slashes) should work equally well as 

    $tab = "\t";
    $got_tab = ($var = /$tab/);

while the not so dissimilar

    $tab = "\t";
    $tab_ix = substr($big_string, $tab);

works, yet the wholly analogous

    $tab = '\t';
    $tab_ix = substr($big_string, $tab);

fails!

My strong advice to use double-quotes in all situations save those rare
ones where you wish to wish to counter Perl's own nature is one that
derives in *no* *part* from mere whim or personal preference.  

Rather, this is my *well-considered* position.  By that I mean that I've
thought deeply about it -- a *lot*, and for many, many years.  

And while I have also conferred with Larry about it, who was in agreement
and supplied much of the argument I've outline above, I refuse to resort to
a blind appeal to authority for justification the way as you have done, as
I prefer reason that can be made to clear to all.  If I cannot persuade you
that my reasons are sensible, then I cannot, and that's all I ask.  

And so I am sticking to it, until and unless some stronger argument to 
the contrary has presented itself to me, addressing these many issues in a
convincing and logical fashion.

To date, none have.

I believe that I have solidly demonstrated not only is blindly writing
single quotes as a *default* a poor decision *typographically*, something
as self-apparent as anything can be, but far more importantly, I have also
shown that that in turning by default to suppressive quoting, one must do
continual battle, swimming upstream as it were, against Perl's fundamental,
underlying interpolative nature, a leitmotiv that courses through Perl
again and again and again.

To battle against that nature is not only more bother than sense, it also
risks causing needless, irritating, and perhaps even dangerous confusion
amongst writers and maintainers fr Perl code.

So I needs must ask: *why* are you fighting Perl?  Cui bono?

--tom

[*]: "politically correct"
     "programically correct"
     "politically *and* programically correct"
     "the way the right suppresses the left"
     "where one group silences or enforces its speech-will on another"
     "all of the above"




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About