develooper Front page | perl.perl6.users | Postings from November 2018

Diamond <> or fileinput-like input handling (was Re: what type$in,$out and $err is)

Thread Next
From:
Trey Harris
Date:
November 5, 2018 23:37
Subject:
Diamond <> or fileinput-like input handling (was Re: what type$in,$out and $err is)
Message ID:
CALKJ+EtBF8_3PfvFLXLoDcN0Z-psbwHLCn=asn=0NCiWcbO-Kw@mail.gmail.com
On Mon, Nov 5, 2018 at 11:54 AM Ralph Mellor
[ralphdjmellor@gmail.com](mailto:ralphdjmellor@gmail.com)
<http://mailto:[ralphdjmellor@gmail.com](mailto:ralphdjmellor@gmail.com)>
wrote:

On Sun, Oct 28, 2018 at 7:26 PM Xiao Yafeng <xyf.xiao@gmail.com> wrote:
>
>> Besides, just curious, why choose '_' as default it looks strange....
>>
>
> Turns out it's deprecated in 6.d:
>
> https://marketing.perl6.org/id/1541379592/pdf_digital
>
Is it, for Proc objects, in addition to for &open as mentioned in the
ChangeLog <https://github.com/perl6/6.d-prep/blob/master/ChangeLog.md>?

They mean different things (that simply happen to semantically overlap
frequently):

   - In &open <https://docs.perl6.org/routine/open>, use of '-' indicates $*IN
   or $*OUT for read-only or write-only uses respectively.
   - In Proc <https://docs.perl6.org/type/Proc>, the default of '-'
   indicates standard POSIX-like practice of inheriting stdin, stdout and
   stderr from the parent, no matter how the parent (or its parent(s)) have
   duped or redirected them.

From the respective code, I think it is not; as the ChangeLog states, it’s
deprecated for &open, but not for Proc.

This passage
<https://docs.perl6.org/language/variables#Argument_related_variables> from
the built-in variables doc is interesting:

Argument related variables

   - $*ARGFILES An IO::ArgFiles <https://docs.perl6.org/type/IO::ArgFiles>
   (an empty subclass of IO::CatHandle) that uses @*ARGS as source files,
   if it contains any files, or $*IN otherwise. When $*IN is used, its
   :nl-in, :chomp, :encoding, and :bin will be set on the IO::ArgFiles
   <https://docs.perl6.org/type/IO::ArgFiles> object.

As of 6.d language, $*ARGFILES *inside* sub MAIN
<https://docs.perl6.org/language/functions#sub_MAIN> is always set to $*IN,
even when @*ARGS is not empty.

   - @*ARGS Arguments from the command line.

This is getting at the issue with replicating the common “while diamond”
operation (originally just while (<>)) in Perl 5, which allowed easy
replication of Unix/POSIX default line-oriented file-handling behavior:
namely, concatenating the lines of each filename given in the optional
positionals, unless the filename was the single hyphen "-", in which case
lines from standard input were read instead; when the positionals were
empty, an implicit "-" was assumed.

(If you haven’t played with this behavior directly, it’s very easy to just
use cat or grep to observe this behavior; with no files, stdin is used;
with files, stdin is usually not used, but - alone can be used to switch to
stdin after reading named files. And not just “after”—it’s permissible,
although a bit unusual, to put - between file names, in which case lines
from the named files preceding the hyphen are processed, then lines from
stdin are processed until stdin closes, then files named after the hyphen
are processed. It’s even allowable to include the single hyphen more than
once, which forces reopening of stdin for another round of input until it’s
closed again. This behavior is ubiquitous in the Unix world; Perl 5 had it
built-in, Python has its fileinput module
<https://docs.python.org/3/library/fileinput.html>, and so on.)

How to replicate this behavior in Raku without handling all the
args-handling and opening/closing logic yourself is now… unclear, at best,
and may simply be missing.

In this passage <https://docs.perl6.org/language/5to6-nutshell#Loops> from
the Perl 5-to-6 guide (which, as an ancillary help document, does not have
any authority for language specification), we read:

Note that reading line-by-line from a filehandle has changed.

In Perl 5, it was done in a while loop using the diamond operator. Using for
instead of while was a common bug, because the for causes the whole file to
be sucked in at once, swamping the program’s memory usage.

In Perl 6, for statement is *lazy*, so we read line-by-line in a for loop
using the .lines method.

while (<IN_FH>)  { } # Perl 5

for $IN_FH.lines { } # Perl 6

$IN_FH is, I think, given in the latter example in a hand-wavy way. The
thing that comes *closest* to cat, grep, Perl 5’s diamond, or Python’s
fileinput is—I believe—the unadorned routine lines()
<https://docs.perl6.org/routine/lines>:

for lines() -> $line { ... }

but it has the nonstandard behavior (which may be a bug) of only observing
the first hyphen found in the argument list and ignoring any successive
ones (at least in 6.c). Also, while I’m having trouble getting a 6.d to
work for me right now, I think Raku will give a deprecation notice when it
encounters that first hyphen—so to get the same behavior you’ll again need
to handle all hyphens on the command line yourself.

Just fyi — It’s entirely possible I’m wrong here, and just haven’t found
the right bit of the docs, or that once I get 6.d built I’ll find that
$*ARGFILES.lines — which would *seem* to me to be the right thing to use —
works for this purpose.

I’m not clear why its behavior is changed in sub MAIN—obviously, you can
handle the positionals yourself in the MAIN signature, but you lose the
special stdin handling.

Trey

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About