develooper Front page | perl.perl6.users | Postings from July 2020

Re: Raku version of "The top 10 tricks of Perl one-liners" ?!?

Thread Previous
From:
Timo Paulssen
Date:
July 23, 2020 00:01
Subject:
Re: Raku version of "The top 10 tricks of Perl one-liners" ?!?
Message ID:
f70c65b3-a1c4-6105-8b61-b8fcb457bb40@wakelift.de
Try it with a very filled folder, though. I would expect the majority of
the time spent is setup and actually going through the lines themselves
isn't very slow.

On 22/07/2020 22:31, Aureliano Guedes wrote:
> That is a little bit disappointing:
>
> $ time ls -l | perl -lane 'print "$F[7] $F[1]"' > /dev/null
>
> real 0m0.008s
> user 0m0.013s
> sys 0m0.000s
>
> $ time ls -l | raku -ne 'say "$_[7] $_[1]" given .words' > /dev/null
> Use of Nil in string context
>   in block  at -e line 1
>
> real 0m0.302s
> user 0m0.370s
> sys 0m0.060s 
>
>
> The delay is so long that I wouldn't use that in a very filled folder.
> Perhaps I know, it will be improved (I hope).   
>
> On Wed, Jul 22, 2020 at 4:21 PM Larry Wall <larry@wall.org
> <mailto:larry@wall.org>> wrote:
>
>     On Sun, Jul 19, 2020 at 09:38:31PM -0700, William Michels via
>     perl6-users wrote:
>     : Hello,
>     :
>     : I ran across this 2010 Perl(5) article on the Oracle Linux Blog:
>     :
>     : "The top 10 tricks of Perl one-liners"
>     :
>     https://blogs.oracle.com/linux/the-top-10-tricks-of-perl-one-liners-v2
>     :
>     : Q1. Now that it's a decade later--and Raku (née Perl6) has hit the
>     : scene--can someone translate the 'top ten tricks' in the blog
>     article
>     : above into Raku?
>     :
>     : Q2. Are many of the ten Perl(5) one-liner 'tricks' unnecessary
>     in Raku
>     : (better defaults, more regularized regexes, etc.)?
>     :
>     : Best, Bill.
>
>     Yes, and yes.  :-)
>
>     More specificially, here's my take.
>
>     >   Trick #1: -l
>     >
>     >        Smart newline processing. Normally, perl hands you entire
>     lines,
>     >        including a trailing newline. With -l, it will strip the
>     trailing
>     >        newline off of any lines read, and automatically add a
>     newline to
>     >        anything you print (including via -p).
>     >
>     >        Suppose I wanted to strip trailing whitespace from a
>     file. I might
>     >        naïvely try something like
>     >
>     >        perl -pe 's/\s*$//'
>     >
>     >        The problem, however, is that the line ends with "\n",
>     which is
>     >        whitespace, and so that snippet will also remove all
>     newlines from
>     >        my file! -l solves the problem, by pulling off the
>     newline before
>     >        handing my script the line, and then tacking a new one on
>     afterwards:
>     >
>     >        perl -lpe 's/\s*$//'
>
>     This trick is not needed in Raku, since newlines are stripped by
>     default.  Also,
>     there are .trim methods that you can use instead of regex.
>
>     >    Trick #2: -0
>     >
>     >        Occasionally, it's useful to run a script over an entire
>     file,
>     >        or over larger chunks at once. -0 makes -n and -p feed
>     you chunks
>     >        split on NULL bytes instead of newlines. This is often
>     useful for,
>     >        e.g. processing the output of find -print0. Furthermore,
>     perl -0777
>     >        makes perl not do any splitting, and pass entire files to
>     your script
>     >        in $_.
>     >
>     >        find . -name '*~' -print0 | perl -0ne unlink
>     >
>     >        Could be used to delete all ~-files in a directory tree,
>     without
>     >        having to remember how xargs works.
>
>     The key word above is "occasionally", so most of these seldom-used
>     switches are gone.
>     Also, most of their functions are really easy to do from inside
>     the language.
>     So these days dividing a file by null chars would typicaly be
>     handled with:
>
>         for slurp.split("\0") { ... }
>
>     >    Trick #3: -i
>     >
>     >        -i tells perl to operate on files in-place. If you use -n
>     or -p with
>     >        -i, and you pass perl filenames on the command-line, perl
>     will run
>     >        your script on those files, and then replace their
>     contents with the
>     >        output. -i optionally accepts an backup suffix as
>     argument; Perl will
>     >        write backup copies of edited files to names with that
>     suffix added.
>     >
>     >        perl -i.bak -ne 'print unless /^#/' script.sh
>     >
>     >        Would strip all whole-line commands from script.sh, but
>     leave a copy
>     >        of the original in script.sh.bak.
>
>     I'm not aware of a direct replacement for this in Raku.  Perl has
>     to be
>     better at something...
>
>     >    Trick #4: The .. operator
>     >
>     >        Perl's .. operator is a stateful operator -- it remembers
>     state
>     >        between evaluations. As long as its left operand is
>     false, it returns
>     >        false; Once the left hand returns true, it starts
>     evaluating the
>     >        right-hand operand until that becomes true, at which
>     point, on
>     >        the next iteration it resets to false and starts testing
>     the other
>     >        operand again.
>     >
>     >        What does that mean in practice? It's a range operator:
>     It can be
>     >        easily used to act on a range of lines in a file. For
>     instance,
>     >        I can extract all GPG public keys from a file using:
>     >
>     >        perl -ne 'print if /-----BEGIN PGP PUBLIC KEY
>     BLOCK-----/../-----END PGP PUBLIC KEY BLOCK-----/' FILE
>
>     The scalar .. operator in Perl translates to the ff operator in Raku.
>     It's slightly less magical, however, insofar as it won't treat bare
>     numbers as line numbers in the input.
>
>     >    Trick #5: -a
>     >
>     >        -a turns on autosplit mode – perl will automatically
>     split input
>     >        lines on whitespace into the @F array. If you ever run
>     into any advice
>     >        that accidentally escaped from 1980 telling you to use
>     awk because
>     >        it automatically splits lines into fields, this is how
>     you use perl
>     >        to do the same thing without learning another, even
>     worse, language.
>     >
>     >        As an example, you could print a list of files along with
>     their link
>     >        counts using
>     >
>     >        ls -l | perl -lane 'print "$F[7] $F[1]"'
>
>     This feature was always a bit suspect because it hard-wired a
>     particular
>     name.  You don't even need a weird name in Raku:
>
>          ls -l | raku -ne 'say "$_[7] $_[1]" given .words'
>
>     >    Trick #6: -F
>     >
>     >        -F is used in conjunction with -a, to choose the delimiter on
>     >        which to split lines. To print every user in /etc/passwd
>     (which is
>     >        colon-separated with the user in the first column), we
>     could do:
>     >
>     >        perl -F: -lane 'print $F[0]' /etc/passwd
>
>     Again, we felt this switch wasn't really pulling it's weight, so
>     we pulled it
>     in favor of explicit split or comb:
>
>          raku -ne 'say $_[0] given .split(":")' /etc/passwd
>
>     >    Trick #7: \K
>     >
>     >        \K is undoubtedly my favorite little-known-feature of
>     Perl regular
>     >        expressions. If \K appears in a regex, it causes the
>     regex matcher to
>     >        drop everything before that point from the internal
>     record of "Which
>     >        string did this regex match?". This is most useful in
>     conjunction
>     >        with s///, where it gives you a simple way to match a
>     long expression,
>     >        but only replace a suffix of it.
>     >
>     >        Suppose I want to replace the From: field in an email. We
>     could
>     >        write something like
>     >
>     >        perl -lape 's/(^From:).*/$1 Nelson Elhage
>     <nelhage\@ksplice.com <http://ksplice.com>>/'
>     >
>     >        But having to parenthesize the right bit and include the
>     $1 is
>     >        annoying and error-prone. We can simplify the regex by
>     using \K to
>     >        tell perl we won't want to replace the start of the match:
>     >
>     >        perl -lape 's/^From:\K.*/ Nelson Elhage
>     <nelhage\@ksplice.com <http://ksplice.com>>/'
>
>     Perl's \K \k becomes <( )> in Raku.  Note that there are other
>     regex changes as well,
>     and that in the replacement it's not necessary to escape the @ in
>     the absence of brackets:
>
>          raku -pe 's/ ^ "From:" <(.*)> / Nelson Elhage
>     <nelhage@ksplice.com <mailto:nelhage@ksplice.com>>/'
>
>     The )> is not required to balance there, but helps clarify the
>     intention.  If you do
>     have a quoting problem in the replacement, you can use the
>     assignment form with
>     any other form of quoting instead:
>
>          raku -pe 's[ ^ "From:" <(.*)> ] = Q[Nelson Elhage
>     <nelhage@ksplice.com <mailto:nelhage@ksplice.com>>]'
>
>     >    Trick #8: $ENV{}
>     >
>     >        When you're writing a one-liner using -e in the shell,
>     you generally
>     >        want to quote it with ', so that dollar signs inside the
>     one-liner
>     >        aren't expanded by the shell. But that makes it annoying
>     to use a '
>     >        inside your one-liner, since you can't escape a single
>     quote inside
>     >        of single quotes, in the shell.
>     >
>     >        Let's suppose we wanted to print the username of anyone
>     in /etc/passwd
>     >        whose name included an apostrophe. One option would be to
>     use a
>     >        standard shell-quoting trick to include the ':
>
>     >        perl -F: -lane 'print $F[0] if $F[4] =~ /'"'"'/' /etc/passwd
>     >
>     >        But counting apostrophes and backslashes gets old fast. A
>     better
>     >        option, in my opinion, is to use the environment to pass
>     the regex
>     >        into perl, which lets you dodge a layer of parsing entirely:
>     >
>     >        env re="'" perl -F: -lane 'print $F[0] if $F[4] =~
>     /$ENV{re}/' /etc/passwd
>     >
>     >        We use the env command to place the regex in a variable
>     called re,
>     >        which we can then refer to from the perl script through
>     the %ENV
>     >        hash. This way is slightly longer, but I find the savings
>     in counting
>     >        backslashes or quotes to be worth it, especially if you
>     need to end
>     >        up embedding strings with more than a single metacharacter.
>
>     This is rather Unix-centric on the face of it, since on Windows you'd
>     have to use outer "" quoting instead.  But you can certainly use the
>     same trick with Raku, provided you spell ENV right:
>
>          env re="'" raku -ne '(say .[0] if .[4] ~~ /<{ %*ENV<re> }>/)
>     given .split(":")' /etc/passwd
>
>     It probably won't be very efficient though, and doesn't do a thing
>     for readability.
>     Much easier to use a character name:
>
>          raku -ne '(say .[0] if .[4] ~~ /\c[APOSTROPHE]/) given
>     .split(":")' /etc/passwd
>
>     You could backport that trick to Perl using \N{} too, I guess.
>
>     >    Trick #9: BEGIN and END
>     >
>     >        BEGIN { ... } and END { ... } let you put code that gets
>     run entirely
>     >        before or after the loop over the lines.
>     >
>     >        For example, I could sum the values in the second column
>     of a CSV
>     >        file using:
>     >
>     >        perl -F, -lane '$t += $F[1]; END { print $t }'
>
>     Same trick, except you can omit the brackets:
>
>          raku -ne 'my $t += [1] given .split(","); END say $t'
>
>     Note the 'my' is required because strict is the default.
>
>     >    Trick #10: -MRegexp::Common
>     >
>     >        Using -M on the command line tells perl to load the given
>     module
>     >        before running your code. There are thousands of modules
>     available
>     >        on CPAN, numerous of them potentially useful in
>     one-liners, but
>     >        one of my favorite for one-liner use is Regexp::Common,
>     which, as
>     >        its name suggests, contains regular expressions to match
>     numerous
>     >        commonly-used pieces of data.
>     >
>     >        The full set of regexes available in Regexp::Common is
>     available in
>     >        its documentation, but here's an example of where I might
>     use it:
>     >
>     >        Neither the ifconfig nor the ip tool that is supposed to
>     replace it
>     >        provide, as far as I know, an easy way of extracting
>     information for
>     >        use by scripts. The ifdata program provides such an
>     interface, but
>     >        isn't installed everywhere. Using perl and
>     Regexp::Common, however,
>     >        we can do a pretty decent job of extracing an IP from ips
>     output:
>     >
>     >        ip address list eth0 | \
>     >          perl -MRegexp::Common -lne 'print $1 if /($RE{net}{IPv4})/'
>
>     I don't know if there's anything quite comparable.  And who's to say
>     what's "common" anymore...   Certainly we have -M.  But Raku's regex
>     and grammars are so much more powerful that these things are likely to
>     kept in more specific Grammar modules anyway, or just hand-rolled for
>     the purpose on the spot.
>
>     >    ~nelhage Join the discussion Comments ( 7 )
>
>     Larry
>
>
>
> -- 
> Aureliano Guedes
> skype: aureliano.guedes
> contato:  (11) 94292-6110
> whatsapp +5511942926110

Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About