Front page | perl.perl6.users |
Postings from July 2020
Re: Raku version of "The top 10 tricks of Perl one-liners" ?!?
Thread Previous
From:
Timo Paulssen
Date:
July 23, 2020 00:01
Subject:
Re: Raku version of "The top 10 tricks of Perl one-liners" ?!?
Message ID:
f70c65b3-a1c4-6105-8b61-b8fcb457bb40@wakelift.de
Try it with a very filled folder, though. I would expect the majority of
the time spent is setup and actually going through the lines themselves
isn't very slow.
On 22/07/2020 22:31, Aureliano Guedes wrote:
> That is a little bit disappointing:
>
> $ time ls -l | perl -lane 'print "$F[7] $F[1]"' > /dev/null
>
> real 0m0.008s
> user 0m0.013s
> sys 0m0.000s
>
> $ time ls -l | raku -ne 'say "$_[7] $_[1]" given .words' > /dev/null
> Use of Nil in string context
> in block at -e line 1
>
> real 0m0.302s
> user 0m0.370s
> sys 0m0.060s
>
>
> The delay is so long that I wouldn't use that in a very filled folder.
> Perhaps I know, it will be improved (I hope).
>
> On Wed, Jul 22, 2020 at 4:21 PM Larry Wall <larry@wall.org
> <mailto:larry@wall.org>> wrote:
>
> On Sun, Jul 19, 2020 at 09:38:31PM -0700, William Michels via
> perl6-users wrote:
> : Hello,
> :
> : I ran across this 2010 Perl(5) article on the Oracle Linux Blog:
> :
> : "The top 10 tricks of Perl one-liners"
> :
> https://blogs.oracle.com/linux/the-top-10-tricks-of-perl-one-liners-v2
> :
> : Q1. Now that it's a decade later--and Raku (née Perl6) has hit the
> : scene--can someone translate the 'top ten tricks' in the blog
> article
> : above into Raku?
> :
> : Q2. Are many of the ten Perl(5) one-liner 'tricks' unnecessary
> in Raku
> : (better defaults, more regularized regexes, etc.)?
> :
> : Best, Bill.
>
> Yes, and yes. :-)
>
> More specificially, here's my take.
>
> > Trick #1: -l
> >
> > Smart newline processing. Normally, perl hands you entire
> lines,
> > including a trailing newline. With -l, it will strip the
> trailing
> > newline off of any lines read, and automatically add a
> newline to
> > anything you print (including via -p).
> >
> > Suppose I wanted to strip trailing whitespace from a
> file. I might
> > naïvely try something like
> >
> > perl -pe 's/\s*$//'
> >
> > The problem, however, is that the line ends with "\n",
> which is
> > whitespace, and so that snippet will also remove all
> newlines from
> > my file! -l solves the problem, by pulling off the
> newline before
> > handing my script the line, and then tacking a new one on
> afterwards:
> >
> > perl -lpe 's/\s*$//'
>
> This trick is not needed in Raku, since newlines are stripped by
> default. Also,
> there are .trim methods that you can use instead of regex.
>
> > Trick #2: -0
> >
> > Occasionally, it's useful to run a script over an entire
> file,
> > or over larger chunks at once. -0 makes -n and -p feed
> you chunks
> > split on NULL bytes instead of newlines. This is often
> useful for,
> > e.g. processing the output of find -print0. Furthermore,
> perl -0777
> > makes perl not do any splitting, and pass entire files to
> your script
> > in $_.
> >
> > find . -name '*~' -print0 | perl -0ne unlink
> >
> > Could be used to delete all ~-files in a directory tree,
> without
> > having to remember how xargs works.
>
> The key word above is "occasionally", so most of these seldom-used
> switches are gone.
> Also, most of their functions are really easy to do from inside
> the language.
> So these days dividing a file by null chars would typicaly be
> handled with:
>
> for slurp.split("\0") { ... }
>
> > Trick #3: -i
> >
> > -i tells perl to operate on files in-place. If you use -n
> or -p with
> > -i, and you pass perl filenames on the command-line, perl
> will run
> > your script on those files, and then replace their
> contents with the
> > output. -i optionally accepts an backup suffix as
> argument; Perl will
> > write backup copies of edited files to names with that
> suffix added.
> >
> > perl -i.bak -ne 'print unless /^#/' script.sh
> >
> > Would strip all whole-line commands from script.sh, but
> leave a copy
> > of the original in script.sh.bak.
>
> I'm not aware of a direct replacement for this in Raku. Perl has
> to be
> better at something...
>
> > Trick #4: The .. operator
> >
> > Perl's .. operator is a stateful operator -- it remembers
> state
> > between evaluations. As long as its left operand is
> false, it returns
> > false; Once the left hand returns true, it starts
> evaluating the
> > right-hand operand until that becomes true, at which
> point, on
> > the next iteration it resets to false and starts testing
> the other
> > operand again.
> >
> > What does that mean in practice? It's a range operator:
> It can be
> > easily used to act on a range of lines in a file. For
> instance,
> > I can extract all GPG public keys from a file using:
> >
> > perl -ne 'print if /-----BEGIN PGP PUBLIC KEY
> BLOCK-----/../-----END PGP PUBLIC KEY BLOCK-----/' FILE
>
> The scalar .. operator in Perl translates to the ff operator in Raku.
> It's slightly less magical, however, insofar as it won't treat bare
> numbers as line numbers in the input.
>
> > Trick #5: -a
> >
> > -a turns on autosplit mode – perl will automatically
> split input
> > lines on whitespace into the @F array. If you ever run
> into any advice
> > that accidentally escaped from 1980 telling you to use
> awk because
> > it automatically splits lines into fields, this is how
> you use perl
> > to do the same thing without learning another, even
> worse, language.
> >
> > As an example, you could print a list of files along with
> their link
> > counts using
> >
> > ls -l | perl -lane 'print "$F[7] $F[1]"'
>
> This feature was always a bit suspect because it hard-wired a
> particular
> name. You don't even need a weird name in Raku:
>
> ls -l | raku -ne 'say "$_[7] $_[1]" given .words'
>
> > Trick #6: -F
> >
> > -F is used in conjunction with -a, to choose the delimiter on
> > which to split lines. To print every user in /etc/passwd
> (which is
> > colon-separated with the user in the first column), we
> could do:
> >
> > perl -F: -lane 'print $F[0]' /etc/passwd
>
> Again, we felt this switch wasn't really pulling it's weight, so
> we pulled it
> in favor of explicit split or comb:
>
> raku -ne 'say $_[0] given .split(":")' /etc/passwd
>
> > Trick #7: \K
> >
> > \K is undoubtedly my favorite little-known-feature of
> Perl regular
> > expressions. If \K appears in a regex, it causes the
> regex matcher to
> > drop everything before that point from the internal
> record of "Which
> > string did this regex match?". This is most useful in
> conjunction
> > with s///, where it gives you a simple way to match a
> long expression,
> > but only replace a suffix of it.
> >
> > Suppose I want to replace the From: field in an email. We
> could
> > write something like
> >
> > perl -lape 's/(^From:).*/$1 Nelson Elhage
> <nelhage\@ksplice.com <http://ksplice.com>>/'
> >
> > But having to parenthesize the right bit and include the
> $1 is
> > annoying and error-prone. We can simplify the regex by
> using \K to
> > tell perl we won't want to replace the start of the match:
> >
> > perl -lape 's/^From:\K.*/ Nelson Elhage
> <nelhage\@ksplice.com <http://ksplice.com>>/'
>
> Perl's \K \k becomes <( )> in Raku. Note that there are other
> regex changes as well,
> and that in the replacement it's not necessary to escape the @ in
> the absence of brackets:
>
> raku -pe 's/ ^ "From:" <(.*)> / Nelson Elhage
> <nelhage@ksplice.com <mailto:nelhage@ksplice.com>>/'
>
> The )> is not required to balance there, but helps clarify the
> intention. If you do
> have a quoting problem in the replacement, you can use the
> assignment form with
> any other form of quoting instead:
>
> raku -pe 's[ ^ "From:" <(.*)> ] = Q[Nelson Elhage
> <nelhage@ksplice.com <mailto:nelhage@ksplice.com>>]'
>
> > Trick #8: $ENV{}
> >
> > When you're writing a one-liner using -e in the shell,
> you generally
> > want to quote it with ', so that dollar signs inside the
> one-liner
> > aren't expanded by the shell. But that makes it annoying
> to use a '
> > inside your one-liner, since you can't escape a single
> quote inside
> > of single quotes, in the shell.
> >
> > Let's suppose we wanted to print the username of anyone
> in /etc/passwd
> > whose name included an apostrophe. One option would be to
> use a
> > standard shell-quoting trick to include the ':
>
> > perl -F: -lane 'print $F[0] if $F[4] =~ /'"'"'/' /etc/passwd
> >
> > But counting apostrophes and backslashes gets old fast. A
> better
> > option, in my opinion, is to use the environment to pass
> the regex
> > into perl, which lets you dodge a layer of parsing entirely:
> >
> > env re="'" perl -F: -lane 'print $F[0] if $F[4] =~
> /$ENV{re}/' /etc/passwd
> >
> > We use the env command to place the regex in a variable
> called re,
> > which we can then refer to from the perl script through
> the %ENV
> > hash. This way is slightly longer, but I find the savings
> in counting
> > backslashes or quotes to be worth it, especially if you
> need to end
> > up embedding strings with more than a single metacharacter.
>
> This is rather Unix-centric on the face of it, since on Windows you'd
> have to use outer "" quoting instead. But you can certainly use the
> same trick with Raku, provided you spell ENV right:
>
> env re="'" raku -ne '(say .[0] if .[4] ~~ /<{ %*ENV<re> }>/)
> given .split(":")' /etc/passwd
>
> It probably won't be very efficient though, and doesn't do a thing
> for readability.
> Much easier to use a character name:
>
> raku -ne '(say .[0] if .[4] ~~ /\c[APOSTROPHE]/) given
> .split(":")' /etc/passwd
>
> You could backport that trick to Perl using \N{} too, I guess.
>
> > Trick #9: BEGIN and END
> >
> > BEGIN { ... } and END { ... } let you put code that gets
> run entirely
> > before or after the loop over the lines.
> >
> > For example, I could sum the values in the second column
> of a CSV
> > file using:
> >
> > perl -F, -lane '$t += $F[1]; END { print $t }'
>
> Same trick, except you can omit the brackets:
>
> raku -ne 'my $t += [1] given .split(","); END say $t'
>
> Note the 'my' is required because strict is the default.
>
> > Trick #10: -MRegexp::Common
> >
> > Using -M on the command line tells perl to load the given
> module
> > before running your code. There are thousands of modules
> available
> > on CPAN, numerous of them potentially useful in
> one-liners, but
> > one of my favorite for one-liner use is Regexp::Common,
> which, as
> > its name suggests, contains regular expressions to match
> numerous
> > commonly-used pieces of data.
> >
> > The full set of regexes available in Regexp::Common is
> available in
> > its documentation, but here's an example of where I might
> use it:
> >
> > Neither the ifconfig nor the ip tool that is supposed to
> replace it
> > provide, as far as I know, an easy way of extracting
> information for
> > use by scripts. The ifdata program provides such an
> interface, but
> > isn't installed everywhere. Using perl and
> Regexp::Common, however,
> > we can do a pretty decent job of extracing an IP from ips
> output:
> >
> > ip address list eth0 | \
> > perl -MRegexp::Common -lne 'print $1 if /($RE{net}{IPv4})/'
>
> I don't know if there's anything quite comparable. And who's to say
> what's "common" anymore... Certainly we have -M. But Raku's regex
> and grammars are so much more powerful that these things are likely to
> kept in more specific Grammar modules anyway, or just hand-rolled for
> the purpose on the spot.
>
> > ~nelhage Join the discussion Comments ( 7 )
>
> Larry
>
>
>
> --
> Aureliano Guedes
> skype: aureliano.guedes
> contato: (11) 94292-6110
> whatsapp +5511942926110
Thread Previous