Front page | perl.perl5.porters |
Postings from June 2022
Pre-RFC: s/.../.../gg Really globally substitute
Thread Next
From:
James Raspass
Date:
June 29, 2022 13:25
Subject:
Pre-RFC: s/.../.../gg Really globally substitute
Message ID:
CAP4Ky=8RSo3mpcQCyZQyHmdAuzQYVD=sojkU=55s813_88PCvQ@mail.gmail.com
Note this idea came out of a code golf discussion but I feel it has
merit in verbose normal code too so I'm posting it here. Programming
Perl contains the follow exert:
---
When a global substitution just isn’t global enough
Occasionally, you can’t just use a /g to get all the changes to occur,
either because
the substitutions overlap or have to happen right to left, or because
you need the
length of $` to change between matches. You can usually do what you want by
calling s/// repeatedly. However, you want the loop to stop when the
s/// finally
fails, so you have to put it into the conditional, which leaves
nothing to do in the
main part of the loop. So we just write a 1, which is a rather boring
thing to do,
but bored is the best you can hope for sometimes. Here are some examples that
use a few more of those odd regex beasties that keep popping up:
# put commas in the right places in an integer
1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/;
# expand tabs to 8⁷column spacing
1 while s/\t+/" " x (length($&)*8 ⁷ length($`)%8)/e;
# remove (nested (even deeply nested (like this))) remarks
1 while s/\([^()]*\)//g;
# remove duplicate words (and triplicate (and quadruplicate...))
1 while s/\b(\w+) \1\b/$1/gi;
---
I feel like we could replace this pattern with a dedicated flag that
is clearer and potentially faster, /gg, e.g.
s/(\d)(\d\d\d)(?!\d)/$1,$2/gg;
One thing I like about this proposal is it would allow us to use /r
without needing to add extra variables and copies, e.g.
say $cost =~ s/(\d)(\d\d\d)(?!\d)/$1,$2/ggr;
Doubling up a flag to mean that but more makes sense when you consider
/aa or /xx, or I guess /ee but unlike /ee I don't think we should
target an explicit number of iterations, just once, or as many as
possible. Obviously this construct has the potential to inf loop, but
so does the postfix while.
Things to consider:
- Should we return the total number of of replacements when not using /r.
- Should it apply to m// too or just s/// like /e?
- Should it go breadth or depth first?
Thoughts?
Thread Next
-
Pre-RFC: s/.../.../gg Really globally substitute
by James Raspass