develooper Front page | perl.perl5.porters | Postings from June 2022

Re: Pre-RFC: s/.../.../gg Really globally substitute

Thread Previous | Thread Next
Scott Baker
June 29, 2022 15:14
Re: Pre-RFC: s/.../.../gg Really globally substitute
Message ID:
This feels pretty corner-casey to me. I've been writing Perl for 15+ 
years now and I haven't run in to this yet.

It's an interesting idea, for sure, I don't know if it's worth the 
time/effort it would take to implement though. Right now, I think we 
have bigger fish to fry.

- Scott

On 6/29/22 06:25, James Raspass wrote:
> Note this idea came out of a code golf discussion but I feel it has
> merit in verbose normal code too so I'm posting it here. Programming
> Perl contains the follow exert:
> ---
> When a global substitution just isn’t global enough
> Occasionally, you can’t just use a /g to get all the changes to occur,
> either because
> the substitutions overlap or have to happen right to left, or because
> you need the
> length of $` to change between matches. You can usually do what you want by
> calling s/// repeatedly. However, you want the loop to stop when the
> s/// finally
> fails, so you have to put it into the conditional, which leaves
> nothing to do in the
> main part of the loop. So we just write a 1, which is a rather boring
> thing to do,
> but bored is the best you can hope for sometimes. Here are some examples that
> use a few more of those odd regex beasties that keep popping up:
> # put commas in the right places in an integer
> 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/;
> # expand tabs to 8⁷column spacing
> 1 while s/\t+/" " x (length($&)*8 ⁷ length($`)%8)/e;
> # remove (nested (even deeply nested (like this))) remarks
> 1 while s/\([^()]*\)//g;
> # remove duplicate words (and triplicate (and quadruplicate...))
> 1 while s/\b(\w+) \1\b/$1/gi;
> ---
> I feel like we could replace this pattern with a dedicated flag that
> is clearer and potentially faster, /gg, e.g.
> s/(\d)(\d\d\d)(?!\d)/$1,$2/gg;
> One thing I like about this proposal is it would allow us to use /r
> without needing to add extra variables and copies, e.g.
> say $cost =~ s/(\d)(\d\d\d)(?!\d)/$1,$2/ggr;
> Doubling up a flag to mean that but more makes sense when you consider
> /aa or /xx, or I guess /ee but unlike /ee I don't think we should
> target an explicit number of iterations, just once, or as many as
> possible. Obviously this construct has the potential to inf loop, but
> so does the postfix while.
> Things to consider:
>   - Should we return the total number of of replacements when not using /r.
>   - Should it apply to m// too or just s/// like /e?
>   - Should it go breadth or depth first?
> Thoughts?

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About