develooper Front page | perl.perl5.porters | Postings from June 2022

Re: Pre-RFC: s/.../.../gg Really globally substitute

Thread Previous | Thread Next
From:
perl5
Date:
June 29, 2022 15:42
Subject:
Re: Pre-RFC: s/.../.../gg Really globally substitute
Message ID:
20220629174211.7adc6cd3@pc09
On Wed, 29 Jun 2022 14:25:37 +0100, James Raspass <jraspass@gmail.com> wrote:

> Note this idea came out of a code golf discussion but I feel it has
> merit in verbose normal code too so I'm posting it here. Programming
> Perl contains the follow exert:

I like it. I don't know if this is the best way to implement it,
but I've wanted this occasionally, as it reads much cleaner than
the `1 while`

I agree however with the people that see no value in it, that is
it not on my most wanted feature list. I also forsee problems when
explaining s///egg (the combination with /e)

> ---
> When a global substitution just isn’t global enough
> 
> Occasionally, you can’t just use a /g to get all the changes to occur, either
> because the substitutions overlap or have to happen right to left, or because
> you need the length of $` to change between matches. You can usually do what
> you want by calling s/// repeatedly. However, you want the loop to stop when
> the s/// finally fails, so you have to put it into the conditional, which
> leaves nothing to do in the main part of the loop. So we just write a 1, which
> is a rather boringthing to do, but bored is the best you can hope for sometimes.
> Here are some examples that use a few more of those odd regex beasties that
> keep popping up:
> 
> # put commas in the right places in an integer
> 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/;
> 
> # expand tabs to 8⁷column spacing
> 1 while s/\t+/" " x (length($&)*8 ⁷ length($`)%8)/e;
> 
> # remove (nested (even deeply nested (like this))) remarks
> 1 while s/\([^()]*\)//g;
> 
> # remove duplicate words (and triplicate (and quadruplicate...))
> 1 while s/\b(\w+) \1\b/$1/gi;
> ---
> 
> I feel like we could replace this pattern with a dedicated flag that is clearer
> and potentially faster, /gg, e.g.
> 
> s/(\d)(\d\d\d)(?!\d)/$1,$2/gg;
> 
> One thing I like about this proposal is it would allow us to use /r without
> needing to add extra variables and copies, e.g.
> 
> say $cost =~ s/(\d)(\d\d\d)(?!\d)/$1,$2/ggr;
> 
> Doubling up a flag to mean that but more makes sense when you consider /aa or
> /xx, or I guess /ee but unlike /ee I don't think we should target an explicit
> number of iterations, just once, or as many as possible. Obviously this
> construct has the potential to inf loop, but so does the postfix while.
> 
> Things to consider:
>  - Should we return the total number of of replacements when not using /r.
>  - Should it apply to m// too or just s/// like /e?
>  - Should it go breadth or depth first?
> 
> Thoughts?


-- 
H.Merijn Brand  https://tux.nl   Perl Monger   http://amsterdam.pm.org/
using perl5.00307 .. 5.35        porting perl5 on HP-UX, AIX, and Linux
https://tux.nl/email.html http://qa.perl.org https://www.test-smoke.org
                           

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About