develooper Front page | perl.perl5.porters | Postings from November 2022

Re: Breaking up regcomp.c and maybe regexec.c

Thread Previous | Thread Next
From:
Karl Williamson
Date:
November 10, 2022 17:02
Subject:
Re: Breaking up regcomp.c and maybe regexec.c
Message ID:
0f3adfdb-c9f1-3c34-b4fd-631458cc4901@khwilliamson.com
On 11/10/22 07:25, demerphq wrote:
> 
> 
> On Thu, 10 Nov 2022, 13:31 Paul "LeoNerd" Evans, <leonerd@leonerd.org.uk 
> <mailto:leonerd@leonerd.org.uk>> wrote:
> 
>     On Thu, 10 Nov 2022 10:22:54 +0100
>     demerphq <demerphq@gmail.com <mailto:demerphq@gmail.com>> wrote:
> 
>      > The regex engine is a huge amount of code divided into several
>      > conceptual components along with tons of ancillary code, all crammed
>      > basically into two files. regcomp.c has 26k lines, and regexec.c has
>      > 11k lines.
>      >
>      > The nature of some of the code somewhat biases towards large
>      > functions, but nevertheless we have several large functions in
>     both of
>      > the files, arguably code that could be separated into more reasonably
>      > sized compilation units.
>     ...
>      > Does anybody have any thoughts on this? Relatively few people work on
>      > the regex code, does anybody have any strong feelings about this? Any
>      > guidance to provide?
> 
>     I think overall the idea sounds good. I recently split up `op.c`
>     because it had become rather large, and took out the peephole optimiser
>     into its own new `peep.c` file. Sounds like you could do similar for
>     similar reasons.
> 
> 
> Thanks, I'll check that out to validate my assumptions.
> 
> 
> 
>     As to naming, I wonder if we want to consider having things like
>     regtrie.c, regpeep.c, etc...
> 
> 
> that was the direction I had planned for
> 
>     Or whether it might make more sense to
>     move all the regexp engine stuff into a re/ subdirectory?
> 
> 
> Hadn't even considered that...
> 
>     Would that
>     even work with our build system, or do all the files have to be flat at
>     the toplevel?
> 
> 
> These kind of questions are why I asked for guidance... :-)
> 
> Yves
> 

I have no strong feelings about this either way.  So I would support 
doing it to whoever wants to try.  And it might lead to being able to 
separate the regex optimizer from the rest of the compiling code, for 
example.  This has been in perltodo since before I joined.

The Levenstein distance code is taken from CPAN.  I hope I gave credit.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About