On Thu, 10 Nov 2022 10:22:54 +0100 demerphq <demerphq@gmail.com> wrote: > The regex engine is a huge amount of code divided into several > conceptual components along with tons of ancillary code, all crammed > basically into two files. regcomp.c has 26k lines, and regexec.c has > 11k lines. > > The nature of some of the code somewhat biases towards large > functions, but nevertheless we have several large functions in both of > the files, arguably code that could be separated into more reasonably > sized compilation units. ... > Does anybody have any thoughts on this? Relatively few people work on > the regex code, does anybody have any strong feelings about this? Any > guidance to provide? I think overall the idea sounds good. I recently split up `op.c` because it had become rather large, and took out the peephole optimiser into its own new `peep.c` file. Sounds like you could do similar for similar reasons. As to naming, I wonder if we want to consider having things like regtrie.c, regpeep.c, etc... Or whether it might make more sense to move all the regexp engine stuff into a re/ subdirectory? Would that even work with our build system, or do all the files have to be flat at the toplevel? -- Paul "LeoNerd" Evans leonerd@leonerd.org.uk | https://metacpan.org/author/PEVANS http://www.leonerd.org.uk/ | https://www.tindie.com/stores/leonerd/Thread Previous | Thread Next