On 12/20/2017 09:42 AM, Paul "LeoNerd" Evans wrote: > On Tue, 19 Dec 2017 19:16:24 -0700 > Karl Williamson <public@khwilliamson.com> wrote: > >> 3) Is there enough usage of quantified [[:ascii:]] in the wild to >> justify doing this optimization? (I was surprised to see only 132 >> CPAN modules have plain :ascii: (this grep also would catch negation)) > > Perhaps they use > > [\x00-\x7F] > > or something similar? Good point, there are a bunch more modules that use this. Doing so means their code is not portable to EBCDIC machines. Perhaps that's why I didn't think to optimize these on ASCII machines, into [[:ascii]]:, though it's easy to do so, and I will. I'd imagine looking for one of those would be > much shorter too, as you can AND with 0x80808080 (or 64bit equivalent) > and get 4 (8) chars at once. > That's what I meant by vectorization, or word-at-a-time operations. That's what I just added to core. In fact this will work on any exact pattern whose length evenly divides the word size. The regex engine could be changed to take advantage of this in several places.Thread Previous | Thread Next