develooper Front page | perl.perl5.porters | Postings from October 2014

Re: Bringing the regex compiler into the current millenium.

Thread Previous | Thread Next
From:
Christian Millour
Date:
October 24, 2014 00:22
Subject:
Re: Bringing the regex compiler into the current millenium.
Message ID:
54499BB5.6090005@abtela.com
Le 23/10/2014 21:54, demerphq a écrit :
> On 23 October 2014 18:05, Christian Millour <cm.perl@abtela.com
> <mailto:cm.perl@abtela.com>> wrote:
>
>     Le 23/10/2014 10:15, demerphq a écrit :
>
>         I welcome any interest in this project. Please let me know if
>         you have
>         any thoughts or wish to contribute.
>
>
>     possibly not a welcome thought :
>
>     if at all possible, would you consider adding a maxlen attribute,
>     that would hold the maximum length of a match ? There is already a
>     minlen attribute for optimization (don't try to match if the string
>     is too short). The maxlen would help, when working on chunked input,
>     to gather enough input to ensure a complete match (especially when
>     the regexp contains optional trailer and thus might succeed on
>     incomplete input).
>
>     This is a low-priority request. minlen can be accessed using
>     Regexp::MinLength, and I have been able to compute maxlen for simple
>     cases using Regexp::Parser. I have little trust in my efforts there
>     though and would love to see it done right.
>
>
>
> I added support for maxlen earlier this year as part of working toward
> making $/ support regexes (pretty much the same use case you mention).
> We now set flags to determine if the regex is potentially infinite
> (RXf_UNBOUNDED_QUANTIFIER), or if not we calculate the maxlen. It should
> be in Perl 5.19.9 and later. (maxlen is meaningless when
> RXf_UNBOUNDED_QUANTIFIER is set).
>
> $ ./perl -Ilib -Mre=Debug,OPTIMISE,DUMP,FLAGS -e'/fo+o/'
> Compiling REx "fo+o"
> first:>  1: EXACT <f> (3) [ ]
> Peep>  1: EXACT <f> (3) [ SCF_DO_SUBSTR SCF_DO_STCLASS_AND
> SCF_DO_STCLASS SCF_WHILEM_VISITED_POS ]
>    join>  1: EXACT <f> (3)
> Peep>  3: PLUS (6) [ SCF_DO_SUBSTR SCF_WHILEM_VISITED_POS ]
>    Peep>  4: EXACT <o> (0) [ SCF_DO_SUBSTR SCF_WHILEM_VISITED_POS ]
>      join>  4: EXACT <o> (0)
> Peep>  6: EXACT <o> (8) [ SCF_DO_SUBSTR SCF_WHILEM_VISITED_POS ]
>    join>  6: EXACT <o> (8)
> minlen: 3 r->minlen:0 maxlen:0
> Final program:
>     1: EXACT <f> (3)
>     3: PLUS (6)
>     4:   EXACT <o> (0)
>     6: EXACT <o> (8)
>     8: END (0)
> anchored "fo" at 0 floating "oo" at 1..9223372036854775807 (checking
> floating) minlen 3
> r->extflags: UNBOUNDED_QUANTIFIER_SEEN USE_INTUIT_NOML USE_INTUIT_ML
> r->intflags: [none-set]
> Freeing REx: "fo+o"
>
> $ ./perl -Ilib -Mre=Debug,OPTIMISE,DUMP,FLAGS -e'/foo/'
> Compiling REx "foo"
> first:>  1: EXACT <foo> (3) [ ]
> Peep>  1: EXACT <foo> (3) [ SCF_DO_SUBSTR SCF_DO_STCLASS_AND
> SCF_DO_STCLASS SCF_WHILEM_VISITED_POS ]
>    join>  1: EXACT <foo> (3)
> minlen: 3 r->minlen:0 maxlen:3
> Final program:
>     1: EXACT <foo> (3)
>     3: END (0)
> anchored "foo" at 0 (checking anchored isall) minlen 3
> r->extflags: CHECK_ALL USE_INTUIT_NOML USE_INTUIT_ML
> r->intflags: [none-set]
> Freeing REx: "foo"
>
>
> (some of that output is specific to blead, the relevant parts are in 5.20).
>

Ooooh well done, I missed that... I guess if I had tried to *edit* this 
months-stale regcomp.c buffer on a long since many-times-pulled blead 
git clone, Emacs would have told me that the file had changed quite a 
lot since I first loaded it. As I had only been able to stare at it in 
befuddled awe it never got the opportunity ;-)

Unless it is yet incomplete I believe this added functionality would be 
well worthy of an entry in perldelta.

> cheers,
> Yves

cheers indeed ! Have a virtual beverage of your choice on me (and a real 
one or a few if we ever meet). I am afraid that will be the full extent 
of my contribution as the awe is not in the least abating while studying 
the latest version...



Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About