develooper Front page | perl.perl5.porters | Postings from October 2014

Re: Bringing the regex compiler into the current millenium.

Thread Previous | Thread Next
From:
demerphq
Date:
October 23, 2014 19:54
Subject:
Re: Bringing the regex compiler into the current millenium.
Message ID:
CANgJU+XsRaJHV1-uzxqM0FUhdU6otZHXKMQHf9agjP206hMfNw@mail.gmail.com
On 23 October 2014 18:05, Christian Millour <cm.perl@abtela.com> wrote:

> Le 23/10/2014 10:15, demerphq a écrit :
>
>> I welcome any interest in this project. Please let me know if you have
>> any thoughts or wish to contribute.
>>
>
> possibly not a welcome thought :
>
> if at all possible, would you consider adding a maxlen attribute, that
> would hold the maximum length of a match ? There is already a minlen
> attribute for optimization (don't try to match if the string is too short).
> The maxlen would help, when working on chunked input, to gather enough
> input to ensure a complete match (especially when the regexp contains
> optional trailer and thus might succeed on incomplete input).
>
> This is a low-priority request. minlen can be accessed using
> Regexp::MinLength, and I have been able to compute maxlen for simple cases
> using Regexp::Parser. I have little trust in my efforts there though and
> would love to see it done right.
>


I added support for maxlen earlier this year as part of working toward
making $/ support regexes (pretty much the same use case you mention). We
now set flags to determine if the regex is potentially infinite
(RXf_UNBOUNDED_QUANTIFIER), or if not we calculate the maxlen. It should be
in Perl 5.19.9 and later. (maxlen is meaningless when
RXf_UNBOUNDED_QUANTIFIER is set).

$ ./perl -Ilib -Mre=Debug,OPTIMISE,DUMP,FLAGS -e'/fo+o/'
Compiling REx "fo+o"
first:>  1: EXACT <f> (3) [ ]
Peep>  1: EXACT <f> (3) [ SCF_DO_SUBSTR SCF_DO_STCLASS_AND SCF_DO_STCLASS
SCF_WHILEM_VISITED_POS ]
  join>  1: EXACT <f> (3)
Peep>  3: PLUS (6) [ SCF_DO_SUBSTR SCF_WHILEM_VISITED_POS ]
  Peep>  4: EXACT <o> (0) [ SCF_DO_SUBSTR SCF_WHILEM_VISITED_POS ]
    join>  4: EXACT <o> (0)
Peep>  6: EXACT <o> (8) [ SCF_DO_SUBSTR SCF_WHILEM_VISITED_POS ]
  join>  6: EXACT <o> (8)
minlen: 3 r->minlen:0 maxlen:0
Final program:
   1: EXACT <f> (3)
   3: PLUS (6)
   4:   EXACT <o> (0)
   6: EXACT <o> (8)
   8: END (0)
anchored "fo" at 0 floating "oo" at 1..9223372036854775807 (checking
floating) minlen 3
r->extflags: UNBOUNDED_QUANTIFIER_SEEN USE_INTUIT_NOML USE_INTUIT_ML
r->intflags: [none-set]
Freeing REx: "fo+o"

$ ./perl -Ilib -Mre=Debug,OPTIMISE,DUMP,FLAGS -e'/foo/'
Compiling REx "foo"
first:>  1: EXACT <foo> (3) [ ]
Peep>  1: EXACT <foo> (3) [ SCF_DO_SUBSTR SCF_DO_STCLASS_AND SCF_DO_STCLASS
SCF_WHILEM_VISITED_POS ]
  join>  1: EXACT <foo> (3)
minlen: 3 r->minlen:0 maxlen:3
Final program:
   1: EXACT <foo> (3)
   3: END (0)
anchored "foo" at 0 (checking anchored isall) minlen 3
r->extflags: CHECK_ALL USE_INTUIT_NOML USE_INTUIT_ML
r->intflags: [none-set]
Freeing REx: "foo"


(some of that output is specific to blead, the relevant parts are in 5.20).

cheers,
Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About