develooper Front page | perl.perl5.porters | Postings from December 2014

Re: All I want for Christmas is: Streaming Regexps

Thread Previous | Thread Next
From:
demerphq
Date:
December 25, 2014 16:01
Subject:
Re: All I want for Christmas is: Streaming Regexps
Message ID:
CANgJU+UsKA7ZZiaNpa09miSGtXyLi35=1eFo1fioRcfo0J4bcw@mail.gmail.com
On 25 December 2014 at 16:02, Paul "LeoNerd" Evans
<leonerd@leonerd.org.uk> wrote:
> Given a string, $str, and a regexp, $re, I know that the regexp does
> not currently match:
>
>   ok( not $str =~ $re );
>
> However, currently Perl has no way to let me distinguish the two very
> importantly different cases of:
>
>   1) $str contains characters that cause $re not to match
>   2) $str does not contain enough characters to cause $re to match
>
> For example, consider
>
>   $str = '"here is';
>   $re  = qr/^"[^"]+"/;
>
> Currently $str does not match $re, but that's only because of a lack of
> characters; if we were to supply more characters, such as from
> read()ing more from a file or IO handle, we might find that the regexp
> now matches. Alternatively, given
>
>   $str = 'This will never';
>   $re  = qr/^\d+/;
>
> It is immediately observed that $str cannot ever match $re, no matter
> how much more we read and extend $str with.
>
> I have occasionally observed cases in Parser::MGC where being able to
> make this distinction would be really useful - right now it has a
> partial attempt at a lazy-streaming mode but that can only operate on
> whole blocks separated by ignorable whitespace. It would be really nice
> if Parser::MGC could drive a lazy socket read() or similar, to continue
> reading input until it matched an entire AST-driven document, of
> whatever syntax was being parsed.
>
> -----
>
> In summary:
>
>   I'd like a way to know if a regexp fails to match because it ran out
>   of input but was happy until that point, or if it found some bad
>   characters that adding more input to will never help.

We have most of this already. I did it so I could implement regexp $/,
but I have not had time to complete it.

Id guess its a day or two of tinkering to get this done in a way you
could use it.

Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About