develooper Front page | perl.perl5.porters | Postings from August 2013

Re: Perl 5.18 and Regexp::Grammars

Thread Previous | Thread Next
From:
Dave Mitchell
Date:
August 13, 2013 13:52
Subject:
Re: Perl 5.18 and Regexp::Grammars
Message ID:
20130813135208.GO2177@iabyn.com
On Mon, Aug 12, 2013 at 08:04:17PM +0200, Aristotle Pagaltzis wrote:
> * demerphq <demerphq@gmail.com> [2013-08-12 15:40]:
> > I think he means something like this:
> >
> > source( $quote, $start_delim1, $end_delim1, $modifiers, $text1,
> > $start_delim2, $end_delim2, $text2 );


Ah I see.  Yes, I had envisioned something like that.

The one major way I suspect I may need to deviate from that is that I
would expect that for tr/// and s///, for the 'source' method to be called
twice, once for the src pattern and once for replacement (with presumably
an indication of whether it's first or second), so there wouldn't be foo2
args.

I don't know whether that would be a big limitation. It implies that any
modifications to 'src' in s/src/dst/ and tr/src/dst/ has has to be done
before the code has a chance to 'dst'.

Doing it otherwise would, I suspect be a lot harder. There is currently a
function, S_scan_str() which basically extracts out a single-quotish value
while processing just \-delimiter; this string may be left as-is,
or passed on for double-quotish parsing. Making S_scan_str() check for
an overloaded source method and calling it before returning is
(relatively) simple. Something like s/// causes S_scan_str() to be called
twice. Somehow saving the results of two calls to S_scan_str() then
calling source() once would, I suspect, be trickier.

In terms of other semantics, I'm assuming that the value returned by
source() will be immediately stringified then used in place of the
original text (so if source() returns an overloaded or regex object, it
wont stay that way).

Exactly how \ and escaped delimiters are handled needs to be decided, but
I suspect that that will be largely determined by how S_scan_str() already
operates.

So in something like

    'ab\\c\'d'

I'd expect source() to be called with (amongst other things), a string
consisting of the 7 chars

    a  b  \  \  c  '  d

(Note that in single-quoted strings, double-\ stripping is done later).

But I think largely the details don't matter as long as they are clearly
documented (like that source() needs to return a double-\ if it is to be
interpreted as a \ ).

So, apart from the devil of the details, does anyone have any strong
feeling pro- or anti- this proposal?

-- 
Little fly, thy summer's play my thoughtless hand
has terminated with extreme prejudice.
        (with apologies to William Blake)

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About