develooper Front page | perl.perl6.language | Postings from April 2003

Re: Ruminating RFC 93- alphabet-blind pattern matching

Thread Previous | Thread Next
From:
Joseph F. Ryan
Date:
April 4, 2003 14:43
Subject:
Re: Ruminating RFC 93- alphabet-blind pattern matching
Message ID:
3E8E0A88.3020905@osu.edu
Yary Hluchan wrote:

>>making *productions* of strings/sounds/whatever that could possibly
>>match the regular expression?
>>
>>
>>>Correct me if I am wrong, but isn't this the :any switch of apoc 5?
>>>http://www.perl.com/pub/a/2002/06/26/synopsis5.html
>>>
>
>Not really, unless the input string is infinite!
>


Well, thats just in the general purpose case, right?  That's because
a regex like /a*/ matches:

'w'
'qsdf'
'i bet you didn't wnt this to mtch'

So, you're going to need some sort of controlled input to a regex match
with the :any for it to work right.

Here's my approach to the problem: generate a possible string that
could match every atom in the regex individually, and then generate
matches for the whole regex off of that.  I liked Luke's approach
of stapling methods onto the Rx classes, so I used an approach that
made use of that idea.  I completed each of the needed rules, since
the methods in my example are pretty simple (they probably would be
in Luke's example too, but I just wanted to be sure I wasn't missing
anything).

    use List::Permutations <<permutations>>; # Perl 5's name.

    sub generate (rx $spec, Int $limit) {
        my $string = $spec.generate_match (&propagate, $limit);
        $string =~ m:any/ (<$spec>) { yield $1 } /;
     
        my sub propagate ($atom) {
            given ($atom) {
                when Perl::sv_literal {
                    $string ~= $_.literal()
                }
                when Perl::Rx {
                    $string ~= .generate_match (&propagate, $limit)
                        if .isa(generate_match)
                }
            }
        }
    }


    Perl::Rx::Atom::generate_match (&p, $limit) {
        return &p.($.atom)
    }
    Perl::Rx::Zerowidth::generate_match (&p, $limit) {
        return &p.($.atom)
    }
    Perl::Rx::Meta::generate_match (&p, $limit) {
        return join '', $.possible
    }
    Perl::Rx::Oneof::generate_match (&p, $limit) {
        return join '', $.possible
    }
    Perl::Rx::Charclass::generate_match (&p, $limit) {
        return join '', $.possible
    }

    Perl::Rx::Sequence::generate_match (&p, $limit) {
        my $string;
        $string ~= &p.($_) for $.atoms;
        return $string;
    }

    Perl::Rx::Alternation::generate_match (&p, $limit) {
        my $string;
        $string ~= &p.($_) for $.branches;
        return $string;
    }

    Perl::Rx::Modifier::generate_match (&p, $limit) {
        my $string;
        $string ~= &p.($_) for $.atoms;
        # is $self ($.) still the topic here?  or is the last
        # member of $.atoms?
        return $self.mod.transform($string);
    }

    Perl::Rx::Modifier::repeat (&p, $limit) {
        $string := join '', map { join '', $_ }
            permutations (split //, &p.($.atom)) xx ($.max // $limit);
        return $string;
    }

So, given a call like:

    generate (/(A*B*(C*|Z+))/, 4);
   
The C<$string> variable in the 2nd line of C<generate> would become:

    AAAABBBBCCCCZZZZ

And the :any switch takes care of the rest. (-:


Joseph F. Ryan
ryan.311@osu.edu


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About