develooper Front page | perl.fwp | Postings from March 2005

Re: Self-recognizing programs and regular expressions

Thread Previous | Thread Next
Karsten Sperling
March 8, 2005 11:18
Re: Self-recognizing programs and regular expressions
Message ID:
Here's one without \Q, which I'm fairly certain is score 1:


The trick is that \2 is the string "\2?\" (i'm not escaping \ with \\
here for clarity), so the regex \2?\* matches "*" as well as "\2?\*".
That gives us a regex that matches either a meta-character or that
same regex matching that meta-character (non-meta-characters aren't a
problem because they match themselves and hence the regex that matches
themselves already).

My first attempt was therefore this (whitespace for clarity):


The regex (^(?=.*(\\2\?\\))) sets up \2, i'll call it R. The next part
is the "escaped" version of R, which i'll call S. S matches the string
R, as well as S itself. So S{2} matches RS. (In fact S also includes
the "(" that comes after R, but thats not important here.)

Unfortunately, S{2} also matches SS, RR, and any other string derived
from SS by removing some or all instances of "\2?\". That's because
these "\2?\" in front of each meta-char is matched optionally.

Using (S){2}.{11}\z makes sure the second match of S goes all the way
through the string to within 11 chars of the end. (11 being the length
of "){2}.{11}\z"). R is extended to verify that these are in fact the
last 11 chars of the string. Additionally, we verify that the string
has the right length: (?=.{338}\)\{2}\.\{11}\\z\z)

It's still possible to shift "\2?\" strings from the second part of
the string to the first though, leaving the length unchanged. To
prevent this, a check is added to R to verify that the opening '(' at
the start of S (the 9th in the final code) is at the right position:

Here's the whole code again:

my $x = <<'EOF'; $x =~ s/\n//g; print "OK" if $x =~ /$x/;

- Karsten

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About