On Wed, Oct 28, 2009 at 3:42 PM, John <john.imrie@vodafoneemail.co.uk>wrote:
Now I don't see alphanumeric defined anywhere but I also don't see how it
> can be forced to match 灞
>
It already does
$ perl -v
This is perl, v5.8.8 built for i486-linux-gnu-thread-multi
...
$ perl -le'print chr(28766) =~ /^\w\z/ || 0'
1
Further more taint washing is carried out by regexes and extending the
> samantics of \w \d and \s could allow tainted data to be cleaned where it
> should not.
>
If you're using \w to filter out chinese characters, you're already failing.
What do you think extending \w \d and \s will do.
>
There's been no discussion of expanding them. The problem is that what they
match varies depending on Perl internals
$ perl -le'
$s1 = "\xC2";
$s2 = "\x{2660}";
for ($s1, $s2, $s1.$s2) {
print /\w/ || 0;
}
'
0
0
1
If there's no \w in s1 or in s2, why does their concatenation have one.
Thread Previous
|
Thread Next