develooper Front page | perl.perl5.porters | Postings from October 2009

Re: Rule 1 has been invoked [Re: What should \s \w \d match in5.12?]

Thread Previous | Thread Next
From:
Abigail
Date:
October 30, 2009 04:42
Subject:
Re: Rule 1 has been invoked [Re: What should \s \w \d match in5.12?]
Message ID:
20091030114157.GA13777@almanda
On Fri, Oct 30, 2009 at 11:30:23AM +0000, David Cantrell wrote:
> On Thu, Oct 29, 2009 at 11:18:05PM +0100, Abigail wrote:
> 
> > Noone is suggesting to have \d *start* matching non-ASCII digits.
> 
> perldoc perlunicode can be read to indicate that \d does exactly that.
> 
> Under "speed", it says:
> 
> " As an example, the Unicode properties (character classes) like
>   "\p{Nd}" are known to be quite a bit slower (5-20 times) than
>   their simpler counterparts like "\d" (then again, there 268
>   Unicode characters matching "Nd" compared with the 10 ASCII
>   characters matching "d"). "
> 
> which looks like saying that \d matches [0123456789].  But it also says
> elsewhere that it does match funny forn characters.


The quote as is suggests that. But it becomes somewhat clearer if
you take the sentence preceeding your quote into account:

    In general, operations with UTF-8 encoded strings are still slower.

The point isn't so much \p{Nd} vs. \d, but matching against non-UTF-8
encoded strings, vs matching against UTF-8 encoded strings.


Abigail

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About