develooper Front page | perl.perl6.internals | Postings from June 2001

RE: More character matching bits

Thread Previous | Thread Next
From:
Hong Zhang
Date:
June 12, 2001 17:42
Subject:
RE: More character matching bits
Message ID:
400CE9390E334A4393CEECDD6863120A289EF8@ussccm003.corp.palm.com

We should let external collator to handle all these fancy features.
People can always normalize/canonicalize/do-whatever-you-want
and send the result text/binary to regex. All the features we
argue about here can be easily done by a customized collator.

Do NOT expect the Perl regex be a linguist that can understand
every language in the world and be able to match my name in 
English and Chinese :-) (Of course, that will be a useful
feature for me.)

Please note regex is O(n) at best, adding an external collator
will make is O(2n). Put fancy unicode feature into regex will 
not make it any faster.

My recommendation is to keep regex locale independent. And
have some API for handling locale specific features, though
I am not sure what is the best way to do this.

Hong

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About