develooper Front page | perl.perl6.internals | Postings from June 2001

Re: More character matching bits

Thread Previous | Thread Next
From:
Simon Cozens
Date:
June 15, 2001 15:28
Subject:
Re: More character matching bits
Message ID:
20010615232824.B11733@deep-dark-truthful-mirror.pmb.ox.ac.uk
On Fri, Jun 15, 2001 at 11:50:49AM -0400, Dan Sugalski wrote:
> Unless I'm missing something (Simon? Hong?) Japanese (and potentially all 
> the languages that use the Han characters) can interpret a particular 
> character as either a number or not a number, depending on context.

Uh, don't think so, no. The numerals are, ooh, let's see:
U+4E00, U+4E8C, U+4E09, U+56DB, U+4E94, U+4E03, U+516B, U+5341, U+5343,
U+4E07 and two more I can't find. The rest aren't (usually) treated as
numbers, no. It's certainly not the case that a given character is both
non-number and number.

> >module Locale::Hawaiian;
> >use re 'class (\w => [aeiou礪銓\xFBhklmnpw`])';
> >...
> 
> Sure. I expect Damian will write us something that lets you specify them 
> upside-down in Klingon or something by the time this is done. :)

This is handy, but this means the regexp engine needs to be *VERY* dynamic
at runtime.

-- 
When your hammer is C++, everything begins to look like a thumb. 
    -- Steve Haflich, comp.lang.c++

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About