At 01:14 PM 06-11-2001 -0700, Russ Allbery wrote: >Dan Sugalski <dan@sidhe.org> writes: > > At 01:05 PM 6/11/2001 -0700, Russ Allbery wrote: > >> Dan Sugalski <dan@sidhe.org> writes: > > >>> Should perl's regexes and other character comparison bits have an > >>> option to consider different characters for the same thing as > >>> identical beasts? I'm thinking in particular of the Katakana/Hiragana > >>> bits of japanese, but other languages may have the same concepts. > > >> I think canonicalization gets you that if that's what you want. > > > I don't think canonicalization should do this. (I really hope not) This > > isn't really a canonicalization matter--words written with one character > > set aren't (AFAIK) the same as words written with the other, and which > > alphabet you use matters. (Which sort of argues against being able to do > > this, I suppose...) > >I guess I don't know what the definition of "the same thing" you're using >here is. I thought Dan was talking about something equivalent to the m//i functionality. Would it, or should it, be possible to tell m// to treat Katakana characters as the same as hiragana characters, in much the same way as m//i treats UPPERCASE the same as lowercase? Canonicalization won't get you that. My feeling is that the hooks should be there, but the specific equivalence mappings should be in the library, not the core.Thread Previous | Thread Next