develooper Front page | perl.perl5.porters | Postings from November 2011

Re: RFC: fc keyword API

Thread Previous
From:
Brian Fraser
Date:
November 25, 2011 22:19
Subject:
Re: RFC: fc keyword API
Message ID:
CA+nL+naZ_Hk+LYT7ZDT-TCbv52sEi9X1-Ds6qSMAoyxcHtG9Bg@mail.gmail.com
On Sat, Nov 26, 2011 at 3:44 AM, David Nicol <davidnicol@gmail.com> wrote:

> why not have two keywords, instead of one keyword that does either of
> two different things based on a lexical (or is it latebound?) hint?
>

Mostly because like the other casing functions,  fc() has a corresponding
double-quotish string escape, \F. If there was a second keyword, say,
fc_simple, it would either need an escape, or be the exception to the rule.
I don't think any of those is desirable.

Also because a function that did simple folding and nothing else would be
rather worthless. : P


> The consensus on fc may have been reached before the complexities were
> understood.
>

Well, no consensus yet, other than fc() doing full folding by default,
since that's in the new Camel and the standard. Thus, this thread : )


> and have another way
> available to do simple folding -- possibly something that isn't in
> core.


I'm neither in favor or against any of those options. My reaction to both
would be "oh, that's cool". So yeah.


> Does full folding normalize the different ways to spell letters
> with marks over them too? Even if it doesn't, the C<fc> operator
> should, in my opinion.
>
>
It doesn't, but that's what the standard says it should do. Eventually,
once the regex engine implements canonical equivalents (UTS #18 2.1), fc()
could be expanded to do that, but that's getting ahead of ourselves.
Hm, looks like I forgot to mention the relevant parts of the standard in my
previous mail, which are 3.13, Default Case Operations, 4.2,
Case-Normative, and 5.18, Case Mappings. Apologies for that.
3.13 in particular has definitions for default caseless matching, which
would be fc, canonical caseless matching, would requires normalization to
NFD before and after fc and is what you have in mind, and compatibility
caseless matching, which is... uh, a mouthful.

I recall asking Karl about those last two, and he was of mind to let
Unicode::Casing take care of them for the time being.


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About