develooper Front page | perl.perl5.porters | Postings from December 2009

Re: PATCH: partial [perl #58182]: regex case-sensitive matchingnow utf8ness independent

Thread Previous | Thread Next
From:
Juerd Waalboer
Date:
December 10, 2009 23:52
Subject:
Re: PATCH: partial [perl #58182]: regex case-sensitive matchingnow utf8ness independent
Message ID:
20091210130056.GD2041@c4.tnx.nl
demerphq skribis 2009-12-10 13:23 (+0100):
> And, [[:word:]] is spelled [[:alnum:]].

juerd@lanova:~$ perl -le'print "foo" =~ /[[:word:]]/'
1

See perlre

> You cannot have both the current behaviour and non buggy implementation.

Fully agreed. That's certainly not what I'm after, either.

> Simply put I consider that:
> [^STUFF] matching the same code points as [STUFF] to be an irrefutable
> and overwhelming reason why the current behavior of POSIX charclass
> cannot be preserved.

What exactly do you mean by "current behaviour"?

To fix the issue that codepoints 128..255 are included depending on
internal encoding, there are two options:

- Ignore anything above 127
- Provide full unicode semantics.

The first, ASCII-only, would be a mistake.

Perhaps there is other current behaviour that I am not aware of.
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <#####@juerd.nl>  <http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy <sales@convolution.nl>


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About