develooper Front page | perl.perl5.porters | Postings from December 2009

Re: PATCH: partial [perl #58182]: regex case-sensitive matchingnow utf8ness independent

Thread Previous | Thread Next
From:
Juerd Waalboer
Date:
December 10, 2009 04:13
Subject:
Re: PATCH: partial [perl #58182]: regex case-sensitive matchingnow utf8ness independent
Message ID:
20091209192327.GL2148@c4.tnx.nl
karl williamson skribis 2009-12-09 12:11 (-0700):
> Since Yves is incommunicado, I took what he had done before Larry's veto  
> and extended and modified it, adding an intermediate way.  What that  
> means is that anything that looks like[[:xxx:]] will match only in the  
> ASCII range, or in the current locale, if set.  I never heard any  
> controversy about that part of the proposal, and it makes sense to me  
> that a Posix construct should act like the Posix definition says to.

These "posix" constructs have for a long time been documented as
*equivalent* to \d, \s and \w, with two remarks: [[:space:]] also
includes \cK and [[:word:]] doesn't even exist in POSIX.

Changing them is as bad as changing the metacharacters. Changing them to
break the equivalency might even be worse.

Also, note that perlre calls this "POSIX character class **syntax**"
(emphasis mine).

An even stronger argument is that perlre defines equivalence with
\p{...}, and explicitly mentions that these are Unicode constructs.
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <#####@juerd.nl>  <http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy <sales@convolution.nl>


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About