develooper Front page | perl.perl5.porters | Postings from April 2007

[PATCH] Change meaning of \v, \V, and add \h, \H to match Perl6, add \R to match PCRE and unicode tr18

Thread Previous | Thread Next
From:
demerphq
Date:
April 22, 2007 14:35
Subject:
[PATCH] Change meaning of \v, \V, and add \h, \H to match Perl6, add \R to match PCRE and unicode tr18
Message ID:
9b18b3110704221434g43457742p28cab00289f83639@mail.gmail.com
The attached patch changes \v and \V (as requested by Larry a while
back) to match vertical whitespace (or not), adds support for \h and
\H to match horizontal whitespace (or not) and adds \R which match
line endings, (the same as (?>\x0D\x0A|\v) would match).

\R is currently in PCRE 7 and is suggested in the unicode spec under
http://unicode.org/unicode/reports/tr18/#Line_Boundaries.

\v\V and \h\H are from Perl6.

Note, none of them change behaviour under use locale, and use the same
semantics whether the string is utf8 or latin1.

As part of this patch a new header file has been created,
regcharclass.h, which is generated by a script I have written. Im not
sure what i should do with the script... Obviously it should be added
to core, but the question is where and how it should be used to
regenerate the header. Maybe it should be added to regcomp.pl with a
config file holding the data it uses? Or should it just go in util or
unicore or something? I doubt it will change much as new versions of
the unicode standard come out, but its possible...

A work in progress version of the script is attached along with the patch.

Cheers,
Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About