Front page | perl.perl5.porters |
Postings from February 2015
[perl #123946] assert in /\p^ /
Thread Previous
From:
Hugo van der Sanden
Date:
February 27, 2015 01:52
Subject:
[perl #123946] assert in /\p^ /
Message ID:
rt-4.0.18-13737-1425001940-592.123946-75-0@perl.org
# New Ticket Created by Hugo van der Sanden
# Please include the string: [perl #123946]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org/Ticket/Display.html?id=123946 >
AFL (<http://lcamtuf.coredump.cx/afl/>) finds this:
% ./perl -Ilib -ce '/\p^ /'
Use of uninitialized value $table in concatenation (.) or string at lib/utf8_heavy.pl line 398.
Use of uninitialized value $_[0] in substitution (s///) at lib/utf8_heavy.pl line 23.
Use of uninitialized value $loose in pattern match (m//) at lib/utf8_heavy.pl line 25.
Use of uninitialized value $table in concatenation (.) or string at lib/utf8_heavy.pl line 409.
perl: sv.c:11436: Perl_sv_vcatpvfn_flags: Assertion `(IV)elen >= 0' failed.
Aborted (core dumped)
%
(I haven't yet looked at the utf8_heavy warnings, only the coredump.)
I'm not sure if (for example) /\p^L/ is intended to be supported, but the length handling in regcomp.c is suspect for this case: after \p if we see '{', we set the length (UV n) to be the number of characters to the matching close brace, else we set it to 1 (regcomp.c:14177). If we then see '^' we set a flag and decrement n, and skip past additional whitespace further decrementing n as we go. If we then get an error, we can end up passing a negative (wrapped to large unsigned) n as the length.
My guess is we want to support /\p^L/ but not /\p^ L/; the diff below is a start towards that, but it's not sufficient - I think we need to move the parsing out of the !SIZE_ONLY guard, or we can't be sure to continue at the right point.
Hugo
--- a/regcomp.c
+++ b/regcomp.c
@@ -14122,6 +14122,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp,
case 'P':
{
char *e;
+ int braced = 1;
/* We will handle any undefined properties ourselves */
U8 swash_init_flags = _CORE_SWASH_INIT_RETURN_IF_UNDEF
@@ -14149,6 +14150,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp,
else {
e = RExC_parse;
n = 1;
+ braced = 0;
}
if (!SIZE_ONLY) {
SV* invlist;
@@ -14156,17 +14158,19 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp
if (UCHARAT(RExC_parse) == '^') {
RExC_parse++;
- n--;
+ if (braced) n--;
/* toggle. (The rhs xor gets the single bit that
* differs between P and p; the other xor inverts just
* that bit) */
value ^= 'P' ^ 'p';
+ if (braced) {
while (isSPACE(*RExC_parse)) {
RExC_parse++;
n--;
}
}
+ }
/* Try to get the definition of the property into
* <invlist>. If /i is in effect, the effective property
* will have its name be <__NAME_i>. The design is
Thread Previous