develooper Front page | perl.perl5.porters | Postings from February 2015

[perl #123946] assert in /\p^ /

Thread Previous
From:
Hugo van der Sanden
Date:
February 27, 2015 01:52
Subject:
[perl #123946] assert in /\p^ /
Message ID:
rt-4.0.18-13737-1425001940-592.123946-75-0@perl.org
# New Ticket Created by  Hugo van der Sanden 
# Please include the string:  [perl #123946]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=123946 >


AFL (<http://lcamtuf.coredump.cx/afl/>) finds this:

% ./perl -Ilib -ce '/\p^ /'
Use of uninitialized value $table in concatenation (.) or string at lib/utf8_heavy.pl line 398.
Use of uninitialized value $_[0] in substitution (s///) at lib/utf8_heavy.pl line 23.
Use of uninitialized value $loose in pattern match (m//) at lib/utf8_heavy.pl line 25.
Use of uninitialized value $table in concatenation (.) or string at lib/utf8_heavy.pl line 409.
perl: sv.c:11436: Perl_sv_vcatpvfn_flags: Assertion `(IV)elen >= 0' failed.
Aborted (core dumped)
% 

(I haven't yet looked at the utf8_heavy warnings, only the coredump.)

I'm not sure if (for example) /\p^L/ is intended to be supported, but the length handling in regcomp.c is suspect for this case: after \p if we see '{', we set the length (UV n) to be the number of characters to the matching close brace, else we set it to 1 (regcomp.c:14177). If we then see '^' we set a flag and decrement n, and skip past additional whitespace further decrementing n as we go. If we then get an error, we can end up passing a negative (wrapped to large unsigned) n as the length.

My guess is we want to support /\p^L/ but not /\p^ L/; the diff below is a start towards that, but it's not sufficient - I think we need to move the parsing out of the !SIZE_ONLY guard, or we can't be sure to continue at the right point.

Hugo
--- a/regcomp.c
+++ b/regcomp.c
@@ -14122,6 +14122,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, 
            case 'P':
                {
                char *e;
+                int braced = 1;
 
                 /* We will handle any undefined properties ourselves */
                 U8 swash_init_flags = _CORE_SWASH_INIT_RETURN_IF_UNDEF
@@ -14149,6 +14150,7 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp, 
                else {
                    e = RExC_parse;
                    n = 1;
+                    braced = 0;
                }
                if (!SIZE_ONLY) {
                     SV* invlist;
@@ -14156,17 +14158,19 @@ S_regclass(pTHX_ RExC_state_t *pRExC_state, I32 *flagp
 
                    if (UCHARAT(RExC_parse) == '^') {
                        RExC_parse++;
-                        n--;
+                        if (braced) n--;
                          /* toggle.  (The rhs xor gets the single bit that
                           * differs between P and p; the other xor inverts just
                           * that bit) */
                         value ^= 'P' ^ 'p';
 
+                        if (braced) {
                             while (isSPACE(*RExC_parse)) {
                                 RExC_parse++;
                                 n--;
                             }
                         }
+                   }
                     /* Try to get the definition of the property into
                      * <invlist>.  If /i is in effect, the effective property
                      * will have its name be <__NAME_i>.  The design is


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About