develooper Front page | perl.perl5.changes | Postings from March 2019

[perl.git] branch blead updated. v5.29.8-46-gb37d10f658

From:
Dave Mitchell
Date:
March 7, 2019 11:45
Subject:
[perl.git] branch blead updated. v5.29.8-46-gb37d10f658
Message ID:
E1h1rSn-0005oY-Ru@git.dc.perl.space
In perl.git, the branch blead has been updated

<https://perl5.git.perl.org/perl.git/commitdiff/b37d10f658c300104241001e3f5de1f31d62b22f?hp=c11f6329a2001a507b929b443e7512970fe62202>

- Log -----------------------------------------------------------------
commit b37d10f658c300104241001e3f5de1f31d62b22f
Author: David Mitchell <davem@iabyn.com>
Date:   Thu Mar 7 10:23:04 2019 +0000

    fix CvFILE() leak in Perl_newATTRSUB_x()
    
    When overwriting cv with the contents of PL_compcv, it was checking the
    CvDYNFILE(cv) flag (to see if CvFILE(cv) needed freeing) *after*
    overwriting cv's flags with PL_compcv's flag.

commit f60d17f54c2572c253cccc4479a32236201866b2
Author: David Mitchell <davem@iabyn.com>
Date:   Thu Mar 7 10:21:25 2019 +0000

    Improve description of CVf_DYNFILE flag

commit bfefaec2e78fd51f89738f65efad8ffaf0017da3
Author: David Mitchell <davem@iabyn.com>
Date:   Thu Mar 7 08:42:59 2019 +0000

    add comments above Perl_newSTUB()

commit 7d79ca0952ed6929075e5a3a9d5d24ce212eb6e2
Author: David Mitchell <davem@iabyn.com>
Date:   Wed Mar 6 16:13:45 2019 +0000

    fix leak in regex re-entrant capture buffer
    
    When a regex is about to be executed, if it is the same regex as
    PL_curpm, then its 'offs' capture-indices buffer is saved in a local var
    pointer and a new one allocated. At the end of execution, depending on
    success or failure, the old buffer is restored and the new freed, or
    vice versa.
    
    However, if the regex dies during execution, e.g. /(?{ die })/
    then the old buffer will leak.
    
    So use SAVEFREEPV() on the old buffer, and change the 'restore on
    failure' behaviour - always free the old buffer and keep the new
    buffer, and instead copy the old indices to the new buffer.

commit 495a482db9ba8e2fe6ca17ef4c9b0b1dca65d7fd
Author: David Mitchell <davem@iabyn.com>
Date:   Wed Mar 6 10:36:23 2019 +0000

    fix leak in /[(?{]/
    
    This pattern is correctly interpreted by the parser as not containing
    any code blocks, e.g. (?{...}). It's then passed to the regex compiler,
    which thinks it may after all contain a code block not seen before (e.g.
    interpolated in at runtime). So it evals the code qr'[(?{]' to compile
    any code blocks.  Again the parser doesn't see any code blocks, so the
    regex compiler realises it was wrong, and attempts to free the hidden
    anon CV associated with compiling a qr// (this CV would take ownership
    of any found code blocks, but is empty apart from a single OP_CONST
    containing the text of regex).
    
    This freeing of the CV was going wrong, resulting in the op slab(s)
    associated with the anon CV leaking.
    
    This was because cv_forget_slab(PL_compcv) was being called, which
    converts a compiling CV into a compiled  CV, where CvSTART() no longer
    points to the op slab, and instead the slab can only be accessed
    indirectly via the ops in CvROOT().
    
    Then when the CV is freed, because it is no longer marked as SvSLABBED,
    the freeing code assumes that any associated ops are attached via
    SvROOT() - but they haven't been yet - they're still sitting on the
    yyparse stack. So they leak.
    
    The solution seems to be a simple as removing the call to
    cv_forget_slab().
    
    I sort of understood this as I wrote this commit message, but it's
    fading already. Don't ask me to explain this in a week's time, let alone
    next year.

commit 49c01b24867571217e880f4de5d82ed1d3b09dc6
Author: David Mitchell <davem@iabyn.com>
Date:   Tue Mar 5 10:24:30 2019 +0000

    docs for op slab functions
    
    Add some basic code comments at the top of each function associated
    with allocating and freeing OP slabs.

-----------------------------------------------------------------------

Summary of changes:
 cv.h       |  2 +-
 op.c       | 43 ++++++++++++++++++++++++++++++++++++++++---
 regexec.c  | 39 +++++++++++++++++----------------------
 t/re/pat.t | 26 +++++++++++++++++++++++++-
 4 files changed, 83 insertions(+), 27 deletions(-)

diff --git a/cv.h b/cv.h
index dac83fa873..d50e320f16 100644
--- a/cv.h
+++ b/cv.h
@@ -129,7 +129,7 @@ See L<perlguts/Autoloading with XSUBs>.
 #ifdef PERL_CORE
 # define CVf_SLABBED	0x0800	/* Holds refcount on op slab  */
 #endif
-#define CVf_DYNFILE	0x1000	/* The filename isn't static  */
+#define CVf_DYNFILE	0x1000	/* The filename is malloced  */
 #define CVf_AUTOLOAD	0x2000	/* SvPVX contains AUTOLOADed sub name  */
 #define CVf_HASEVAL	0x4000	/* contains string eval  */
 #define CVf_NAMED	0x8000  /* Has a name HEK */
diff --git a/op.c b/op.c
index 75d25f3e7d..40bc2ef84e 100644
--- a/op.c
+++ b/op.c
@@ -246,6 +246,8 @@ S_prune_chain_head(OP** op_p)
 #define SIZE_TO_PSIZE(x)	(((x) + sizeof(I32 *) - 1)/sizeof(I32 *))
 #define DIFF(o,p)		((size_t)((I32 **)(p) - (I32**)(o)))
 
+/* malloc a new op slab (suitable for attaching to PL_compcv) */
+
 static OPSLAB *
 S_new_slab(pTHX_ size_t sz)
 {
@@ -277,6 +279,12 @@ S_new_slab(pTHX_ size_t sz)
 	PerlIO_printf(Perl_debug_log, "%s", SvPVx_nolen(Perl_mess args)) \
     )
 
+/* Returns a sz-sized block of memory (suitable for holding an op) from
+ * a free slot in the chain of op slabs attached to PL_compcv.
+ * Allocates a new slab if necessary.
+ * if PL_compcv isn't compiling, malloc() instead.
+ */
+
 void *
 Perl_Slab_Alloc(pTHX_ size_t sz)
 {
@@ -447,6 +455,11 @@ S_pp_freed(pTHX)
 }
 #endif
 
+
+/* Return the block of memory used by an op to the free list of
+ * the OP slab associated with that op.
+ */
+
 void
 Perl_Slab_Free(pTHX_ void *op)
 {
@@ -488,6 +501,16 @@ Perl_opslab_free_nopad(pTHX_ OPSLAB *slab)
     if (havepad) LEAVE;
 }
 
+/* Free a chain of OP slabs. Should only be called after all ops contained
+ * in it have been freed. At this point, its reference count should be 1,
+ * because OpslabREFCNT_dec() skips doing rc-- when it detects that rc == 1,
+ * and just directly calls opslab_free().
+ * (Note that the reference count which PL_compcv held on the slab should
+ * have been removed once compilation of the sub was complete).
+ *
+ *
+ */
+
 void
 Perl_opslab_free(pTHX_ OPSLAB *slab)
 {
@@ -515,6 +538,10 @@ Perl_opslab_free(pTHX_ OPSLAB *slab)
     } while (slab);
 }
 
+/* like opslab_free(), but first calls op_free() on any ops in the slab
+ * not marked as OP_FREED
+ */
+
 void
 Perl_opslab_force_free(pTHX_ OPSLAB *slab)
 {
@@ -7085,8 +7112,13 @@ Perl_pmruntime(pTHX_ OP *o, OP *expr, OP *repl, UV flags, I32 floor)
 #  endif
 		}
 #endif
-		/* But we know that one op is using this CV's slab. */
-		cv_forget_slab(PL_compcv);
+                /* This LEAVE_SCOPE will restore PL_compcv to point to the
+                 * outer CV (the one whose slab holds the pm op). The
+                 * inner CV (which holds expr) will be freed later, once
+                 * all the entries on the parse stack have been popped on
+                 * return from this function. Which is why its safe to
+                 * call op_free(expr) below.
+                 */
 		LEAVE_SCOPE(floor);
 		pm->op_pmflags &= ~PMf_HAS_CV;
 	    }
@@ -10169,6 +10201,7 @@ Perl_newATTRSUB_x(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs,
     if (cv) {				/* must reuse cv if autoloaded */
 	/* transfer PL_compcv to cv */
 	if (block) {
+            bool free_file = CvFILE(cv) && CvDYNFILE(cv);
 	    cv_flags_t existing_builtin_attrs = CvFLAGS(cv) & CVf_BUILTIN_ATTRS;
 	    PADLIST *const temp_av = CvPADLIST(cv);
 	    CV *const temp_cv = CvOUTSIDE(cv);
@@ -10206,7 +10239,7 @@ Perl_newATTRSUB_x(pTHX_ I32 floor, OP *o, OP *proto, OP *attrs,
 	    CvFLAGS(PL_compcv) &= ~(CVf_SLABBED|CVf_WEAKOUTSIDE);
 	    CvFLAGS(PL_compcv) |= other_flags;
 
-	    if (CvFILE(cv) && CvDYNFILE(cv)) {
+	    if (free_file) {
 		Safefree(CvFILE(cv));
             }
 	    CvFILE_set_from_cop(cv, PL_curcop);
@@ -10802,6 +10835,10 @@ Perl_newXS_len_flags(pTHX_ const char *name, STRLEN len,
     return cv;
 }
 
+/* Add a stub CV to a typeglob.
+ * This is the implementation of a forward declaration, 'sub foo';'
+ */
+
 CV *
 Perl_newSTUB(pTHX_ GV *gv, bool fake)
 {
diff --git a/regexec.c b/regexec.c
index e00583aece..bb3a630e87 100644
--- a/regexec.c
+++ b/regexec.c
@@ -3424,7 +3424,8 @@ Perl_regexec_flags(pTHX_ REGEXP * const rx, char *stringarg, char *strend,
            we switch it back; otherwise we leave it swapped.
         */
         swap = prog->offs;
-        /* do we need a save destructor here for eval dies? */
+        /* avoid leak if we die, or clean up anyway if match completes */
+        SAVEFREEPV(swap);
         Newxz(prog->offs, (prog->nparens + 1), regexp_paren_pair);
         DEBUG_BUFFERS_r(Perl_re_exec_indentf( aTHX_
 	    "rex=0x%" UVxf " saving  offs: orig=0x%" UVxf " new=0x%" UVxf "\n",
@@ -3809,17 +3810,6 @@ Perl_regexec_flags(pTHX_ REGEXP * const rx, char *stringarg, char *strend,
         goto phooey;
     }
 
-    DEBUG_BUFFERS_r(
-	if (swap)
-            Perl_re_exec_indentf( aTHX_
-		"rex=0x%" UVxf " freeing offs: 0x%" UVxf "\n",
-		0,
-                PTR2UV(prog),
-		PTR2UV(swap)
-	    );
-    );
-    Safefree(swap);
-
     /* clean up; this will trigger destructors that will free all slabs
      * above the current one, and cleanup the regmatch_info_aux
      * and regmatch_info_aux_eval sructs */
@@ -3841,24 +3831,29 @@ Perl_regexec_flags(pTHX_ REGEXP * const rx, char *stringarg, char *strend,
     DEBUG_EXECUTE_r(Perl_re_printf( aTHX_  "%sMatch failed%s\n",
 			  PL_colors[4], PL_colors[5]));
 
-    /* clean up; this will trigger destructors that will free all slabs
-     * above the current one, and cleanup the regmatch_info_aux
-     * and regmatch_info_aux_eval sructs */
-
-    LEAVE_SCOPE(oldsave);
-
     if (swap) {
-        /* we failed :-( roll it back */
+        /* we failed :-( roll it back.
+         * Since the swap buffer will be freed on scope exit which follows
+         * shortly, restore the old captures by copying 'swap's original
+         * data to the new offs buffer
+         */
         DEBUG_BUFFERS_r(Perl_re_exec_indentf( aTHX_
-	    "rex=0x%" UVxf " rolling back offs: freeing=0x%" UVxf " restoring=0x%" UVxf "\n",
+	    "rex=0x%" UVxf " rolling back offs: 0x%" UVxf " will be freed; restoring data to =0x%" UVxf "\n",
 	    0,
             PTR2UV(prog),
 	    PTR2UV(prog->offs),
 	    PTR2UV(swap)
 	));
-        Safefree(prog->offs);
-        prog->offs = swap;
+
+        Copy(swap, prog->offs, prog->nparens + 1, regexp_paren_pair);
     }
+
+    /* clean up; this will trigger destructors that will free all slabs
+     * above the current one, and cleanup the regmatch_info_aux
+     * and regmatch_info_aux_eval sructs */
+
+    LEAVE_SCOPE(oldsave);
+
     return 0;
 }
 
diff --git a/t/re/pat.t b/t/re/pat.t
index 8e9ad812f3..c3e4521131 100644
--- a/t/re/pat.t
+++ b/t/re/pat.t
@@ -23,7 +23,7 @@ BEGIN {
     skip_all('no re module') unless defined &DynaLoader::boot_DynaLoader;
     skip_all_without_unicode_tables();
 
-plan tests => 852;  # Update this when adding/deleting tests.
+plan tests => 853;  # Update this when adding/deleting tests.
 
 run_tests() unless caller;
 
@@ -1885,6 +1885,26 @@ EOF_CODE
             like($got[5],qr/Error: Infinite recursion via empty pattern/,
            "empty pattern in regex codeblock: produced the right exception message" );
         }
+
+    # This test is based on the one directly above, which happened to
+    # leak. Repeat the test, but stripped down to the bare essentials
+    # of the leak, which is to die while executing a regex which is
+    # already the current regex, thus causing the saved outer set of
+    # capture offsets to leak. The test itself doesn't do anything
+    # except sit around hoping not to be triggered by ASan
+    {
+        eval {
+            my $s = "abcd";
+            $s =~ m{([abcd]) (?{ die if $1 eq 'd'; })}gx;
+            $s =~ //g;
+            $s =~ //g;
+            $s =~ //g;
+        };
+        pass("call to current regex doesn't leak");
+    }
+
+
+
     {
         # [perl #130495] /x comment skipping stopped a byte short, leading
         # to assertion failure or 'malformed utf-8 character" warning
@@ -1928,6 +1948,10 @@ EOP
     }
     {
         # buffer overflow
+
+        # This test also used to leak - fixed by the commit which added
+        # this line.
+
         fresh_perl_is("BEGIN{\$^H=0x200000}\ns/[(?{//xx",
                       "Unmatched [ in regex; marked by <-- HERE in m/[ <-- HERE (?{/ at (eval 1) line 1.\n",
                       {}, "buffer overflow for regexp component");

-- 
Perl5 Master Repository



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About