Front page | perl.perl5.porters |
Postings from July 2012
[perl #114272] \V and \H skip too far and match incorrectly
From:
Nicholas Clark
Date:
July 26, 2012 05:34
Subject:
[perl #114272] \V and \H skip too far and match incorrectly
Message ID:
rt-3.6.HEAD-11172-1343306025-88.114272-75-0@perl.org
# New Ticket Created by Nicholas Clark
# Please include the string: [perl #114272]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=114272 >
Building with -faddress-sanitizer (on OS X) happens to reveal a bug in perl
which is concealed on most other platforms due to how memory allocation is
rounded up.
The bug itself is twofold. Firstly a wrong result:
$ ./miniperl -le '$_ = "\xFFn"; print /\Vn/ ? "y" : "n"'
n
$ ./miniperl -le '$_ = "\xFFn"; utf8::upgrade($_); print /\Vn/ ? "y" : "n"'
y
(it should be the same, independent of internal representation)
and secondly reading from unallocated memory:
$ valgrind ./miniperl -le '$_ = "\xFFn"; print /\Vn/ ? "y" : "n"'
==17838== Memcheck, a memory error detector
==17838== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==17838== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==17838== Command: ./miniperl -le $_\ =\ "\\xFFn";\ print\ /\\Vn/\ ?\ "y"\ :\ "n"
==17838==
==17838== Invalid read of size 1
==17838== at 0x68EADA: S_regmatch (regexec.c:5852)
==17838== by 0x67FFEC: S_regtry (regexec.c:2729)
==17838== by 0x67DF72: Perl_regexec_flags (regexec.c:2393)
==17838== by 0x5632B5: Perl_pp_match (pp_hot.c:1378)
==17838== by 0x508817: Perl_runops_debug (dump.c:2129)
==17838== by 0x40B3FA: S_run_body (perl.c:2409)
==17838== by 0x40A754: perl_run (perl.c:2326)
==17838== by 0x4481BB: main (miniperlmain.c:119)
==17838== Address 0x5c8d20d is 5 bytes after a block of size 8 alloc'd
==17838== at 0x4C24B4A: malloc (vg_replace_malloc.c:263)
==17838== by 0x509110: Perl_safesysmalloc (util.c:100)
==17838== by 0x576CC0: Perl_sv_grow (sv.c:1520)
==17838== by 0x594C39: Perl_sv_setsv_flags (sv.c:4239)
==17838== by 0x55A8D6: Perl_pp_sassign (pp_hot.c:212)
==17838== by 0x508817: Perl_runops_debug (dump.c:2129)
==17838== by 0x40B3FA: S_run_body (perl.c:2409)
==17838== by 0x40A754: perl_run (perl.c:2326)
==17838== by 0x4481BB: main (miniperlmain.c:119)
==17838==
n
==17838==
==17838== HEAP SUMMARY:
==17838== in use at exit: 107,704 bytes in 509 blocks
==17838== total heap usage: 625 allocs, 116 frees, 133,452 bytes allocated
==17838==
==17838== LEAK SUMMARY:
==17838== definitely lost: 0 bytes in 0 blocks
==17838== indirectly lost: 0 bytes in 0 blocks
==17838== possibly lost: 0 bytes in 0 blocks
==17838== still reachable: 107,704 bytes in 509 blocks
==17838== suppressed: 0 bytes in 0 blocks
==17838== Rerun with --leak-check=full to see details of leaked memory
==17838==
==17838== For counts of detected and suppressed errors, rerun with: -v
==17838== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
however, to reveal that, I have to hack sv.h and handy.h to disable initial
over-allocation:
$ git diff handy.h sv.c
diff --git a/handy.h b/handy.h
index 2205742..629b540 100644
--- a/handy.h
+++ b/handy.h
@@ -1143,7 +1143,11 @@ PoisonWith(0xEF) for catching access to freed memory.
(void)(sizeof(t) > 1 && ((MEM_SIZE)(n)+0.0) > MEM_SIZE_MAX/sizeof(t) &&
#define MEM_WRAP_CHECK_(n,t) MEM_WRAP_CHECK(n,t),
+#if 1
#define PERL_STRLEN_ROUNDUP(n) ((void)(((n) > MEM_SIZE_MAX - 2 * PERL_STRLEN_RO
+#else
+#define PERL_STRLEN_ROUNDUP(n) (n)
+#endif
#else
diff --git a/sv.c b/sv.c
index 2d5444e..3261894 100644
--- a/sv.c
+++ b/sv.c
@@ -1506,10 +1506,10 @@ Perl_sv_grow(pTHX_ register SV *const sv, register STRLE
s = SvPVX_mutable(sv);
if (newlen > SvLEN(sv)) { /* need more room? */
- STRLEN minlen = SvCUR(sv);
+ /* STRLEN minlen = SvCUR(sv);
minlen += (minlen >> PERL_STRLEN_EXPAND_SHIFT) + 10;
if (newlen < minlen)
- newlen = minlen;
+ newlen = minlen;*/
#ifndef Perl_safesysmalloc_size
newlen = PERL_STRLEN_ROUNDUP(newlen);
#endif
(changing the #if 1 to #if 0 demonstrates that the read beyond end bug is
also present for chr 254)
The error is in this code in S_regmatch:
#define CASE_CLASS(nAmE) \
case nAmE: \
if (locinput >= PL_regeol) \
sayNO; \
if ((n=is_##nAmE(locinput,utf8_target))) { \
locinput += n; \
nextchr = UCHARAT(locinput); \
} else \
sayNO; \
break; \
case N##nAmE: \
if (locinput >= PL_regeol) \
sayNO; \
if ((n=is_##nAmE(locinput,utf8_target))) { \
sayNO; \
} else { \
locinput += UTF8SKIP(locinput); \
nextchr = UCHARAT(locinput); \
} \
break
CASE_CLASS(VERTWS);
CASE_CLASS(HORIZWS);
#undef CASE_CLASS
which is using UTF8SKIP unconditionally, when (if I understand it correctly)
it *should* only be using them if utf8_target is true.
The code in question was added by commit e1d1eefb8c88e0dc in April 2007, so
it's been in everything since 5.010
Nicholas Clark
-
[perl #114272] \V and \H skip too far and match incorrectly
by Nicholas Clark