Front page | perl.perl5.porters |
Postings from June 2017
From: Dave Mitchell
June 5, 2017 10:27
Message ID: 20170605102702.GG3083@iabyn.com
I've just pushed a branch, smoke-me/davem/sprintf, which contains
104 commits that heavily rework the Perl_sv_vcatpvfn_flags() function.
This sprawling 1700-line function provides the low-level sprintf
functionality to all the various other sprintf-ish perl and C functions in
core. It has grown piecewise over the years and has become a bit of a
mess. In particular it was hard to determine whether any buffer overflows
My branch, which I intend to merge into blead in a few days time, does
Size modifiers with integer Inf/Nan, e.g. sprintf("%hi", Inf)
used to print spurious warnings.
The very-special-cased "%.NNNg" code has been removed, and is now
handled by the special-cased "...%.NNNg..." code instead, which makes
it about 4% slower, but handles unicode radix points better.
"%n" with a missing arg now croaks with a specific error message
(previously it croaked about modifying a read-only variable).
sprintf("%p", 0+Inf) now prints the address of an SV, not the literal
string "Inf" (or "NaN").
It now warns on a missing explicit width or vector arg, e.g. "%*2$d"
Widths and precisions specified via an argument used to wrap; for
example, this used to output " abc"; it now croaks with "Integer
overflow in format string":
$w = 0x100000005; printf "%*s", $w, "abc";
Illegal formats didn't work well with utf8. It now resets to the
character following the '%' and continues from there.
The locale-specific radix point code now uses
SvCUR(PL_numeric_radix_sv) rather than SvLEN(PL_numeric_radix_sv) for
buffer size allocation. Also, several places weren't taking into
account the length of the radix point string when calculating buffer
sizes; they were only being saved by the general 'add 40' fudge
Buffer sizing and length calculation overflow/wraps:
The code has been thoroughly audited to look for any issues where the
format and/or args are malicious, and some possible (although not
exploitable) wraps have been fixed.
int's and I32's etc have been replaced with STRLEN where appropriate.
Tests for overflow/wrap have been added to the test suite.
The code has been cleaned up a lot and better commented, so it will be
easier to avoid wrapping in future.
The special-cased "...%.0f..." is now detected and handled earlier in
the loop, so is about 35% faster.
It now sets and restores locale only once per function call, not
multiple times per format element.
Generally the code has been heavily tweaked.
On average, the newly-added benchmarks are about 10-15% faster; the
mixed-utf8 benchmark is about about 80% faster.
The HAS_LDBL_SPRINTF_BUG code that worked around a 2002 Irix 6
bug has been removed.
The 300-line block of code which handles %a hex floating-point formats
has been moved into a separate static function.
The 'svmax' arg to the sv_vcatpvfn() family has been changed from I32
to STRLEN and renamed to 'sv_count'.
A "Redundant argument in printf" warning is no longer emitted after an
invalid % format element has been encountered. This makes the code
simpler, and since you get an "Invalid conversion in printf" warning
anyway, does no harm (and in fact may be less confusing).
Several locals vars have been eliminated and the scope of many others
has been reduced.
"Procrastination grows to fill the available time"
-- Mitchell's corollary to Parkinson's Law
by Dave Mitchell