develooper Front page | perl.perl5.porters | Postings from January 2017

test randomization (Re: slowness of ext/XS-APItest/t/handy.t,utf8.t)

Thread Next
From:
Craig A. Berry
Date:
January 24, 2017 17:11
Subject:
test randomization (Re: slowness of ext/XS-APItest/t/handy.t,utf8.t)
Message ID:
CA+vYcVx+ytFi9z2aXKzmWvVMmkj8KvDexPb5Y8EQtPnO4KAsvQ@mail.gmail.com
On Mon, Jan 23, 2017 at 4:04 AM, demerphq <demerphq@gmail.com> wrote:
> On 21 January 2017 at 15:37, Craig A. Berry <craig.a.berry@gmail.com> wrote:

> I think we do fixed subset testing more often than people realize, and
> that sample testing would test more stuff, and would make sure bizarre
> stuff gets tested.

But it can only have value if people know that it's being done and
know how to reproduce failures.  And that includes making any
automated processes people run have the ability to capture and report
the randomly-generated inputs that triggered the failure and would be
needed to reproduce it.

You're basically talking about fuzzing by another name, but with a
home-grown random input generator rather than AFL or something, and
looking for incorrect results rather than invalid memory accesses.
When someone using a fuzzer reports a bug, we don't add the millions
of inputs they ran to the test suite; we add the one that triggered a
bug.  I don't see why that should be different for any other kind of
testing with random inputs.

> To a certain extent I think this is a XY problem.
>
> Your underlying concern seems to be about how the end user with
> minimal information will respond to a failing test. Since a randomized
> test file might not be perceived as a repeatable failure to the end
> user they may incorrectly respond to the failure by ignoring the
> error, or not trusting perls test suite.

> So I am trying to suggest that we address *that* problem rather than
> try to forbid test randomization.

Setting expectations would help, but it's still a major sea change
that I don't think has been adequately discussed (sorry if it was and
I wasn't paying attention).  Things like BBC reports, CPAN smokes, and
even basic bisecting depend on everything being the same except the
one thing you want to vary.  Randomizing input data in tests takes
away people's choice about what gets varied.

I've always thought the purpose of the test suite was to validate that
things known to be good are still good with a different
platform/toolchain/configuration/version, etc.  There is nothing wrong
with exhaustively hunting down things whose goodness is not known, but
the core test suite, which is included with the release tarball to
certify the release, seems an odd place for that.

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About