On Tue, Jan 24, 2017 at 11:10:50AM -0600, Craig A. Berry wrote: > Setting expectations would help, but it's still a major sea change > that I don't think has been adequately discussed (sorry if it was and > I wasn't paying attention). Things like BBC reports, CPAN smokes, and > even basic bisecting depend on everything being the same except the > one thing you want to vary. Randomizing input data in tests takes > away people's choice about what gets varied. > > I've always thought the purpose of the test suite was to validate that > things known to be good are still good with a different > platform/toolchain/configuration/version, etc. There is nothing wrong > with exhaustively hunting down things whose goodness is not known, but > the core test suite, which is included with the release tarball to > certify the release, seems an odd place for that. I think random subset selection in the test suite should be a method of last resort. I believe we do it in one or two places already with Unicode stuff, since it would take far too long to test every permutation in that case. When we do this, I think ideally: 1) the test space should be logically divided into N subsets, and one of those subsets is randomly chosen for testing - i.e. there is only a single random number generated at the start of the test suite, and that selects which set of tests to run - so no doing a 1000-times loop and for each iteration choosing a random character and testing it. 2) if at least one test fails, then the random number chosen should be displayed on stderr (e.g. using diag()) so that it can be seen in smoke reports etc. It should also be reported to stdout always; 3) there should be a way to run a test script with a specified random number (e.g. via an environment variable); 4) N should be small enough (e.g. 10,20,30...) that all permutations are likely to have been tried by at least one smoker after a small number of days, so we don't get a sudden failure 6 months down the line. 5) there should be a method (e.g. via an env var) to make all test scripts in the test suite run all tests rather than a random subset. (Now Karl's going to point out how Unicode makes that nice simple scheme impractical... ;-) -- "Procrastination grows to fill the available time" -- Mitchell's corollary to Parkinson's LawThread Previous | Thread Next