develooper Front page | perl.perl5.porters | Postings from February 2018

Re: [perl #132800] lib/unicore/mktables takes too long

Thread Previous | Thread Next
February 2, 2018 09:24
Re: [perl #132800] lib/unicore/mktables takes too long
Message ID:
On 2 February 2018 at 06:36, Karl Williamson <> wrote:
> On 02/01/2018 06:20 PM, James E Keenan via RT wrote:
>> On Thu, 01 Feb 2018 21:21:59 GMT, randir wrote:
>>> This is a bug report for perl from,
>>> generated with the help of perlbug 1.41 running under perl 5.27.9.
>>> -----------------------------------------------------------------
>>> [Please describe your issue here]
>>> High-parallel builds of perl are currently stranded by the
>>> lib/unicore/mktables step. It takes ~60% of the total build time:
>>> git clean -dfx && ./Configure -de -Dusedevel && time make -j20
>>> test_prep
>>> make -j20 test_prep  162.93s user 9.08s system 670% cpu 25.648 total
>>> ./miniperl -Ilib lib/unicore/mktables -C lib/unicore -P pod -maketest
>>> -p -w  15.82s user 0.23s system 99% cpu 16.057 total
>>> If I include configure step in the total time measurement, it's still
>>> ~27%
>>> bash -c './Configure -de -Dusedevel && make -j20 test_prep'  185.76s
>>> user 14.36s system 340% cpu 58.733 total
>>> This happened only during this development cycle, this step has been
>>> taking around 12-16 seconds on 5.20-5.26 with ~second growth each
>>> release. Can this be addressed in some way? This makes bisecting
>>> through 5.27.x and, later, all future perls, much slower.
>> I don't find compelling evidence of this trend.  I built perl at v5.24.0,
>> v5.26.0 and HEAD twice in each of three environments.  See the attachment
>> for results.  I focus on 'real' time because with bisection the clock time
>> measures the time a human is waiting for results.  The data is obviously not
>> statistically significant, but my impressions are that:
>> * there is more variation in timings between different 'make' runs for the
>> same Perl version than there is between different Perl versions;
>> * if one machine is inherently faster than another (typically, more
>> cores), then the percentage of total clock time taken up by ./Configure is
>> greater on that machine than the other; ('make' flies like a rocket on
>> dromedary where 'nproc -all' returns 24);
>> * if you're running Porting/ with the '--module' switch, the
>> total time spent in building and testing prerequisites swamps that taken
>> during 'make'.
> I also haven't anecdotally noticed any marked decrease in speed.  I looked
> through the list of commits in 5.27 to see what might be causing it, and
> nothing stands out.  Each release of Unicode brings more data that has to be
> processed, including whole new files and properties. There was some of that
> in Unicode 10.
> Note that my goal is to make mktables not run very frequently.  It really
> only needs to be run when a yearly new Unicode release comes along, or it
> itself changes.  Although, suppose there is an underlying bug in the core
> that causes it to generate defective tables, and that gets fixed.  Then we
> would need to run it, but we wouldn't know that we do.   So it does get
> complicated.  There in fact may still be a bug which affects the outputs,
> but it generally only causes comment lines to get garbaged, so there's no
> real harm.  I actually haven't noticed this bug in a long time, and I don't
> know if it still happens.  (Once Tux and I were trying to debug a mktables
> problem.  It was slow going, and then he rebased and the problem was
> suddenly gone.  Independently Dave Mitchell had fixed a problem in blead,
> and that was our underlying cause.  So this scenario has happened.)
> And I have a question for you git people.  I was looking  at the blame
> history
> It includes
> in the list of recent changes.  But I don't see that that commit actually
> modified mktables.  What's going on?

Personally I have never understood why we treat the output of mktables
any differently than we treat the output of say,
Porting/ or Porting/

IMO we should just check the output files into the repo, and only run
mktables when the input data is updated. I see no point in
regenerating the tables every time someone does a clean build.


perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About