develooper Front page | perl.perl5.porters | Postings from January 2017

[perl #130643] run loop optimization

Thread Previous
From:
Hugo van der Sanden
Date:
January 25, 2017 14:23
Subject:
[perl #130643] run loop optimization
Message ID:
rt-4.0.24-25901-1485354206-914.130643-75-0@perl.org
# New Ticket Created by  Hugo van der Sanden 
# Please include the string:  [perl #130643]
# in the subject line of all future correspondence about this issue. 
# <URL: https://rt.perl.org/Ticket/Display.html?id=130643 >


This is a bug report for perl from hv@crypt.org,
generated with the help of perlbug 1.40 running under perl 5.25.10.

-----------------------------------------------------------------
[Please describe your issue here]

Back in [perl #73480] I noticed that my gcc at the time (4.4.2) was failing
to achieve a simple optimization, and we estimated at the time that tricking
gcc into achieving it via 339aac22c2 made perl programs built with it about
1% faster overall. For some reason at the time there was no discussion of
the effect this might have for other compilers.

Since then gcc has moved on, and my current install (4.8.4) does not need
the trickery. In the meantime, eb578fdb55 implemented a blanket removal
of all register declarations, as a result of which the trickery could
potentially result in a similar or greater slowdown with any compiler
that failed to see it could be elided.

So I'm not sure what the best state for the runloop would be right now,
but I suspect we should just remove the extra variable:

--- a/run.c
+++ b/run.c
@@ -36,10 +36,9 @@
 int
 Perl_runops_standard(pTHX)
 {
-    OP *op = PL_op;
-    PERL_DTRACE_PROBE_OP(op);
-    while ((PL_op = op = op->op_ppaddr(aTHX))) {
-	 PERL_DTRACE_PROBE_OP(op);
+    PERL_DTRACE_PROBE_OP(PL_op);
+    while ((PL_op = PL_op->op_ppaddr(aTHX))) {
+	 PERL_DTRACE_PROBE_OP(PL_op);
     }
     PERL_ASYNC_CHECK();
 

Locally, a build with `./Configure -des -Doptimize='-g -O6'` now makes
my main loop in Perl_runops_standard simply:
loop:
	callq  *16(%rax)	      # op_ppaddr
	test   %rax, %rax
	mov    %rax, 0x344913(%rip)   # PL_op
	jne    loop
.. both with and without the change. With threads in the mix I get (again
with or without the patch):
loop:
	mov    %rbx, %rdi
	callq  *16(%rax)
	test   %rax, %rax
	mov    %rax, 8(%rbx)
	jne    loop

With -O0 there's unsurprisingly a noticeable improvement with the patch,
but I guess that's only really of interest as a clue to what older or less
developed compilers might be doing.

It'd be useful for people with different compilers or significantly different
platforms to do similar tests.

(Meta: should we add a 'severity=optimization'?)

[Please do not change anything below this line]
-----------------------------------------------------------------
---
Flags:
    category=core
    severity=Wishlist
---
Site configuration information for perl 5.25.10:

Configured by hv at Mon Jan 23 20:29:32 GMT 2017.

Summary of my perl5 (revision 5 version 25 subversion 10) configuration:
  Commit id: e18c4116c82b2027a1e5d4e6b9a7214d60779053
  Platform:
    osname=linux
    osvers=3.13.0-101-generic
    archname=x86_64-linux
    uname='linux shad2 3.13.0-101-generic #148-ubuntu smp thu oct 20 22:08:32 utc 2016 x86_64 x86_64 x86_64 gnulinux '
    config_args='-des -Dcc=gcc -Dprefix=/opt/blead-d0 -Doptimize=-g -O0 -DDEBUGGING -Dusedevel -Uversiononly'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    bincompat5005=undef
  Compiler:
    cc='gcc'
    ccflags ='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
    optimize='-g -O0'
    cppflags='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion=''
    gccversion='4.8.4'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='gcc'
    ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /lib64 /usr/lib64
    libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
    libc=libc-2.19.so
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.19'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E'
    cccdlflags='-fPIC'
    lddlflags='-shared -g -O0 -L/usr/local/lib -fstack-protector'


---
@INC for perl 5.25.10:
    /opt/blead-d0/lib/perl5/site_perl/5.25.10/x86_64-linux
    /opt/blead-d0/lib/perl5/site_perl/5.25.10
    /opt/blead-d0/lib/perl5/5.25.10/x86_64-linux
    /opt/blead-d0/lib/perl5/5.25.10
    /opt/blead-d0/lib/perl5/site_perl/5.25.8
    /opt/blead-d0/lib/perl5/site_perl

---
Environment for perl 5.25.10:
    HOME=/home/hv
    LANG=C
    LANGUAGE=en_GB:en
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=<elided>
    PERL_BADLANG (unset)
    SHELL=/bin/bash


Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About