Front page | perl.perl5.porters |
Postings from February 2013
Re: PL_sv_objcount
Thread Previous
|
Thread Next
From:
bulk88
Date:
February 28, 2013 14:50
Subject:
Re: PL_sv_objcount
Message ID:
BLU0-SMTP76D6BD392D155A6611CEF2DFFE0@phx.gbl
Nicholas Clark wrote:
> On Thu, Feb 28, 2013 at 02:21:23AM -0500, bulk88 wrote:
>> Steffen Mueller wrote:
>>> On 02/27/2013 08:11 PM, Nicholas Clark wrote:
>>>> So yes, good catch. It's going to be non-zero for any meaningful program.
>>> Alas, removing it (see branch smueller smueller/killsvobjcount2), makes
>>> the following test in t/op/stash.t fail. My hunch is that it's
>>> uncovering another bug?
>> I did some tests, and one liners, print()s and system()s do not use the
>> " if (PL_sv_objcount) {" code path. I would leave it it alone for
>> performance reasons.
>
> That's the wrong trade off.
>
> For one liners, runtime speed is not going to be as important as start up
> time. For longer running programs, runtime is what matters.
Shutdown time is part of startup time. So it does affect one liners.
Running sv_clean_objs when PL_sv_objcount == 0, takes 1/10th to 1/100th
of a ms, or 200,000,000 to 20,000,000 instructions (2 GHZ cpu, and
instruction count is 100% useless, but it makes a point about numbers).
<sarcasm> I already know everyone here uses mod_perl, so why care about
one liners? they aren't real Perl users. Real Perl users lease another
blade when they need more performance or dont use Perl.</sarcasm>
Anyway, lets figure out why it took a measurable amount of CPU time to
do sv_clean_objs when PL_sv_objcount == 0. Adding a counter to S_visit
_________________________________________________
STATIC I32
S_visit(pTHX_ SVFUNC_t f, const U32 flags, const U32 mask)
{
dVAR;
SV* sva;
I32 visited = 0;
U32 count = 0;
PERL_ARGS_ASSERT_VISIT;
for (sva = PL_sv_arenaroot; sva; sva = MUTABLE_SV(SvANY(sva))) {
const SV * const svend = &sva[SvREFCNT(sva)];
SV* sv;
for (sv = sva + 1; sv < svend; ++sv) {
count++;
if (SvTYPE(sv) != (svtype)SVTYPEMASK
&& (sv->sv_flags & mask) == flags
&& SvREFCNT(sv))
{
(FCALL)(aTHX_ sv);
++visited;
}
}
}
Perl_warn(aTHX_ "S_visit %u\n", count);
return visited;
}
__________________________________________________
running perl -e "print 'japh'"
produced
_________________________________________________
S_visit 508
S_visit 508
S_visit 508
S_visit 508
_________________________________________________
So for the simplest of one liners a loop was called 2000 times
unnecessarily. The code is from 1994. Perl ran on a Pentium 1 back then.
Someone (Larry?) thought it was a good idea in 1994 with a 486 or P1.
It still is a good idea in 2013. A faster CPU isn't an excuse to use it up.
>
> Removing PL_sv_objcount eliminates a small cost for every object creation
> and destruction. This will scale (roughly) linearly with program runtime.
> So longer programs are really going to feel the benefit. Whereas for
> *trivial* one liners (the only things that will never create objects), the
> extra cost of always calling sv_clean_objs() is
>
> a) going to be swamped by the startup and parsing time anyway
> b) isn't going to be large because there simply aren't going to be that many
> scalars allocated at all.
The ++/-- code is unmeasurable. It will be executed async to all the
other instructions ahead/behind it since its not related to any previous
calculation/data. Regarding "nano optimization", I'll point out this
post
https://rt.perl.org/rt3/Public/Bug/Display.html?id=116443#txn-1185031
and
https://rt.perl.org/rt3/Public/Bug/Display.html?id=116443#txn-1184961 ,
I won't make a comment on those 2 posts until more people respond in
this thread.
Here are some Win32 VC 32 bit -O1 asm dumps
S_curse
6514: if (SvTYPE(sv) != SVt_PVIO)
28089934 80 7B 08 0F cmp byte ptr [ebx+8],0Fh
28089938 59 pop ecx
28089939 74 06 je $L69352+30h (28089941h)
6515: --PL_sv_objcount;/* XXX Might want something more general */
2808993B FF 8E F4 02 00 00 dec dword ptr [esi+2F4h]
6516: }
6517: return TRUE;
28089941 B0 01 mov al,1
Perl_sv_bless
9615: if (SvOBJECT(tmpRef)) {
2808CEF3 85 C7 test edi,eax
2808CEF5 74 15 je $L59457+44h (2808CF0Ch)
9616: if (SvTYPE(tmpRef) != SVt_PVIO)
2808CEF7 3C 0F cmp al,0Fh
2808CEF9 74 06 je $L59457+39h (2808CF01h)
9617: --PL_sv_objcount;
2808CEFB FF 8B F4 02 00 00 dec dword ptr [ebx+2F4h]
9618: SvREFCNT_dec(SvSTASH(tmpRef));
2808CF01 8B 06 mov eax,dword ptr [esi]
2808CF03 8B 08 mov ecx,dword ptr [eax]
2808CF05 53 push ebx
2808CF06 E8 E5 7A FB FF call S_SvREFCNT_dec (280449F0h)
2808CF0B 59 pop ecx
9619: }
9620: }
////////////the SvOBJECT_on() should have been reordered by the compiler
to happen AFTER the type check but Visual C 2003 never reorders
statements, 2008 in x64 mode does, I dont feel like writing a patch over
1 instruction on 1 compiler/////////
9621: SvOBJECT_on(tmpRef);
2808CF0C 09 7E 08 or dword ptr [esi+8],edi
9622: if (SvTYPE(tmpRef) != SVt_PVIO)
2808CF0F 80 7E 08 0F cmp byte ptr [esi+8],0Fh
2808CF13 74 06 je $L59457+53h (2808CF1Bh)
9623: ++PL_sv_objcount;
2808CF15 FF 83 F4 02 00 00 inc dword ptr [ebx+2F4h]
9624: SvUPGRADE(tmpRef, SVt_PVMG);
2808CF1B 8B 46 08 mov eax,dword ptr [esi+8]
S_sv_dup_common and if you dont use ithreads or psudeo-fork, dont you
dare look at this :-)
12359: if (SvOBJECT(dstr) && SvTYPE(dstr) != SVt_PVIO)
2809012A 8B 46 08 mov eax,dword ptr [esi+8]
2809012D A9 00 00 10 00 test eax,100000h
28090132 74 0D je $L70889+58h (28090141h)
28090134 3C 0F cmp al,0Fh
28090136 74 09 je $L70889+58h (28090141h)
12360: ++PL_sv_objcount;
28090138 8B 45 08 mov eax,dword ptr [my_perl]
2809013B FF 80 F4 02 00 00 inc dword ptr [eax+2F4h]
12361:
12362: return dstr;
28090141 8B C6 mov eax,esi
28090143 5B pop ebx
28090144 5F pop edi
28090145 5E pop esi
12363: }
28090146 C9 leave
28090147 C3 ret
profile code is Win32 only, adjustment for overhead of getting the time
is included
_____________________________________________________________
/* Tell PerlIO we are about to tear things apart in case
we have layers which are using resources that should
be cleaned up now.
*/
PerlIO_destruct(aTHX);
if (PL_sv_objcount) {
LARGE_INTEGER my_pre;
LARGE_INTEGER my_beg;
LARGE_INTEGER my_end;
LARGE_INTEGER my_freq;
/*
* Try to destruct global references. We do this first so that the
* destructors and destructees still exist. Some sv's might remain.
* Non-referenced objects are on their own.
*/
Perl_warn(aTHX_ "PL_sv_objcount is %u\n", PL_sv_objcount);
QueryPerformanceFrequency(&my_freq);
QueryPerformanceCounter(&my_pre);
QueryPerformanceCounter(&my_beg);
sv_clean_objs();
QueryPerformanceCounter(&my_end);
Perl_warn(aTHX_ "if true time=%f\n",
(((double)(my_end.QuadPart-my_beg.QuadPart-(my_beg.QuadPart-my_pre.QuadPart))/(double)my_freq.QuadPart)));
PL_sv_objcount = 0;
}
else
{
LARGE_INTEGER my_pre;
LARGE_INTEGER my_beg;
LARGE_INTEGER my_end;
LARGE_INTEGER my_freq;
/*
* Try to destruct global references. We do this first so that the
* destructors and destructees still exist. Some sv's might remain.
* Non-referenced objects are on their own.
*/
Perl_warn(aTHX_ "PL_sv_objcount is %u\n", PL_sv_objcount);
QueryPerformanceFrequency(&my_freq);
QueryPerformanceCounter(&my_pre);
QueryPerformanceCounter(&my_beg);
sv_clean_objs();
QueryPerformanceCounter(&my_end);
Perl_warn(aTHX_ "if false time=%f\n",
(((double)(my_end.QuadPart-my_beg.QuadPart-(my_beg.QuadPart-my_pre.QuadPart))/(double)my_freq.QuadPart)));
PL_sv_objcount = 0;
}
/* unhook hooks which will soon be, or use, destroyed data */
SvREFCNT_dec(PL_warnhook);
PL_warnhook = NULL;
SvREFCNT_dec(PL_diehook);
PL_diehook = NULL;
___________________________________________________________________________
"time= " is in seconds. Attached is raw data, no S_visit stats here
since they would triple to quadruple the .011 ms floor to 0.035-0.045
ms. With S_visit, raw data will have to be another post b/c of attach
size probably. The workload is "perl harness base/*.t comp/*.t cmd/*.t
io/*.t op/*.t pragma/*.t" for both raw data files.
Thread Previous
|
Thread Next