develooper Front page | perl.perl5.porters | Postings from February 2013

Re: PL_sv_objcount

Thread Previous | Thread Next
From:
bulk88
Date:
February 28, 2013 14:50
Subject:
Re: PL_sv_objcount
Message ID:
BLU0-SMTP76D6BD392D155A6611CEF2DFFE0@phx.gbl
Nicholas Clark wrote:
> On Thu, Feb 28, 2013 at 02:21:23AM -0500, bulk88 wrote:
>> Steffen Mueller wrote:
>>> On 02/27/2013 08:11 PM, Nicholas Clark wrote:
>>>> So yes, good catch. It's going to be non-zero for any meaningful program.
>>> Alas, removing it (see branch smueller smueller/killsvobjcount2), makes 
>>> the following test in t/op/stash.t fail. My hunch is that it's 
>>> uncovering another bug?
>> I did some tests, and one liners, print()s and system()s do not use the 
>> "    if (PL_sv_objcount) {" code path. I would leave it it alone for 
>> performance reasons.
> 
> That's the wrong trade off.
> 
> For one liners, runtime speed is not going to be as important as start up
> time. For longer running programs, runtime is what matters.

Shutdown time is part of startup time. So it does affect one liners. 
Running sv_clean_objs when PL_sv_objcount == 0, takes 1/10th to 1/100th 
of a ms, or 200,000,000 to 20,000,000 instructions (2 GHZ cpu, and 
instruction count is 100% useless, but it makes a point about numbers). 
<sarcasm> I already know everyone here uses mod_perl, so why care about 
one liners? they aren't real Perl users. Real Perl users lease another 
blade when they need more performance or dont use Perl.</sarcasm>

Anyway, lets figure out why it took a measurable amount of CPU time to 
do sv_clean_objs when PL_sv_objcount == 0. Adding a counter to S_visit
_________________________________________________
STATIC I32
S_visit(pTHX_ SVFUNC_t f, const U32 flags, const U32 mask)
{
     dVAR;
     SV* sva;
     I32 visited = 0;
	U32 count = 0;

     PERL_ARGS_ASSERT_VISIT;

     for (sva = PL_sv_arenaroot; sva; sva = MUTABLE_SV(SvANY(sva))) {
	const SV * const svend = &sva[SvREFCNT(sva)];
	SV* sv;
	for (sv = sva + 1; sv < svend; ++sv) {
         count++;
	    if (SvTYPE(sv) != (svtype)SVTYPEMASK
		    && (sv->sv_flags & mask) == flags
		    && SvREFCNT(sv))
	    {
		(FCALL)(aTHX_ sv);
		++visited;
	    }
	}
     }
	Perl_warn(aTHX_ "S_visit %u\n", count);
     return visited;
}
__________________________________________________


running perl -e "print 'japh'"
produced
_________________________________________________
S_visit 508
S_visit 508
S_visit 508
S_visit 508
_________________________________________________

So for the simplest of one liners a loop was called 2000 times 
unnecessarily. The code is from 1994. Perl ran on a Pentium 1 back then. 
  Someone (Larry?) thought it was a good idea in 1994 with a 486 or P1. 
It still is a good idea in 2013. A faster CPU isn't an excuse to use it up.

> 
> Removing PL_sv_objcount eliminates a small cost for every object creation
> and destruction. This will scale (roughly) linearly with program runtime.
> So longer programs are really going to feel the benefit. Whereas for
> *trivial* one liners (the only things that will never create objects), the
> extra cost of always calling sv_clean_objs() is
> 
> a) going to be swamped by the startup and parsing time anyway
> b) isn't going to be large because there simply aren't going to be that many
>    scalars allocated at all.

The ++/-- code is unmeasurable. It will be executed async to all the 
other instructions ahead/behind it since its not related to any previous 
  calculation/data. Regarding "nano optimization", I'll point out this 
post 
https://rt.perl.org/rt3/Public/Bug/Display.html?id=116443#txn-1185031 
and 
https://rt.perl.org/rt3/Public/Bug/Display.html?id=116443#txn-1184961 , 
I won't make a comment on those 2 posts until more people respond in 
this thread.

Here are some Win32 VC 32 bit -O1 asm dumps

S_curse
   6514: 	if (SvTYPE(sv) != SVt_PVIO)
28089934 80 7B 08 0F      cmp         byte ptr [ebx+8],0Fh
28089938 59               pop         ecx
28089939 74 06            je          $L69352+30h (28089941h)
   6515: 	    --PL_sv_objcount;/* XXX Might want something more general */
2808993B FF 8E F4 02 00 00 dec         dword ptr [esi+2F4h]
   6516:     }
   6517:     return TRUE;
28089941 B0 01            mov         al,1



Perl_sv_bless
   9615: 	if (SvOBJECT(tmpRef)) {
2808CEF3 85 C7            test        edi,eax
2808CEF5 74 15            je          $L59457+44h (2808CF0Ch)
   9616: 	    if (SvTYPE(tmpRef) != SVt_PVIO)
2808CEF7 3C 0F            cmp         al,0Fh
2808CEF9 74 06            je          $L59457+39h (2808CF01h)
   9617: 		--PL_sv_objcount;
2808CEFB FF 8B F4 02 00 00 dec         dword ptr [ebx+2F4h]
   9618: 	    SvREFCNT_dec(SvSTASH(tmpRef));
2808CF01 8B 06            mov         eax,dword ptr [esi]
2808CF03 8B 08            mov         ecx,dword ptr [eax]
2808CF05 53               push        ebx
2808CF06 E8 E5 7A FB FF   call        S_SvREFCNT_dec (280449F0h)
2808CF0B 59               pop         ecx
   9619: 	}
   9620:     }
////////////the SvOBJECT_on() should have been reordered by the compiler 
to happen AFTER the type check but Visual C 2003 never reorders 
statements, 2008 in x64 mode does, I dont feel like writing a patch over 
1 instruction on 1 compiler/////////
   9621:     SvOBJECT_on(tmpRef);
2808CF0C 09 7E 08         or          dword ptr [esi+8],edi
   9622:     if (SvTYPE(tmpRef) != SVt_PVIO)
2808CF0F 80 7E 08 0F      cmp         byte ptr [esi+8],0Fh
2808CF13 74 06            je          $L59457+53h (2808CF1Bh)
   9623: 	++PL_sv_objcount;
2808CF15 FF 83 F4 02 00 00 inc         dword ptr [ebx+2F4h]
   9624:     SvUPGRADE(tmpRef, SVt_PVMG);
2808CF1B 8B 46 08         mov         eax,dword ptr [esi+8]


S_sv_dup_common and if you dont use ithreads or psudeo-fork, dont you 
dare look at this :-)
  12359:     if (SvOBJECT(dstr) && SvTYPE(dstr) != SVt_PVIO)
2809012A 8B 46 08         mov         eax,dword ptr [esi+8]
2809012D A9 00 00 10 00   test        eax,100000h
28090132 74 0D            je          $L70889+58h (28090141h)
28090134 3C 0F            cmp         al,0Fh
28090136 74 09            je          $L70889+58h (28090141h)
  12360: 	++PL_sv_objcount;
28090138 8B 45 08         mov         eax,dword ptr [my_perl]
2809013B FF 80 F4 02 00 00 inc         dword ptr [eax+2F4h]
  12361:
  12362:     return dstr;
28090141 8B C6            mov         eax,esi
28090143 5B               pop         ebx
28090144 5F               pop         edi
28090145 5E               pop         esi
  12363:  }
28090146 C9               leave
28090147 C3               ret




profile code is Win32 only, adjustment for overhead of getting the time 
is included
_____________________________________________________________
     /* Tell PerlIO we are about to tear things apart in case
        we have layers which are using resources that should
        be cleaned up now.
      */

     PerlIO_destruct(aTHX);

     if (PL_sv_objcount) {
	LARGE_INTEGER my_pre;
	LARGE_INTEGER my_beg;
     LARGE_INTEGER my_end;
     LARGE_INTEGER my_freq;
	/*
	 * Try to destruct global references.  We do this first so that the
	 * destructors and destructees still exist.  Some sv's might remain.
	 * Non-referenced objects are on their own.
	 */
     Perl_warn(aTHX_ "PL_sv_objcount is %u\n", PL_sv_objcount);
	QueryPerformanceFrequency(&my_freq);
	QueryPerformanceCounter(&my_pre);
	QueryPerformanceCounter(&my_beg);
	sv_clean_objs();
	QueryPerformanceCounter(&my_end);
	Perl_warn(aTHX_ "if true time=%f\n", 
(((double)(my_end.QuadPart-my_beg.QuadPart-(my_beg.QuadPart-my_pre.QuadPart))/(double)my_freq.QuadPart)));
	PL_sv_objcount = 0;
     }
	else
  {
	LARGE_INTEGER my_pre;
	LARGE_INTEGER my_beg;
     LARGE_INTEGER my_end;
     LARGE_INTEGER my_freq;
	/*
	 * Try to destruct global references.  We do this first so that the
	 * destructors and destructees still exist.  Some sv's might remain.
	 * Non-referenced objects are on their own.
	 */
     Perl_warn(aTHX_ "PL_sv_objcount is %u\n", PL_sv_objcount);
	QueryPerformanceFrequency(&my_freq);
	QueryPerformanceCounter(&my_pre);
	QueryPerformanceCounter(&my_beg);
	sv_clean_objs();
	QueryPerformanceCounter(&my_end);
	Perl_warn(aTHX_ "if false time=%f\n", 
(((double)(my_end.QuadPart-my_beg.QuadPart-(my_beg.QuadPart-my_pre.QuadPart))/(double)my_freq.QuadPart)));
	PL_sv_objcount = 0;
     }

     /* unhook hooks which will soon be, or use, destroyed data */
     SvREFCNT_dec(PL_warnhook);
     PL_warnhook = NULL;
     SvREFCNT_dec(PL_diehook);
     PL_diehook = NULL;

___________________________________________________________________________

"time= " is in seconds. Attached is raw data, no S_visit stats here 
since they would triple to quadruple the .011 ms floor to 0.035-0.045 
ms. With S_visit, raw data will have to be another post b/c of attach 
size probably. The workload is "perl harness  base/*.t comp/*.t cmd/*.t 
io/*.t op/*.t pragma/*.t" for both raw data files.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About