develooper Front page | perl.perl5.porters | Postings from April 2003

Re: COW in CORE and SvFAKE in older Perl

Thread Previous | Thread Next
From:
Aaron Sherman
Date:
April 30, 2003 23:31
Subject:
Re: COW in CORE and SvFAKE in older Perl
Message ID:
1051770700.1480.379.camel@localhost.localdomain
On Wed, 2003-04-30 at 17:49, Nicholas Clark wrote:
> On Tue, Apr 29, 2003 at 06:29:32PM -0400, Aaron Sherman wrote:
> > On Tue, 2003-04-29 at 16:12, Nicholas Clark wrote:
> 
> > some interesting efficiency tricks which are very useful, but result in
> > massive memory duplication (which is why a 250Kb message is by default
> > the largest that it will process, and why SA is a useless tool for most
> > virus-checking).
> 
> Ah. I'm not sure which situations copy on write will give you a win over
> simple passing a scalar reference and dereferencing it as needed. There
> doesn't seem to be any copying for that for the simple test that I did:

The problem in SA is that it does (massive paraphrasing into pseudo-code
here):

sub processmessage {
	my $self = shift;
	my $msgref = $self->{message};
	$self->{msg_as_array} = [split /\r?\n/, $$msgref];
	$self->process();
}

It does have good reason for this, and there are large performance
benefits, but it is a big hit on memory.

Later on many more scalars will be invented, but with just that chunk of
code, at a bare minimum (e.g. ignoring the SV and AV overhead), you're
forcing two copies of the message to exist in-core at the same time. If
that copy could be copy-on-write, SA's memory footprint would
immediately (and permanently, since it never writes to either) drop to a
little over 1/2 of its current usage for large messages, and it could
double the size of messages that are allowed for processing by default
to 0.5Mb (still small, but twice as good! ;)

> There's a -DC debugging flag to show what the copy on write code is doing.
> It actually copies all strings, however short. There is probably scope for
> tuning this to only copy long things. Also, currently the basic copy code
> doesn't upgrade anything aggressively. It can only copy on write if the
> source is at least PVIV already (with no IV or NV slot in use)

It does seem here like you're saying I'm hoping in vain, since the above
code is certainly not allocating PVIVs... perhaps split// is one of
those cases that should have such optimization forced...?

> I don't know. There are no good benchmarks to test this kind of thing.

Heehee, welcome to SpamAssassin, the Perl benchmark from hell! ;-)


Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About