develooper Front page | perl.perl5.porters | Postings from August 2008

Re: Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause)

From:
Tom Christiansen
Date:
August 1, 2008 08:04
Subject:
Re: Fighting the Good Fight against spam deluge (was: Senatorial (Senescent?) reflective pause)
Message ID:
22174.1217603039@chthon
In-Reply-To: Message from jm@jmason.org (Justin Mason) 
   of "Fri, 01 Aug 2008 11:03:59 BST." <20080801100359.32DEE30C0B1@jmason.org> 

>Tim Bunce writes:

>> On Thu, Jul 31, 2008 at 04:15:49PM -0600, Tom Christiansen wrote:
>>> 
>>> I do have another Perl-spin bug/problem related, but I am pretty sure 
>>> this is some pessimal combo of input data and processing code.  Here's
>>> an example of it:
>>> 
>>>       UID   PID  PPID CPU PRI NI   VSZ   RSS WCHAN   STAT TT       TIME COMMAND
>>> 	0  2463  3795   0   2 20 44920 10020 poll   I     ??   14:42.02 perl5.10.0: spamd child (perl5.10.0)
>>> 
>>> Nearly 15 minutes CPU time to process one message?!

>Note that spamd reuses forked children, so that process has probably
>processed more than 1 msg.

Oh, so that might explain pid 3795 then.  Its WCHAN is poll, meaning I
presume, select, waiting on servicing another request from its SA daemon
master.  (The blacklisters' polls are trailing their log files.)

Still, that's a lot of RSS, eh?

  UID   PID  PPID CPU PRI NI   VSZ   RSS WCHAN   STAT TT       TIME COMMAND
    0  3795     1   0   2 20 30640  2616 poll   Is    ??    1:51.43 perl5.10.0: /usr/local/bin/spamd -m 8 -d -p 4321 (perl5.10.0)
  101 12053 21910   0   2 20   380   732 fifow  I     ??    0:00.19 perl: sigrand (perl)
    0 12686  3795   0   2 20 39012 26880 poll   I     ??    6:30.87 perl5.10.0: spamd child (perl5.10.0)
    0 21364  3795   0   2 20 33364  1100 poll   I     ??    0:29.48 perl5.10.0: spamd child (perl5.10.0)
  101 14990  6699   0  10 20   172     4 -      T     p0    0:00.04 /usr/bin/perl /home/tchrist/scripts/mail ja
    0  1663  3969   0   2 20  2540  1152 poll   I     p1    0:14.71 /usr/local/bin/perl /home/tchrist/scripts/blacklist-imposters-smtp /var/log/maillog.0.gz /var/log/maillog.1.gz /var/log/maillog.2.gz /var/log/maillog.3.gz /var/log/maillog.4.gz /var/log/maillog.5.gz /var/log/maillog.6.gz /var/log/maillog.7.gz
    0 12917  3969   0   2 20  2524  1160 poll   I     p1    1:38.32 /usr/local/bin/perl /home/tchrist/scripts/blacklist-imposters-spamd /var/log/daemon.0.gz /var/log/daemon.1.gz /var/log/daemon.2.gz /var/log/daemon.3.gz /var/log/daemon.4.gz /var/log/daemon.5.gz /var/log/daemon.6.gz /var/log/daemon.7.gz

Oh, no that's not it.  Grepping on spamd yields:

UID   PID  PPID CPU PRI NI   VSZ   RSS WCHAN   STAT TT       TIME COMMAND
   62 16243     1  22  10 20  9800  1648 nanosl Is    ??   39:19.71 spamd: (pf <spamd-white> update) (spamd)
   62  8872 16243   0   2 20 15872  9504 select I     ??   22:11.38 /usr/libexec/spamd -v -5 -g
   62 24330 16243   0  -6 20  8492   368 piperd I     ??    0:22.01 spamd: (/var/db/spamd update) (spamd)
    0  3795     1   0   2 20 30640  2616 poll   Is    ??    1:51.66 perl5.10.0: /usr/local/bin/spamd -m 8 -d -p 4321 (perl5.10.0)
    0 12686  3795  13   2 20 39348 28172 poll   I     ??    6:39.63 perl5.10.0: spamd child (perl5.10.0)
    0 21364  3795   0   2 20 33364  1100 poll   I     ??    0:29.48 perl5.10.0: spamd child (perl5.10.0)
  101  7458  6699   0  28 20  1524     4 -      RV    p0    0:00.00 grep spamd (tcsh)
    0 12917  3969   0   2 20  2524  1160 poll   I     p1    1:38.48 /usr/local/bin/perl /home/tchrist/scripts/blacklist-imposters-spamd /var/log/daemon.0.gz /var/log/daemon.1.gz /var/log/daemon.2.gz /var/log/daemon.3.gz /var/log/daemon.4.gz /var/log/daemon.5.gz /var/log/daemon.6.gz /var/log/daemon.7.gz

There are different WCHANs for poll and select.

And yes, those are two very different flavors of spamd.  Just 
wanted to check to see for sure whom to SIGkick, howsoever 
harmless SIGUSR2 might (or mightn't) be.

>> My usual first approach is something like
>> 
>>  $SIG{USR2} = \&Carp::cluck; # or \&Carp::confess;

Great idea, Tim--thanks!

> spamd handily already does that. ;)  Send it a SIGUSR2 and it'll
> dump a backtrace, and the current message's Message-ID.

Gosh, *two* steps of me, you are!

> If you can capture a message that causes one of the standard rules to go
> exponential, please open a bug at http://issues.apache.org/SpamAssassin/
> and we'll track down the regexp and fix it.  (If it's a perl bug, we'll
> then open it in _that_ bug tracker ;)

Got it!

Aug  1 09:01:50 chthon spamd[12686]: spamd: caught SIGUSR2 - dumping backtrace. most recent message: <33416880.1217602525815.JavaMail.bulky@10.8.7.84> 
Aug  1 09:01:51 chthon spamd[12686]:  at /usr/local/bin/spamd line 2428 
Aug  1 09:01:51 chthon spamd[12686]:  main::backtrace_handler('USR2') called at /usr/local/lib/perl5/site_perl/5.10.0/Mail/SpamAssassin/SpamdForkScaling.pm line 659 
Aug  1 09:01:51 chthon spamd[12686]:  eval {...} called at /usr/local/lib/perl5/site_perl/5.10.0/Mail/SpamAssassin/SpamdForkScaling.pm line 659 
Aug  1 09:01:51 chthon spamd[12686]:  Mail::SpamAssassin::SpamdForkScaling::sysread_with_timeout('Mail::SpamAssassin::SpamdForkScaling=HASH(0x3c039488)', 'IO::Socket=GLOB(0x3d119514)', 'SCALAR(0x3e35b668)', 6, 300) called at /usr/local/lib/perl5/site_perl/5.10.0/Mail/SpamAssassin/SpamdForkScaling.pm line 587 
Aug  1 09:01:52 chthon spamd[12686]:  Mail::SpamAssassin::SpamdForkScaling::wait_for_orders('Mail::SpamAssassin::SpamdForkScaling=HASH(0x3c039488)') called at /usr/local/bin/spamd line 1060 
Aug  1 09:01:52 chthon spamd[12686]:  main::spawn() called at /usr/local/lib/perl5/site_perl/5.10.0/Mail/SpamAssassin/SpamdForkScaling.pm line 823 
Aug  1 09:01:52 chthon spamd[12686]:  Mail::SpamAssassin::SpamdForkScaling::need_to_add_server('Mail::SpamAssassin::SpamdForkScaling=HASH(0x3c039488)', 0) called at /usr/local/lib/perl5/site_perl/5.10.0/Mail/SpamAssassin/SpamdForkScaling.pm line 808 
Aug  1 09:01:52 chthon spamd[12686]:  Mail::SpamAssassin::SpamdForkScaling::adapt_num_children('Mail::SpamAssassin::SpamdForkScaling=HASH(0x3c039488)') called at /usr/local/lib/perl5/site_perl/5.10.0/Mail/SpamAssassin/SpamdForkScaling.pm line 388 
Aug  1 09:01:53 chthon spamd[12686]:  Mail::SpamAssassin::SpamdForkScaling::main_server_poll('Mail::SpamAssassin::SpamdForkScaling=HASH(0x3c039488)', 2) called at /usr/local/bin/spamd line 974 

Very nice, and very clever.  And no exponential pattern.  Yet. :-)

But I'll know what to keep my eye on now, and how. 

Thanks to everybody!

--tom



nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About