[SA-exim] Monitored Greylisting?

Jay Milk

2006-12-04 16:22:25 UTC

All,

I know this isn't 100% on-topic for this list, but I'm at a loss as to
where else to ask my questions. I also believe that this does fit into
the sa-exim/greylisting world.

I've been watching the sa-exim project for a while. I'm the sys-admin
of a dedicated server running Cpanel/WHM (which in turn uses EXIM 4.x).
I currently have mailscanner installed, and I keep an eye on the
mailqueue using mailwatch. While tagging spam has been mostly
successful for a while, I'd still like to reject it on ingress.
However, we've seen a marked increase in spam recently, specifically in
"good" spam. This is spam that eludes SA quite well -- it appears to
come from many different relays, in different formats, and usually
including an obfuscated image with the spam-message, and random prose
below. It defeats SA rules and the Bayes filter very well. I also have
a few honey-pots set up -- email addresses which are silently advertised
(or easily guessed), and go directly into sa-learn for spam.

On an average day, my server processes ~1,500 messages, of which > 75%
are spam. Even with a well-trained database, I get over 50 missed
spam-messages each day. I get less than five false positives in a week.

All this said, I don't think sa-exim will do my server much good.
High-scoring spam (>25) is already discarded, and with the quality of
spam improving, sa seems to be missing a lot. However, if I could set
up greylisting in a way that's workable for my server (and my
user-base), I think I could improve the user-experience greatly.

Here are my thoughts --
1. I'd like to keep two whitelists, one with from-email/to-email pairs,
and another with from-email/to-domain pairs. I have the know-how to
extract these from the mailscanner-log and populate sql-tables -- I'd
basically add each address my users have sent mail *to* every 10 minutes
or so. Emails that are on either of these whitelists would be delivered
without further delay.
2. I'd like to keep "business hours" for each domain. I see that the
majority of spam is actually coming in outside of business hours, so the
greylisting could be somewhat more aggressive outside of business hours.
3. Incoming messages which don't match either whitelist will be
greylisted -- here's now where the monitoring comes in: I could monitor
the greylist database and for each address-pair decide whether to allow
the message next time it comes in, or whether to reject (550) the next
connect-attempt.

I have the expertise to write php-scripts and work with mysql databases,
in order to implement this monitoring system. However, I have *no clue*
when it comes to exim ACL or other configs, and I'm deathly afraid to
recompile exim -- I can't afford to break anything, as I don't have
enough expertise to trouble-shoot and fix this animal.

If anyone can help with the exim-integration on this, I'd be more than
glad to modify mailwatch for greylist monitoring, autowhitelisting,
etc. Of course, the result of any of this work would be fully open source.

Thanks,
-- JM