Discussion:
[SA-exim] Monitored Greylisting?
Jay Milk
2006-12-04 16:22:25 UTC
Permalink
All,

I know this isn't 100% on-topic for this list, but I'm at a loss as to
where else to ask my questions. I also believe that this does fit into
the sa-exim/greylisting world.

I've been watching the sa-exim project for a while. I'm the sys-admin
of a dedicated server running Cpanel/WHM (which in turn uses EXIM 4.x).
I currently have mailscanner installed, and I keep an eye on the
mailqueue using mailwatch. While tagging spam has been mostly
successful for a while, I'd still like to reject it on ingress.
However, we've seen a marked increase in spam recently, specifically in
"good" spam. This is spam that eludes SA quite well -- it appears to
come from many different relays, in different formats, and usually
including an obfuscated image with the spam-message, and random prose
below. It defeats SA rules and the Bayes filter very well. I also have
a few honey-pots set up -- email addresses which are silently advertised
(or easily guessed), and go directly into sa-learn for spam.

On an average day, my server processes ~1,500 messages, of which > 75%
are spam. Even with a well-trained database, I get over 50 missed
spam-messages each day. I get less than five false positives in a week.

All this said, I don't think sa-exim will do my server much good.
High-scoring spam (>25) is already discarded, and with the quality of
spam improving, sa seems to be missing a lot. However, if I could set
up greylisting in a way that's workable for my server (and my
user-base), I think I could improve the user-experience greatly.

Here are my thoughts --
1. I'd like to keep two whitelists, one with from-email/to-email pairs,
and another with from-email/to-domain pairs. I have the know-how to
extract these from the mailscanner-log and populate sql-tables -- I'd
basically add each address my users have sent mail *to* every 10 minutes
or so. Emails that are on either of these whitelists would be delivered
without further delay.
2. I'd like to keep "business hours" for each domain. I see that the
majority of spam is actually coming in outside of business hours, so the
greylisting could be somewhat more aggressive outside of business hours.
3. Incoming messages which don't match either whitelist will be
greylisted -- here's now where the monitoring comes in: I could monitor
the greylist database and for each address-pair decide whether to allow
the message next time it comes in, or whether to reject (550) the next
connect-attempt.

I have the expertise to write php-scripts and work with mysql databases,
in order to implement this monitoring system. However, I have *no clue*
when it comes to exim ACL or other configs, and I'm deathly afraid to
recompile exim -- I can't afford to break anything, as I don't have
enough expertise to trouble-shoot and fix this animal.

If anyone can help with the exim-integration on this, I'd be more than
glad to modify mailwatch for greylist monitoring, autowhitelisting,
etc. Of course, the result of any of this work would be fully open source.

Thanks,
-- JM
Michael Heiming
2006-12-11 07:15:31 UTC
Permalink
Post by Jay Milk
All,
I know this isn't 100% on-topic for this list, but I'm at a loss as to
where else to ask my questions. I also believe that this does fit into
the sa-exim/greylisting world.
I've been watching the sa-exim project for a while. I'm the sys-admin
of a dedicated server running Cpanel/WHM (which in turn uses EXIM 4.x).
I currently have mailscanner installed, and I keep an eye on the
mailqueue using mailwatch. While tagging spam has been mostly
successful for a while, I'd still like to reject it on ingress.
However, we've seen a marked increase in spam recently, specifically in
"good" spam. This is spam that eludes SA quite well -- it appears to
come from many different relays, in different formats, and usually
including an obfuscated image with the spam-message, and random prose
below. It defeats SA rules and the Bayes filter very well. I also have
a few honey-pots set up -- email addresses which are silently advertised
(or easily guessed), and go directly into sa-learn for spam.
On an average day, my server processes ~1,500 messages, of which > 75%
are spam. Even with a well-trained database, I get over 50 missed
spam-messages each day. I get less than five false positives in a week.
Sounds good, with the recent raise of spam I get rid of this amount of
ratware sometimes in a couple of Minutes. Still spam is >90%, on
secondary MX systems even >99%. I'd suggest to take a deep look in the
anti spam possibilities exim has to offer.

Of course you could look into FuzzyOCR against gif scam, which can be
used by recent SA versions, but all this stuff is pretty expensive to
run, if you are flooded with spam 24/7. Though with your minimal spam it
might not be a big problem at all.

You can control any step of a smtp connection with exim and delay
suspicious hosts for the smallest mistake. Be very picky about the
slightest mistake. A bunch of it can be fooled into nice smtp protocol
violation this way or just goes away. There are quite a few
configuration examples available STFW.

This way you don't need to fire up SA that often, which saves resources,
since SA tends to use quite some ram, limiting the number of spamd you
can run in parallel.

Good luck

Michael Heiming
--

Loading...