lunes, agosto 30, 2004

more statistics: SPAM


spam_stat_Aug
Originally uploaded by s_hare.
You may think I'm nuts. That's fine with me. But you should know that I did not count my mails by hand, and seeing the result I think it was worth writing that little script which made this plot. So what we see here is the total amount of mail received by the author in the month of August, and the red part gives the amount of SPAM-mail contained in this - SPAM-mail which was successfully filtered out, which I never had to look at. There is still a fraction going through the filters, about two to ten daily, but they are marked as doubtful. I am shocked by the amount of SPAM and what a large fraction it has. If this was my mailbox at home, I'd be furious.

The technical bit: I use procmail on the mailserver, send the mail through SpamAssassin, and then through bogofilter. It is set up in such a way that bogofilter learns from SpamAssassin. Moreover, when it is in doubt, it gets feedback from me. The goodlist has 7.4 MB and the spamlist 4.3 MB. This setup has saved me from seeing 3.3 MB of SPAM in August. Not a single mail was lost through false alarm. I keep my fingers crossed.