It seems I am still having problems with spam on this site now. Instead of getting tons of spam comments posted, I get false positive, that is legitimate content marked as spam.
First off, I wish to apologize for the people that haven't seen their comments published immediately. Next, I will work on improving this and explaining the current situation. Here are the statistics that the spam module gives me:
- prevented comment spam: 100 (27% aka positives)
- marked comment as spam: 242 (65% aka positives)
- manually marked comment as spam: 3 (0.8%, aka false negatives)
- marked comment as not spam: 17 (5% negatives)
- manually marked comment as not spam: 8 (2.2% false positives)
- total: 370 comments
Another way to put it:
- spam: 93%, 99% of which is automatically detected
- ham: 7%, 32% of which is marked as spam
In other words, the filter works very well at marking spam, so well in fact that it also marks non-spammy comments as spam... Which is a shame because it breaks the interactivity of the site. Basically, you have one chance out of 3 of not seeing your content posted online, which leads people to try to submit it three times, which actually gets them blocked even more.
And this is only the spam module filters: the CAPTCHA module tells me it's blocked 26535 attempts (26 thousand) - amazing.
So overall, the solutions I put in place a year ago work, somehow, but are too aggressive. I have therefore bumped the threshold from 65% to 80%, hopefully that will help with this.
I have also updated the Honeypot module since they have done lots of releases since 1.5, fixed the issue I reported and it's now in use on Drupal.org after some tweaking by our venerable webmaster killes.
If this fails, I will look again at other solutions, like the blogspam plugin that relies on the communnity-run blogspam.net site which powers, amongst other things, Debian Administration and Ikiwiki sites. Back then, I also was considering the Bad Behavior and Block Anonymous links modules.
(Note to self: I have also removed around 150 comments sitting in the spam module moderation queue, that it didn't seem to remove as it should have done.)