<?xml version="1.0"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
>

<channel>
<title>Decaflon Thread: Writing An Effective Spam Filter</title>
<link>http://decaflon.com/notes/</link>
<description>Decaflon Thread: Writing An Effective Spam Filter</description>
<language>en</language>
<pubDate>Fri, 09 Jan 2009 05:12:23 +0000</pubDate>

<item>
<title>Writing An Effective Spam Filter</title>
<link>http://decaflon.com/programming/notes/14024/p/1/#response-115593</link>
<pubDate>Tue, 06 May 2008 20:39:02</pubDate>
<dc:creator>corenominal</dc:creator>
<guid isPermaLink="false">115593</guid>
<description>&lt;p&gt;@Ozone42: A quote from the Wikipedia article you linked to: &lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Today, Sisyphean can be used as an adjective meaning that an activity is unending and/or repetitive. It could also be used to refer to tasks that are pointless and unrewarding.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;I agree that this could become an unending activity, but I am not sure about the &quot;pointless and unrewarding&quot; part. Personally I find this type of task thoroughly interesting and rewarding. I have really enjoyed coding up my spamsnake and I am looking forward to continued tinkering :)&lt;/p&gt;
&lt;p&gt;Also, I wonder myself about some parts of the points system [&lt;em&gt;discussed in the linked article.&lt;/em&gt;] Having read the entire post and the comments, I am pretty sure that some of the flags/rules used are compound, so while they may appear odd on their own, they probably work well in conjunction with other rules.
&lt;/p&gt;</description>
</item>
<item>
<title>Writing An Effective Spam Filter</title>
<link>http://decaflon.com/programming/notes/14024/p/1/#response-115589</link>
<pubDate>Tue, 06 May 2008 19:34:59</pubDate>
<dc:creator>Ozone42</dc:creator>
<guid isPermaLink="false">115589</guid>
<description>&lt;p&gt;&lt;a href='http://en.wikipedia.org/wiki/Sisyphus'&gt;Sisyphus comes to mind&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The point system you linked to isn't bad, but I wonder about the -10 points for &quot;Interesting.&quot;  Perhaps if that was the majority of the comment made.&lt;/p&gt;
&lt;p&gt;I think the most important thing to keep in mind is not make it hard on the real people commenting.  If you catch 90% of the spam and it's still easy to leave a comment that doesn't get flagged, I think you've succeeded.  If you catch 100% of the spam over a week, but have 2 false positives, I think that's a failure.  Then again, it really depends on the level of traffic and spam we're talking about here.
&lt;/p&gt;</description>
</item>
<item>
<title>Writing An Effective Spam Filter</title>
<link>http://decaflon.com/programming/notes/14024/p/1/#response-115587</link>
<pubDate>Tue, 06 May 2008 19:07:07</pubDate>
<dc:creator>corenominal</dc:creator>
<guid isPermaLink="false">115587</guid>
<description>&lt;p&gt;&lt;em&gt;This is my first &lt;a href=&quot;http://chawlk.com&quot;&gt;Chawlk&lt;/a&gt; note. I only registered as a new Chawlk user this morning, and to be honest, I was not sure that I would be overly interested in the site/service; however, there seems to be a good mix of users and content on the site and it occurred to me that the &lt;a href=&quot;http://chawlk.com/notes/&quot;&gt;Chawk notes&lt;/a&gt; service might be a good place to post Dear Lazy Web type posts?! So here goes...&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Dear Lazy Web&lt;/p&gt;
&lt;p&gt;I am currently in the process of writing a new spam filter for the user comments system on website [&lt;em&gt;&lt;a href=&quot;http://crunchbang.org&quot;&gt;crunchbang.org&lt;/a&gt;&lt;/em&gt;]. I am following the &lt;a href=&quot;http://snook.ca/archives/other/effective_blog_comment_spam_blocker/&quot;&gt;same principles as described by Mr Snook&lt;/a&gt;. Do you have any experience of writing/creating similar, and if so, do you have any tips for effectively separating the spam from the ham?&lt;/p&gt;
&lt;p&gt;Best Regards,&lt;/p&gt;
&lt;p&gt;Philip
&lt;/p&gt;</description>
</item>

</channel>
</rss>

