Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S266412AbUAVStN (ORCPT ); Thu, 22 Jan 2004 13:49:13 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S266413AbUAVStN (ORCPT ); Thu, 22 Jan 2004 13:49:13 -0500 Received: from wsip-68-99-153-203.ri.ri.cox.net ([68.99.153.203]:31680 "EHLO blue-labs.org") by vger.kernel.org with ESMTP id S266412AbUAVStJ (ORCPT ); Thu, 22 Jan 2004 13:49:09 -0500 Message-ID: <40101B1E.3030908@blue-labs.org> Date: Thu, 22 Jan 2004 13:49:02 -0500 From: David Ford User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7a) Gecko/20040121 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Lang CC: Jes Sorensen , Zan Lynx , Andreas Jellinghaus , Linux Kernel Mailing List Subject: Re: [OT] Confirmation Spam Blocking was: List 'linux-dvb' closed to public posts References: <20040121194315.GE9327@redhat.com> <1074717499.18964.9.camel@localhost.localdomain> <20040121211550.GK9327@redhat.com> <20040121213027.GN23765@srv-lnx2600.matchmail.com> <1074731162.25704.10.camel@localhost.localdomain> <401000C1.9010901@blue-labs.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1731 Lines: 45 I've been amusing myself once or twice a week by studying some of these emails. Due to the use of common words just like your email below, bayesian score is far too low (granting it a negative point value in SA). The problem is that properly trained is too fluid. It'd be far more achievable if I only talked geek.. Or if I only talked automotive. Or that I only talked medical. However, my "vocabulary" is far to varied to train a bayesian filter that the use of medical terms, computer terms, or a given topic, is taboo. It cuts the gray area far to close to the middle of the road and thus makes marking the email as probable spam useless. All I'm doing now is wasting CPU because in the end I'm doing the job of dealing with the spam myself. Yes, I did see this. I'm not so spiteful and actively pay attention to my queue when having this type of correspondence. David David Lang wrote: >On Thu, 22 Jan 2004, David Ford wrote: > > >>Considering that Bayesian filters are useless against the new spam that >>is proliferating these days, that's laughable. Spam now comes with a >>good 5-10K of random dictionary words. >> >> >so we need to extend the Bayesian filters to deal with multi-word combos, >how many legit mail has those dictionary words in them? properly traind >their presence should help identify the spam. > >not that you will ever see this (other then through the list) as I won't >respond to your confirmation message. > >David Lang > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/