Great Circle Associates List-Managers
(October 2002)
 

Indexed By Date: [Previous] [Next] Indexed By Thread: [Previous] [Next]

Subject: Re: ISPs Wrongly Block 1 in 8 Messages for Spam
From: "Michael C. Berch" <mcb @ postmodern . com>
Date: Thu, 31 Oct 2002 14:57:00 -0800
To: List Managers List <list-managers @ greatcircle . com>
In-reply-to: <200210311444.g9VEiWU31167@mail.rev.net>

I am a user, tester, and occasional developer of SpamAssassin.  There's 
a lot of semi-accurate information about SA floating around the Net.

First of all, SA does not use a specific or "arbitrary" method of 
identifying spam. Instead, it is an open platform that has a number of 
different techniques plugged into it (and is extensible if you want to 
write/add your own).   SA techniques include:

* Internal pattern-matching on header & body parts
* Second-order (syntactic) analysis of patterns
* Analysis of embedded code (JavaScript, HTML, etc.)
* Automatic (feedback-based) whitelist and blacklist processing
* Use of external blocking lists like MAPS RBL, DUL, Osirus, Ordb.org, 
SpamCop, RFCI, et al.
* Use of Vipul's Razor (known spam database)

Future plans include hooks for Bayesian Filtering.

The rulesets and scoring are repeatedly applied to a set of spam, 
nonspam, and mixed message corpuses, using a genetic algorithm, to 
determine scores.  They are absolutely not "arbitrary", i.e., having a 
person decide a particular word or phrase is "spam" or not.

The reason that SA works so well -- and I believe that it's the best at 
what it does -- is that there is no one "best" way to identify spam.  
There are multiple techniques with varying degrees of success, and if 
you combine them all, and allow a self-correcting feedback technique 
determine the score (likelihood of a message being spam) you get a very 
high degree of success.   Plus the ability of any user to override 
various rules and scores to meet his/her individual needs.

-- 
Michael C. Berch
mcb@postmodern.com





References:
Indexed By Date Previous: Re: ISPs Wrongly Block 1 in 8 Messages for Spam
From: Tom Keyser <tkeyser@mail.com>
Next:
From: (nil)
Indexed By Thread Previous: Re: ISPs Wrongly Block 1 in 8 Messages for Spam
From: "Bernie Cosell" <bernie@fantasyfarm.com>
Next: Re: ISPs Wrongly Block 1 in 8 Messages for Spam
From: bwarsaw@python.org (Barry A. Warsaw)

Google
 
Search Internet Search www.greatcircle.com