SpamAssassin is a mail server application used to identify and tag spam. It uses a series of rules to analyze each piece of incoming email and then assign a score to that mail based on its characteristics. The higher the score assigned, the more likely that mail is spam. Once a certain threshold is reached for a given piece of mail, SpamAssassin tags the mail as being probably spam.
SpamAssassin handles spam tagging by adding several headers to each piece of mail identified as spam, and by prepending a special identifying string to the subject line of each such message. The headers and subject line can then be used, if you wish, by your mail client to sort probable spam for special handling.
SpamAssassin is a powerful tool, but it identifies suspected spam by its contents. It will probably correctly identify and mark from 80 to 95% of the spam. However, some legitimate mail is likely to be tagged as probable spam, and some spam is likely to slide through untagged because it is written to look more like regular email.
To find out whether a particular message has been tagged as SPAM just take a look at the full header of the email. To analyze the email's header, first you must open it up. Most email programs hide most of the header's content, but here are the instructions for displaying the headers in some of the more popular programs:
Outlook Express: Right-click on the mail to get a drop-down menu. Select "Properties." Select the "Details" tab and the full header appears in the window. You can then click on the "Message Source" button to see the whole message source in a new window.
Mac OS X Mail: Select the "View > Message > Raw Source" and you will see the whole message source right in the message field.
Eudora: With the mail open in Eudora, click on the "BlahBlahBlah" (sic) button in the toolbar.
Netscape Mail (3.0): Open the "Options" menu and select "Show All Headers."
Netscape (+4.0): Open the "View" menu and select "Headers."
In the full header, you will see "X-Spam-Status: Yes" (or "No"), the number of spam "points" the message has received, the "Hits" received by the email and the tags/flags/rule names. Somethimg like this:
X-Spam-Status: Yes, hits= 5.9 required= 4.0 tests= HTML_FONT_FACE_BAD, HTML_IMAGE_RATIO_02, HTML_MESSAGE, IP_WHITELIST,
MIME_BASE64_TEXT, TVD_SPACE_RATIO autolearn= disabled version= 3.002005
X-Spam-Flag: YES
X-Spam-Level: ***** |
This the explanation of some of those rules:
| Test name |
Score |
Area |
Description |
| SUBJ_ILLEGAL_CHARS |
0.10 |
Subject |
Subject has too many raw illegal characters |
| HTML_IMAGE_ONLY_20 |
0.64 |
Body |
HTML: images with 1600-2000 bytes of words |
| HTML_MESSAGE |
0.00 |
Body |
HTML included in message |
| MIME_HTML_ONLY |
0.00 |
Body |
Message only has text/html MIME parts |
| HTML_MIME_NO_HTML_TAG |
0.51 |
Body |
HTML-only message, but there is no HTML tag |
| HTML_SHORT_LINK_IMG_3 |
0.52 |
Body |
HTML is very short with a linked image |
| PLING_PLING |
0.46 |
Subject |
Subject has lots of exclamation marks |
| MANY_EXCLAMATIONS |
0.00 |
Subject |
Subject has many exclamations |
| DATE_IN_PAST_06_12 |
0.75 |
Date |
Date is 6 to 12 hours before Received: date |
| HTML_TAG_EXIST_TBODY |
0.13 |
Body |
HTML has "tbody" tag |
| HTML_IMAGE_RATIO_02 |
0.19 |
Body |
HTML has a low ratio of text to image area |
| BAYES_60 |
3.5 |
Body |
Bayesian spam probability is 60 to 80% |
| MIME_BASE64_TEXT |
1.7 |
Body |
Message text disguised using base64 encoding |
You can see the full list here.