EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 28 Dec 2012, 08:06 PM   #1
andrearp
Member
 
Join Date: Aug 2005
Posts: 35
Where's my Spam Bayes DB?

Hi,

I've been using Fastmail for years and I'm completely satisfied with the service.

However, something happened to my Spam Bayes database: I'm quite sure I **WAS** "using" the "personal" Db as it had a lot of "learned" messages, and there's a threshold of messages to begin using the "personal" Db instead of the global. This is an automatic process.

Something strange happened, I think, a couple of weeks ago. While quickly reviewing the Spam folder, I noticed a few messages down there... that shouldn't have been there as they all belonged to folders where similar messages had been filed for years, and those folders had spam learning enabled. Then for a few days I noticed some spam messages not caught in the spam filters.

NOW I was playing around with settings and I see my Bayes Db contains just a few pieces of spam samples, while many more non spam samples have been learned by non-spam learing folders.

I didn't clear my Db using the clear button, I'm quite sure... so ... has anybody seen a similar behaviour?
andrearp is offline   Reply With Quote

Old 28 Dec 2012, 09:32 PM   #2
janusz
The "e" in e-mail
 
Join Date: Feb 2006
Location: EU
Posts: 4,944
It seems that something similar happened in the past. I'd submit a support ticket, may be your database can be restored....
janusz is offline   Reply With Quote
Old 29 Dec 2012, 04:19 AM   #3
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,927
Arrow Searching for user or global BAYES tags in messages

It's easy to determine if Bayes filtering changed over time. When you look at the full headers of a message (such as by reading the raw message contents), you will see the following header near the top (usually about the 6th header): X-Spam-hits: (spam indicator list returned by SpamAssassin)
There are usually two entries in the spam hit list which start with BAYES:
  • BAYES_XX (where XX is a two-digit number): This is followed with a spam value which is currently as low as -1.9 for BAYES_00 (low likelihood of spam) and up to 3.5 for BAYES_99 (high likelihood of spam).
  • BAYES_USED (followed by user, global, or global_overload): Indicates whether the BAYES_XX was derived from your enabled user database or the global database. If the tag is global_overload, the global database was used due to transient system overload.
You can search for the user or global database tag in the new AJAX interface as follows:
  • Log into the normal new interface.
  • In the Search mail box in the upper right enter the following (you can cut/paste if you wish). This will find all messages received after Aug 1 2012 which used the global database. You can change the date as needed. The case is important (AND must be capitalized as shown).
    Code:
    after:2012.08.01 AND header:"X-Spam-hits: global"
  • After the search results are shown (for the current folder), you can change the search folder or search in all mail folders.
  • You can search for all messages (of any date) which used the user database with the following search.
    Code:
    header:"X-Spam-hits: user"
  • Please note that messages from people in your online Fastmail address book will contain the header X-Spam-known-sender: yes and this forces the X-Spam-score header value to 0.0 (not spam). So the user or global Bayes database is only used if the sender address is not in your online address book.
  • The searches shown above also work in the classic interface.
Bill

Last edited by n5bb : 30 Dec 2012 at 12:07 AM. Reason: Corrected BAYES_USED tags and added global_overload
n5bb is online now   Reply With Quote
Old 29 Dec 2012, 04:31 AM   #4
bramhall
Essential Contributor
 
Join Date: Jun 2006
Posts: 369
Quote:
Originally Posted by n5bb View Post
It's easy to determine if Bayes filtering changed over time.
Fine, but why should it change over time??The only two reasons I can think of are (a) a Bayes database crash (b) the user inadvertently resetting the database.
bramhall is offline   Reply With Quote
Old 29 Dec 2012, 05:02 AM   #5
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,927
It has not changed for my account. If the original poster searches their headers as I have suggested, we will be working with a verified issue starting on a certain date, rather than guesses.
n5bb is online now   Reply With Quote
Old 29 Dec 2012, 03:03 PM   #6
andrearp
Member
 
Join Date: Aug 2005
Posts: 35
Thanks for the suggestion. I took one of the folders, and searched the strings.

The results are interesting.

Of the (around) 8000 messages, approx 2000 have a "global" tag, 4000 and more have "user" and some 2000 are really old (recvd before June 2007) and have BAYES_00

The strange thing is that I discovered I've been switched more than one time to the "global" then back to "user" - now I vaguely remember that a couple of times, when I went to the Spam folder, I noticed an increase of non spam messages landing there, I selected them all and marked as non spam. I never remeber resetting the DB, but I've read about a couple of Bayes DB incidents

It seems the last event happened this August, but I may have not noticed or remember the right date (it seemed to me it happened more recently).

Consider that I CAREFULLY keep my address book updated, with 3300 contacts with multiple emails each, and contacts have a -30 spam score in the headers. So when I switch to "global", I don't see many legitimate messages going to the spam folder; I usually notice a few more "localized" spam messages getting through, probably because the global filter has not collected many "Italian sourced" spam samples

So when this happened it usually caused just a few badly managed newsletters and some device alerts (ouch! I added them to the address book!) to end up in the Spam folder.

However, this is a bit worrisome!
andrearp is offline   Reply With Quote
Old 30 Dec 2012, 12:11 AM   #7
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,927
To my knowledge, the operating system upgrade about two years ago (Dec 2010) mentioned in the link from janusz in this thread was the last large system user Bayes database failure. Other things you may see include:
  • Large messages (>1.1 M or so) are not processed by the SpamAssassin system, so the X-Spam-score and X-Spam-hits headers are not available. Your Sieve rules script should be executed for large messages, but any rules requiring the X-Spam-score header won't work.
  • Messages you have sent (normally filed into the Sent Items folder, which is renamed Sent in the new interface) or move to your account via an IMAP client will not have X-Spam-score or X-Spam-hits headers added by Fastmail.
  • If the spam processing system becomes overloaded the system may use the global Bayes database rather then the user database. In this case the X-Spam-hits BAYES_USER tag will change to global_overload.
My earlier post had correct search syntax, but my description of the BAYES_USER tags were incorrect. I have now corrected my earlier post and added global_overload.

Bill
n5bb is online now   Reply With Quote
Old 30 Dec 2012, 03:34 AM   #8
andrearp
Member
 
Join Date: Aug 2005
Posts: 35
It doesn't seem a message size or system overload issue. Messages are "marked" as "globally" o "user" filtered in very large blocks, just as if global has been used from date a to date b and then user has been used from date b to date c and so on.

However, I DID feel something was wrong when I found (just a few, to say the truth) messages in the wrong place but, again, as the address book rule is always valid, so this was not a big problem to me.

Anyway, gloabl or user based, the filtering is excellent. Now that I'm in global mode, I do see a few messages slip through the filter and get to my folders, but I started training the db again.
andrearp is offline   Reply With Quote
Old 30 Dec 2012, 07:48 AM   #9
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,927
You can train your user Bayes database on existing messages. So it should be easy to get the non-spam messages. If you don't empty your Spam (Junk Mail) very often, you might have enough there also. I have my Junk Mail folder set to auto-delete after 31 days, and currently 200 spam messages have accumulated over that interval. I have my spam filter set to auto-discard at a spam level greater than 5, so if I remove that setting it will accumulate spam faster.

Bill
n5bb is online now   Reply With Quote
Old 30 Dec 2012, 09:37 AM   #10
ChinaLamb
The "e" in e-mail
 
Join Date: Dec 2004
Location: a virtually impossible but finitely improbable position
Posts: 2,320
Quote:
Originally Posted by n5bb View Post
You can train your user Bayes database on existing messages. So it should be easy to get the non-spam messages. If you don't empty your Spam (Junk Mail) very often, you might have enough there also. I have my Junk Mail folder set to auto-delete after 31 days, and currently 200 spam messages have accumulated over that interval. I have my spam filter set to auto-discard at a spam level greater than 5, so if I remove that setting it will accumulate spam faster.

Bill
I have mine set a bit higher for auto discard. Had some legit messages come through at 9 and 10. Turns out people sending email from other countries easily get flagged as spam (their ip address perhaps?). Other languages also seem to get flagged easily. Turned off auto discard for a while and saw some messages come through with a spam score of over 100! That was interesting.
ChinaLamb is offline   Reply With Quote
Old 30 Dec 2012, 09:50 AM   #11
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,927
I only have my discard setting set that low because I own my own domain and also use a hobby forwarding service, both of which get a large amount of spam. I was tired of my Junk Mail box getting filled with several messages an hour. When you realize that the vast majority of spam connections are refused by the Fastmail incoming server without being accepted, it's amazing how many spam are being sent.
n5bb is online now   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 02:21 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy