EmailDiscussions.com - View Single Post - Looking for opinions on discarding "spam"

jhollington · 9 Nov 2018, 09:43 PM

Quote:

Originally Posted by xyzzy

Ok, now I'm a bit confused and need clarification. Folder settings have a Spam Learning and Auto-purge. So according to what you are saying Spam Learning should or should not be set to get my filters to learn what is spam in the spam folder (and again, when)? I would think you would want to set Auto-purge to clean all the stuff out after a specified time. Are my filters learning at purge time then too?

No problem. Let me see if I can break it down in point form...

Spam Learning: When a folder — any folder — is set to "Learn as Spam," then the contents of that folder are scanned daily and added to the Bayes database as examples of spam messages. There used to be a way in the older UI to see the exact time when each folder was last scanned, but I don't think that's there anymore.

Auto-purge: This feature simply removes messages automatically after a certain interval — either all messages, or only those unpinned (unflagged) messages. However, this is where the Spam folder is a special case —*not because of anything specifically magical about auto-purge, but simply because it's at the time of deletion from the Spam folder — either manually or automatically —*that a message is added to the Bayes database.

Quote:

It's the timing that's got me confused. The system spam filers flag stuff as spam and put it into the spam folder on the way in. So how does doing anything with my settings affect correct possible mistakes other than flagging the mistakes as not spam? The more I talk about this the more I am getting confused. Didn't think it was so complicated.

It's all your fault!

Haha, let me try and explain the process in hops that it will provide some clarity (and not more confusion

)...

There are two stages to spam handling in Fastmail — spam detection and spam training.

Spam detection is simply the process by which Fastmail identifies incoming messages as spam. This is done using a number of factors, one of which is your personal Bayes database. Messages that are detected as spam by Fastmail are put into the Spam folder, but they are NOT added to the Bayes database right away. This is to prevent the Bayes database from getting cluttered with false positives in the event that Fastmail gets a message wrong —*which of course it does from time to time, especially when you're first training it.

Spam training is the process by which Fastmail learns what is spam and what is not. There are three ways that you can teach Fastmail that a message is spam:

Manually mark a message as spam by clicking the "Report Spam" button. This moves it to the spam folder and updates your personal Bayes spam database right away.
Move the message to a folder that has been set to "learn as spam." New messages in these folders are added to your personal Bayes database once per day (so you should leave messages in a "learn as spam" folder for at least 24 hours).
Delete a message from the Fastmail spam folder — either manually or using auto-purge. These messages are added to your personal Bayes spam database right away.

The key point is that messages in the Spam folder are NOT automatically trained as spam until they're deleted. This gives you ample opportunity to check and clean up the spam folder without worrying about legitimate messages getting misidentified as spam and messing with future spam detection. Of course, you can set the Spam folder to "learn as spam" in the same way as any other folder, but Fastmail recommends against doing this, and suggests that only folders that you manually move email to should be set to learn spam.

In short, messages are only added to your personal Bayes spam database if they are manually reported as spam, left in a folder that's been set to "learn as spam" for at least 24 hours, or deleted from the system "Spam" folder.

It's also important to understand that spam detection actually works with two lists — one list for spam messages and one for not-spam ("ham") messages. This is why false positives are a problem — if you have your default Spam folder also being used to learn spam, then every message in there will be added to the spam list, even if it shouldn't be. Later marking a legitimate message as "not spam" doesn't remove it from the "spam" list, but simply adds it to the "ham" list, so you end up with two conflicting entries —one that says that message is spam and one that says it's not.

Note that if you're only using the Fastmail web interface, I can't see any reason to even concern yourself with the folder learning features. These are primarily for users of IMAP clients like Apple Mail and Thunderbird where there's no direct button to report spam to Fastmail. If you're using the web interface, the very best way to report a spam message is to simply click "Report spam" and be done with it (and vice-versa with the "Not spam" button for messages that are not spam).