EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 9 Nov 2018, 04:07 PM   #16
xyzzy
Essential Contributor
 
Join Date: May 2018
Posts: 474
Quote:
Originally Posted by jhollington View Post
Actually, while your'e correct that messages are identified as spam at the point of delivery, it is the point of deletion at which spam messages are trained as spam (in terms of the Bayes database). This is to prevent false positives that would come from simply training based on the spam folder (although you can set your standard junk mail folder to train all messages that are in it as spam, this isn't the default setting, and in fact FastMail specifically recommends that you don't do this.
- - -
So unless you actually set your Junk Mail folder to be a source for learning spam, the Bayes database only gets updated when a message is specifically marked as spam or when it's deleted from this folder —*either manually or as part of an auto-purge rule.
Ok, now I'm a bit confused and need clarification. Folder settings have a Spam Learning and Auto-purge. So according to what you are saying Spam Learning should or should not be set to get my filters to learn what is spam in the spam folder (and again, when)? I would think you would want to set Auto-purge to clean all the stuff out after a specified time. Are my filters learning at purge time then too?

These settings are in all folders. But it looks like the Spam folder (and subfolders under it) are treated differently in that you can flag their contents as "not spam" while in all other folders you can flag stuff as spam and it will move to the spam folder. Training seems clearer for non-spam folders but can you define the exact effects of the Spam Learning and Auto-purge for both spam and non-spam folders?

It's the timing that's got me confused. The system spam filers flag stuff as spam and put it into the spam folder on the way in. So how does doing anything with my settings affect correct possible mistakes other than flagging the mistakes as not spam? The more I talk about this the more I am getting confused. Didn't think it was so complicated. It's all your fault!
xyzzy is offline   Reply With Quote
Old 9 Nov 2018, 10:13 PM   #17
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Quote:
Originally Posted by NumberSix View Post
Holy moly! This possibly explains my recent dissatisfaction with the spam learning... I haven't emptied my Spam folder for quite a long time. Thanks for the explanation!
Heh, no problem. It's an easy trap to fall into, since most email clients/systems kind of work the opposite way. I also think that maybe once upon a time Fastmail actually did set the spam folder to auto-learn, but I can't remember for sure. I know that I personally had it set that way for a very long time, until I realized that I was doing my Bayes database a disservice.

Quote:
Originally Posted by n5bb View Post
You can set the properties for a folder so that unpinned messages in that folder are auto-purged (permanently deleted) after a chosen time interval So you could set the auto-purge delay to 7 days and know that your personal Bayes database would be updated, giving you a week to check for false positives.
This is exactly what I do, since I primarily use an IMAP client. I avoid manually deleting anything from the spam folder, and just let Fastmail do its thing. After seven days, whatever is left in there gets learned as spam.
jhollington is offline   Reply With Quote
Old 9 Nov 2018, 10:43 PM   #18
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Quote:
Originally Posted by xyzzy View Post
Ok, now I'm a bit confused and need clarification. Folder settings have a Spam Learning and Auto-purge. So according to what you are saying Spam Learning should or should not be set to get my filters to learn what is spam in the spam folder (and again, when)? I would think you would want to set Auto-purge to clean all the stuff out after a specified time. Are my filters learning at purge time then too?
No problem. Let me see if I can break it down in point form...

Spam Learning: When a folder — any folder — is set to "Learn as Spam," then the contents of that folder are scanned daily and added to the Bayes database as examples of spam messages. There used to be a way in the older UI to see the exact time when each folder was last scanned, but I don't think that's there anymore.

Auto-purge: This feature simply removes messages automatically after a certain interval — either all messages, or only those unpinned (unflagged) messages. However, this is where the Spam folder is a special case —*not because of anything specifically magical about auto-purge, but simply because it's at the time of deletion from the Spam folder — either manually or automatically —*that a message is added to the Bayes database.

Quote:
It's the timing that's got me confused. The system spam filers flag stuff as spam and put it into the spam folder on the way in. So how does doing anything with my settings affect correct possible mistakes other than flagging the mistakes as not spam? The more I talk about this the more I am getting confused. Didn't think it was so complicated. It's all your fault!
Haha, let me try and explain the process in hops that it will provide some clarity (and not more confusion )...

There are two stages to spam handling in Fastmail — spam detection and spam training.

Spam detection is simply the process by which Fastmail identifies incoming messages as spam. This is done using a number of factors, one of which is your personal Bayes database. Messages that are detected as spam by Fastmail are put into the Spam folder, but they are NOT added to the Bayes database right away. This is to prevent the Bayes database from getting cluttered with false positives in the event that Fastmail gets a message wrong —*which of course it does from time to time, especially when you're first training it.

Spam training is the process by which Fastmail learns what is spam and what is not. There are three ways that you can teach Fastmail that a message is spam:
  1. Manually mark a message as spam by clicking the "Report Spam" button. This moves it to the spam folder and updates your personal Bayes spam database right away.
  2. Move the message to a folder that has been set to "learn as spam." New messages in these folders are added to your personal Bayes database once per day (so you should leave messages in a "learn as spam" folder for at least 24 hours).
  3. Delete a message from the Fastmail spam folder — either manually or using auto-purge. These messages are added to your personal Bayes spam database right away.
The key point is that messages in the Spam folder are NOT automatically trained as spam until they're deleted. This gives you ample opportunity to check and clean up the spam folder without worrying about legitimate messages getting misidentified as spam and messing with future spam detection. Of course, you can set the Spam folder to "learn as spam" in the same way as any other folder, but Fastmail recommends against doing this, and suggests that only folders that you manually move email to should be set to learn spam.

In short, messages are only added to your personal Bayes spam database if they are manually reported as spam, left in a folder that's been set to "learn as spam" for at least 24 hours, or deleted from the system "Spam" folder.

It's also important to understand that spam detection actually works with two lists — one list for spam messages and one for not-spam ("ham") messages. This is why false positives are a problem — if you have your default Spam folder also being used to learn spam, then every message in there will be added to the spam list, even if it shouldn't be. Later marking a legitimate message as "not spam" doesn't remove it from the "spam" list, but simply adds it to the "ham" list, so you end up with two conflicting entries —one that says that message is spam and one that says it's not.

Note that if you're only using the Fastmail web interface, I can't see any reason to even concern yourself with the folder learning features. These are primarily for users of IMAP clients like Apple Mail and Thunderbird where there's no direct button to report spam to Fastmail. If you're using the web interface, the very best way to report a spam message is to simply click "Report spam" and be done with it (and vice-versa with the "Not spam" button for messages that are not spam).
jhollington is offline   Reply With Quote
Old 10 Nov 2018, 12:14 AM   #19
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,917
Quote:
Originally Posted by jhollington View Post
...Note that if you're only using the Fastmail web interface, I can't see any reason to even concern yourself with the folder learning features. These are primarily for users of IMAP clients like Apple Mail and Thunderbird where there's no direct button to report spam to Fastmail. If you're using the web interface, the very best way to report a spam message is to simply click "Report spam" and be done with it (and vice-versa with the "Not spam" button for messages that are not spam).
That’s a great post! The only thing I might add concerning the personal Bayes system is that you should also add some ham (non-spam) to your personal Bates database. Manually reporting false spam positives using the “Not spam” button (as described above) is critical, since you want to train the Bates database correctly when a message is improperly classified as spam. But you also need other good messages for the Bayes database. For example, you need at least 200 non-spam reported to initially enable the personal Bayes filter to be activated. So be sure that the Archive folder is set to train as non-spam. I also use some folders with filing rules which insure that only ham reach those folders.

The Bayes spam score is only one component of the spam score which is used to determine when to file messages into your Spam folder. It’s also possible for the address book whitelisting to be disabled if a message has been forwarded or for some other reason DMARC or other authentication measures fail. So it’s still possible for ham to appear in your Spam folder, especially if you reduce the threshold for the spam filter below the default value. So be sure check your spam folder for obvious ham before discarding messages from it.

Bill
n5bb is offline   Reply With Quote
Old 10 Nov 2018, 02:42 AM   #20
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Quote:
Originally Posted by n5bb View Post
That’s a great post! The only thing I might add concerning the personal Bayes system is that you should also add some ham (non-spam) to your personal Bates database. Manually reporting false spam positives using the “Not spam” button (as described above) is critical, since you want to train the Bates database correctly when a message is improperly classified as spam. But you also need other good messages for the Bayes database. For example, you need at least 200 non-spam reported to initially enable the personal Bayes filter to be activated. So be sure that the Archive folder is set to train as non-spam. I also use some folders with filing rules which insure that only ham reach those folders.
Thanks

I also keep the Archive folder marked to learn "not spam" but I don't think that's necessary either unless you're using an IMAP client (and maybe not even then). According to Fastmail's page on Improving spam protection (emphasis mine):

Quote:
Everybody's spam is different. When you report spam that's slipped through our filters, or non-spam that we've mistakenly classified, we feed this information into a database that's tuned just for you. We also automatically train this with spam you've deleted permanently from your spam folder, and non-spam you've moved to your Archive folder or replied to.
So it appears that anything that's moved to the archive folder is automatically trained as non-spam, although it's less clear if this is the case when archiving messages in an IMAP client .... I know somebody from Fastmail once said that messages deleted from the spam folder in an IMAP client aren't learned as spam, so I wouldn't be surprised if IMAP moves don't get tracked in the same way either, as it's probably something tied into the web client actions on the front-end rather than monitoring changes in the back-end message store.

I also have a "HAM" folder on my account with a seven-day purge for training non-spam that I don't otherwise want to keep. Things like newsletters and notifications —*which I still want to land in my inbox but usually delete after reading —*get dropped into the "HAM" folder, where they'll be learned as non-spam and then automatically deleted afterward.
jhollington is offline   Reply With Quote
Old 10 Nov 2018, 07:05 AM   #21
NumberSix
Cornerstone of the Community
 
Join Date: Jan 2003
Location: The Village
Posts: 599
Quote:
Originally Posted by n5bb View Post
You can set the properties for a folder so that unpinned messages in that folder are auto-purged (permanently deleted) after a chosen time interval
Yep, did that
NumberSix is offline   Reply With Quote
Old 10 Nov 2018, 07:57 AM   #22
gardenweed
Cornerstone of the Community
 
Join Date: Jun 2008
Location: Perth
Posts: 664
Quote:
Originally Posted by jhollington View Post
....

I also have a "HAM" folder on my account with a seven-day purge for training non-spam that I don't otherwise want to keep. Things like newsletters and notifications —*which I still want to land in my inbox but usually delete after reading —*get dropped into the "HAM" folder, where they'll be learned as non-spam and then automatically deleted afterward.
Interesting. I just use the Trash folder for that purpose. It is set to learn non-spam.
My logic is that if I have simply moved the email to Trash, rather than marked it as spam, then it is Ham.
gardenweed is offline   Reply With Quote
Old 10 Nov 2018, 10:16 AM   #23
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Quote:
Originally Posted by gardenweed View Post
Interesting. I just use the Trash folder for that purpose. It is set to learn non-spam.
My logic is that if I have simply moved the email to Trash, rather than marked it as spam, then it is Ham.
Yeah, that thought occurred to me too as I was writing the last post, but to me there just seems to be something wrong with assuming that all trash is not spam

But I can see how that would work, other than the psychological aspect, as long as you don't ever manually delete anything from the spam folder in an IMAP client (in which case, that would end up in the trash), or of course forget to report spam messages as such, simply deleting them instead.
jhollington is offline   Reply With Quote
Old 11 Nov 2018, 01:46 PM   #24
xyzzy
Essential Contributor
 
Join Date: May 2018
Posts: 474
jhollington
Thanks for the post. It, in conjunction with the "Improving spam protection" Fastmail doc helped clarify this stuff up for me (I hope).

With respect to the spam learning folder setting the Fastmail doc specifically talks about that in relation to email clients using IMAP.

Quote:
There's no mechanism in the IMAP protocol for hooking into our spam reporting system directly. However, you can nominate special folders in your account which we'll scan once a day to learn spam/non-spam.
I assume that was its original intent.

Two final (I hope) questions:
  1. When I was experimenting with the sieve filter (currently now disabled it knowing what I know now) I created some sub-folders under Spam, i.e., Spam was the parent. These folders appear to behave just like the Spam parent, i.e,, you can mark stuff in it as "not spam" as opposed to "regular" folders wher you mark stuff as spam. So while I may not need this any more it seems you can get the same spam training behavior from folders which are the children of the Spam folder. Just thought I mention this in case my assumption is wrong.
  2. Somewhat off topic - When I first saw the reference "ham" in this thread I just thought it was a typo but never questioned it. In yours and following posts I see it's not a typo. I never heard that term before (not counting the meat) so what does "ham" stand for? What's it's origin?
xyzzy is offline   Reply With Quote
Old 11 Nov 2018, 09:00 PM   #25
BritTim
The "e" in e-mail
 
Join Date: May 2003
Location: mostly in Thailand
Posts: 3,084
Quote:
what does "ham" stand for? What's it's origin?
If you know exactly what spam is, you should be able to figure it out. Spam is a kind of low quality canned pork meat product. It was used for spam after a Monty Python sketch where someone trying to tell someone something normally was drowned out by shouts of 'SPAM SPAM SPAM!!!'. Ham is usually a higher quality pork meat product. The use of spam (for fake email) led to ham (for real email). It is a kind of joke.

Last edited by BritTim : 11 Nov 2018 at 09:10 PM. Reason: add Monty Python reference
BritTim is offline   Reply With Quote
Old 11 Nov 2018, 09:22 PM   #26
Berenburger
The "e" in e-mail
 
Join Date: Sep 2004
Location: The Netherlands
Posts: 2,898
Quote:
Originally Posted by BritTim View Post
If you know exactly what spam is, you should be able to figure it out. Spam is a kind of low quality canned pork meat product. It was used for spam after a Monty Python sketch where someone trying to tell someone something normally was drowned out by shouts of 'SPAM SPAM SPAM!!!'. Ham is usually a higher quality pork meat product. The use of spam (for fake email) led to ham (for real email). It is a kind of joke.
Originally spam is canned ham.
https://en.wikipedia.org/wiki/Spam_(food)?wprov=sfti1
Berenburger is offline   Reply With Quote
Old 12 Nov 2018, 06:10 AM   #27
xyzzy
Essential Contributor
 
Join Date: May 2018
Posts: 474
Oh, that "ham". Gotcha. Thanks.
xyzzy is offline   Reply With Quote
Old 12 Nov 2018, 07:59 AM   #28
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,917
For the origin of the term “spam” with regards to email, see:
https://www.templetons.com/brad/spamterm.html

The term “ham” is commonly used for non-spam. I’m an amateur radio operator, and we are also called “hams”. This radio related use of “ham” has a long history but the source of that term has been disputed for over 50 years.

Bill
n5bb is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 12:17 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy