EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 11 Mar 2014, 01:37 AM   #1
kballard
Junior Member
 
Join Date: Mar 2014
Posts: 1
Spam questions

I have two questions about spam filtering in FastMail.

The first has to do with auto-flagging. As per FastMail's suggestion, I have my INBOX set to automatically mark messages as "Not Spam", and my Junk folder set to flag messages as Spam. My question is, if a spam mail gets through my filter and lands in my INBOX, and then I subsequently move it to Junk, does the initial "Not Spam" flag hurt the spam filtering, or does the subsequent Spam flag completely undo anything the "Not Spam" flag did?

My second question is that I get a variety of spam that still regularly makes it through the spam filter into my inbox, despite having flagged almost 1000 spams at this point. These spam messages are all written in some CJK language (either Chinese or Japanese) and they all have an attachment. I don't know why they keep making it through my spam filter when almost everything else is caught at this point. Does anyone have suggestions for how to deal with particular messages like this? I had a similar issue with Russian email that I solved with a Sieve rule that looks like `header :is ["X-Spam-charsets"] "from='koi8-r', subject='koi8-r', plain='koi8-r'"`, but these CJK messages all have a utf-8 charset. The most recent one has a X-Spam-hits entry `LANGUAGES ja.shift-jis`, but others that I looked at don't have this at all.

I'm hoping that eventually SpamAssassin will figure out how to filter these on its own, but I'm worried about how long it's taken so far. I'm also worried that the answer to my first question is that the auto-flagging will actually hurt the spam filtering of any messages that make it into my INBOX, and if so that would make it even harder for SpamAssassin to learn to flag these CJK messages.
kballard is offline   Reply With Quote

Old 11 Mar 2014, 07:11 AM   #2
BritTim
The "e" in e-mail
 
Join Date: May 2003
Location: mostly in Thailand
Posts: 3,093
Welcome to the forums!

I expect Bill will chime in with a fuller and better exposition on this subject later.

Meanwhile, here is an answer to your first question. Assuming you check email promptly (say, at least twice a day) the spam in the inbox should not be a big issue with Bayes training. It is not a realtime process. The folder is checked only once a day for new ham. You can see the approximate time of day this occurs by looking at the details of the folder from the Folders screen (look for last scan).

On the second issue, maybe you can approach this a slightly different way. Look carefully at the spam headers of the offending emails. (First, do verify that BAYES_USED is set to user.) Next, look at the total X-spam-score. If the score is close to the spam threshold, you can possibly just be a little more aggressive with your filtering. The default settings are quite conservative. Especially, if you whitelist your known correspondents by adding them to your contacts, you can use lower settings pretty safely.
BritTim is offline   Reply With Quote
Old 11 Mar 2014, 08:59 AM   #3
DrStrabismus
The "e" in e-mail
 
Join Date: May 2002
Posts: 2,804
SpamAssassin can tell whether an email has already been learned, so it trains, ignores or untrains and retrains accordingly. It does have a problem with identifying languages in unicode, but if Bayes is catching them I would suggest filtering on that.
DrStrabismus is offline   Reply With Quote
Old 11 Mar 2014, 10:57 AM   #4
n5bb
Intergalactic Postmaster
 
Join Date: May 2004
Location: Irving, Texas
Posts: 8,927
Arrow Backscatter?

Yes, you can always correct your spam marking and the last change will be used for future filtering.

If all of those foreign language messages include an attachment, then I suspect that those are backscatter failed delivery notices due to some spammer(s) using your From address in their spam to email services in Japan or some other far east country. If the spammer is sending to non-existing addresses, the email system will return a message to the Return-Path address (often with an attachment containing the original email). So if spammers are spoofing your address in the Return-Path, you will receive large numbers of bounces.

If these are truly backscatter, then they aren't normal spam but instead a side effect of spam. Fastmail attempts to discover these and deal with them as shown on the Advanced>Spam/Virus Protection - Custom screen (Backscatter Action). But I'm pretty sure that Fastmail is only searching for English (or maybe a few European languages) to find backscatter, so your backscatter may not be detected properly by the Fastmail backscatter filter.

You might look at the X-Spam-source headers and see if there are certain common Host, Country, FromHeader, or MailFrom header values only found on these messages. Also look at the X-Mail-from and even the From headers, since it's possible these messages may be bounces from only a few email systems.

If you find some header values that are always showing up in these undesired messages, it's possible to filter these messages. Just post here and someone will help you create a filter. I recommend using the filter to file into a new folder so you know how well the new rules are working and can find any false positives.

Good luck at reducing the number of these messages. If they are (as I suspect) foreign language backscatter, you may have a hard time filtering them if they are coming from many different email systems in various countries.

Bill
n5bb is offline   Reply With Quote
Old 13 Mar 2014, 11:48 PM   #5
Glendon CDN
Member
 
Join Date: Feb 2004
Location: Markham, ON Canada
Posts: 80
On Advanced Settings my Bayes Status shows as "Global". How/where do I change it to "user" (as suggested above)? The Global "button" is not clickable.
Glendon CDN is offline   Reply With Quote
Old 14 Mar 2014, 12:56 AM   #6
DrStrabismus
The "e" in e-mail
 
Join Date: May 2002
Posts: 2,804
You have to train a minimum number of spam and ham (200 of each I think) before you switch to using your per user database.
DrStrabismus is offline   Reply With Quote
Old 14 Mar 2014, 01:57 AM   #7
Ksmith
Junior Member
 
Join Date: Jan 2014
Posts: 7
So if you look at the headers and see lots of ".UK" domain spam, how do you create a filter to add 5 points to any email with a UK domain?
Ksmith is offline   Reply With Quote
Old 14 Mar 2014, 07:40 AM   #8
lane
Cornerstone of the Community
 
Join Date: Dec 2005
Location: Kars, NB, Canada
Posts: 702
Quote:
Originally Posted by Ksmith View Post
So if you look at the headers and see lots of ".UK" domain spam, how do you create a filter to add 5 points to any email with a UK domain?
I don't think you can. But you can write a rule (filter) to move it to your spam folder or discard it when the spam score is 5 points lower than usually necessary to move it there. For example, I get a large amount of spam (about 150 per day) to one particular address at my domain, call it trouble@mydomain.com. So the rule for me is entered under Discarding Emails:

Message with: Advanced
That: NA
The text: allof(header :contains "X-Delivered-to" "trouble@mydomain.com", header :value "ge" :comparator "i;ascii-numeric" ["X-Spam-score"] ["3"], not header :contains ["X-Spam-known-sender"] "yes" )

This discards all email to that address if the spam score is 3 or more, unless it is from a known sender. My usual discard level is 10, but this rule makes it 3 just for that one particular address. You could do the same, either as a discard or file to the spam folder.
lane is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 10:23 PM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy