EmailDiscussions.com  
WORTH A LOOK: Guide to Fax to Email and Email to Fax Services
Did you know you can now send and receive faxes via email? That's right, you don't even need a fax machine! Click here to compare online fax services.

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail.FM Forums > FastMail.FM General Discussions
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

FastMail.FM General Discussions Everything that does not belong in the help or feature requests Forums goes here. This includes discussion about FastMail.FM policies, development (such as stylesheet development),FastMail.FM support sites like the Wiki, and so forth.

Reply
 
Thread Tools
Old 5th July 2003, 05:32 PM   #1
robmueller
Intergalactic Postmaster
 
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 6,082

Representative of:
Fastmail.FM
Send a message via Yahoo to robmueller
SpamAssassin - good news/bad news

Ok, good new first, I've found the problem that was probably causing some of the low scores. Most of the RBL lists weren't working but now should be.

Second good news, while I was there, I changed the code to always add the X-Spam-hits header if you have filtering enabled, whether it hits your threshold or not. Should help track down issues more easily in the future.

Third good news. I hadn't realised, but SA 2.55 by default now uses a Bayes type database. At the moment it's using a single global one for all users that it's self building based on it's normal rules checking procedures. I know that technically the strongest point of a Bayes database is that it's supposed to tune itself to each users spam/ham emails, but I think a global one should be a helpful start for catching common spam characteristics not normally seen by the standard SA rules. It would be nice to build a set of trusted users who could report email to add to this database, though there's work there on creating the interface...

Now the bad news. Due to a mixup during the testing, there was about 15 minutes where all incoming email that was being spam checked, was being marked as "X-Spam: spam", regardless of what spam assassin said! I'm going to check the log and inform all affected users. For the moment, you might want to check your spam folders a bit more carefully than usual. Sorry.

Rob
robmueller is offline   Reply With Quote
Old 5th July 2003, 05:49 PM   #2
Terry
Master of the @
 
Join Date: Jul 2002
Location: A.U
Posts: 1,614
Thanks Rob....

My mailbox did not work for about 20 mins I thought it was my local isp as it said dns fail 22 33 70 443 ? anyway its working now and most of my lost mail arrived....I hope
Terry is offline   Reply With Quote
Old 5th July 2003, 08:26 PM   #3
jeffdesilva
Member
 
Join Date: Sep 2002
Posts: 58
Re: SpamAssassin - good news/bad news

Quote:
Originally posted by robmueller


Second good news, while I was there, I changed the code to always add the X-Spam-hits header if you have filtering enabled, whether it hits your threshold or not. Should help track down issues more easily in the future.



Rob
This is an excellent idea.
jeffdesilva is offline   Reply With Quote
Old 5th July 2003, 08:31 PM   #4
eggman
Essential Contributor
 
Join Date: Jun 2002
Location: AU
Posts: 465
I agree - this is a great idea Rob.
eggman is offline   Reply With Quote
Old 5th July 2003, 09:09 PM   #5
fmfan
Master of the @
 
Join Date: Jul 2002
Location: TX US
Posts: 1,297
Send a message via AIM to fmfan Send a message via Yahoo to fmfan
Quote:
Originally posted by robmueller
I hadn't realised, but SA 2.55 by default now uses a Bayes type database.
Sort of part reason asked question 24th June 2003 06:44 PM
Wonder what 2.55 will mean when in use, compared to 2.43 ?

Now know bit more
fmfan is offline   Reply With Quote
Old 5th July 2003, 09:51 PM   #6
vidvandre
Cornerstone of the Community
 
Join Date: Dec 2002
Location: Sørumsand(!), Norway
Posts: 625
Re: SpamAssassin - good news/bad news

Quote:
Originally posted by robmueller
Most of the RBL lists weren't working but now should be.
May explain some of the apparent spam that's been delivered to my inbox... Good!
Quote:
Originally posted by robmueller
I changed the code to always add the X-Spam-hits header
Great, that's very usefull! Thanks...
Quote:
Originally posted by robmueller
It would be nice to build a set of trusted users who could report email to add to this database
You can have my 70+ daily spam anytime!
vidvandre is offline   Reply With Quote
Old 5th July 2003, 10:43 PM   #7
DrStrabismus
The "e" in e-mail
 
Join Date: May 2002
Posts: 2,645
Is 2.55 already active?
DrStrabismus is offline   Reply With Quote
Old 5th July 2003, 10:52 PM   #8
Heartz
Essential Contributor
 
Join Date: Feb 2002
Location: Selangor, Malaysia
Posts: 454
Yeap, I'd volunteer to. I get heaps from my two ISP email accounts which have been in operation since 1996. In the old days, I used to love passing out my email address to anybody who asked and get like 30-40 daily.

Just say where Rob. I'd help out anyway I can.

Die spammers DIE!!
Heartz is offline   Reply With Quote
Old 5th July 2003, 11:40 PM   #9
DrStrabismus
The "e" in e-mail
 
Join Date: May 2002
Posts: 2,645
I'd be wary about giving a large amount of personal spam for use in a public spam corpus, without contributing a similar amount of ham. Your own spam is likely to generate tokens that relate specifically to you.

I would have thought that best thing to do would be create some secret spamtrap accounts (ideally including some external pop-link sources) and automate the process, so you are always using up-to-date spam.
DrStrabismus is offline   Reply With Quote
Old 6th July 2003, 04:51 AM   #10
paleolith
Essential Contributor
 
Join Date: Mar 2002
Location: Florida
Posts: 369
Re: SpamAssassin - good news/bad news

Quote:
Originally posted by robmueller
It would be nice to build a set of trusted users who could report email to add to this database, though there's work there on creating the interface...
I have an archive of over 17,000 spams received in the past two years that I could donate to the cause. It's pretty clean -- or perhaps I should say very dirty -- as I've been careful to double-check all filtered email. And I've been archiving all my valid email for over 12 years, so I have both parts of the needed archive -- roughly 11,000 in the same period, but that's in and out, so the incoming is less. However, my archive excludes mailing list traffic, which might be a major concern.

At the very least, it would be an opportunity to test on a fairly large scale how well the technique works on a site basis as opposed to an individual basis.

Some people may be assuming that you are only asking for the spam. Everyone needs to understand that the Bayesian filter method depends as much on having a corpus of valid email as on having a spam corpus. The technique, in a nutshell, involves picking the words out of a new email which best discriminate -- in BOTH directions -- between spam and non-spam. It's not clear how well this would work with only one side of the statistics available!

Personally, I'm willing contribute valid email as well as spam. I trust you to set it up in such a way as to keep it private during processing -- after all, I already trust you to store my email unencrypted on your servers. But everyone volunteering needs to understand that the technique -- at least so far as it's been validated up to now -- needs a complete archive of both spam and valid email. Partial archives, either unbalanced or just because the user (like the majority) doesn't keep all email, could unbalance the statistics in serious and unexpected, very possibly detrimental, ways.

Edward
paleolith is offline   Reply With Quote
Old 6th July 2003, 07:43 AM   #11
fmfan
Master of the @
 
Join Date: Jul 2002
Location: TX US
Posts: 1,297
Send a message via AIM to fmfan Send a message via Yahoo to fmfan
Quote:
Originally posted by paleolith
-- needs a complete archive of both spam and valid email. Partial archives ...
could unbalance the statistics in ... unexpected, possibly detrimental, ways.
That's what Paul Graham says...


Spam Resources


June 24, 2003 ... Bill Gates
Toward a Spam-Free Future


"Premature optimization is the root of all evil (or at least most of it) in programming."

- Donald Knuth

...

Last edited by fmfan : 6th July 2003 at 02:07 PM.
fmfan is offline   Reply With Quote
Old 6th July 2003, 08:20 AM   #12
sjk
Master of the @
 
Join Date: May 2002
Location: Hawaii
Posts: 1,974
Re: SpamAssassin - good news/bad news

Quote:
Originally posted by robmueller

Ok, good new first, ...

Second good news, ...

Third good news. ...

Now the bad news. ...
And only forum readers are worthy of receiving this news?
sjk is offline   Reply With Quote
Old 6th July 2003, 05:53 PM   #13
bitequator
The "e" in e-mail
 
Join Date: Apr 2003
Location: USA
Posts: 2,978
I love it, this is so awesome. 2.55, and X-Spam-hits across the board! Now I have no need to keep threshold at 2 (plus it will also mark msgs scoring <2 now).

Is Jeremy pleased?
bitequator is offline   Reply With Quote
Old 7th July 2003, 04:18 AM   #14
DrStrabismus
The "e" in e-mail
 
Join Date: May 2002
Posts: 2,645
Quote:
Originally posted by DrStrabismus
Is 2.55 already active?
I guess the answer is: yes and no.

I'm seeing the the new bayes hits on my direct spam, but not on pop-link spam.
DrStrabismus is offline   Reply With Quote
Old 7th July 2003, 09:52 PM   #15
AndrewL
Junior Member
 
Join Date: Feb 2003
Location: Cambridge, UK
Posts: 27
Like DrStrabismus I see the 'X-Spam-hits' header added to all e-mail delivered direct via a FastMail domain, but not to e-mail fetched indirectly via a POP link.

Rob, will the POP links fetching system be upgraded to the new version of SPAM assassin?

Best regards,


Andrew
AndrewL is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 03:05 AM.


Copyright EmailDiscussions.com 1998-2010. All Rights Reserved