View Single Post
Old 16 Aug 2003, 06:14 AM   #1
robmueller
Intergalactic Postmaster
 
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 6,102

Representative of:
Fastmail.FM
Quick situation summary

Server/network down time:

Looks like it was a power issue. Basically we have to believe what they say since we weren't actually there, though we have to question why the backup generators clearly hadn't been tested regularly. More details in this thread.

http://www.emaildiscussions.com/...threadid=14962

Naturally we don't want to jump to immediate conclusions, but clearly we'll be making some hard decisions over the next couple of weeks to determine our future relationship with NYI.

Original choice of NYI:

At the time we moved from RackSpace, we sent out RFP's to over a dozen providers. One of the people we sent it to even wrote back saying "It was the most comprehensive RFP he'd ever seen". The two providers that clearly stood out were NYI and Peer 1 networks, and we went with NYI because of their better "on paper" connectivity, support and reputation. For instance, they survived 9/11 with no downtime. Also importantly to us, they had an excellent track record on abuse handling. They'd previously been selected as spamcop's host.

Backup MX overload:

In summary, the smtp.eu server was overloaded. It's never been asked to handle the full brunt of ALL email for fastmail for several hours. Our smtp.eu administrator had unbeknownst to us turned on sender address verification to reduce spam sent through the secondary MX server. Unfortunately this hadn't been stress tested, and turned out to be rather resource intensive which was part of the reason the machine became overloaded.

That's why we setup a new server in Texas ASAP to handle backup email and we changed smtp.eu to point to it as soon as we could get it ready.

Now that things are back to normal, we've made the Texas server the smtp.us2 server, and smtp.eu is now a backup, backup. This means we now have 3 completely geographically separate mail servers.

Current situation:

All our mail queues have now been cleared, so any mail we had queued has been delivered. Some external sites may still have email for FastMail in their deferred queue, so more email may come in in the next 12 hours or so.

All web/imap/pop/etc services seem to be running normally right now.

Rob
robmueller is offline   Reply With Quote