EmailDiscussions.com

EmailDiscussions.com (http://www.emaildiscussions.com/index.php)
-   FastMail Forum (http://www.emaildiscussions.com/forumdisplay.php?f=27)
-   -   Quick situation summary (http://www.emaildiscussions.com/showthread.php?t=14963)

robmueller 16 Aug 2003 06:14 AM

Quick situation summary
 
Server/network down time:

Looks like it was a power issue. Basically we have to believe what they say since we weren't actually there, though we have to question why the backup generators clearly hadn't been tested regularly. More details in this thread.

http://www.emaildiscussions.com/...threadid=14962

Naturally we don't want to jump to immediate conclusions, but clearly we'll be making some hard decisions over the next couple of weeks to determine our future relationship with NYI.

Original choice of NYI:

At the time we moved from RackSpace, we sent out RFP's to over a dozen providers. One of the people we sent it to even wrote back saying "It was the most comprehensive RFP he'd ever seen". The two providers that clearly stood out were NYI and Peer 1 networks, and we went with NYI because of their better "on paper" connectivity, support and reputation. For instance, they survived 9/11 with no downtime. Also importantly to us, they had an excellent track record on abuse handling. They'd previously been selected as spamcop's host.

Backup MX overload:

In summary, the smtp.eu server was overloaded. It's never been asked to handle the full brunt of ALL email for fastmail for several hours. Our smtp.eu administrator had unbeknownst to us turned on sender address verification to reduce spam sent through the secondary MX server. Unfortunately this hadn't been stress tested, and turned out to be rather resource intensive which was part of the reason the machine became overloaded.

That's why we setup a new server in Texas ASAP to handle backup email and we changed smtp.eu to point to it as soon as we could get it ready.

Now that things are back to normal, we've made the Texas server the smtp.us2 server, and smtp.eu is now a backup, backup. This means we now have 3 completely geographically separate mail servers.

Current situation:

All our mail queues have now been cleared, so any mail we had queued has been delivered. Some external sites may still have email for FastMail in their deferred queue, so more email may come in in the next 12 hours or so.

All web/imap/pop/etc services seem to be running normally right now.

Rob

kander 16 Aug 2003 06:17 AM

Thanks for the update, Rob! As always your communication with the users is greatly appreciated!

--K

bitequator 16 Aug 2003 06:50 AM

Great news! It'll be interesting to see what NYI's new state-of-the-art coloc will look like... will you guys give them benefit of the doubt and wait for news of this new data center?

Incidentally, are you able to give us more details on the hosting of the Texas server?

binary 16 Aug 2003 06:13 PM

Thanks for the update Rob.

I've been away from a computer for the last 48 hours so I havn't try to access FM whilst it was down.

Will incoming mail have been lost, bounced back to sender or delivered into my FM mailbox?

It's a rudimentary question that isn't answered by the weblog entry, nor your forum post.

Jeremy Howard 17 Aug 2003 06:24 AM

Incoming mail will have been delivered to your inbox, except in rare cases where the sending server was configured to not queue mail, and failed to reach our backup server. In this case, the sender would get an undeliverable message notification.

aminm 17 Aug 2003 06:31 AM

Backup web servers
 
Rob wrote:

"Backup MX overload:

In summary, the smtp.eu server was overloaded. It's never been asked to handle the full brunt of ALL email for fastmail for several hours. Our smtp.eu administrator had unbeknownst to us turned on sender address verification to reduce spam sent through the secondary MX server. Unfortunately this hadn't been stress tested, and turned out to be rather resource intensive which was part of the reason the machine became overloaded.
Backup MX overload:

In summary, the smtp.eu server was overloaded. It's never been asked to handle the full brunt of ALL email for fastmail for several hours. Our smtp.eu administrator had unbeknownst to us turned on sender address verification to reduce spam sent through the secondary MX server. Unfortunately this hadn't been stress tested, and turned out to be rather resource intensive which was part of the reason the machine became overloaded."

This still doesn't explain why I could not get to www.fastmail.fm homepage for the duration of NYC outage. Don't you have backup web servers in place?

Amin

Jeremy Howard 17 Aug 2003 06:34 AM

For you to read your mail, the server that actually stores your e-mail must be up. The backup servers are for queuing, and name services.

Shelded 17 Aug 2003 07:10 AM

Re: Quick situation summary
 
Quote:

Originally posted by robmueller
Now that things are back to normal, we've made the Texas server the smtp.us2 server, and smtp.eu is now a backup, backup. This means we now have 3 completely geographically separate mail servers.


I'm not a guru at this, but I did a tracert on smtp.us.fastmail.fm and smtp.us2.fastmail.fm which are 66.111.4.3 and 66.111.4.20. They have the same router immediately preceding them, that is, 66.111.15.204. I don't see how they're not sitting in the same facility instead of one in Texas and the other in NYC. What does this mean?

aminm 17 Aug 2003 11:25 AM

No redundancy?
 
Quote:

Originally posted by Jeremy Howard
For you to read your mail, the server that actually stores your e-mail must be up. The backup servers are for queuing, and name services.
This is not good. There is no redundancy in your system then. You have a single point of failure. Are you telling me that whenever New York is down I can't get to fastmail UI no matter how many backup servers you have on other continents?

Edwin 17 Aug 2003 12:19 PM

Re: No redundancy?
 
Quote:

Originally posted by aminm
This is not good. There is no redundancy in your system then. You have a single point of failure. Are you telling me that whenever New York is down I can't get to fastmail UI no matter how many backup servers you have on other continents?
You phrase this like it's a regular occurrence! To the best of my knowledge, this is the first time ever that NY has been down.

biffbulkie 17 Aug 2003 12:35 PM

Re: Re: No redundancy?
 
Quote:

Originally posted by Edwin
You phrase this like it's a regular occurrence! To the best of my knowledge, this is the first time ever that NY has been down.
Actually, it's happened twice before.

;)

Jeremy Howard 17 Aug 2003 02:47 PM

Re: Re: Quick situation summary
 
Quote:

Originally posted by shelded
I'm not a guru at this, but I did a tracert on smtp.us.fastmail.fm and smtp.us2.fastmail.fm which are 66.111.4.3 and 66.111.4.20. They have the same router immediately preceding them, that is, 66.111.15.204. I don't see how they're not sitting in the same facility instead of one in Texas and the other in NYC. What does this mean?
It's back to the way it was for a few days until we finish setting up monitoring systems on the new server. We rushed it in to service during the outage since it was needed, but we need to do more work to make it robust before it is permanently in production.

aminm 17 Aug 2003 03:02 PM

Re: Re: No redundancy?
 
Quote:

Originally posted by Edwin
You phrase this like it's a regular occurrence! To the best of my knowledge, this is the first time ever that NY has been down.
NYI faults aside, this episode made me realize that fastmail.fm does not seem to have backup web servers set up in a different geographical location. If New York web servers are toasted (for any reason, be outage, software glitch, whatever) users will seemingly be uable to access the web front-end. I love to hear that I am wrong.

hadaso 17 Aug 2003 03:23 PM

Re: Re: Re: No redundancy?
 
Does anyone know any email provider that has backup webmail/pop/imap access to all user email at a different geographic location?

How many email providers provide backup mail servers (i.e., SMTP) on different geographic locations?

Shelded 17 Aug 2003 04:26 PM

Does anyone know half as much about any other mail provider as you do about Fastmail or Runbox? We're pretty nosey about this stuff but has Hotmail ever told you anything like what Fastmail tells you?

Hotmail supposedly has 5,000 servers (I think Edwin worked that out once) so I suppose they have the ability to provide my data from more than one place. But there have been two occasions I recall when they were unavailable to thousands of users -- remember the time they had a primary go down and it turned out that the backup data was junk? That was a self-inflicted problem. Another time they corrupted a bunch of people's passports or something of the sort. Those errors took better than full days to resolve. So if Microsoft can have a failure, why can't Fastmail? :)


All times are GMT +9. The time now is 08:08 AM.


Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy