EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail.FM Forums > Fastmail.FM Help and Current Issues
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Fastmail.FM Help and Current Issues This forum is for users to help each other to solve any problems they are having using FastMail.FM. It's also the place to discuss problems such as outages, slowness or other similar issues.

Reply
Thread Tools
Unread 25th May 2005, 05:14 PM   #1
none4u
Junior Member
 
Join Date: May 2005
Posts: 15
Fastmail Down - 1:13am Pst 5/25/2005

Fastmail is currently down and along with it went my business website.
none4u is offline   Reply With Quote
Unread 25th May 2005, 05:21 PM   #2
Heartz
Essential Contributor
 
Join Date: Feb 2002
Location: Selangor, Malaysia
Posts: 455
Currently facing problems too. It says contacting server and then stops there. Ironically, http://admin.fastem.com is having the same problem.
Heartz is offline   Reply With Quote
Unread 25th May 2005, 05:51 PM   #3
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Oslo, Norway
Posts: 2,380

Representative of:
Fastmail.fm
Our shiny new raid array seems to have taken out the server with the filestorage on it, and everything else is timing out trying to contact it.

We're working on bringing machines back to life now.

Bron.
brong is offline   Reply With Quote
Unread 25th May 2005, 05:59 PM   #4
none4u
Junior Member
 
Join Date: May 2005
Posts: 15
Is there any ETA for when it will be back up? Should I pause my Google adwords? I hate to pause them because it can cause me to lose rank.
none4u is offline   Reply With Quote
Unread 25th May 2005, 06:00 PM   #5
kander
The "e" in e-mail
 
Join Date: Jan 2002
Location: The Netherlands
Posts: 4,110
Seems to be back up now.
kander is offline   Reply With Quote
Unread 25th May 2005, 06:12 PM   #6
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Oslo, Norway
Posts: 2,380

Representative of:
Fastmail.fm
yeah, it's all back up.

I'll post more after I've got the kids in bed.
brong is offline   Reply With Quote
Unread 25th May 2005, 06:22 PM   #7
Terry
Master of the @
 
Join Date: Jul 2002
Location: A.U
Posts: 1,980
I do hope they are queueing the mail ? as I'm missing one
Terry is offline   Reply With Quote
Unread 25th May 2005, 09:08 PM   #8
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Oslo, Norway
Posts: 2,380

Representative of:
Fastmail.fm
I've posted a comment to the status blog and linked to this thread. I promised I'd get into more techincal details here.

We've purchased two new 4TB SATA arrays. Our current storage was filling up as people started using their increased quotas. There's only space in our current cabinets to add one of them, so we connected it to the same machine as the current backups and filestorage, planning to move backups across to the new disk, freeing up the rest of the older SATA unit for the increased filestorage we'll get once DAV/FTP is rolled out.

The first problem was that the driver for our SCSI card in linux only supports 2TB disks. We worked around that by splitting the array (something that's very easy to do in its config menu), and also by upgrading the kernel to support LVM to rejoin them into one big disk!

The only problem is, the internal RAID array in the box uses a driver which is broken in new Linuxes - we found the patch which claims to fix it again and applied it, but you saw what happened - the internal RAID disappeared from the system and everything went pearshaped!

We rebooted the server and it came up fine - then switched back to the last known stable version of the kernel on that box, adding the LVM support back in. I forgot to add SCSI check all LUNs support (I claim holding a crying baby at the time for me mistake), so yet another reboot to add that back in as well.

At this point all the disks were back, and the other problem surfaced. The SATA arrays are mounted via NFS from the web servers to provide filestorage access. They started freezing up waiting for responses from the NFS server, which then sent all the other servers into a spiral waiting for services. I saw loads over 1000 on one of our frontend boxes!

We restarted either services (if the boxes weren't too overloaded by then) or entire servers until we got everything back under control.

The NFS server machine has always been very reliable - though we have had issues with a SCSI cable in the past, and now these problems with having the additional device added. We're really hoping the kernel downgrade has fixed the problem for now, and we have another step available after this (add a separate SCSI card and put the new array on a different channel). We're prefer to separate things more, but we don't have additional space available in the cabinets yet.

Our apologies to everyone who was affected by this,

Bron.
brong is offline   Reply With Quote
Unread 26th May 2005, 01:11 AM   #9
Aleks
Member
 
Join Date: Mar 2005
Location: New York, USA
Posts: 81
I have 2 questions:

1. Was there any email loss during this downtime (bounced, deleted, undelivered, etc)?
2. How many developers does FastMail have? It seems like brong is the only one who's fixing bugs and adding features... Maybe you guys need to hire some additional help to speed things up.
Aleks is offline   Reply With Quote
Unread 26th May 2005, 08:16 AM   #10
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Oslo, Norway
Posts: 2,380

Representative of:
Fastmail.fm
I'm the only one who's being chatty on the forum anyway!

On the issue of lost email - absolutely not. This is one very nice thing about email (and the Postfix server we use takes great pains to be reliable) - if the receiving host can't guarantee that it has a copy then it doesn't tell the sending host that reception succeeded, and so it will be queued and tried again.

We have lost mail in the past for short periods, but in each case it's been due to misconfiguration rather than server failure.

Besides, in this case the problem, while serious, affected the NFS infrastructure only, so it took down the web servers and the frontend proxy servers (web servers waiting for disk responses from NFS and frontend servers overloaded queueing requests for the web servers!) The database and mx servers which are the only ones depended on for mail reception were all fine. They wouldn't have even noticed!

Bron.
brong is offline   Reply With Quote
Unread 26th May 2005, 11:54 AM   #11
Aleks
Member
 
Join Date: Mar 2005
Location: New York, USA
Posts: 81
That's good news, Bron. Keep up the good work.
Aleks is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 12:55 AM.

 

Copyright EmailDiscussions.com 1998-2013. All Rights Reserved