EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 15 Aug 2004, 05:01 PM   #1
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
Outage complete; Kernel problems resolved

OK, the outage is complete now. Very sorry about that - the problem was that the database server did not exit cleanly when we rebooted. It's never happened before - we're going to talk to the database developer to make sure we understand what happened and stop it from happening again.

We now have an updated version of the Linux kernel in place which has a fix for the problem that has caused numerous outages over the last few weeks. The problem was caused by a 'dead-lock' where two processes would try to access the same thing at the same time, lock each other out in the process, and get stuck. The cause of the deadlock has been identified by the Linux kernel developers and has been resolved (well, since we can't replicate the problem, we can't be 100% sure - but the logic of the bug-fix looks sound, and it passes our stress tests).
Jeremy Howard is offline   Reply With Quote

Old 15 Aug 2004, 05:13 PM   #2
memac
Senior Member
 
Join Date: May 2002
Posts: 196
Thanks much for the update and for getting things working. I think I can speak for the majority of users when I say we appreciate your efforts.
memac is offline   Reply With Quote
Old 15 Aug 2004, 05:24 PM   #3
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
Well it's really the Linux kernel guys who deserve our thanks. No-one else had managed to trigger this particular bug - these are the problems we have by being the biggest IMAP specialist! Despite this, one very helpful kernel hacker (Chris Mason - thanks Chris!) spent a lot of time of the last couple of weeks working closely with us diagnosing the problem and coming up with a fix.
Jeremy Howard is offline   Reply With Quote
Old 15 Aug 2004, 06:04 PM   #4
CML209
Essential Contributor
 
Join Date: Feb 2004
Posts: 328
"well, since we can't replicate the problem, we can't be 100% sure"

No need to replicate. Next week it will be something else. Sorry for my cynicism, but I have spent all night waiting to download messages so that I could keep them offline, meaning I may as well treat this as a pop account. I really do appreciate that someone has to bust tail to fix these things as quickly as possible, but what about the customer?
I've already switched my renewal to manual. It will take something positive from FM to keep me now. More space is not the answer. I have been marking my calendar since the major problems started in January. Back then it was foolish to think of switching when money was spent and you had 9 months to fix it. Now, I am two months away and already migrating elsewhere.
Please give me a good reason to stay here. I love the featrues....... WHEN they work.
Nighty night.
CMLane
CML209 is offline   Reply With Quote
Old 15 Aug 2004, 06:17 PM   #5
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
CMLane, all the problems in the last 2 months (including the reason we had to reboot today), except for one, have all been the 'stuck in D state' problem, which this kernel upgrade is designed to address.
Jeremy Howard is offline   Reply With Quote
Old 15 Aug 2004, 11:09 PM   #6
CML209
Essential Contributor
 
Join Date: Feb 2004
Posts: 328
I look at it as one problem: Downtime. That has been a problem going back to January. My question/request remains the same. I would like a good reason to stick around.
CMLane
CML209 is offline   Reply With Quote
Old 15 Aug 2004, 11:44 PM   #7
mlevin
Cornerstone of the Community
 
Join Date: Oct 2001
Location: Somerville, MA, USA
Posts: 656
Re: Outage complete; Kernel problems resolved

Quote:
Originally posted by Jeremy Howard
OK, the outage is complete now. Very sorry about that - the problem was that the database server did not exit cleanly when we rebooted. It's never happened before - we're going to talk to the database developer to make sure we understand what happened and stop it from happening again.
Jeremy -- thank you for the full update. Despite anything anyone says, and I know I am one who usually complains during outages, you DO always give a full and honest explanation of the issue.

After almost every outage we usually get, along with the explanation, a reasasurance that this should never happen again and that the problem is now one the list of things that are being monitored (or something to that effect).

I don't mean this to sound complain-y -- honestly, I'm just curious -- is it really the case that most problems, once you catch them and figure them out, really don't recur and that each issue that pops up really is unique? If so, why do you think so many novel issues appear from time to time? Once things are going, shouldn't they just keep going? Once a system works, it generally keeps working until/unless you change some setting somewhere. (For example, whenever you call tech support, usually the first thing out of the user's mouth is, "but it's been working fine up until now!" and the first thing out of the technician's mouth is "well, did you change any settings or add anything?")

Again, I mean this as a question, not as an attack. WHY problems occur with the frequency they do is perhaps a different discussion for a different time -- but I have nothing but praise for WHAT you do and HOW you handle it when they DO occur.

- ML
mlevin is offline   Reply With Quote
Old 16 Aug 2004, 12:39 AM   #8
rakhesh
Cornerstone of the Community
 
Join Date: Apr 2002
Location: Muscat, Oman
Posts: 551
Thanks for that explanation J. Its appreciated.
rakhesh is offline   Reply With Quote
Old 16 Aug 2004, 12:51 AM   #9
CML209
Essential Contributor
 
Join Date: Feb 2004
Posts: 328
...... and another major outage gets smoothed over as if nothing happened. I guess I should take that as an answer in itself. cml
p.s. lemmings? Hamlin? or is it 1984 all over again?
CML209 is offline   Reply With Quote
Old 16 Aug 2004, 02:20 AM   #10
Starion
Essential Contributor
 
Join Date: Aug 2004
Location: Washington D.C.
Posts: 240
Quote:
Originally posted by CML209
...... and another major outage gets smoothed over as if nothing happened.
Outage? What outage? You're right CML209, I didn't notice anything.

Quote:
Originally posted by Jeremy Howard
Well it's really the Linux kernel guys who deserve our thanks. Despite this, one very helpful kernel hacker (Chris Mason - thanks Chris!)...
Thanks to Fastmail and the Linux technical support personnel for fixing the problem! Thanks Chris Mason!
Starion is offline   Reply With Quote
Old 16 Aug 2004, 03:18 AM   #11
CML209
Essential Contributor
 
Join Date: Feb 2004
Posts: 328
Yep, whole threads have been deleted; the coverup goes on; keep marching lockstep.
CML209 is offline   Reply With Quote
Old 16 Aug 2004, 03:28 AM   #12
kchess79
Master of the @
 
Join Date: Apr 2002
Location: Chicagoland
Posts: 1,142
Threads about this are being deleted? Are they just repeat threads of the same topic, or are any threads talking about the last outage being deleted? I guess I need to check the "New posts"...I had all my mail at Mailsnare, and forwarded to Yahoo Plus, so when the outage started, I just went back to checking mail at Mailsnare.
kchess79 is offline   Reply With Quote
Old 16 Aug 2004, 03:31 AM   #13
ReuvenNY
 Moderator 
 
Join Date: Mar 2002
Location: New York
Posts: 4,259
No thread on that subject have been deleted. Actually, no threads are deleted. They are merged with similar threads. Sorry, but the paranoia here is unjustified
ReuvenNY is offline   Reply With Quote
Old 16 Aug 2004, 03:35 AM   #14
kchess79
Master of the @
 
Join Date: Apr 2002
Location: Chicagoland
Posts: 1,142
Quote:
Originally posted by ReuvenNY
No thread on that subject have been deleted. Actually, no threads are deleted. They are merged with similar threads. Sorry, but the paranoia here is unjustified
Thanks for clearing that up! I thought that might have been the case--either that, or deleting multiple threads about the same topic. Either way, that's different from all-out deleting anything that might sound less that approving.
kchess79 is offline   Reply With Quote
Old 16 Aug 2004, 03:37 AM   #15
CML209
Essential Contributor
 
Join Date: Feb 2004
Posts: 328
Yes, I started a thread about this outage, though it was not technical. What really makes me mad is that all the technobabble does some of us no good. We would like replies to our concerns. In another thread someoen posted a log timeline, showing how the 5 minute outage turned into 3.5 hours. That post is gone. Believe it or not, I did/do care about FM, but unless they address some customer satsifaction issues, I will be gone, and I am guessing others are. FM needs to be aware of all of the posts of people migrating important contacts to other email addresses.
I am NOT here to tout another service. This will be the last paid service I ever use if I don't renew. My posts have been sincere, regardless of their tone.
When I have questioned missing emails, I get moderators saying they don't believe me and that never happens.
FM needs to see the posts, good and bad. None of mine are nasty. They range from being supportive to terse, but never totally rude, save in the case of the moderator implying that I was a liar.
CMLane
CML209 is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 09:01 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy