|
FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc. |
|
Thread Tools |
15 Aug 2004, 05:01 PM | #1 |
Ultimate Contributor
Join Date: Sep 2001
Location: Australia
Posts: 11,501
|
Outage complete; Kernel problems resolved
OK, the outage is complete now. Very sorry about that - the problem was that the database server did not exit cleanly when we rebooted. It's never happened before - we're going to talk to the database developer to make sure we understand what happened and stop it from happening again.
We now have an updated version of the Linux kernel in place which has a fix for the problem that has caused numerous outages over the last few weeks. The problem was caused by a 'dead-lock' where two processes would try to access the same thing at the same time, lock each other out in the process, and get stuck. The cause of the deadlock has been identified by the Linux kernel developers and has been resolved (well, since we can't replicate the problem, we can't be 100% sure - but the logic of the bug-fix looks sound, and it passes our stress tests). |
15 Aug 2004, 05:13 PM | #2 |
Senior Member
Join Date: May 2002
Posts: 196
|
Thanks much for the update and for getting things working. I think I can speak for the majority of users when I say we appreciate your efforts.
|
15 Aug 2004, 05:24 PM | #3 |
Ultimate Contributor
Join Date: Sep 2001
Location: Australia
Posts: 11,501
|
Well it's really the Linux kernel guys who deserve our thanks. No-one else had managed to trigger this particular bug - these are the problems we have by being the biggest IMAP specialist! Despite this, one very helpful kernel hacker (Chris Mason - thanks Chris!) spent a lot of time of the last couple of weeks working closely with us diagnosing the problem and coming up with a fix.
|
15 Aug 2004, 06:04 PM | #4 |
Essential Contributor
Join Date: Feb 2004
Posts: 328
|
"well, since we can't replicate the problem, we can't be 100% sure"
No need to replicate. Next week it will be something else. Sorry for my cynicism, but I have spent all night waiting to download messages so that I could keep them offline, meaning I may as well treat this as a pop account. I really do appreciate that someone has to bust tail to fix these things as quickly as possible, but what about the customer? I've already switched my renewal to manual. It will take something positive from FM to keep me now. More space is not the answer. I have been marking my calendar since the major problems started in January. Back then it was foolish to think of switching when money was spent and you had 9 months to fix it. Now, I am two months away and already migrating elsewhere. Please give me a good reason to stay here. I love the featrues....... WHEN they work. Nighty night. CMLane |
15 Aug 2004, 06:17 PM | #5 |
Ultimate Contributor
Join Date: Sep 2001
Location: Australia
Posts: 11,501
|
CMLane, all the problems in the last 2 months (including the reason we had to reboot today), except for one, have all been the 'stuck in D state' problem, which this kernel upgrade is designed to address.
|
15 Aug 2004, 11:09 PM | #6 |
Essential Contributor
Join Date: Feb 2004
Posts: 328
|
I look at it as one problem: Downtime. That has been a problem going back to January. My question/request remains the same. I would like a good reason to stick around.
CMLane |
15 Aug 2004, 11:44 PM | #7 | |
Cornerstone of the Community
Join Date: Oct 2001
Location: Somerville, MA, USA
Posts: 656
|
Re: Outage complete; Kernel problems resolved
Quote:
After almost every outage we usually get, along with the explanation, a reasasurance that this should never happen again and that the problem is now one the list of things that are being monitored (or something to that effect). I don't mean this to sound complain-y -- honestly, I'm just curious -- is it really the case that most problems, once you catch them and figure them out, really don't recur and that each issue that pops up really is unique? If so, why do you think so many novel issues appear from time to time? Once things are going, shouldn't they just keep going? Once a system works, it generally keeps working until/unless you change some setting somewhere. (For example, whenever you call tech support, usually the first thing out of the user's mouth is, "but it's been working fine up until now!" and the first thing out of the technician's mouth is "well, did you change any settings or add anything?") Again, I mean this as a question, not as an attack. WHY problems occur with the frequency they do is perhaps a different discussion for a different time -- but I have nothing but praise for WHAT you do and HOW you handle it when they DO occur. - ML |
|
16 Aug 2004, 12:39 AM | #8 |
Cornerstone of the Community
Join Date: Apr 2002
Location: Muscat, Oman
Posts: 551
|
Thanks for that explanation J. Its appreciated.
|
16 Aug 2004, 12:51 AM | #9 |
Essential Contributor
Join Date: Feb 2004
Posts: 328
|
...... and another major outage gets smoothed over as if nothing happened. I guess I should take that as an answer in itself. cml
p.s. lemmings? Hamlin? or is it 1984 all over again? |
16 Aug 2004, 02:20 AM | #10 | ||
Essential Contributor
Join Date: Aug 2004
Location: Washington D.C.
Posts: 240
|
Quote:
Quote:
|
||
16 Aug 2004, 03:18 AM | #11 |
Essential Contributor
Join Date: Feb 2004
Posts: 328
|
Yep, whole threads have been deleted; the coverup goes on; keep marching lockstep.
|
16 Aug 2004, 03:28 AM | #12 |
Master of the @
Join Date: Apr 2002
Location: Chicagoland
Posts: 1,142
|
Threads about this are being deleted? Are they just repeat threads of the same topic, or are any threads talking about the last outage being deleted? I guess I need to check the "New posts"...I had all my mail at Mailsnare, and forwarded to Yahoo Plus, so when the outage started, I just went back to checking mail at Mailsnare.
|
16 Aug 2004, 03:31 AM | #13 |
Moderator
Join Date: Mar 2002
Location: New York
Posts: 4,259
|
No thread on that subject have been deleted. Actually, no threads are deleted. They are merged with similar threads. Sorry, but the paranoia here is unjustified
|
16 Aug 2004, 03:35 AM | #14 | |
Master of the @
Join Date: Apr 2002
Location: Chicagoland
Posts: 1,142
|
Quote:
|
|
16 Aug 2004, 03:37 AM | #15 |
Essential Contributor
Join Date: Feb 2004
Posts: 328
|
Yes, I started a thread about this outage, though it was not technical. What really makes me mad is that all the technobabble does some of us no good. We would like replies to our concerns. In another thread someoen posted a log timeline, showing how the 5 minute outage turned into 3.5 hours. That post is gone. Believe it or not, I did/do care about FM, but unless they address some customer satsifaction issues, I will be gone, and I am guessing others are. FM needs to be aware of all of the posts of people migrating important contacts to other email addresses.
I am NOT here to tout another service. This will be the last paid service I ever use if I don't renew. My posts have been sincere, regardless of their tone. When I have questioned missing emails, I get moderators saying they don't believe me and that never happens. FM needs to see the posts, good and bad. None of mine are nasty. They range from being supportive to terse, but never totally rude, save in the case of the moderator implying that I was a liar. CMLane |