New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Kernel panic - help is needed
AlexBarakov
Patron Provider, Veteran
Around 40 minutes ago, my Seattle2 node got into kernel panic, making it completly unresponsive.
However some time ago I have seen in this forum a simple script that reboots the node in case of kernel panic, which I can not seem to find anymore. If anyone has it, I would be really thankfull :P
Comments
Well it would have to be somewhat responsive, but if you're really feeling brave...
echo 1 > /proc/sys/kernel/sysrq
echo b > /proc/sysrq-trigger
^ Don't do that by the way :P
http://www.cyberciti.biz/tips/reboot-linux-box-after-a-kernel-panic.html
Ah well, things got majorly fucked up and I am seeking help ASAP, if anyone has any management company that he can recommend, please PM me. The system is no longer booting with a kernel panic that I've never seen before and the worse thing is that I need to leave teh city in 1.5 hours for aproximetely 6-7 hours. So if anyone can recommend me a reliable management company to take a look at it while I am away and coordinate its actions via emails, please PM me the name / contact.
For anyone interested, bellow is a screenshot from my kvm console: http://i.imgur.com/IumC8.png
Can you shift-pageup on KVM?
"EXT3-fs: unable to read superblock" means, if it's trying to mount the correct block device, that the filesystem has disappeared/corrupted. Meaning that either:
-the system is trying to mount the wrong block device for some reason
or
-your software raid has de-activated itself
or
-your data is gone
or
-your data is gone
I get that feeling as well, but I noticed the mdadm notices just above it, might be some clues in there.
That particular serevr is running with software raid indeed.
i hope your data is not gone
@Alex_LiquidHost
Steven at rack911.com
If you can get to talk to him, he will probably solve it as best as it can get.
If you can get to talk to him, he will probably solve it as best as it can get.
I am talking with him right now. Hopefully he will take the job in the next 30 minutes before I need to get out of the office. Such a bad timing.
Yea sounds like your data is fucked.
well, with mdadm errors like that run a fsck, And then see how it goes.
@Jacob Dude.. Grammar...
I know my DC if this happens they will hook it up to an sata to usb and then connect it to my other servers, which I can then pull off the data.
Dude, ........................puffff.
Tell him I said hi
on a more serious note; if he can't fix your issue, I'm sure that noone can.
@Alex_LiquidHost Seattle2 node is not Slave node, is it?
They are not the same node as far as I know.
That's the worst thing you could do at this point. Don't take this bad advice.
Steven is working on it. I thnink that if he would not be able to restore ti to a working state, atleast I would eb able to get the data.
@Corey Judging from Damians post, If the RAID is corrupt, Damaged, Then running it could be a 50/50 chance as it always is tunning a fsck.
I don't see where @Damian said that. It's a 100% chance you will screw your raid up if these issues exist running fsck.
The only time you run an fsck is
1) After you replace any bad disks.
2) If the system says 'Inconsistent data on partition X'.
fsck is not suitable at all in this sitaution. A bunch of other things had to be done, before an actual fsck would be enitiated at the end of the procedure. Once they finish working on it, I can update this if anyone is interested in what was done.
Interested @Alex_liquidhost.
@Alex_LiquidHost please share send me an email [email protected].
Why via email? I think the entire community could benefit from the info, that is, if Alex has no issues with that.
yes, please update this thread, if it is okay for you
Hi Alex, thanks for doing your best to nurse this node back to health; glad you found someone trustworthy to help you with that.
As a customer with a VPS on this node, I would definitely be interested to hear updates as you get them -- what Steven did to fix it, and what the root cause was. It can be a learning experience for all of us. Thanks for the transparency!
Root cause - most likely a failed RAID array.
@Alex_LquidHost
fixable?