Kernel panic - help is needed

AlexBarakov · September 2012

Around 40 minutes ago, my Seattle2 node got into kernel panic, making it completly unresponsive.

However some time ago I have seen in this forum a simple script that reboots the node in case of kernel panic, which I can not seem to find anymore. If anyone has it, I would be really thankfull :P

jar · September 2012

Well it would have to be somewhat responsive, but if you're really feeling brave...
echo 1 > /proc/sys/kernel/sysrq
echo b > /proc/sysrq-trigger

^ Don't do that by the way :P

MrAndroid · September 2012

http://www.cyberciti.biz/tips/reboot-linux-box-after-a-kernel-panic.html

AlexBarakov · September 2012

Ah well, things got majorly fucked up and I am seeking help ASAP, if anyone has any management company that he can recommend, please PM me. The system is no longer booting with a kernel panic that I've never seen before and the worse thing is that I need to leave teh city in 1.5 hours for aproximetely 6-7 hours. So if anyone can recommend me a reliable management company to take a look at it while I am away and coordinate its actions via emails, please PM me the name / contact.

For anyone interested, bellow is a screenshot from my kvm console: http://i.imgur.com/IumC8.png

jh · September 2012

Can you shift-pageup on KVM?

Damian · September 2012

"EXT3-fs: unable to read superblock" means, if it's trying to mount the correct block device, that the filesystem has disappeared/corrupted. Meaning that either:

-the system is trying to mount the wrong block device for some reason
or
-your software raid has de-activated itself
or
-your data is gone

jh · September 2012

@Damian said: "EXT3-fs: unable to read superblock" means, if it's trying to mount the correct block device, that the filesystem has disappeared/corrupted. Meaning that either:

-the system is trying to mount the wrong block device for some reason

or
-your data is gone

I get that feeling as well, but I noticed the mdadm notices just above it, might be some clues in there.

AlexBarakov · September 2012

That particular serevr is running with software raid indeed.

Randy · September 2012

i hope your data is not gone

mikho · September 2012

@Alex_LiquidHost
Steven at rack911.com
If you can get to talk to him, he will probably solve it as best as it can get.

AlexBarakov · September 2012

@MikHo said: Steven at rack911.com

If you can get to talk to him, he will probably solve it as best as it can get.

I am talking with him right now. Hopefully he will take the job in the next 30 minutes before I need to get out of the office. Such a bad timing.

Corey · September 2012

Yea sounds like your data is fucked.

Jacob · September 2012

well, with mdadm errors like that run a fsck, And then see how it goes.

Asad · September 2012

@Jacob Dude.. Grammar...

Spencer · September 2012

I know my DC if this happens they will hook it up to an sata to usb and then connect it to my other servers, which I can then pull off the data.

Taz · September 2012

@AsadHaider said: Dude.. Grammar..

Dude, ........................puffff.

mikho · September 2012

@Alex_LiquidHost said: I am talking with him right now. Hopefully he will take the job in the next 30 minutes before I need to get out of the office. Such a bad timing.

Tell him I said hi
on a more serious note; if he can't fix your issue, I'm sure that noone can.

LAKid · September 2012

@Alex_LiquidHost Seattle2 node is not Slave node, is it?

Infinity · September 2012

@LAKid said: @Alex_LiquidHost Seattle2 node is not Slave node, is it?

They are not the same node as far as I know.

Corey · September 2012

@Jacob said: well, with mdadm errors like that run a fsck, And then see how it goes.

That's the worst thing you could do at this point. Don't take this bad advice.

AlexBarakov · September 2012

Steven is working on it. I thnink that if he would not be able to restore ti to a working state, atleast I would eb able to get the data.

Jacob · September 2012

@Corey Judging from Damians post, If the RAID is corrupt, Damaged, Then running it could be a 50/50 chance as it always is tunning a fsck.

Corey · September 2012

@Jacob said: @Corey Judging from Damians post, If the RAID is corrupt, Damaged, Then running it could be a 50/50 chance as it always is tunning a fsck.

I don't see where @Damian said that. It's a 100% chance you will screw your raid up if these issues exist running fsck.

The only time you run an fsck is
1) After you replace any bad disks.
2) If the system says 'Inconsistent data on partition X'.

AlexBarakov · September 2012

fsck is not suitable at all in this sitaution. A bunch of other things had to be done, before an actual fsck would be enitiated at the end of the procedure. Once they finish working on it, I can update this if anyone is interested in what was done.

Taz · September 2012

Interested @Alex_liquidhost.

Corey · September 2012

@Alex_LiquidHost please share send me an email [email protected].

Wintereise · September 2012

Why via email? I think the entire community could benefit from the info, that is, if Alex has no issues with that.

Randy · September 2012

yes, please update this thread, if it is okay for you

seanho · September 2012

Hi Alex, thanks for doing your best to nurse this node back to health; glad you found someone trustworthy to help you with that.

As a customer with a VPS on this node, I would definitely be interested to hear updates as you get them -- what Steven did to fix it, and what the root cause was. It can be a learning experience for all of us. Thanks for the transparency!

AlexBarakov · September 2012

Root cause - most likely a failed RAID array.

mikho · September 2012

@Alex_LquidHost

fixable?

Howdy, Stranger!

Categories

In this Discussion

Kernel panic - help is needed

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Kernel panic - help is needed

Comments