Debian 9 KVM VPS - Filesystem became "read-only". What to do?

Amitz · October 2017

Dear fellows,

I have a KVM VPS that I use for backups. It ran fine for many moons on Debian 9 and caused no headache at all. I logged in today and had to find out that the filesystem became read-only several days ago. Aargh.

I wonder: Is this something that I can fix or is this something with the host system and will I have to open a ticket to get it repaired?

Thanks for your insights in advance!

rm_ · October 2017

Look into dmesg for reasons. Typically it's either some FS error, or host temporarily denying all writes to the block device (if they used thin LVM and oversold the storage). If it's the former, you will need to reboot into some sort of a rescue system and run fsck, if the latter, just reboot normally.

Amitz said: will I have to open a ticket to get it repaired?

If it is an unmanaged service, you are expected to fix this kind of issue yourself.

ehab · October 2017

reboot and check fs
did you run out of space?

mksh · October 2017

As others said check dmesg to see what happend and run fsck if needed. If you are positive that you can just ignore whatever caused the filesystem to be mounted read only you could also just do mount -o remount,rw /mountpoint to make it writable again.

Amitz · October 2017

Thank you, guys! I will do a cociu and see what the fsck will bring. dmesg shows no traces. Would hate if I had to setup the server again... Again, thanks!

Neoon · October 2017

Well, rebooting the VPS while the filesystem is read-only is a bad idea, you can lose data this way.

WSS · October 2017

The fact that it's a KVM generally means that it's going to have a "real" console to it. Do not run fsck as it is a live filesystem. I assume it's ext4- in which case you should be able to /mostly/ recover gracefully.

Go back into grub, hit 'e', then add "single" to the command line (If systemd still supports that), then hit F10. It'll boot into single user mode without remounting the filesystems. do a fsck on / and see what it has to say. If it doesn't say something super-duper nasty or suggest that you have hundreds of missing inodes, go ahead and say 'y' to repair, then when it's done, sync;sync;sync&&reboot

angstrom · October 2017

@Amitz said:
Thank you, guys! I will do a cociu and see what the fsck will bring. dmesg shows no traces. Would hate if I had to setup the server again... Again, thanks!

Running fsck is unavoidable in this situation, I only hope that your disk is (much) smaller than cociu's. :-)

If you first run fsck with the flag -N, you can safely see what it would do (maybe also with the flag -V for more info).

Sometimes, this problem indicates that the hard disk is beginning to fail.

Maounique · October 2017

angstrom said: Sometimes, this problem indicates that the hard disk is beginning to fail.

Rarely and on physical machines mostly.
This can be a lot of things, from a hiccup of the storage on the node to major corruption, but, in general, the nodes have raid storage, it usually fails completely or transparently for the user which can only see some slowdown, if that. On hundreds of nodes we have a disk failure every week or so, raid corruption we had just 4 times in 6 years, in which 2 we managed to recover all data.
In SAN storage cases, though, the node might lose contact with the SAN for various reasons, from unplugging the wrong cord to fabric malfunction and even brief such incidents can create a problem like that.
Also, there is overselling, but i wouldnt know about that...

raindog308 · October 2017

Dude, when I emailed you the link to tentacle-highlights.tar.gz, I specifically told you not to untar it in the root dir...

JoeMerit · October 2017

Did you ask the provider what happened ? it is doubtful that it was limited to just your KVM.

jlay · October 2017

Usually happens when the underlying storage (eg: iSCSI or AoE) experiences a blip. Kernel doesn't get a timely response, remounts the FS readonly. On the root volume your only option is to really reboot, I believe. Auxiliary filesystems can be remounted with 'mount -o remount,rw /filesystem'.

default · October 2017

Shazan · October 2017

Although doing it in single user mode is definitely safer, I am not sure you are forced to reboot to fsck a read only volume, even if it is root. I've done it a few times and it worked without any issues.

The file system doesn't write to it, so it shouldn't cause any inconsistencies if fsck modifies it. When it finishes, you could try to remount it read-write and/or reboot.

WSS · October 2017

@Shazan said:
Although doing it in single user mode is definitely safer, I am not sure you are forced to reboot to fsck a read only volume, even if it is root. I've done it a few times and it worked without any issues.

The file system doesn't write to it, so it shouldn't cause any inconsistencies if fsck modifies it. When it finishes, you could try to remount it read-write and/or reboot.

Before systemd, you could call 'init S' to drop into single user mode. Now it's 'systemctl isolate rescue.target'

I personally only do such things after a reboot (after an initial fsck check tells me just how pooched things are) as I don't trust the filesystem to be safe until it can cleanly-as-possible replace the journal.

Falzo · October 2017

I'd say first thing to do if you have valuable data on it: backup/rsync everything out.

if you can create or export a disk image with the provider, maybe do that too.

only after that mess around ;-)

Amitz · October 2017

Thank you, guys!

I started the VPS in rescue mode, did let fsck do its job, rebooted and everything seems okay for the moment. Have also contacted the provider (netcup). They told me that they have checked the host and that everything would be okay on their side.

Support answered on a Sunday morning, not bad. No idea whether they really checked the host node, but still. I will keep an eye on that issue during the next days. Let's see if it becomes read-only again. I would then probably just re-install the thing. What would you do in case the problem arises a second time?

Enjoy your weekend
Amitz

JoeMerit · October 2017

A second time? cancel!

Amitz · October 2017

JoeMerit said: A second time? cancel!

Are you implying that the "read-only" issue is caused by something that my provider - and only my provider - is responsible for?

cociu · October 2017

Amitz said: I will do a cociu and see what the fsck will bring

100 eur first please ! i hope you have at least 500 tb in this kvm will tard arrownd 3 weeks , tested and confirmed.

angstrom · October 2017

@Amitz said:

JoeMerit said: A second time? cancel!

Are you implying that the "read-only" issue is caused by something that my provider - and only my provider - is responsible for?

Yes! Blame it on your provider! ;-)

On a more serious note, if I may say so, you haven't given enough info for any of us to have a clue about what exactly went wrong. If it was simply a rare hiccup, then you should be okay now.

By the way, was this a fresh install of Debian 9 a few months ago, or was it a dist-upgrade from Debian 8? If the latter, was Debian 8 a fresh install or a dist-upgrade from Debian 7?

Amitz · October 2017

I will try to provide more in-depth information if it happens again!
That was a fresh Debian 9 install, a few months ago - no upgrade from a previous version.

raindog308 · October 2017

centos thx?

lion · October 2017

@raindog308 said:
centos thx?

debian, thx

angstrom · October 2017

@raindog308 said:
centos thx?

Live free or die.

Oh, wait, that won't help us choose between Debian and CentOS.

Maounique · October 2017

angstrom said: that won't help us choose between Debian and CentOS.

Actually it will, at least for some

angstrom · October 2017

@Maounique said:

angstrom said: that won't help us choose between Debian and CentOS.

Actually it will, at least for some

So much the better. :-)

bsdguy · October 2017

@Amitz said:
Support answered on a Sunday morning, not bad.

From what I know netcup isn't shitty at all; they are not at all a small operation.

Let's see if it becomes read-only again. I would then probably just re-install the thing. What would you do in case the problem arises a second time?

With a cheap low end provider and ovz I'd cancel and go elsewhere. With a better provider like netcup, however, I'd seriously investigate what could be wrong on my side, too.

Btw, debian was my usual linux distro for years (oh well, for the few cases I used linux; but I frequently advised clients to go linux at least in their backends). After they went the systemd route I forgot about them. systemd just isn't acceptable and if I had a problem on a debian vps systemd would always be my first suspicion. If poettering happened to be killed they should write "finally a problem solved" on his tombstone.

angstrom · October 2017

@angstrom said:

@Maounique said:

angstrom said: that won't help us choose between Debian and CentOS.

Actually it will, at least for some

So much the better. :-)

If time-tested stability is the criterion, one might argue that CentOS 6 or 7 is a better/safer choice than Debian 9.

At the same time, I don't have the impression that the problem that @Amitz experienced with Debian 9 is a common issue, judging by the usual mailing lists, etc.

raindog308 · October 2017

@bsdguy said:
I frequently advised clients to go linux at least in their backends).

"Take your penguin and shove it up your backend!"

raindog308 · October 2017

angstrom said: If time-tested stability is the criterion, one might argue that CentOS 6 or 7 is a better/safer choice than Debian 9.

I was really just teasing @Amitz as I think he's the origin of "debian thx" :-)

Howdy, Stranger!

Categories

In this Discussion

Debian 9 KVM VPS - Filesystem became "read-only". What to do?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

Debian 9 KVM VPS - Filesystem became "read-only". What to do?

Comments