Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Well, that's not good. Why did my VPS eat itself?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Well, that's not good. Why did my VPS eat itself?

raindog308raindog308 Administrator, Veteran

I had some good news and some good news and some bad news today.

The good news is that I found out why one of my web apps isn't working.

The other good news is that I found my backups are solid as I was able to restore it elsewhere.

The bad news:

So this VPS (Deb 11) has been running for a while and the only thing on it are some password-protected web apps. It only accepts ssh keys, and while I'm sure the FSB could break into it, it really didn't have much of an attack surface, but anything's possible I guess.

I'm kind of curious what would have broken it...? I'm assuming something caused a reboot at which point whatever was broken in grub became apparent.

Off to rescue mode...

Thanked by 1netomx

Comments

  • yoursunnyyoursunny Member, IPv6 Advocate

    Your self-cannibalism has been doubled.

  • FrankZFrankZ Veteran

    @raindog308 said: I'm kind of curious what would have broken it...? I'm assuming something caused a reboot at which point whatever was broken in grub became apparent.

    In rescue mode I would check for other disk corruption.
    The reason for your failure to boot may be as simple as the provider having a disk or raid controller failing on the node. As opposed to you being hacked.

  • Quite a few google hits on this one and it was in my search history too (!). Are you using unattended updates? Seems like a grub update failed. If you want to troubleshoot, you could try:

    grub-install --debug

    ...after restoring to one of your pre-bad-news backups :smile:

    Thanked by 1szymonp
  • jackbjackb Member, Host Rep
    edited April 2022

    @FrankZ said:
    In rescue mode I would check for other disk corruption.
    The reason for your failure to boot may be as simple as the provider having a disk or raid controller failing on the node. As opposed to you being hacked.

    While worth checking usually this one isn't related to being hacked or data loss / drive failures etc.

    Usually it's an OS update and grub failed to reinstall correctly. Regenerating initramfs and reinstalling grub from livecd should resolve the issue in most cases.

    I've seen a few customers get this upgrading from Debian 10 to 11.

  • rm_rm_ IPv6 Advocate, Veteran

    @raindog308 said: I'm kind of curious what would have broken it...?

    Did you try Googling for the error message? https://www.google.com/search?q=symbol+grub_calloc+not+found

  • MaouniqueMaounique Host Rep, Veteran

    @jackb said: I've seen a few customers get this upgrading from Debian 10 to 11.

    Not only that exact upgrade, but the message is specific. There are many more situations like this with GRUB or other updates failing, I had to rescue data for a few ppl on Xen-PV and OVZ due to botched updates lately. Maybe overall quality of distros is failing, compared to last year, I think the incidents like this tripled.

  • jsgjsg Member, Resident Benchmarker

    Oh, the pleasures of grub ...

  • raindog308raindog308 Administrator, Veteran

    Unfortunately, booting rescue and then doing a grub-install didn't resolve the issue, even though grub said it was successful.

    Unfortunately this is solus so getting in and out of rescue is hit or miss, and if something doesn't work right there's a long timeout. Probably time to trash and move.

  • HostSlickHostSlick Member, Patron Provider
    edited April 2022

    Is it a KVM VPS by any chance?
    Maybe FileStorage based (QCOW / RAW)??

  • raindog308raindog308 Administrator, Veteran

    @HostSlick said:
    Is it a KVM VPS by any chance?

    Indeed it is, though I installed from a template.

  • Fucking grub. Probably the biggest cause for me of dying servers over the years.

  • jackbjackb Member, Host Rep
    edited April 2022

    @raindog308 said:
    Unfortunately, booting rescue and then doing a grub-install didn't resolve the issue, even though grub said it was successful.

    Did you chroot into the normal system (grub-install from rescue without chrooting will fail to boot) and also regenerate initramfs?

    I'd recommend using grml live cd rather than solus's rescue system. There's a shortcut command in grml - grml-chroot which saves having to manually do the various mounts you need when doing this sort of thing.

  • Prime404Prime404 Member
    edited April 2022

    @raindog308 said:
    Unfortunately, booting rescue and then doing a grub-install didn't resolve the issue, even though grub said it was successful.

    This issue mostly occurs when a system have been updated and from an earlier version of eg. Debian.

    What you may wanna try is the following commands from a chrooted environment and fingers crossed it solves the problem:
    apt-get remove grub*
    apt-get install grub-pc
    grub-install /dev/vda

  • @Prime404 said:

    @raindog308 said:
    Unfortunately, booting rescue and then doing a grub-install didn't resolve the issue, even though grub said it was successful.

    This issue mostly occurs when a system have been updated and from an earlier version of eg. Debian.

    What you may wanna try is the following commands from a chrooted environment and fingers crossed it solves the problem:
    ```apt-get remove grub*

    apt-get install grub-pc

    grub-install /dev/vda```

    Yes, cloud-init exacerbates this issue, by auto-upgrading all your packages.

Sign In or Register to comment.