Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Is this a french thing or just OVHCloud sheningans?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Is this a french thing or just OVHCloud sheningans?

  1. You have Proxmox 7.4 running on OVHCloud dedicated server.
  2. Suddenly your server crashes one night and you get error on boot: Unable to locate IOAPIC for GSI -1
  3. You boot to rescue, and check lsblk

Everything looks well.. You can mount/chroot the data and LVM with:

ls /dev/mapper/*

mount /dev/mapper/lvm-data /mnt
and the system with
mount /dev/mapper/controller /mnt

Okay, proceed to unmount. Can backup if needed.

Proceed to open a ticket:

Hi, my server crashed recently, and I get this error on boot: Unable to locate IOAPIC for GSI -1

I have checked that the both disks show up and mount normally in rescue mode.

I have important data in server, so Im only asking what this error is and how to solve it and boot back into current system.

The server is running default Proxmox VE 7.4 template.

OVHCloud asks permission to check the issue via ticket.

No issue with hardware.. Okay back to square one.. reboot back to rescue and start taking backups.

All partitions gone after OVH intervention.

Did they just nuke every fucking thing there was? When I especially asked them to just check it. Or was there some kind of French - English language barrier? Or am I just missing something here and all data is fine despite all partitions are now gone from lsblk?

«1

Comments

  • stefemanstefeman Member
    edited September 2023

    I'm not even mad. This is so dumb its almost amazing. Question is, can this be salvaged? any advice is kindly appreciated lmao

  • stefemanstefeman Member
    edited September 2023

    Or maybe im just a fucking retard myself. lsmod shows empty response.

    Something is wrong with this rescue mode.

    Trying to enable mdraid, lvm, device-mapper with modprobe next.

  • This is an unmanaged dedicated server, they do not boot into your OS to check stuff. The most they'll do is check the hardware from rescue mode.

  • jackbjackb Member, Host Rep

    pvscan
    vgscan
    lvscan
    mdadm --assemble --scan

    These are four you'll probably want look up if you're not already familiar with them.

  • stefemanstefeman Member
    edited September 2023

    They had outdated rescue image loaded without correct modules from the intervention. everything shows up again now.

    I will do backups of the qcow2 disk image and then I try to figure out whats wrong with the boot error: Unable to locate IOAPIC for GSI -1

  • @jackb said:
    pvscan
    vgscan
    lvscan
    mdadm --assemble --scan

    These are four you'll probably want look up if you're not already familiar with them.

    Any ideas how to proceed? xD

  • jackbjackb Member, Host Rep

    @stefeman said:
    Any ideas how to proceed? xD

    Have you found your root and boot partitions?

    If so, mount root to /mnt and boot to /mnt/boot , then follow the below to chroot into your system.

    https://superuser.com/a/417004

    Then reinstall grub following the instructions for your particular OS - varies per distro typically.

  • mount /dev/md126 on /mnt

    ls /mnt
    bin dev home lib32 libx32 media opt root sbin sys usr
    boot etc lib lib64 lost+found mnt proc run srv tmp var

  • cat /mnt/etc/fstab
    UUID=766bffc2-b1f4-440c-8d38-525cc3f739cd / ext4 defaults 0 1
    UUID=941fa5a7-979d-4486-bb9b-823f313c64f4 /boot ext4 defaults 0 0
    LABEL=EFI_SYSPART /boot/efi vfat defaults 0 1
    UUID=95d9af38-6faf-4b2d-99f9-46a3ce548de1 /var/lib/vz ext4 defaults 0 0
    UUID=8d444ae4-9813-46ca-a816-3e387f5c09d7 swap swap defaults 0 0
    UUID=682703b2-a54d-46ea-bf14-13905848b9ed swap swap defaults 0 0

  • stefemanstefeman Member
    edited September 2023

    mount /dev/md127 /mnt/boot
    mount /dev/nvme1n1p1 /mnt/boot/efi

    This should be it for mounting the boot partitions.

  • No errors so far:

    mount -t proc /proc /mnt/proc/
    mount --rbind /sys /mnt/sys/
    mount --rbind /dev /mnt/dev/

    Also no errors so far.

    chroot /mnt

    now im at chroot

  • Did you recently update/change the Linux kernel? It seems to be a kernel issue.

    Thanked by 1stefeman
  • MaouniqueMaounique Host Rep, Veteran

    I would be very careful about tickets mentioning disks, if I were you. Might swap disks first, check them later, if that.
    You might never know how they would react, the random is very high in those people.

    Thanked by 1stefeman
  • stefemanstefeman Member
    edited September 2023

    @Val said:
    Did you recently update/change the Linux kernel? It seems to be a kernel issue.

    Possibly, if it was via normal update/upgrade how can I check this?

    Also,

    ls /etc/pve/

    is empty..

    ls /var/lib/pve-cluster/
    config.db config.db-shm config.db-wal

  • jackbjackb Member, Host Rep

    @stefeman said:
    No errors so far:

    mount -t proc /proc /mnt/proc/
    mount --rbind /sys /mnt/sys/
    mount --rbind /dev /mnt/dev/

    Also no errors so far.

    chroot /mnt

    now im at chroot

    Perfect. Reinstall grub and try a reboot.

    Thanked by 1stefeman
  • @jackb said:

    @stefeman said:
    No errors so far:

    mount -t proc /proc /mnt/proc/
    mount --rbind /sys /mnt/sys/
    mount --rbind /dev /mnt/dev/

    Also no errors so far.

    chroot /mnt

    now im at chroot

    Perfect. Reinstall grub and try a reboot.

    root@rescue-customer-ca:/# update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.15.108-1-pve
    Found initrd image: /boot/initrd.img-5.15.108-1-pve
    /usr/sbin/grub-probe: warning: Couldn't find physical volume (null)'. Some modules may be missing from core image.. /usr/sbin/grub-probe: warning: Couldn't find physical volume(null)'. Some modules may be missing from core image..
    /usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Warning: os-prober will not be executed to detect other bootable partitions.
    Systems on them will not be added to the GRUB boot configuration.
    Check GRUB_DISABLE_OS_PROBER documentation entry.
    Adding boot menu entry for UEFI Firmware Settings ...
    done

  • stefemanstefeman Member
    edited September 2023

    May you tell me the exact commands I could try?

    I tried:

    add "noapic" on the variable GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub to disable APIC

    But I get the above stuff.

    root@rescue-customer-ca:/# update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.15.108-1-pve
    Found initrd image: /boot/initrd.img-5.15.108-1-pve
    /usr/sbin/grub-probe: warning: Couldn't find physical volume (null)'. Some modules may be missing from core image.. /usr/sbin/grub-probe: warning: Couldn't find physical volume(null)'. Some modules may be missing from core image..
    /usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Warning: os-prober will not be executed to detect other bootable partitions.
    Systems on them will not be added to the GRUB boot configuration.
    Check GRUB_DISABLE_OS_PROBER documentation entry.
    Adding boot menu entry for UEFI Firmware Settings ...
    done

  • -rw------- 1 root root 40K Sep 20 12:18 /var/lib/pve-cluster/config.db

    I probly have my configs right here since /etc/pve is empty.

  • stefemanstefeman Member
    edited September 2023

    @stefeman said:
    May you tell me the exact commands I could try?

    I tried:

    add "noapic" on the variable GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub to disable APIC

    But I get the above stuff.

    root@rescue-customer-ca:/# update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.15.108-1-pve
    Found initrd image: /boot/initrd.img-5.15.108-1-pve
    /usr/sbin/grub-probe: warning: Couldn't find physical volume (null)'. Some modules may be missing from core image.. /usr/sbin/grub-probe: warning: Couldn't find physical volume(null)'. Some modules may be missing from core image..
    /usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    Warning: os-prober will not be executed to detect other bootable partitions.
    Systems on them will not be added to the GRUB boot configuration.
    Check GRUB_DISABLE_OS_PROBER documentation entry.
    Adding boot menu entry for UEFI Firmware Settings ...
    done

    I assume this works as intended.

    Will try to reboot now.

    Configs file should be automatically mounted by pve-cluster at boot

  • stefemanstefeman Member
    edited September 2023

    Currently boots this far and gets stuck:

    The cursor dot at the bottom keeps flashing though, and I dont have RAID6 so I wonder why its saying that.

    I also tried remove the noapic grub command, but the old no GSI error comes back.

  • @stefeman said:
    mount /dev/md127 /mnt/boot
    mount /dev/nvme1n1p1 /mnt/boot/efi

    This should be it for mounting the boot partitions.

    I wonder if the mount /dev/nvme1n1p1 /mnt/boot/efi command was something I was not supposed to apply as the /dev/md127 or /boot partition already contained efi folder inside.

  • jackbjackb Member, Host Rep
    edited September 2023

    @stefeman said:

    @jackb said:

    @stefeman said:
    No errors so far:

    mount -t proc /proc /mnt/proc/
    mount --rbind /sys /mnt/sys/
    mount --rbind /dev /mnt/dev/

    Also no errors so far.

    chroot /mnt

    now im at chroot

    Perfect. Reinstall grub and try a reboot.

    root@rescue-customer-ca:/# update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.15.108-1-pve
    Found initrd image: /boot/initrd.img-5.15.108-1-pve
    /usr/sbin/grub-probe: warning: Couldn't find physical volume (null)'. Some modules may be missing from core image.. /usr/sbin/grub-probe: warning: Couldn't find physical volume(null)'. Some modules may be missing from core image..
    /usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..

    I've got this error before from degraded/unsynced mdadm arrays. Check the contents of /proc/mdstat and see if your arrays are intact.

  • @jackb said:

    @stefeman said:

    @jackb said:

    @stefeman said:
    No errors so far:

    mount -t proc /proc /mnt/proc/
    mount --rbind /sys /mnt/sys/
    mount --rbind /dev /mnt/dev/

    Also no errors so far.

    chroot /mnt

    now im at chroot

    Perfect. Reinstall grub and try a reboot.

    root@rescue-customer-ca:/# update-grub
    Generating grub configuration file ...
    Found linux image: /boot/vmlinuz-5.15.108-1-pve
    Found initrd image: /boot/initrd.img-5.15.108-1-pve
    /usr/sbin/grub-probe: warning: Couldn't find physical volume (null)'. Some modules may be missing from core image.. /usr/sbin/grub-probe: warning: Couldn't find physical volume(null)'. Some modules may be missing from core image..
    /usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..

    I've got this error before from degraded/unsynced mdadm arrays. Check the contents of /proc/mdstat and see if your arrays are intact.

    root@rescue:~# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multi                                                   path] [faulty]
    md126 : active raid1 nvme0n1p2[0]
          1046528 blocks super 1.2 [2/1] [U_]
    
    md127 : active raid0 nvme0n1p3[0] nvme1n1p3[1]
          20953088 blocks super 1.2 512k chunks
    
  • jackbjackb Member, Host Rep

    @stefeman said:
    root@rescue:~# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multi path] [faulty]
    md126 : active raid1 nvme0n1p2[0]
    1046528 blocks super 1.2 [2/1] [U_]

    md127 : active raid0 nvme0n1p3[0] nvme1n1p3[1]
    20953088 blocks super 1.2 512k chunks

    Sync the degraded arrays and then retry the grub reinstall from chroot as before.

    Thanked by 1stefeman
  • @jackb said:

    @stefeman said:
    root@rescue:~# cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multi path] [faulty]
    md126 : active raid1 nvme0n1p2[0]
    1046528 blocks super 1.2 [2/1] [U_]

    md127 : active raid0 nvme0n1p3[0] nvme1n1p3[1]
    20953088 blocks super 1.2 512k chunks

    Sync the degraded arrays and then retry the grub reinstall from chroot as before.

    Any idea how? xD

  • jackbjackb Member, Host Rep
    edited September 2023

    @stefeman said:
    Any idea how? xD

    You need to find which missing partition belongs in /dev/md126. Then mdadm --manage /dev/md126 --add <partition>

    I'd hazard a guess at nvme1n1p2, but you should confirm the size matches nvme0n1p2 / that the partition table of nvme0n1 matches nvme1n1.

    Thanked by 1stefeman
  • Or just backup and reinstall?
    Have you checked with smartctl the state of both disks as well?

  • jackbjackb Member, Host Rep

    @Val said:
    Or just backup and reinstall?
    Have you checked with smartctl the state of both disks as well?

    Honestly this sort of problem is recoverable most of the time. The procedure to recover it is worth learning for anyone using a bare metal system.

    Thanked by 2Val stefeman
  • @jackb said:

    @Val said:
    Or just backup and reinstall?
    Have you checked with smartctl the state of both disks as well?

    Honestly this sort of problem is recoverable most of the time. The procedure to recover it is worth learning for anyone using a bare metal system.

    I rebooted into recovery final time, seems like the lsblk is different every time.

    cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
    md126 : active raid1 nvme0n1p2[0]
          1046528 blocks super 1.2 [2/1] [U_]
    
    md127 : active raid0 nvme0n1p3[0] nvme1n1p3[1]
          20953088 blocks super 1.2 512k chunks
    
    unused devices: <none>
    

    Would mdadm --manage /dev/md126 --add <partition> still apply?

    And thanks for the support to everyone so far.

  • Im fairly sure that right now, md126 is the boot and it needs to be on both drives.

Sign In or Register to comment.