New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
Been using 5950x since some time, never had issues, also does it reboots or just freezes? if it freezes, make sure to check it's not related to NVME
Do check IPMI logs, may be power issue or loose power cable ?
Also try to use the default 4.x kernel that comes with Almalinux and see if the issue persists or not
There was similar topic here: https://lowendtalk.com/discussion/185740/hetzner-ax101-ax102-spontaneously-reboots
I've seen this topic but nothing much has been solved there, yes I added this but it didn't help me GRUB_CMDLINE_LINUX_DEFAULT="consoleblank=0 nomodeset noapic pci=assign-busses apicmaintimer idle=poll reboot=cold,hard".
Premium water cooling is used and it also didn't help, no network issues at all
Howe you updated grub afterwards?
grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX=GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/almalinux-swap rd.lvm.lv=almalinux/root rd.lvm.lv=almalinux/swap rhgb quiet consoleblank=0 nomodeset noapi$
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true
Oh, it seems chat gpt didn't help me much and I've now realised I've spelt it wrong, can anyone advise here's what I originally had
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/almalinux-swap rd.lvm.lv=almalinux/root rd.lvm.lv=almalinux/swap rhgb quiet"
How do I add these lines correctly?
GRUB_CMDLINE_LINUX_DEFAULT="consoleblank=0 nomodeset noapic pci=assign-busses apicmaintimer idle=poll reboot=cold,hard"
Am I right in thinking it should be like this?
GRUB_CMDLINE_LINUX="consoleblank=0 nomodeset noapic pci=assign-busses apicmaintimer idle=poll reboot=cold,hard=crashkernel=auto resume=/dev/mapper/almalinux-swap rd.lvm.lv=almalinux/root rd.lvm.lv=almalinux/swap rhgb quiet"
I would suggest doing these tests to diagnose hardware issues
https://help.ovhcloud.com/csm/en-dedicated-servers-hardware-diagnostics?id=kb_article_view&sysparm_article=KB0043506
Maybe start with the Memory test. It's from my experience one of the most common causes of random reboot.
My bet is on the nvme as well.
Do your memories have a heat sink? If so, they also require airflow. ECC memories would save you a week of diagnosis work here.
Is it Gigabyte motherboard? If so, it's a known issue. Nothing will really help, it's random.
It's highly unlikely that the issue is related to memory errors.
iv had a similar issue but on a different cpu / motherboard but it had something to do with the psu where it would loose power but boot back up since it was so quick and it seemed like it just 'rebooted' but it was really it just loosing power completely but for a very brief amount of time.
Why?
I would think it would have something to do with hardware if there are no relevant logs leading up to the reboot
Just personal experience. We've built hundreds of Ryzen machine, AM4 and AM5 on the server boards with non ECC RAM. We've never had bad memory cause an issue. Bad motherboards on these Gigabyte / Asrock Rack builds are very common and often cause issues exactly like what OP is describing.
The problem was related to CPU virtualisation, one hosting gave us a command that solved this problem, but unfortunately the hoster asked us not to publish this solution to the public, oh that competition
Actually it's all AMD's fault, how could he not think that people will make virtual machines on this processor?
Likely not competition, I'd guess it's an issue on their side if they don't want it published.
I and many others run lots of virtual machines on AMD Ryzen, never a problem.
This is a problem with the latest ryzen series and it is very much prevalent, if you raise kvm machines on a dedicated server from Hetzner on and ryzen 7950x they will crash
Weird, never had a problem, but I don't run anything on Hetzner.
I have 3 7950XD servers from Hetzner running KVM VPS and they haven't crashed a single time.
@Dessgun is the crash related to nested virtualization?
Yes, partially
I have 7950x3d. I also experience host random reboots mostly when I play certain games in the VM. It only happens if nested virtualization is enabled and Win11 guest enables virtualization-based security. If I disable nested virtualization the host is stable.
Motherboard is ASUS TUF X670E-Plus and 2x32GB ECC ram.