New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Random server freeze
Hi
One of my dedicated servers freezes suddenly every few weeks, no ping, no SSH, all services (webserver, ftp, etc) are down.
When I connect via KVM it is frozen and I can't do anything but I see a weird message from UFW, can't scroll up or down...
Restarting solves the issues.
The screen I see on KVM (covered some IPs just in case)

Which is just some UFW log, that I believe is irrelevant...
Any help and suggestions to figure the cause of this issue is much appreciated, thanks!

Comments
@yoursunny in case you have anything to suggest...
Is this a VPS or a Dedicated?
Does it have IPMI/IDRAC?
Does it have IPMI/IDRAC?
dedi, yes, but frozen as well
I wonder...
Is your logs or server storage full?
nope...
Any chance you're using Proxmox?
I had similar problems on two of my private servers, upgrading to opt in Kernel 5.19 fixed the problem, at least so far
Have u ever check your power supply? I got this similar error on some old hosts and was fixed just changing it but i'm not sure is the same things
I had similar issue recently, My server was heat up. Try to check your hardware status
Which OS is it and which kernel, i have had mostly such issues with AMD servers and kernel. By updating to kernel lt or ml the issue resolved.
@SloMail OS is Debian 11
I tried kernel updates, etc, nothing worked and I gave up on debugging after hours and I assumed if it is power supply ( @CiprianoOscar ) , overheat ( @vba ) , or any other HW issues, it is out of my hands
So I just asked to replace the server while keeping the old drives, This will probably eliminate the random freeze issue.
Only problem is: The new server with old drives refuses to boot
, for different reason
) , but that's another story, I will try to solve it hopefully...
( disk mduuid not found even after re-installing grub2
If you were still troubleshooting, you'd get all the temperatures available to the motherboard, CPU and hard drives.
You'd also look at the system log for errors and warnings. Run smartctl on the drives, minimum of the short test and longer test if possible.
You'll also want to note if the freezes happen at specific intervals.
I'd expect reboots if it was PSU.
You can install os in different new drive on server and mount previous drive.
Did that... nothing conclusive
Nope, not regular/specific intervals...
I agree with you if it was PSU, I would expect a bigger sign (power related) than just a freeze on the same screen. (server reboot, power off, etc)
Do a memtest to make sure your DIMMs are ok. Preferably more than 1 pass.
If possible try a different CPU too to see if it's replicated on there. Unfortunately none of this is particularly easy to identify the root cause of based on experience.
The primary two causes that I can assume for the issue -
Maybe you can try reinstalling your OS.