All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
AsRock X570D4U random crashes
We've got a server that has been stable for the best part of a year but over the past 3 or 4 days has started to have random crashes. It will go down after a few hours and when checking the IPMI the console just shows "powered off". When we power on the server it boots normally to the OS and will be fine for a few hours until it happens again.
This is the server spec:
AlmaLinux 8.10
AsRock X570D4U board
Ryzen 5800x
2 x 16 GB RAM sticks
There is nothing written to the logs so it isn't a graceful shutdown.
All sensors are normal so it isn't overheating, the drives are fine and a RAM test doesn't bring back any errors.
Anyone ever come across this before with this board/CPU or have any suggestions before I bin it and move everything to a new box?

Comments
I can tell you based on our experience it is the motherboard, we had many of such issues and same behaviour until it completely die. Just replace the motherboard/mainboard and you will be fine.
I haven't used this model but ASRock BD650U and I've had to replace around 10 motherboards due to this exact issue. The experience is as @AlbaHost writes that it'll happen and one day it will refuse to power on. From my experience this is sooner than later.
Just to be safe:
Early on we had some X570D4U's exhibiting a symptom where they'd crash/lockup every 24-72hrs roughly. The above fixed it.
Similar to @labze I think we've got around ~20 B650D4U boards waiting/in some level of ARR RMA, all early variants. Those ones just die, never power back up (IPMI still works) and spit out code "00" on your debug LED.
One X570D4U ended up acting kind of similar, where it would crash/power off every few hours but would come back. In the end was fixed by just recycling the board as @AlbaHost said.
As crunchbits mentioned the first thing to do is update the bios, that's what I did on my Ryzen 9 5950x servers which are on the same board and haven't had issues.
So just to update - before swapping out the board completely we replaced the PSU cables and swapped the PSU ports. So far, to my surprise, it has been up for over 6 days with no more random crashes.
My tip at this point, dont buy ASRock Rack. We had far too many boards replaced due to quality issues, avoiding them now like the plague.
Good point - I'm not a fan of ASRock (in general) either. Issue is though, that AFAIK ASRock is the only one out there providing AMD compatible MBs with remote access, something invaluable for a datacenter.
https://www.supermicro.com/en/products/motherboard/h13sae-mf
There is also Supermicro motherboard..
Yeah, saw that recently, but haven't been able to get one. There are 2 suppliers here that say they can deliver, but asking them when it turns out "we expect" and "weeks"
We have some Ryzen 7900's running on Supermicro boards and no problems at all in terms of random reboots, drives vanishing, etc. Unfortunately our older Ryzen's are stuck on ASRock for now.
Along with Supermicro’s AM5 board, there is also some variants from MSI:
https://eps.msi.com/en/product/server-motherboards/D3051-D3051GB4N-10G
I’m curious if anyone has any experience with the MSI one.