New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Hetzner to replace motherboard on ALL their AMD servers. Will take over a year.
- Why are we replacing these mainboards?
There have unfortunately been an increasing number of unexplained incidents with mainboards on a number of servers from these models: AX42, AX52 and AX102. After a detailed analysis of the incidents which we did in cooperation with the manufacturer, it became clear that there is a design error on the mainboard for these models. We want to minimize the risk of this causing a problem on these servers, so we want to replace all affected mainboards.The process to replace these mainboards will begin in February 2025. Because many mainboards are affected, we estimate that it will take an entire year to replace all of them

Comments
https://www.google.com/search?q=Hetzner+Motherboard+Replacements+for+AX42,+AX52+and+AX102+inurl:lowendtalk.com
https://lowendtalk.com/search?Search=There+have+unfortunately+been+an+increasing+number+of+unexplained+incidents+with
/Closed.
Very much NOT all their AMD servers; it's their Zen4-based ones (e.g. DDR5-RAM based).
Zen3 and earlier is fine.
Looks like just Ryzen Zen4 are impacted. AX162 (EPYC Zen4) appears not affected.
Good enough at best, the chickens are coming home to roost on their desktop grade hardware strategy
https://www.ubicloud.com/blog/debugging-hetzner-uncovering-failures-with-powerstat-sensors-and-dmidecode
Banning all the small time customers and focusing on larger custs while peddling dubiously reliable hardware is one of the business moves of all time
Hetzner is banning the small customers? Why would they do that?
They are a budget provider so they are trying to minimize the costs to keep prices for customers low.
I have a server that will be replaced on the 25th and will need to be down for 30-40 minutes.
Good to know that their AX162 platform has already iterated to v3, which is much more reliable now.
I hope their costs aren't rising and asrock is paying for the replacements and work hours.
I like to call it Karma.
https://www.ubicloud.com/blog/debugging-hetzner-uncovering-failures-with-powerstat-sensors-and-dmidecode
and a nice write up by ubicloud but unfortunately they do not name the motherboards (and revisions). This would be nice in checking if a newly ordered AX162 is already running on the newest mobo/bios/etc.
Btw: is anyone here having a X account still (I don't)? Could you ask them about the mobo models/revisions on X here: https://x.com/UbicloudHQ/status/1891482264036311428#m ? Thank you.
Must be some really bad issue with the motherboards for them to replace all of them, that's A LOT OF COST.
Downside of doing it custom, as the mobos are customized for their needs.
I wonder if these motherboards winds up in the refurb market afterwards ...
It's not only Hetzner having those problems, I see similar problems with Ryzen (and sometimes Epyc) builds at other providers.
But Hetzner is actually doing sth. in this regard.
(I still don't understand that they ditched the PX line with some good and trustworthy XEON builds, missing that product line dearly).
Most probably they're using custom-made Hetzner only motherboards; if it were publicly available models, we would hear many more independent reports.
@webhorizon but I was talking about ubicloud and their great analysis. They could just do a dmidecode -t2 on the AX162-v3 (as they call it) the way they did with the borged mobo models before.
Hetzner has had some pretty unstable models over the years. We had SX63 back when it launched, and the drives kept on dropping out and no longer showing until a system cold boot (power off and then power on) across multiple systems that we had.
Those are AsRock Rack Boards. Hetzner uses custom boards from AsRock Rack, where some components are rotated, but other than that it's still the AsRock Rack Board anyone can buy. Several other providers have the same issues with the same board.
Hi,
yes we do also have exactly the same issue.
Its the Asrockrack B650 Chip boards, produced between Q1 and Q3 2024 with pre H4 serial numbers.
This was due to low quality metals used in ASRR boards in early builds. Supposedly fixed now.
I don't get why some people seem to be surprised or even angry.
Look, Hetzner / @Hetzner_OL grew as and to a large degree still is a "reasonably good quality for a low price" provider. As they also are a very big provider it seems perfectly reasonable for them to save pennies (actually dollars) on mass equipment like mainboards and/or somewhat customized mainboards. Asrock produces for many, many consumers, hence high variability is trump, Hetzner however has a clear profile like "we don't need (just for an example) any graphics beyond VESA" or "we need x NVMe and y SATA ports" and in the relatively limited cases they do need say a powerful GPU they just put in a PCIe card. Also note that not having certain features (at least not as standard) not only saves cost but also often saves electric power.
And that's what most of their customers WANT, cheap products of reasonably good quality and having their own (more or less) customized main boards is a smart and helpful step towards that goal.
As you probably know I'm not a Hetzner fan - but neither am I against them. Plus I highly value fairness, so please let's look at what happened to them in a fair way.
Finally, at least nowadays, there simply is no certainty that hiccups (or even cluster fucks) don't happen. Even to the best. Well, this time it hit Asrock and in consequence some Hetzner product lines.
WHAT do you (the angry ones) WANT? Should Hetzner, like quite a few others, try to hush it up and/or throw a little rebate at you as a token? Or should a decent provider in such a situation recognize and factually solve the problem - as Hetzner chose to do?
IMO the latter is the only acceptable and responsible option and I say kudos to Hetzner for their way of handling the problem/situation!
If anything I, if I were a customer -and- my server were a production machine, would only have one simple question: Hetzner, would you please kindly either provide a temp "fallback" server for 3 days (one before, one the day of board swap, and one after) so as to make sure my production site/services stay fully available -or- swap me over to an already upgraded server (5 min downtime) instead of having 30 - 60 min downtime, and at a time of day my server receives few requests per second?
I wish you a smooth migration to the new gen boards, Hetzner! And again, kudos for acting transparently and responsibly.
How has your personal experience been with the newer boards? I know you're a big Ryzen/ASRock shop so would be interesting to know if you have seen things becoming more stable with the newer boards that you're running the 79** and 99**' Ryzen's on...assuming they are using ASRock?
We switched to SuperMicro boards, we haven't stocked ASRR for at least a few months.
Edit: Relating to ASRR and not to my reply below, I've had newer manufacturered ASRR boards with the same issue, the issue definitely has not gone away. Also only running SMC boards now.
You should let people request replacements as an option. I offered to pay full cost of replacements and hands time for it and was denied, I quit deploying new servers with RSNET because of it.
Hetnzer is the goat. big cost to them. Always thought they're one of the best companies.
we have seen this same issue on a few providers, sadly, they don't seem to take proactive measures. so if you have the issue, you need to report it to support, wait few hours, they try to debug, escalate, find the issue, ask for replacement, and make the physical change.
lots of downtime, and waste of money. leaving some providers too because of that.
Probably servers that are not running right now will get the first replacements, and only then customers with running servers