New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Comments
I have some HC520 drives on my home NAS. They are great so far. I got them as server pulls with a 5 year warranty for $75 each.
Ok, official info was sent to remaining affected customers, please check your e-mails.
Here is a time-lined description of the event of the fuckup:
July 27, 2025 – Afternoon (GMT+3):
Multiple Seagate ST18000NM019J drives (firmware KM02) across two nodes suddenly powered down due to a firmware-related failure. Drives began reporting critical SMART alerts (Data channel impending failure), causing the RAID-6/60 array to become unavailable.
Result:
Addon storage volumes became inaccessible, and VPS services depending on those volumes were disrupted. Some NVMe-based systems also experienced write issues due to OS-level I/O buffering.
July 28, 2025 – Morning:
Our team accessed the datacenter, identified the fault, and began recovery efforts. All NVMe-only VPS services were successfully migrated to healthy nodes.
July 28–29, 2025:
RAID array access was restored in degraded mode, enabling partial access to addon volumes at limited transfer speeds.
🧪 Root Cause
Firmware fault affecting multiple ST18000NM019J (KM02) drives simultaneously
RAID controller entered fault mode due to concurrent SMART failures
No physical disk damage, no reallocated sectors or ECC errors — this was purely firmware-triggered
🛡️ Mitigation Going Forward
We are conducting a full infrastructure audit to identify any remaining ST18000NM019J drives with KM02 firmware
Affected drives will be proactively replaced or updated, where supported
RAID monitoring thresholds and firmware validation processes are being tightened to catch these failures earlier
This was an unprecedented firmware-level failure that bypassed typical RAID fault tolerance. We appreciate your understanding as we finalize recovery efforts for impacted systems.
Here is an output of one of the drives, maybe it can help others to check theirs if they have the same model used, all 6 reported exactly the same error, have the same powered on hours ( ~266 days ) and were brand new.
SMART Health Status: Data channel impending failure general hard drive failure [asc=5d, ascq=30]
The error in bold triggered the detach of the drives from the raid array.
Here is a screen shot from the log of one of the dells servers ( R740 ) showing 2 drives leaving the " chat" at the precise same time, DST was not set on the server so that is why the time shows only 12:00
@host_c you dropped something
↓
↓
↓
↓
↓
↓
👑
Seagate more like Samsung am I right?!
I always said that it is a question of when something like this will happen, rather if it will happen.
These are totally out of ones control. To be fair, in my career, this is a forst, especially with SAS drives, not to mention New drives.
For fuck sake, we have older 8-10-14 TB SAS drives that have 5 years powered on time and have no issues even today. ( WD/HGST/Seagate )
Ah yes, Seagate was as helpful as the popcorn replays we will get here
It is what it is.
We will most probably switch to HGST in the next Drives Orders we will do.
After causing data loss by Seagate many years ago, I wouldnt touch them anymore even with a 10 foot pole
I will agree with you on this. At least when we have drive fails with HGST, it is usually 1 drive / server, not 4-5-6 at the same time.
Don't get me wrong, we are used to drive fail, as well, we do storage, so I see nothing abnormal to change drives on a monthly basis in the data-center. But this was something new.
Luck is that we do not have many 18 TB seagates left. ( Especially this model )
Also, this issue is specific only to EXOS X18 line, mostly 16-18 TB models, at lest this was the info we googled and found for the past 48h.
Customers have their options in the mail, and we have a new thing to put up to the check list.
It is an unfortunate event, but fuck, this is life, some things break sometimes.
Too bad this messed up our week of upgrades in the DC we wished to do, so that got derailed for another week or so.......
TL;DR
Hi,
similar happened to a customer of us... with 8 TB Intel NVMe drives... failed faster than could be replaced.
Sometimes it is what it is...
Wish you all luck to get all data out there before things explode!
And its just another case that should show clearly to everyone: Keep always backups somewhere... there is no 100% security, no matter how good the hoster or the hardware might be.
THX
@host_c - I received your email. Thank you very much for being open and informing your customers about the causes of downtime. Such openness is highly appreciated with regards to respect for your business. If I may, I have some questions:
Thanks again for your understanding and clear communication.
Does that mean those without email updates....are not impacted?
Edit: Oh well just checked I do received an email update from @host_c However none of my VPS seems to be affected (as I can still access them normally?). Is that so?
1. Which drives will you use for customers choosing a fresh install?
HGST 14 and 16 TB, some have a mix of Toshiba and HGST, sincerely I cannot recall from top of my head exactly.
2. Which drives will you use for customers opting for recovery?
same
yet, customers are provisioned over a raid array, not individual drives!
3. Will IPv4 and IPv6 addresses change in the case of a fresh re-install?
No, these will be manually issued, so we will preserve IPs settings. However due to this manual provisioning, it will be slow, as firstly we have to delete the old config manually from the cluster, recreate it and so on
4. Will CPU change upon a fresh re-install?
No. CPU Type Generation will not change, Model might as we have from 2.4 to 2.7 GHz Scale Gen 2 CPU's
Precisely, Mail was sent to VPS on those specific nodes. Who did not get any mail and has it's service up, it means nothing happened, carry on
thanks for sharing the juicy details with us
Seagate, you suck!
yeah, if they don't help you with this, don't buy any Seagate drive ever again!
Well, they blown me off as the drives were not bought thru a certified Seagate reseller. as if any of us can manufacture a drive at home. Fuck me.
There are 3 Drive Manufacturers in the WORLD:
Seagate
WD/HGST
Toshiba
So any drive you have bought that is enterprise and has a 5 year warranty should be replaceable regardless that you bought it on e-bay or a shop. ( in the limits that the drive does not have hammer marks or it did not operate in 50 degree Celsius )
But, here is the reply from them, and I will underline the fact that we asked for a FW fix not a replace, as a fw fix might have helped more. I could not care less that we have 6 or 10 failed drives, that is my problem, I asked for FW fix as that is the issue that might had helped us and our customers issue; again, there is no mechanical issue with them, they just decided to go to holiday.
and left the array in the middle of the day.
Now, this is not a Seagate only policy, WD and Toshiba do the same.
EDIT:
One of the reasons I moved away from HP years ago was their restrictive firmware and BIOS update policy. Starting with Gen8 servers, critical updates — including fixes for issues that only emerge under specific conditions — were locked behind a support subscription.
This approach is frustrating because firmware and BIOS bugs are not user-created issues; they are vendor-side flaws that should be resolved as a matter of responsibility. Requiring customers to pay for access to those fixes feels like a penalty for simply using the hardware.
Now that HP is involved with Juniper, I can only hope they don’t bring this same restrictive, short-sighted policy mindset into that ecosystem. - tho I am positive they will.
Stuff happens. Then you fix it. Thats life. You're doing a good job communicating what happened.
My question is: did any customers lose data ? the customers that requested recovery, how does that happen ? Is this a forensic recovery where the drives are sent out ?
This is not a forensic procedure , it is an I house solutin, we do not send out anything that has customer data to no one regardless the situation.
We did manage to inport the array on an older controller that does not take into account the smart error of the drives ( for the moment ), but copy off them is extremely slow, a few mb/sec
This is why we sent out the mail that those that do not have crucial data, can opt for provision of a new vps, as it is faster. Those that need the data will have to wait till we move the add-on drive to a new vps, slow, very slow.
Unfortunately we cannot guarantee the integrity of the data we recover, that will be up to the user to check. This is the best we can do under the current circumstances.
thanks for the explanation. Good luck to all involved and even though most of my stuff is backed up ya always wonder, 'what am I not backing up'
You. There is no backup of you. Once you die, that's it. There is no backup of your firmware, especially considering it is patented and personalised for you. This is why the end is always nigh.
thats deep, mate.
Any flash sale plans? Could use a stronger (CPU, RAM) box for Immich to aid my existing storage VPS.
What is your current storage VPS specs? Just out of curiosity.
What are your current specs? I have a home instance crunching on rpi4
1 vCPU, 2GB RAM, hosting 2 static sites, an OpenCloud instance, Open WebUI, Shlink and Actual Budget already
Depending on your photo library, You should really get a new box and not just add resources to this.
That's exactly the plan. Could also benefit from redundancy in the EU region, still being burnt by previous host's 'emergency migration' with now close to a week of unexpected downtime.
Correct a small mistake, currently Toshiba hard drives belong to WD.
So there are only 2 Drive Manufacturers in the WORLD:
Seagate and WD
Fairly certain Toshiba still is an independent hard drive manufacturer. They bought out certain 3.5" drive manufacturing facilities and IP from WD in 2012, with WD buying certain 2.5" facilities and IP in turn, and nothing has changed in the meantime AFAIK.
Although I was not affected I highly appreciate the transparency. Thanks for taking your time to let others know about the issue so they can be aware. I feel I am in very good hands
Nice, THX for the update on this.
I feel good having only 2 options, makes things far more simpler.