New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Hetzner: Data Loss Incident
iwaswrongonce
Member
in Outages
Yikes, not a good look...
DATA LOSS INCIDENT (SNAPSHOTS)
Dear Customer,
Unfortunately, we have to inform you that there was a data loss incident that affects a small amount of your snapshots on Hetzner Cloud.
All snapshots you create are stored on our highly available storage systems. The snapshot contents are distributed over multiple internal servers and data is stored in a way that allows up to two separate disks to fail without impacting data integrity.
This means the snapshot can still be accessed, even if two disks fail at the same time.
Due to a recent, very unfortunate series of events in one of our clusters, multiple disks failed in short succession and caused a small number of snapshots to become unavailable.
We immediately tried to recover the affected snapshots but unfortunately the data is lost and we have exhausted all our options.
AFFECTED SNAPSHOTS IN YOUR ACCOUNT:
Project: [redacted]
Name: [redacted]
The snapshots have been removed from our system as they are no longer accessible.
We sincerely hope this doesn’t cause too much trouble for you; we know losing data is the worst-case scenario. Also, we have added 20€ as Cloud Credits to your account (valid for one year). While we know that this will not bring back your data, we still hope that you will accept the gesture.
In response to this we will re-evaluate our snapshot cluster data replication strategies as well as our strategies for replacing disks and rebuilding redundancy after replacement.
Best Regards,
Hetzner Cloud
Thanked by 1darkimmortal
Comments
Shit happens, backup your backups.
"very unfortunate series of events" meaning someone fucked up
Not necessarily, could simply be an instance of cost cutting, a decision taken calculating the risks and implications. Even without that, nothing is 100% sure. Provider back-ups/snapshots/redundancy/etc all good, but take own backups to increase recovery chances and have a stash away from the provider where they can't cut your access in case they imagine something.
Damm it, i am using this functionality as VM templates
I like the way they handled this incident. Everyone affected was given €20 cloud credits without having to ask, along with a concise description of what happened and which snapshot(s) were affected.
Shit happens, don't rely on one source of truth for your data. Relax and enjoy the free credits.
For me Hetzner has been one of the most reliable services I have used so far in many years. I think I only had one incident years ago because of hardware failure. But I always have a paraoid-level backup strategy.
that's why experts say backup your backups. so far hetzner and OVH(Kimsufi and SYS) are the most reliable providers to me... well Sh**t happens.
I already have backup storage in hetzner for my servers but I think I need offline backups for my backups
So from now on Hetzner lands in tray G for garbage!
Then rather Google Cloud or any other provider where the data is highly secure.
Translation: we sold hard drive with your snapshot to some Romanian on OLX
happens
All the “backup your backups” comments: there is no way to backup/restore an image. We don’t use images for backups anyway.
The point of this is not we lost data (we didn’t). The point is that Hetzner doesn’t have backups of their backups, and clearly has real sysops issues.
Data loss is data loss. It’s not a good look for them.
If you have enough luck you can find your backup on www.hazi.ro (VPS HDD plans).
P.S. I've never bought an HDD from OLX, and I never will
I know that, hun
https://streamable.com/us4zwd
I hope you're not one of those people who thinks I have something in common with Calin. (I personally helped him get out of here)
A 2 disk failing scenario for such a storage system isn't much of plan in my opinion. Sure the services are dirt cheap but if I knew I was putting anything on a storage system in which all it takes is 2 disks failing, for darn sure I wouldn't put anything remotely important on it without backups. I hope their nextcloud instances aren't configured this way because I now began considering using them for thus service.
Without the data some customers don’t need any credit as cost to rebuild project can cost X times more than €20, if current backup/snapshot system fail why not to look for alternative solution.
You have unrealistic expectations.
Tremendously bad
I just realized after rereading, it means 3 disks failed (I think?)
Yeah, they say that 2 can fail without issue, and then say that "multiple" failed to cause the incident. As far as I can tell, that means 3 disks failed at once to cause it.
Ugh, you probably don't have a single service that can survive three disk failures.
There's definitely a flaw in their redundancy plan, though.
If you think they are incompetent, then move.
If you think they are capable but not perfect (the best you can hope for), stick with them because I guarantee you that they'll review their processes to prevent this again. If you switch providers, the new one might not have learned this lesson/experience.
The fact they've identified the issue and it had a financial penalty for them are VERY positive signs that this will be a very rare occurrence.
Yes, good point indeed!
Haha, we don’t even use them anymore even for their backup services (storage boxes)
It seems like, unlike the storage box, the storage share (Nextcloud service) is backed up multiple times a day and can sustain "several simultaneous drive failures".
Bad tremendously
Congrats on your first post
Another two word bot @FAT32 @Jord @raindog308
Interestingly, this one also likes their bandwidth being doubled.
your post count has been doubled
someone fucked up big time .