HostHatch Los Angeles storage VPS outage

psb777 · September 2023

About 3 weeks ago, I noticed that my HostHatch Los Angeles storage VPS had very poor disk read speed, less than 1 MiB/s most of the time. I thought maybe there was an online RAID resync so I waited.

One day later, on Sept 4th, I got a "Reboot Notification" email, saying they "detected an issue with a host node and our engineers had to perform an unscheduled reboot to completely resolve the issue." However, the low disk read speed did not resolve after the reboot. Not even improved, still around 1 MiB/s.

I opened a ticket the day after, which naturally received 0 (zero) replies so far.

On Sept 24th, I got another email: "We will be taking down the node to perform an offline RAID resync. This may take up to 36 hours to complete."

It took more than 36 hours, but when it came back online, it won't boot. VNC displayed the BIOS message "not a bootable disk." I mounted a Live CD, only to find that all data were lost. The whole /dev/vda only contains NULL bytes.

Anyone else affected by this? Did they take 3 weeks to discover a failing RAID, and then perform a resync only to lose it? I guess opening another ticket with their stonewalling support department won't help...

doro · September 2023

same here, won't boot, any news? @hosthatch

alilet · September 2023

You have been hosthatched

Jimmy2000 · September 2023

I had same problem too. I can boot up now, but all data lost!!! After i reinstall bookworm, unfortunarely ipv6 can't work.

Megumiso · September 2023

I got message in my email.

Dear Customer,

This is an important message regarding your server(s);
My Hostname (1.1.1.1) - Los Angeles

Due to a multi disk failure in the RAID50 array, we were not able to restore your Storage VM (mentioned above) to its previous state. Please login to your account at cloud.hosthatch.com and use the reinstall feature to start using it again.

We have moved your server to a node with RAID10 HDD storage at no additional cost, for better redundancy and performance of your VM in the future. We will also be issuing 30 days of credit to your account later this week.

Thank you for your understanding and we apologize for the inconvenience caused.

Sincerely,
HostHatch Support Team

I didn't know they used RAID50...

Daniel15 · September 2023

This was the same server that had data corruption/loss issues in May 2022, when they said it used some known bad configuration... I wonder if they just never replaced the hardware. https://lowendtalk.com/discussion/179028/hosthatch-los-angeles-storage-data-corruption-was-hosthatch-los-angeles-storage-down/p1

Coincidentally I was copying a bunch of data off mine recently and it was very slow, like you mentioned. Took me a few weeks to download ~3TB of data from it. Moved that data onto a new NAS at home.

psb777 · September 2023

@Daniel15 said: This was the same server that had data corruption/loss issues in May 2022, when they said it used some known bad configuration... I wonder if they just never replaced the hardware.

Not sure about that, although Emil said "we will make sure that failure rate is eliminated in the next few months." It has been more than a year. Maybe it's still the bad configuration, or maybe its just failing hard drives. It doesn't matter to me, nor to them, really. I am expecting some responses (if any) along the lines of what a small % of users that are affected.

Daniel15 · September 2023

@psb777 said:

@Daniel15 said: This was the same server that had data corruption/loss issues in May 2022, when they said it used some known bad configuration... I wonder if they just never replaced the hardware.

Not sure about that, although Emil said "we will make sure that failure rate is eliminated in the next few months." It has been more than a year. Maybe it's still the bad configuration, or maybe its just failing hard drives. It doesn't matter to me, nor to them, really. I am expecting some responses (if any) along the lines of what a small % of users that are affected.

Emil also said:

. We are ordering more hardware to all storage locations and we will reach out to customers on these nodes to get them running on the newer nodes in our new cloud platform.

But I never got an offer to migrate my 10TB VPS to another node, and my ticket about it was closed without resolution. Oh well. It's definitely moved now. I just hope I don't lose space as a result of the migration (if it converts from TiB to TB).

TimboJones · September 2023

I wasn't affected by this outage, but I've lost data three times over the years with HostHatch. It's made me bring things back in-house so I can better look after it.

I get the impression they don't do monitoring and so they don't get alerted as soon a disk fails, as well as not having hot spares (probably a case/drive limitation) ready to takeover the instant a failure occurs.

Virmach has had extended outages, but they haven't lost my data, yet. And strangely, I feel like Virmach communicates issues better.

hosthatch · September 2023

Just to be clear, we’ve never had a data loss event on a NVMe node, where we use RAID 10.

We’ve had multiple data loss events on Storage VMs, where we use RAID50. It is never pleasant, but there is no false promises involved: we generally advertise these as such, as RAID 50 comes with an inherently higher risk as compared to RAID 10. We build our new storage nodes with multiple smaller arrays and/or use RAID 60, which is more safer, but again, you should know that there is an inherent risk when it comes to using parity RAIDs.

At our pricing, you would still be paying less for two geographically redundant VMs on two different continents as compared to most RAID 10 storage VMs out there.

cause · September 2023

My affected server was recreated with a smaller empty disk. It was 1500GiB but now a bit less than 1500GB.

zan · September 2023

@Jimmy2000 said:
I had same problem too. I can boot up now, but all data lost!!! After i reinstall bookworm, unfortunarely ipv6 can't work.

Same here IPv6 is not working. Already opened a ticket let's see how long it will take for them to fix it.

TeoM · September 2023

Again TBs data loss
Boy are you kidding us.

What is this shit.

FranzVonVirMach · September 2023

@hosthatch said:
Just to be clear, we’ve never had a data loss event on a NVMe node, where we use RAID 10.

We’ve had multiple data loss events on Storage VMs, where we use RAID50. It is never pleasant, but there is no false promises involved: we generally advertise these as such, as RAID 50 comes with an inherently higher risk as compared to RAID 10. We build our new storage nodes with multiple smaller arrays and/or use RAID 60, which is more safer, but again, you should know that there is an inherent risk when it comes to using parity RAIDs.

At our pricing, you would still be paying less for two geographically redundant VMs on two different continents as compared to most RAID 10 storage VMs out there.

That's a truly unique way to express regret to your clients for another data loss.
Your service may be cheap, but you don't need to be a dick, they are still paid clients not beggars.

bjo · September 2023

IO is also slow in Stockholm storage VM and there was also a node failure some days ago. Glad I moved my borg backups back to a Hetzner SB, so I can use the Stockholm for borg2 migration testing and a loss wouldn't be that bad.

hosthatch · September 2023

@FranzVonVirMach said:
That's a truly unique way to express regret to your clients for another data loss.
Your service may be cheap, but you don't need to be a dick, they are still paid clients not beggars.

I think we've always been known for being straightforward. I am sorry if it comes off as rude, but it is not meant to be, it is just meant to be honest.

If you are looking for 3 paragraphs of sweet nothings, there is a purple provider with a great history you might want to go with instead of us.

Murata_Chink_Best · September 2023

@hosthatch said: 3 paragraphs of sweet nothings

Yo why almost everyone holds option against the purple daddy?

At least they get extraordinary marketing and public relations skills

Astro · September 2023

@hosthatch said:

@FranzVonVirMach said:
That's a truly unique way to express regret to your clients for another data loss.
Your service may be cheap, but you don't need to be a dick, they are still paid clients not beggars.

I think we've always been known for being straightforward. I am sorry if it comes off as rude, but it is not meant to be, it is just meant to be honest.

If you are looking for 3 paragraphs of sweet nothings, there is a purple provider with a great history you might want to go with instead of us.

Under the garb of being straightforward there is no need to diss @dustinc. Racknerd has been very professional and apparently much better customer facing skills

They follow through with what they say so it’s not really sweet nothings.

You guys have a great history too. Let’s not undermine others.

nullnothere · September 2023

@cause said: It was 1500GiB but now a bit less than 1500GB.

That's definitely worrying if as part of the restore/recreate, the underlying disk specs change.

@hosthatch - could you comment on this? It'll be nice to know if this kind of disk size change is going to happen (I'm sure you can appreciate it can mess with backups and replication when a preconfigured size has been factored in).

hosthatch · September 2023

@Astro said

You guys have a great history too.

Thank you, it has indeed been an exciting 12 years of running an honest business - we try to be as transparent as possible, so I'm happy to see you feel that way

@nullnothere said: @hosthatch - could you comment on this? It'll be nice to know if this kind of disk size change is going to happen (I'm sure you can appreciate it can mess with backups and replication when a preconfigured size has been factored in).

This is an unintended change and we can fix it if you open a ticket.

Daniel15 · October 2023

@hosthatch said: we generally advertise these as such, as RAID 50 comes with an inherently higher risk as compared to RAID 10.

To be honest, I wasn't aware of this. I bought my 10TB storage VPS from a link in a January 2021 email with a subject line of "limited plans available". It didn't list any specs other than "10TB", nor did the link to buy it. I guess I should have asked about it.

@hosthatch said: This is an unintended change and we can fix it if you open a ticket.

So to clarify - if we use the reinstall functionality, we still retain the full disk space? It's not like the NVMe ones where it reduces the disk space when going from legacy to non-legacy? Does reinstalling migrate to the new platform or are the storage VPSes stuck as "legacy" VPSes?

Sgrocks · October 2023

All operators, after collecting money
we do this, we do that and later acts like king, way to ruin the one repo

FranzVonVirMach · October 2023

@hosthatch said:

@FranzVonVirMach said:
That's a truly unique way to express regret to your clients for another data loss.
Your service may be cheap, but you don't need to be a dick, they are still paid clients not beggars.

I think we've always been known for being straightforward. I am sorry if it comes off as rude, but it is not meant to be, it is just meant to be honest.

If you are looking for 3 paragraphs of sweet nothings, there is a purple provider with a great history you might want to go with instead of us.

What makes you think that I don't know my options and that I need in return your sarcastic wanna-be smart-ass comments?
Besides that being a "you don't pay that much so stfu" dick toward your paid clients when you fuck up has nothing to do with being straightforward.
A simple RFO, explanation of steps taken so that the issue doesn't repeat itself and an honest apology is what a mature host does in an occurrence like this. People don't need more salt poured on their wounds after their host lost their data.

hosthatch · October 2023

@FranzVonVirMach said:
What makes you think that I don't know my options and that I need in return your sarcastic wanna-be smart-ass comments?
Besides that being a "you don't pay that much so stfu" dick toward your paid clients when you fuck up has nothing to do with being straightforward.
A simple RFO, explanation of steps taken so that the issue doesn't repeat itself and an honest apology is what a mature host does in an occurrence like this. People don't need more salt poured on their wounds after their host lost their data.

This is what we sent to the affected customers, which I assume you are not.

Daniel15 · October 2023

@Daniel15 said: So to clarify - if we use the reinstall functionality, we still retain the full disk space? It's not like the NVMe ones where it reduces the disk space when going from legacy to non-legacy? Does reinstalling migrate to the new platform or are the storage VPSes stuck as "legacy" VPSes?

To answer my own question - Once HostHatch support fixed the disk space on my VPS, the full disk space was retained upon reinstall.

cause · October 2023

It seems ipv6 is now working, since somewhere this week.

@hosthatch
Was this applied as extending the billing cycle or credit balance? I cannot see any change.

We will also be issuing 30 days of credit to your account later this week.

tommyluo · October 2023

Same issue happen again and again,if me,i will choose other service provider instead.

1: Buyvm Block Storage
2: vultr Object storage
3: wasabi

tommyluo · October 2023

wasabi was usd 5 per TB before,it is usd 6.99 now

emgh · October 2023

You can get 3 TB @ HostHatch for $9

Like they said, use it with other backups or get two

tommyluo · October 2023

make 2 backups at different locations if that is cheap enough.

mansoor · October 2023

So what you are saying is that since you offer cheap service, folks should expect shitty service?

@hosthatch said:
Just to be clear, we’ve never had a data loss event on a NVMe node, where we use RAID 10.

We’ve had multiple data loss events on Storage VMs, where we use RAID50. It is never pleasant, but there is no false promises involved: we generally advertise these as such, as RAID 50 comes with an inherently higher risk as compared to RAID 10. We build our new storage nodes with multiple smaller arrays and/or use RAID 60, which is more safer, but again, you should know that there is an inherent risk when it comes to using parity RAIDs.

At our pricing, you would still be paying less for two geographically redundant VMs on two different continents as compared to most RAID 10 storage VMs out there.

Howdy, Stranger!

Categories

In this Discussion

HostHatch Los Angeles storage VPS outage

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

HostHatch Los Angeles storage VPS outage

Comments