BuyVM Catastrophic Data Failure - All data lost on a node!

Francisco · April 2018

Zerpy said: How is jetbackup configured if it can take more than a day to perform a backup of a server? o.O Even with a raid 6 array that I have for backups, 400 accounts take only a few hours and is mostly due to some accounts having 2+ million files :')

Yeah, that's 2 million files. For instance, lv-shared04 has ~22M inodes on it.

Francisco

Francisco · April 2018

@vimalware said:
Neat. Are these weekly backups made on LUX-shared nodes too? that would make me smile.

I'm just too lazy to backup friends n family shared hosting.

Every node gets 3 backups a week. We shift things some so LUX is like Tues/Thurs/Sat or something like that, that way each group of nodes doesn't have to fight with other nodes for inodes.

As of now all but a handful of accounts we missed on the initial rounds have been imported. We've fixed whatever IP's were wrong.

This issue is resolved

Francisco

raindog308 · April 2018

Zerpy said: How is jetbackup configured if it can take more than a day to perform a backup of a server? o.O Even with a raid 6 array that I have for backups, 400 accounts take only a few hours and is mostly due to some accounts having 2+ million files :')

I'm going to take a wild guess that BuyShared's boxes have an order of magnitude more accounts per server.

deank · April 2018

40,000 accounts per server.

Francisco · April 2018

raindog308 said: I'm going to take a wild guess that BuyShared's boxes have an order of magnitude more accounts per server.

Not that bad. lv-shared04 has a peak of 1000 IP's (a /22 of IP's).

We just have a lot of people that have never, ever, cleaned their spam folders so you end up with 50,000+ unread emails.

Francisco

Zerpy · April 2018

@Francisco said:

Zerpy said: How is jetbackup configured if it can take more than a day to perform a backup of a server? o.O Even with a raid 6 array that I have for backups, 400 accounts take only a few hours and is mostly due to some accounts having 2+ million files :')

Yeah, that's 2 million files. For instance, lv-shared04 has ~22M inodes on it.

Francisco

2 million files, per account for the top ones.

In my case, it's 403 accounts and 34.4 million inodes and total of 2'645 gigabyte of files.

Total backup time being 2 hours and 37 minutes last night.

If it takes 24+ hours for 22 mil inodes, then something must be very wrong, it means you're reading on average 250 iops on the webserver itself on average per second over a 24 hour span - that seems super low :-) But what do I know.

Another smaller box is 228 gigabyte of data, 4.3 million inodes and 125 accounts, takes about 40 minutes, and that's even on spinning rust

@Francisco said: We just have a lot of people that have never, ever, cleaned their spam folders so you end up with 50,000+ unread emails.

Consider mbox format - would probably also greatly reduce a recovery process.

Francisco · April 2018

The issue isn't the reading on the shared side, it's that the destination is getting slammed by not only 7 other shared nodes looking to do the same work, but also BuyVM backups (though those are more stream heavy).

I thought mbox was deprecated. If not, i'll for sure consider that.

Francisco

Francisco · April 2018

You can use the /scripts/convert2maildir script to perform conversions on mail storage data. Maildir is the only supported mail storage system for cPanel & WHM servers. Because of this, users who migrate data onto cPanel & WHM servers will convert any mbox data to the Maildir format.

Shlonged.

Francisco

Zerpy · April 2018

Sorry, mdbox - not mbox - hate the fact their naming is close: https://documentation.cpanel.net/display/68Docs/Mailbox+Conversion

They even considered making mdbox default at some point - however was not done, but this was added in the recent releases - and cPanel did migrate all their internal emails to mdbox themselves

Francisco · April 2018

@Zerpy said:
Sorry, mdbox - not mbox - hate the fact their naming is close: https://documentation.cpanel.net/display/68Docs/Mailbox+Conversion

They even considered making mdbox default at some point - however was not done, but this was added in the recent releases - and cPanel did migrate all their internal emails to mdbox themselves

Neat

Will wait and see if there's any known issues but i'd for sure love to have something like that instead of a metric crap ton of inodes.

Francisco

Zerpy · April 2018

@Francisco said:
Will wait and see if there's any known issues but i'd for sure love to have something like that instead of a metric crap ton of inodes.

Francisco

It was introduced in cPanel v56, so has been there for at least 1+ year - and cPanel run it internally with terabytes of emails - if they'd switch their own @cpanel.net emails to run with it, I'd assume it's "good enough" to make the company rely on it

willie · April 2018

Might be faster to just save a compressed tarball of each account as backup, rather than attempting differential or file by file backup. So you write just one file per account on the backup server.

Francisco · April 2018

willie said: Might be faster to just save a compressed tarball of each account as backup, rather than attempting differential or file by file backup. So you write just one file per account on the backup server.

True, but very heavy on space usage.

In the case of a disaster we'd likely resort to the full drive snapshots which I'll be able to restore at full line rate since it's just streaming data.

Francisco

rafaelscs · April 2018

some hours offline. a copy a few days ago. but much better than losing everything

willie · April 2018

Francisco said: True, but very heavy on space usage.

Actually dump/restor is still a thing and can do differential dumping into a single file on the backup device. That might be a decent alternative. Space consumption stays the same with all approaches if you're not keeping multiple backups around, but the tarball approach increases traffic to the backup server so that's not great.

Francisco said: In the case of a disaster we'd likely resort to the full drive snapshots which I'll be able to restore at full line rate since it's just streaming data.

I'm not so sure you can do that on an active filesystem because of getting inconsistent snapshots as stuff changes during the snapshot process.

Francisco · April 2018

willie said: I'm not so sure you can do that on an active filesystem because of getting inconsistent snapshots as stuff changes during the snapshot process.

I can LVM snapshot the entire node so it'd be just fine.

Francisco

willie · April 2018

Francisco said:

I can LVM snapshot the entire node so it'd be just fine.

How does that work? The stuff inside the LVM partitions would still be changing during the snapshot, I would have thought. I could imagine setting up an overlay filesystem or similar to allow individual accounts to be safely snapshotted and that would be interesting, but I'm not aware of it having been done.

Voss · April 2018

Francisco said: "Awww fuck."

"A guide to webhosting"

More like "OVH Server + Summer Holidays: A Guide to Scamming"

Francisco · April 2018

willie said: How does that work? The stuff inside the LVM partitions would still be changing during the snapshot,

No. The 'changed data' gets stored in a different area. When you make a snapshot you tell it how much space it can use for that very purpose.

https://www.thomas-krenn.com/en/wiki/LVM_Snapshots

Francisco

Nomad · April 2018

Why doesn't anyone mention this might have something to do with the low amount of TBW the consumer series SSDs have?

If they are from the same batch and are in a raid, isn't it expected for them to fail just simultaneously?

YokedEgg · April 2018

@Nomad said:
Why doesn't anyone mention this might have something to do with the low amount of TBW the consumer series SSDs have?

If they are from the same batch and are in a raid, isn't it expected for them to fail just simultaneously?

That was actually already mentioned.

Nomad · April 2018

@YokedEgg said:

@Nomad said:
Why doesn't anyone mention this might have something to do with the low amount of TBW the consumer series SSDs have?

If they are from the same batch and are in a raid, isn't it expected for them to fail just simultaneously?

That was actually already mentioned.

My bad then, I missed that.

Harambe · April 2018

Nomad said: Why doesn't anyone mention this might have something to do with the low amount of TBW the consumer series SSDs have?

Fran mentioned he monitors the wear level on them, on the 1TB variants you're looking at like 2PB of write life per drive. Definitely a possibility, but that seems like a lot of available room for writes on a shared hosting node.

sonic · April 2018

My site backs up online without issue. Great work, Fran!

Francisco · April 2018

Nomad said: Why doesn't anyone mention this might have something to do with the low amount of TBW the consumer series SSDs have?

If they are from the same batch and are in a raid, isn't it expected for them to fail just simultaneously?

I documented about this earlier but I'll repeat it here

The SSD's were all in the 30 - 40% left on the official ratings for the drives, but Samsung's can go far and beyond the limits on those drives. Still, we weren't anywhere near that.

I know this for a fact because I smart'd all the drives around 2 weeks ago when Karen & I decided we wanted to give Shared a nice upgrade with bigger CPU's and a move to the NVME drives.

Honestly, with what we've seen I think it was just because the drives never got a firmware update. We bought those drives right when the 850's first hit the market and pulling the node offline to flash the firmware and take a chance at losing everyones data wasn't a happy thought to me.

For all I know there's some subsystem that was patched in a later firmware where the drives can go cockeyed if they hit a certain wear level. Samsung knows, I don't.

We do a lot to make sure we take as much strain off our drives as we can. We're close to a 85/15 Read/Write work load on our shared node SSD's. There's absolutely no swap on our nodes and we do some other tricks to take high thrash areas off the drives completely.

I'm not happy with it but I'm incredibly proud with how quickly Anthony & I were able to diagnose the problem, build completely new nodes, get everything installed, every backup we had packed and restored.

I think it was just over 24 hours for the whole episode. Given what i've seen from countless other hosts out there, they would be 5 - 10% into the restore by now.

This is resolved. Anyone with outstanding issues with their site, log a ticket, i'll personally sort you out!

Francisco

willie · April 2018

Harambe said: Fran mentioned he monitors the wear level on them, on the 1TB variants you're looking at like 2PB of write life per drive.

https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review lists the rated endurance of the 1TB 850 EVO as 150TB.

Francisco · April 2018

@willie said:

Harambe said: Fran mentioned he monitors the wear level on them, on the 1TB variants you're looking at like 2PB of write life per drive.

https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review lists the rated endurance of the 1TB 850 EVO as 150TB.

Correct! Even the PROs' aren't that much higher.

The enterprise stuff gets into the multi PB.

Still, this is shared. shared04 already had a couple years on it and still had a good bit left.

Francisco

YokedEgg · April 2018

@Francisco said:

@willie said:

Harambe said: Fran mentioned he monitors the wear level on them, on the 1TB variants you're looking at like 2PB of write life per drive.

https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review lists the rated endurance of the 1TB 850 EVO as 150TB.

Correct! Even the PROs' aren't that much higher.

The enterprise stuff gets into the multi PB.

Still, this is shared. shared04 already had a couple years on it and still had a good bit left.

Francisco

Quick question from the official BuyVM LET help desk, when do you think the KVM slices will get NVMe installed in 'em? (srs)

Francisco · April 2018

YokedEgg said: Quick question from the official BuyVM LET help desk, when do you think the KVM slices will get NVMe installed in 'em? (srs)

Unlikely any time soon.

It's a huge cost and not enough people will care about it. I would have to do full motherboard changes, or move to a 2U chassis, since our spare PCI slots are being taken by our infiniband networking.

Free nightly backups & snapshosts will come out around summer or so though That should be fun.

Francisco

Harambe · April 2018

@willie said:

Harambe said: Fran mentioned he monitors the wear level on them, on the 1TB variants you're looking at like 2PB of write life per drive.

https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review lists the rated endurance of the 1TB 850 EVO as 150TB.

Ah, my bad. My quick google search brought this page up earlier: https://www.anandtech.com/show/8747/samsung-ssd-850-evo-review/4

Howdy, Stranger!

Categories

In this Discussion

BuyVM Catastrophic Data Failure - All data lost on a node!

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion

BuyVM Catastrophic Data Failure - All data lost on a node!

Comments