New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Comments
Thank you! I opened a ticket and they're going to convert them to RAW for me.
OP said he waited 24 hours. You say around 6 hours. Those are in major disagreement.
If OP received an email status and didn't wait the 24 hours like stated he did, then OP did tell a different story and was impatient.
I don't want to know what your coworkers have been telling you, but you're doing it wrong. If they tell you that you need to "firewall" their penis with your mouth, don't fall for that!
Question, what does this exactly mean? Who does the certification and how often?
Holy fucking Kool-aid drinker, Batman! This is an epic fuck up that led to data loss, and the customers feel guilty and think they should pay more.
Francisco, future president in the making.
My first dedicated server had a single 80 GB IDE hard drive, and I never took backups. The hardware wasn't even enterprise-grade, it was a WD Green HDD and I'm pretty sure the 'server' was just a tower PC sitting on a shelf in the DC somewhere. I'm lucky I never lost any data. It's a bit scary to think about now
BuyVM uses RAID so you're less likely to lose data, and yet I still have daily backups and verify my backups at least monthly.
Vote f0r Fran in 2020.
2024, 2028, etc.
Making mistake is not that horrible. The point is to get the experience and not repeatedly making the same mistakes.
Which is lost on majority of LET lurkers wh0 sign up for the cheapest deals.
That was me and guess what? The status page said there was "an incident". It had no ETA for recovery and no explanation for what was going on. Additionally, there was a link you could click to give more detailed explanation but guess what? Hyperlinks are supposed to be underlined for fucking reason and all these assholes and their CSS have ruined the web for colorblind people like me when they ** DONT UNDERLINE THEIR FUCKING HYPERLINKS ** Why? Because we can't tell its supposed to be fucking clicked on. Why? Because it's not fucking underlined, per the original HTML spec.
IF they are going to throw up all over the Internet with their retarded CSS, Hosts should at least send an email with an explanation, end of discussion.
"There was an incident in Los Angeles" with a fucking invisible hyperlink isn't good enough.
YOU DIG?
I got an email dated Sat, 22 Dec 2018 17:15:42 +0000 titled "Partial Power Outage in Las Vegas" that described in detail what had happened.
I got an email dated Fri, 34 Nov 1864 25:69:42 +0000 titled "BUY PENIS PILLS NOW 30% OFF" that described in detail what could have happened.
Sucks to be color blind. You might need to look at accessibility options or maybe an add-on that will ignore the website. Some Google search made it sound like accessibility options should override the website style. Since people have been complaining ever since this started, someone must have come up with a workaround for colour blind.
No, my point is data loss IS horrible and not as trivial as some are treating it. A mistake that leads to downtime is not that horrible. Data loss is.
Learning mistakes are great - when it's not at the expense of your money and data.
(I don't have services with BuyVM, just speaking in general. People are really chill about this and that's not normal).
email from buyvm
dont complain about buyvm too many fanboy around here
I'm a bit disappointed with this problem, but lucky me I had spare backup somewhere else, it's only take few minute to restore my intterupted services. But I'm glad they transparant with thiss problem, looking forward to a better service from buyvm.
Good luck with fsck
cheers
Yes, I always have remote backup, but the number of files is huge, and it will take a long time to backup again.hope it won't happen again
I'm doing my very best to not gloss over what happened and if you read the big email I sent on Christmas day, you'll see it was just a lot of fumbling on my part and not having a "plan of action" for such cases.
The vast majority of users with issues have been fixed with a minor FSCK. Some users had serious damage, but a chunk of that was related to QCOW not checking properly. We had to apply a lot of experimental patches to
qemu-img
in hopes of repairing some users, but they didn't work. There's outstanding issues inqemu-img
where it tries to allocate TB's worth of memory to work. QCOW was originally picked so we could offer snapshots/backups of block storage in the New Year, but that's not going to happen. We've swapped all new provisions to RAW based images and will find a way to snapshot/backup that instead.Some users had total failure, there's no excusing that. We've done our best to accommodate people, usually with extensive credits, or in some cases, an extra volume on the house so they can do their own RAID1 in their setup. If you had a failure and haven't talked to us, do so. Most people get around 3 months credit.
The additional power feed goes in early next week, so this particular issue won't happen again. It should never have happened, but with me having medical issues for the past 6 months, it simply got lost on my TODO list. A "plan of attack" has also been written so we know what did/didn't work if there's ever another issue.
The platform itself didn't fail, it kept chugging along. The underlying XFS filesystems didn't fail either, they simply didn't get
xfs_repair
like it should've. The failures squarely on me for panicking to get things resolved and not thinking clearly. It happens when you're half asleep.We fucked up, we'll make sure it doesn't happen again.
Francisco
@ersite Santa Claus, was bringing presents earlier in the datacenter, tripped on the cord, fucked up the server and the whole server room floor, to be more precise. They're changing the piping as we're speaking.
Has Santa Claus been arrested?
No, he is still at large.
I've been hearing that GDPR is after Santa Claus. I also hear Google is after him for his data.
A message to Santa Claus: Stay strong. Don't let them have your data which include one's name, address, and sexual orientation.
"Glorified hobby shop" my ass. Good work Mr
San @Francisco
Frantastic.
Out of curiosity, did you observe any difference in recovery success between encrypted and non-encrypted volumes?
Reminds me of the parable where a new hire fucked something up and expected to be fired. When asked, the boss said, "why would I hire someone else to repeat that same mistake, when I know this guy won't do it again?"
Nope. The few people I talked to with encrypted volumes usually got wacked by qcow issues so recovery wasn't possible.
One customer had me transfer him a copy of his broken QCOW so he can see what he can fix. As mentioned there has been lots of patches trying to address the memory allocation issues but haven't had much in the way of luck. One patch set is from just last week.
Francisco
Thanks for answering that
No problem.
One of the patch sets I applied was https://patchwork.kernel.org/patch/10731187/ but it doesn't actually fix anything. It doesn't OOM, but it is more or less marking every single cluster/block in it as bad which is incorrect.
Any drives that have been affected by it I've put away for safe keeping and will test future patch sets.
Francisco
Do any of these patches claim to fix issues after the fact? I'd have expected them to stop corrupted data from being written, rather than fixing corruption that already happened.
That they were released so recently suggests qcow2 wasn't/isn't really ready for production.
I've been a bit tied up with holiday travel and haven't had a chance to mess with the qcow image much yet, but the conversion utilities don't work on it (they report corruption) and it looks like the file offset table has gotten messed up. I want to look into how those table entries are allocated (hopefully something simple) and see if there is any hope of reconstructing them.
I think this just shows that no matter how well set up a host's storage array is set up, a diligent vps customer would have a backup in another location to mitigate rare things like this.