New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
Post Mortem on Cloudflare Control Plane and Analytics Outage
https://blog.cloudflare.com/post-mortem-on-cloudflare-control-plane-and-analytics-outage/
It isn't too informative but it does shows the commitment Cloudflare is putting to ensure high availability of all their services.
Comments
Honestly, people sometimes complain that Cloudflare does occasionally go down but I'm genuinely amazed that it's up at all.
Given their levels of traffic, and the flexibility their setup needs to accommodate it's really amazing any of it works at all.
Hope they got a few days of good night sleep after this incident.
1 tech that started a week earlier. He/She must have been sweating buckets.
A good informative read. Thanks for sharing that I probably wouldn't have gone looking for it.
Poat mortem nicely covered how some vital parts of cf infra was not HA. Particularly new features. "Just roll out, we will sort this latter" approach. Majority of stuff was single homed. What a shame.
P.s. cf share price has actually grown up during outage.
I've learned basically two things from that:
Flexentia (or similar, their critical infrastructur DC provider) seems to be a real sh#t show. Some examples:
TL;DR
Matthew Prince should make sure to actually and really control his ship and not let evidently not top-class people, especially in innovations, sink it.
And Flexblabla [or whatever] seems to be a bad joke at best. Tier 3, my ass. They may be adequate for John, Mary, and their dog customers like small businesses in Oakland (well, what's left) but for a major and critical for the internet customer their funny "tier 3" fun box evidently, blinking evidently isn't adequate.
Lessons learned: (a) keep innovators (and crap boxes with a chief innovator) on a tight leash, and (b) Use adequate tools and critical infrastructure providers!
And not very transparent. I looked on their site for an status page or outage update page or anything similar and couldn't find it.
All of that uptime is definitely relevant. No one can pull off 100% over a long enough time frame.
Their status page is here: https://www.cloudflarestatus.com/
I meant a Flexential status page.
"viawest - proprietary and confidential" at bottom of the diagram. Ballsy cloudflare showed it publicly lmao
They've been taking non stop L's at the moment.
They had that big power failure on the east coast a few months ago that made @EthernetServers move. There was the fire that happened in LAX via Krypt. Then there was the dedipath drama.
Francisco
Heh, yeah. I think they know Flexential is not exactly in a great position to go after them. The blog post pretty strongly hints that, in Cloudflare's opinion, Flexential has breached their colocation agreement in a number of ways.
Even if the post is light on certain details (and heavy on blame for Flexential), I really respect companies that are so open about their internal operations. Backblaze is another good example of this.
I also love when a CEO of a technology company actually seems to have some technical knowledge and doesn't use only business buzzwords. I'm not sure how Prince is as a manager, but this post certainly makes it sound like he's in the trenches.
Sorry me sometimes stupid with English. "taking L's" means what?
"Taking L's" = losses, failures, etc.
"Taking W's" = wins, etc.
Francisco
Probably yet another master piece of their chief of innovation ... *smirk
"A tech corp doesn't need a CTO, let alone a status page. Our status is always 'we are sooo innovative'"
IMHO this part alone speaks volumes about the company
Hell...I am changing my nameserver to cloudflare before I even login to my CF dashboard.. then the nightmare began... I've been stuck and stop for 3 days..luckily I am on a developement site!
Cloudflare may not be perfect - no business is
But one thing I do have respect for is the fact they bother to communicate properly - both during and after incidents.
Plenty of businesses don't do that and I think that's what riles customers and/or users more than anything.
I had no troubles with Cloudflare DNS though.
How dare you discredit @Francisco like that.
As an ex-CF employee I would say its the opposite. I would definitely not characterize Matthew as down to earth or a decent person. He has an excellent public image, but internally it's quite the opposite. Anyone who has been on a call with him can confirm, especially if competition was discussed . The CEO/CTO duo were the worst part of working at CF.
The company on the other hand is surprisingly good with the engineering people being the smartest people I've ever met. Unfortunately that didn't apply to product or project managers at all, most of who had zero technical knowledge or experience in IT. Most came from completely unrelated fields like logicists, oil, clothing... So if a CF product you love seems to have strange priorities, this is why.
Still, despite the issues and the complains I've yet to see a company being able to compete with CF on innovation, speed of development or pricing. Every "birthday week" they basically kill a dozen of startups.
good read, thank you
Should i invest 100k in cf and hope its the next google in the future.
Based on his public statement I see a quite different man.
Congrats btw wrt all the Mr. Prince not all liking "innovation" (ex) employees! The innovations detailed in Mr. Prince's statement certainly succeeded to gain lots of attention.
And an extra special award goes to Flex[whatever]. Very innovative indeed!
From what I saw, Mr. Prince's statement also clearly shows a man who acts in a honest, transparent, and straight way (as far as possible). It's not just image blabla, he actually and really apologized, stated that the company f#cked up, and tried to quite frankly lay out what went terribly wrong and why.
(Btw: can you show me any not small company where each and every top manager is well liked by all employees? I don't hold my breath).