New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Comments
Only you, with your IPv6, should be recognized by ARIN, RIPE as a member superior to the others!!!
https://blog.cloudflare.com/cloudflare-service-outage-june-12-2025/
"The cloud is safe", they said, propagandized, and preached and even countries increasingly put their admin shit shows in the cloud.
The idiots got what they deserve. Simple as that.
I can't believe this was caused by a null pointer exception.
Google. In 2025. Taking down their cloud with a null pointer exception. Sigh.
Roast them, @jsg
https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1SsW
"On May 29, 2025, a new feature was added to Service Control for additional quota policy checks. This code change and binary release went through our region by region rollout, but the code path that failed was never exercised during this rollout due to needing a policy change that would trigger the code. As a safety precaution, this code change came with a red-button to turn off that particular policy serving path. The issue with this change was that it did not have appropriate error handling nor was it feature flag protected. Without the appropriate error handling, the null pointer caused the binary to crash. Feature flags are used to gradually enable the feature region by region per project, starting with internal projects, to enable us to catch issues. If this had been flag protected, the issue would have been caught in staging.
On June 12, 2025 at ~10:45am PDT, a policy change was inserted into the regional Spanner tables that Service Control uses for policies. Given the global nature of quota management, this metadata was replicated globally within seconds. This policy data contained unintended blank fields. Service Control, then regionally exercised quota checks on policies in each regional datastore. This pulled in blank fields for this respective policy change and exercised the code path that hit the null pointer causing the binaries to go into a crash loop. This occurred globally given each regional deployment.
Within 2 minutes, our Site Reliability Engineering team was triaging the incident. Within 10 minutes, the root cause was identified and the red-button (to disable the serving path) was being put in place. The red-button was ready to roll out ~25 minutes from the start of the incident. Within 40 minutes of the incident, the red-button rollout was completed, and we started seeing recovery across regions, starting with the smaller ones first."
Fingers crossed for a Kevin Fang video about this
OK, but I won't roast them for the null pointer.
What I see as the real problem is the fact that they (self-admitted) basically run a "do as you please, adhere to rules or don't, just as you like" software development freak-show.
Google does have at least some very capable and experienced engineers and Google does have a set of maybe not complete but reasonable rules and they do know how to do software development (the whole cycle, incl. testing) - but they obviously not only tolerate utter ignorance but even gross disregard of those rules. In other word: it's mainly a management problem.
Now to the null pointer and why I'm somewhat lenient on that.
Null pointers (sadly) still just are a fact in the field. Google's code often needs to be high-performance which boils down to certain languages, plus I guess they need lots and lots of code which boils down to not being able to use the very few safe languages and techniques.
One can create (almost) 100% safe software, and it's actually done, e.g. with railway management systems, air and aircraft control, etc. - but that's very, very expensive and also quite slow (development cycle). I happen to do a lot of work in that field and painfully know what I'm talking about.
Now, writing software for say, a nuclear reactor control system is a large project - but compared to the mega shit tons of code Google needs it looks wimpy. Read: it's reasonably doable. Very expensive, very complex development chain, lots of formal stuff, beginning with the specification and requirements and certainly not ending with static analysis, and so on. But it's doable.
One major reason for "it's doable" is that such a project needs relatively "few" developers and those usually are used to work in a very strict environment. Google however needs a large armada of developers and the vast majority of those those would run away if Google went hardcore on safety. Plus, of course, the whole software development would be much, much more expensive, and even that is theoretical because they wouldn't even find the amount of such (adequate) engineers in the first place.
So, they did what they could (as in also "economically reasonable"). Hell, they even created a quite capable (albeit not my taste) programming language suiting their needs and with halfway reasonable safety. It even invites developers to always return an error state.
But here's a "dirty" secret: I know e.g. Ada, and I even like it (a lot), but there are situations when one must squeeze out even the last bit of performance ... and then most developers turn to the compromise between Assembler and a modern programming language: to C. Yes, it's dangerous, it's kind of dirty, uncool, etc. but it's the language with which you get those difficult spots done, plus, at least nowadays, you have quite direct access to e.g. SIMD and the like, which can be the difference between pushing 3 Gb/s and 50+ Gb/s through the network. The price one pays for that if safety is paramount, is high though, and you must use six legged creatures like e.g. Frama, weird (often Ocaml based) based tools or even go really hardcore with full proof systems like Isabelle ... or simply not give a shit (like at Google it seems).
That's why I'm lenient on the null pointer per se, and rather hit hard on management, because, OK, Google doesn't run nuclear reactors (yet) but they are deeply interwoven with billions and billions of $$ going through their systems (usually for customers) and millions of people depending on them and they absolutely need to run a reasonably tight ship - but they don't, as this case very clearly and painfully shows.
If I were high up at Google I'd let the guilty developer get away with a stern warning and stricter oversight, but one or more in management would get fired.
This is what happens when you rely on AI for your source code...
This will all be moot because AI will eventually take the jobs of these expensive engineers right? (Give it 5 years or so.) Google is in the AI race after all, wouldn’t put it past them to displace their own headcount with it.
I think I'll make the move to use Bunny (aff / non-aff) for DNS & CDN just because they are from EU and GDPR things shit and also not dependent of Google/Cloudflare or any big tech companies.