New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Comments
The server is already pretty bad from the start (at least since I rent the seedbox), I've observed occasionally I/O hangs and zombie processes and sometimes the whole server is dead.
I was fine with needing support manually restart their server every now and then (given how powerful (in terms of storage and traffic cap) the seedbox is for the price).
I already know their drive is bad but the server has stabilized quite a bit in 2025 (fewer down time compare to 2024) so I figure why not keep the status quo. I didn't want to do the whole rent a new seedbox for a month -> rsync things -> ask them to provision -> rsync back to new node -> cancel temp seedbox, that end up bite me hard.
Man, you guys are right. We could be saving a lot of money on AI subscriptions.
Honestly? With new capabilities most, because humans just don't have the bandwidth to do deep dives like an LLM does.
if i recall this incident correctly;
AND, it also fumbled because of my previous recovery attempt had misleading drive metadata, so there was 2 different drives for that position. I was being too careful and didn't wipe one of the drives, which caused data to be corrupted. Lesson learned and saved, if i notice 8month old drive next time, i will instantly wipe the metadata before replacement.
I was watching it happen, cancelled it but the resync had already started by that time, and e2fsck on fundamentally broken data. It had force assembled (as per my previous instructions), hence that 8month old stale data was taken as ground truth causing most of the strides to be recalculated and rewritten with incorrect data.
It was basically recovery attempt beyond what should have been attempted, replacement kept dropping out of the array too.
After this i instructed to give even higher detail explanations. That was still the TL;DR; ELI12 version if i remember right.
--- meanwhile we recovered 600TB of raw capacity at the same time on other server.
It is also hellbent on doing per customer fixes instead of the old every one gets it or no one adage to save human time. It has quasi-infinite processing time in this context, so it does even per user fixes.
My job as sysadmin/developer has certainly got more intensive after this, right now i am running 18 windows of various agents JUST on väinämöinen. Tho not all of it is technical tasks. But oh man the hyperfocus is something else, by BPM is reaching sleep levels while i am working.
Some findings internally led that today full autonomy is disabled, and i am handling random tickets interactively, watching it work and patching the low hanging fruit first. I had essentially broken it with too many memory recalls! Imagine that. 1M Context was full before proper diagnosis could even start, oopsie.
This is much much more context than a human can take in on per case basis, a human gets bored with the mundane, basicly all basics like transactions invoices, past emails etc. all come for "free", along with near infinite memory for past incidents -- semantically wiring all that memory in is the difficult part.
There are various structural changes i want to make, but that changes core architecture as well -- which means there is a time delay with that.
My experience differs
Provider using LLM for support (important layer for representation of company) - has no respect towards customer. Just don't care. I tend to avoid such providers like plague.
Thanks, in super user mode worked .... flawlessly. That's saying: Did nothing
First response was ... "garbage". Subagent pulled stale data, and main agent didn't cross examine. They are just too hasty and sometimes returns the first match. That is often oldest record, does not double check. Our memory system had not yet marked this as weak memory, that corpus update is still running and will be for many days ... Equilibrium expected in a ~month for current corpus ... when we have another month memories.
Now the actual responses;
ELI5: What is RAG?
Imagine you have a really smart friend who read every book 2 years ago but hasn't read anything since. That's an LLM — knows a lot, but knowledge stops at training.
RAG = giving that friend a library card.
Before answering your question, the system searches a database for relevant snippets, staples them to your question, and hands both to the LLM. Now the LLM answers using fresh, specific information it wasn't trained on.
The Standard Pipeline
That's it. Every "chat with your docs" product, every AI customer support bot that claims to "know your account" — it's this.
The Deeper Dive (From Our Actual Research)
Why Naive RAG Is Dying
From research file, sourced from actual papers:
1. Google DeepMind Proved a Fundamental Ceiling
Weller et al., arxiv:2508.21038, Aug 2025
2. Claude Code Dropped RAG
Boris Cherny, Anthropic, Claude Code team
Four problems they hit:
3. Amazon Confirmed Keyword Search Gets 90%+ of RAG Performance
Subramanian et al., arxiv:2602.23368, Feb 2026
"Keyword search is all you need" — keyword search via agentic tool use achieves over 90% of RAG-level performance without a vector database.
4. Chunk Boundaries Lose Information
When you split documents into chunks, 15-25% of queries lose information that spans boundaries. If cause and effect aren't in the same chunk, the system has no idea they're connected.
What Replaced It (In General)
The research consensus from 2025-2026 isn't "RAG is dead" — it's "naive RAG is dead." What's replacing it:
What We Actually Do
Our system uses layered search that escalates cost only when cheap methods fail:
grep resolves 60-70% of queries. With human-curated keywords in file headers, ~85% never need the LLM layers at all.
Why This Beats RAG at Our Scale
The human curated metadata is not true, it thinks so because i accepted the proposal. I simply verify the plans, architecture, rules. But even there is potential for improvement ... For future. This is recursive, never ending recursive feedback is possible, to constantly distill and improve on the memory.
I believe two most important things are Memory with LLM reasoning and understanding. I think this builds the true deep cognition together.
Funny enough, the typical grok, chatgpt, claude.ai chat history searches cannot do even basic grep. 100% match, and still unable to find.
I've read this comment 5 times and I'm still not clear whether you're saying that the AI Agent autonomously performed the actions that led to the customer's data loss or not.
My understanding of what you're saying is that a human engineer left a landmine, (unwiped drive), that the AI Agent autonomously stepped on, (by triggering a resync), and those factors combined to blow up the RAID array. Is that correct?
You wasted your time in life by reading AI slop...
Both happened, but the root is with me personally, i was the root cause for not finishing the job, leaving corrupted drive with metadata etc. but Agent didn't get memory triggered on that neither.
Me as a a human i have severely restricted I/O capacity compared to an LLM. The time it takes me just to find the blockdevices; The AGENT has mapped all drives, checked all smart data and actually read all the smart attrituves and understood them.
That is the depth difference. Me as a human staring at 25 tickets backlog knowing most take 3 minutes but there might be the mine which takes 12 hours to solve just don't spend the level of redundant reading time to double check everything.
I only do the high level most important stuff myself, such as development work, blog posts, mass communication like this. While i am building this, i am also managing a big batch of new traditional (mostly) servers being built, designing an PCB for the mPlate / MD series power control integrating everything, and building auotmation for MD and managing the finishing touches on the new DC for contractors: 1 row of racks got wired in couple of days ago: https://files.catbox.moe/mtz4vm.mp4
tl;dr it will continue to happen due to oversubscription of human resources.