New on LowEndTalk? Please Register and read our Community Rules.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.
All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Comments
That garbage came from clusters of $30k GPU. Which will have less than half of value for producing garbage in 2y, 3y tops.
I don't fully understand what you said. But, let me explain how that figure may came. It's revenue/(capex+opex+depreciation of existing capacity–remaining value of hardware).
Last year, it's 40-60/(200B+50B+65B-100B)=18.66~27.9%
In 2026, it's going to be ??B/(500B+60B+107.5B-250B)
Nope. > @itachikonoha said:
Nope, the relation is transitive not 2 way.
The owner of the AI uses it as a tool to turn you into a tool, you and AI are not tools to each other you are both tools working for the owner of the AI. Transitive relation not 2 way, this is a very asymmetric relationship.
It is an arrangement that never works to your benefit, you always take the short straw.
This is some cost recovery ratio-ish type of calculation. It does not measure utilization at all.
You could have 90% utilization and still have a low cost recovery ratio if your pricing is too low, or you could have 30% utilization but charge enough to cover everything.
AI agents are being deployed with the purpose of deskilling independent programmers so that at as soon as possible only the owners of AI agents and no one else will be able to produce software.
Their intent is obvious they want to kill competition and they hope to achieve that by deskilling programmers while using them to upskill the AI agents. All that while charging their "users" for the "benefit" of being thrown under the bus.
wow i've never thought about it like that, that's so incredibly true
Everyone’s worried about AI replacing programmers, but as NDR systems become more powerful and AI-driven, what safeguards exist to prevent them from evolving into large-scale surveillance infrastructure; especially if adopted at ISP or government levels? Sure they monitor people now via cameras and so on. What happens if your active connection is already being read by the AI since your ISP is running it.
i use gemini bc i share the premium with friends (3,50 € for gemini premium and 2tb), vs code i also use but i use it with claude code but with glm 5 from z.ai
It's impossible to measure the real utilization rate. Thus, we could approximate the utilization rate based on payment from the industry against the capex, depreciation and leftover value from last year balance sheet. Of course, this is based on the assumption that hyperscaler aren't a summerhost that sells at cost or even lower than the actual cost. So, the real utilization must be lower than this approximation.
Note: opex removed because we're trying to approximate utilization rate, not money earned: money spent
Yeah it's kinda neat and have enough features for me
For claude + gemini it's like switching due to not enough quotas lol
When Opus quota run out, I switch using latest Gemini to complete the rest of codes haha
Can't say anything about it as I never try copilot before haha
I tried antigravity with Google AI Pro trial for the 1st time before and it suited my needs
I'll try copilot soon
Claude. Loving claude code, tho honestly mostly for the lazy crap that doesn't really matter that I can't be bothered to spend time on myself, lol. Kept burning the quota for pro, eventually swapped to max. Expensive, but it's nice not having to worry about quota.
Yesterday/Tonight's random claude code session: minimalistic dark mode userscript for LET that includes a nice module at the top to stalk all of @host_c posts
Interesting to see that there are not too many here using Codex for coding. Found that it's been more dependable and "trustworthy", if you can call it that, over Gemini and Claude. The latter two are definitely incredibly intelligent and capable, but I think can be a bit lazy and less of a "senior engineer" than I'd like. Codex feels better to me in that regard, with GPT-5.4.
Resurrecting this thread over Claude's crazy-low limit on non-API. It feels the limit within the last few days are lower than 2 weeks ago.
The last 2 weeks was a 2x promo.
This ended few days ago. So you are halved now.
ADDITIONALL for non-api: https://www.theregister.com/2026/03/26/anthropic_tweaks_usage_limits/ (In addition to halved quota)
So you are more than Halved now during weekend and off peak.
agree, if they keep this going. they are done in a private-customer arena... unless u go with API expensive mode
I'm using it outside of the peak hour. And it feels like it's like I'm receiving quarter the quota, despite it was run on non-peak hour.
Anyway, I saw something
https://old.reddit.com/r/ClaudeAI/comments/1s8zxt4/thanks_to_the_leaked_source_code_for_claude_code/
Thanks to the leaked source code for Claude Code, I used Codex to find and patch the root cause of the insane token drain in Claude Code and patched it. Usage limits are back to normal for me!
https://github.com/Rangizingo/cc-cache-fix/tree/main
Disclaimer : Codex found and fixed this, not me. I work in IT and know how to ask the right questions, but it did the work. Giving you this as is cause it's been steady for the last 2 hours for me. My 5 hour usage is at 6% which is normal! Let's be real you're probably just gonna tell claude to clone this repo, and apply it so here is the repo lol. I main Linux but I had codex write stuff that should work across OS. Works on my Mac too.
Also Codex wrote everything below this, not me. I spent a full session reverse-engineering the minified cli.js and found two bugs that silently nuke prompt caching on resumed sessions.
What's actually happening Claude Code has a function called db8 that filters what gets saved to your session files (the JSONL files in ~/.claude/projects/). For non-Anthropic users, it strips out ALL attachment-type messages. Sounds harmless, except some of those attachments are deferred_tools_delta records that track which tools have already been announced to the model.
When you resume a session, Claude Code scans your message history to figure out "what tools did I already tell the model about?" But because db8 nuked those records from the session file, it finds nothing. So it re-announces every single deferred tool from scratch. Every. Single. Resume.
This breaks the cache prefix in three ways:
The system reminders that were at messages[0] in the fresh session now land at messages[N] The billing hash (computed from your first user message) changes because the first message content is different The cache_control breakpoint shifts because the message array is a different length Net result: your entire conversation gets rebuilt as cache_creation tokens instead of hitting cache_read. The longer the conversation, the worse it gets.
The numbers from my actual session Stock claude, same conversation, watching the cache ratio drop with every turn:
Turn 1: cache_read: 15,451 cache_creation: 7,473 ratio: 67% Turn 5: cache_read: 15,451 cache_creation: 16,881 ratio: 48% Turn 10: cache_read: 15,451 cache_creation: 35,006 ratio: 31% Turn 15: cache_read: 15,451 cache_creation: 42,970 ratio: 26% cache_read NEVER moved. Stuck at 15,451 (just the system prompt). Everything else was full-price token processing.
After applying the patch:
Turn 1 (resume): cache_read: 7,208 cache_creation: 49,748 ratio: 13% (structural reset, expected) Turn 2: cache_read: 56,956 cache_creation: 728 ratio: 99% Turn 3: cache_read: 57,684 cache_creation: 611 ratio: 99% 26% to 99%. That's the difference.
There's also a second bug The standalone binary (the one installed at ~/.local/share/claude/) uses a custom Bun fork that rewrites a sentinel value cch=00000 in every outgoing API request. If your conversation happens to contain that string, it breaks the cache prefix. Running via Node.js (node cli.js) instead of the binary eliminates this entirely.
Related issues: anthropics/claude-code#40524 and anthropics/claude-code#34629
I don't notice this because I turned off the auto-update
Currently using a combination of OpenAI (via OAuth, Plus plan), Alibaba's Coding Plan, and Z.ai coding plan (paid $20ish for a year of lite) - In that order.
Gipperty for orchestration, Qwen/GLM for execution, GLM via Z.ai in the unlikely chance I run out of usage.
We're in a weird timeframe where coding plans are popping up all over the place, might be worth trying out one or two to see if they work for you. Admittedly, I haven't used Anthropic's models directly so I don't have a comparison, but it's working alright for me. I'm not employed for dev work so YMMV.
Mostly Gemini because I have free pro. Thinking about buying claude pro.
Do you happen to know what payment methods zai takes or what merchant/company they bill as? It look interesting but I have to pay heavy intl fees transaction
I just rebuilt leaked Claude code source , now it's gonna be my main tool after I add support for certain api types
Using Gemini for study
You can google search the word and get a very quick definition/meaning. I just select the word, right click and then
search googleHonestly, I can't remember - I have cards with 0% international fees, so it wasn't something on my conscience.
Z.ai is in the last place for a reason though, when I signed up with them in Jan (I paid for a year, apparently 3x Claude Pro usage, for $28) it was stupidly slow and unreliable. I'm not sure if that's improved, but I'd tread lightly for up to date reviews.
They use Orb Billing if I'm not mistaken. Tried it once but tokens allocation was super tight made me switched back to copilot. I still have the invoice, let me check
What are we talking about? or who are we talking about?
GPT-5.4 is the best for now. Hope Codex CLI could learn something from the source of Claude Code.
Still Claude Code on Max $100 plan paired with ChatGPT Plus $20, Gemini AI Pro $20 and GLM Coding Pro plan $129/yr discounted and Github Copilot Pro free via open source maintainer program
Claude Code is primary used 5.5 billion tokens the past 30 days of which 3.3 billion tokens during recent 2x off-peak usage 2 week promo.
Added my Claude Code ai-image-creator skill so Claude Code can create images and leverages Google Gemini Nano Banana 2, Flux 2, Riverflow V2 Pro, Bytedance Seedream 4.5, OpenAI GPT-5 Image LLM models
https://github.com/centminmod/my-claude-code-setup
Examples
https://www.threads.com/@george_sl_liu/post/DWj1-gWk1il?xmt=AQF0K2lQnUhz5abjz34ijzkS8-wIw0fBlhob2olL2sJuikRqUe9BeqXeuDX0wsvZbCx79KM&slof=1
https://www.threads.com/@george_sl_liu/post/DWjfrpJk5_D?xmt=AQF0mv0sm-aJ7k5ha-oRXJ-QUNZ0F2oJbxPxpBiXJY9KAoMoufhMsJmmeR0wftUpsQftjj4&slof=1
The date is April 1 2026, not early 2027. So, I'm sure 5.5B tokens would have costs more than $6k. Are you paying out of your own pocket or not?
Beauty of Claude Code Max subscriptions I export my usage metrics Claude Code native OpenTelemetry to Grafana and built my own Grafana MCP so Claude Code can query its own usage over time https://github.com/centminmod/claude-code-opentelemetry-setup
Example dashboard just after promo for past week https://www.threads.com/@george_sl_liu/post/DWeICtdE3xf?xmt=AQF0JxPCBiDNNjoh9f8IMh__ol2e5sRgFlVJ7AJGpVrwLIH0ZgRp7XMzHRxihY-4fvzSPr5F&slof=1
Ah, you incorporated cache hit into the counter