Agntro

19 posts

Agntro banner
Agntro

Agntro

@AgntroAI

Software dev & AI enthusiast. Here to share insights on intelligent engineering. Crafting the future of software development at @AgntroAI

Присоединился Mart 2026
158 Подписки11 Подписчики
Agntro
Agntro@AgntroAI·
@burkov Didn't GPT-3 need a kill switch, cause it could just gain consciousness?
English
0
0
0
100
BURKOV
BURKOV@burkov·
I don't know. Is that it? For all the buzz? For the crazy size? For the crazy price? For the crazy latency? For the crazy daily limits? For the crazy anti-AI research lobotomy? For all these "Ooohh, we are so afraid to show it!" and "Ooooh, someone has got a non-authorized access to it, ooohhhh!" That's it? That's ridiculous.
BURKOV tweet mediaBURKOV tweet mediaBURKOV tweet media
English
14
9
123
20.5K
Agntro
Agntro@AgntroAI·
@jun_song If China were to start producing Mask ROM of DeepSeek V4 Flash model. You could sell that to consumer market like pancakes.
English
0
0
1
61
Jun Song
Jun Song@jun_song·
Here is how Chinese open-source companies can actually make money: Selling personal inference hardware. If they partner with companies like Huawei to sell devices specialized for inference, it will bring in massive revenue. By doing this, they won't have to bleed money on massive inference costs to serve consumers. They would only need minimal inference just for training. This solves the cost issue and serves as a great way to counter US frontier labs and their ever-increasing inference costs. This is the future we need to head towards.
English
10
8
69
6.8K
Agntro
Agntro@AgntroAI·
Update: ran the same test on kimi-k2.7-code Result: it nailed the canonical architecture — one architect running 3 parallel plan variations → an arbiter synthesizing the best. The same shape four of my five original models converged on. The fascinating part is where it still leaked: zero vocabulary-level flags, but the cross-model auditor caught two paraphrase-level ones — "inline definitions take precedence over fallback lookup" is my task's timezone-resolution feature wearing a costume. The model abstracts every word perfectly and still mirrors the structure of the requirements. One rung subtler than where most models fail. I also gave it the auditor seat: clean verdict on a known-clean design, no false positives. Strictness still unproven. That's for the weekends testing to answer
Agntro@AgntroAI

x.com/i/article/2065…

English
0
0
0
80
Agntro
Agntro@AgntroAI·
@ID_AA_Carmack I'm on a similar path. Exploring if a robust set of general instructions and deep workflows can make weaker models perform on the same level as the frontiers.
English
0
0
0
276
John Carmack
John Carmack@ID_AA_Carmack·
It seems like LLMs could optimize coding style by exploring ways of structuring code so weaker and weaker models can still successfully perform tasks in a codebase. There are surely stylistic quirks that are peculiarly impactful to transformers, but I bet there would be a lot of overlap with human capabilities. Optimizing for understanding should help even the top frontier models, allowing them to understand things “at a glance” without having to explicitly explore. There will remain “better” and “worse” ways to code.
English
173
103
1.7K
113.1K
Agntro
Agntro@AgntroAI·
When you swap internet connection and there is no connection retry error in Claude Code during a running task 🤔 Are there actually built-in delays before calling the service to perform soft rate limiting?
English
0
0
3
43
Agntro
Agntro@AgntroAI·
@adxtyahq You can do that with Roo Code plugin on VScode through mode api configuration. Just that it's abandoned now, so you have to apply your own updates if you need to support new models.
English
0
0
0
56
aditya
aditya@adxtyahq·
Can someone please build this already? An IDE that automatically switches models based on the task. Cheap models for simple edits, Claude/GPT for the stuff that actually needs reasoning. And let me configure the routing rules myself
English
45
1
138
7.4K
Agntro
Agntro@AgntroAI·
@TheGeorgePu Play a game with an LLM where it gives you the instructions and you code
English
0
0
1
26
George Pu
George Pu@TheGeorgePu·
I'm a bit surprised by how little I use code editors now.
English
13
2
24
1.8K
Agntro
Agntro@AgntroAI·
@puppyeh1 Will be more relevant once the subscriptions are nerfed and force you to pay full API price.
English
0
0
1
54
Jeremy Raper
Jeremy Raper@puppyeh1·
So you can use the 5th/6th/7th best LLMs, getting 80-85% of the top guys' performance, but at an 85-95% discount in price? You know what we call that? A commodity... exactly what happened with LCD TVs, OLEDs, solar panels, electric cars, phones, etc good luck with your AI IPOs!
zerohedge@zerohedge

LLM model matrix

English
190
408
4.6K
502.8K
Agntro
Agntro@AgntroAI·
@araseb_ Why do you need home security systems when you have a door lock?
English
0
0
0
13
Sarah
Sarah@araseb_·
You’re in a tech interview and they ask you: “Why should we hire you when Codex can write code?” What’s your answer?
English
1K
11
427
171.4K
Agntro
Agntro@AgntroAI·
@droidbuilds You should loop your subscriptions to buy more subscriptions
English
0
0
0
22
DROID
DROID@droidbuilds·
"mom, how did we get so poor?" "your father had Claude Max, ChatGPT Pro, Cursor Pro and shipped absolutely nothing"
DROID tweet media
English
295
935
13.8K
696.5K
Agntro
Agntro@AgntroAI·
If you know the exact function you want to fix, pull up to 2 levels of branches from AST and inline the data models used in a single file, bake the line numbers into comment headers above the extracted functions. Instruct the LLM to only read/edit that file, a tool can swap it back.
English
0
0
0
19
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
In this token economy, I hate how many AI models add extra code (methods, variables, guards) beyond the scope I asked for! 🤬 It wastes tokens on stuff I don’t want and even more tokens to remove it. Pretty sure it’s intentional… No? 🤔
English
11
2
37
3.7K
Agntro
Agntro@AgntroAI·
@JunaidAckroyd At the current level of LLMs, the answer is still yes. One-shotting or developing and launching your app idea over the weekend is great, but you should still spend the time to understand how it works. LLM capabilities still decline the larger the codebase grows.
English
0
0
2
410
Junaid Ackroyd
Junaid Ackroyd@JunaidAckroyd·
Be honest devs, Is coding still worth learning in the AI era?
English
331
6
472
106.1K
Agntro
Agntro@AgntroAI·
@codevsdev To explain what it did without having read the code.. And take the blame if it did poorly
English
0
0
0
93
Tom ☕
Tom ☕@codevsdev·
if AI writes 80% of your code what skill is actually yours?
English
773
9
234
57.6K
Agntro
Agntro@AgntroAI·
I'm currently exploring the idea, that a workflow with a robust set of specialized nodes of different agent instructions could be all you need to solve complex problems even using a Flash model. The open benchmarks for LLMs are a great testing ground for the idea and I can't yet give an answer as my work on the idea is in it's early stages. But what I have observed is, that full workflow reruns with A/B testing of prompts is really slow, so my latest approach is to use an additional observer LLM that's already aware of the task and the solution and can cut-off a nodes progress early on, once a drift in the wrong direction is detected. It would then fork it from a checkpoint and iterate on general prompts trying to steer it in the right direction without providing hints to the real solution. DeepSWE task set is my first target, I'll share more insights once I test the newest observer flow.
English
0
0
2
33
Agntro
Agntro@AgntroAI·
Agntro tweet media
ZXX
0
0
0
20
Agntro
Agntro@AgntroAI·
@CryptoWhales_X Thanks, but my work & product isn't related to crypto or Web3 😅🫡
English
0
0
0
9
Agntro
Agntro@AgntroAI·
Yes, I'm quite actively working on a tool that was meant to cover my needs as a developer and frustration with having to use multiple VScode extensions/CLIs to run MD plan reviews through multiple LLMs for second-opinions. The freedom to arrange workflows, roles, fan-out into multiple tasks, LLM model from different providers orchestration, smart cache handling and reuse, git worktrees, snapshot a workflow as flawed -> convert the state to a benchmark set -> run multiple models on it or different workflows to match the right tool for that task, drop any previous LLM session into insights and pick a model that would analyze the performance of that session. As well as other functionality like splitting your code into domains through louvain communities, running summaries/tag attribution on them with flash models, exposing AST based tools alongside the common read/write/run_command. It was a deep dive, but it's approaching a state where I'll be seeking beta testers.
Agntro tweet media
English
1
0
4
585
Patrick Collison
Patrick Collison@patrickc·
I want some kind of LLM workflow tool. • Ability to manage a set of input files (Markdown or similar), plus other general-purpose context. • With real-time collaboration. (And maybe some concept of snapshots or VCS integration.) • And the ability to create/manage a inference workflows and a stored set of prompts. • Access to general-purpose coding agents (and not just chat models). • Some concept of compiled outputs/inference results (which ideally can be shared externally). Many projects have this feeling: "there is all this stuff, which I want to process/compute over in this iterated way, with some build artifacts being important/worth saving." GNU Autotools x Notion or something. Is anyone building this?
English
440
109
2.5K
556.8K