Ben Dickson

10.1K posts

Ben Dickson

@bendee983

Software Engineer | Tech analyst | Thinker | Student of life | Founder of @bdtechtalks

In a private namespace شامل ہوئے Ağustos 2015

655 فالونگ4.8K فالوورز

Ben Dickson@bendee983·45m

Writing code is no longer the bottleneck. Reviewing it is.

Jonathan Ross@JonathanRoss321

For 50 years, software engineering ran on code rationing. Writing code was expensive, so we rationed it carefully through roadmaps, RFCs, prioritization meetings, and scope reviews. This created a role: the No Engineer. No, that won't scale. No, we don't have bandwidth. No, that's out of scope. No, we need a design doc first. The No Engineer was valuable for 50 years. Every "no" saved real money. Their judgment was the rationing system. LLMs will be the end of code rationing. Code is cheap now. And while the No Engineer is explaining why something can't be done, the Yes Engineer has already shipped three versions of it. If you're a Yes Engineer, the next decade is yours.

English

Ben Dickson@bendee983·53m

@emollick It's kind of like a robotic model that has been trained for a certain morphology. It works very well with that specific anatomy. When you move it to a new body, the more it deviates from its original form, the less accurate it becomes.

English

Ethan Mollick@emollick·4h

Increasingly, I think, we will see a gap between what you can do with frontier model APIs & what you can do with the native apps from the frontier labs (Codex, Claude Code). Models developed and trained with their native harnesses in mind have more capabilities in their harnesses

English

211

11.5K

Ben Dickson@bendee983·3h

The amount of knowledge Gemini has is astounding (even 2.5 Pro). I often use it for cleaning up speech text transcribed by AI. Every time I'm surprised by the kind of niche knowledge it has, such as scientific papers presented at a conference in some distant year. And it uses that knowledge to infer the context of what the conversation was about and how it can correct mistranscribed words.

English

Arnav Gupta@championswimmer·1d

Always maintained that Gemini 3.1 Pro is truly the most "knowledgeable" model. No model comes close to it in world knowledge. Good to see it being proven with some (albeit empirical) research.

Deedy@deedydas

Researchers just estimated the size of all the LLMs by asking it knowledge questions of varying degrees of obscurity! – GPT 5.5: ~10T params – Claude Opus 4.x: ~4-5T – Grok 4: ~3T The idea here is that factual capacity scales log-linearly with size. The paper shows 7 knowledge tiers and T7 is essentially ~0% for all models, suggesting there is still significant headroom for pretraining. Gemini 3.1 Pro is likely >10T given its used as an anchor but has no direct estimate. This means we can infer what different models might cost to some degree and their post-training effectiveness (performance at certain non-factual tasks given its size). One of the coolest papers I’ve read of late.

English

107

25.8K

Ben Dickson@bendee983·4h

@MatthewBerman Which is why the open-source Gemma-class models make sense (and they're really good models). But for large models, I don't think they have Nividia-class distribution for it.

English

Matthew Berman@MatthewBerman·4h

@bendee983 Google benefits from open source if the industry builds on their standard. It's literally the same playbook as android.

English

283

Ben Dickson@bendee983·4h

You either have to sell the model or the hardware that runs the model. Long-term open source only makes sense if you benefit from its propagation. Google benefits from Gemma models being run on Android devices (indirectly benefiting from the sale of Android hardware), so it makes sense to open source those. On the other hand, it is not directly selling the hardware that runs Gemini models (TPUs), so it makes sense for them to keep them closed. Nvidia, on the other hand, will profit immensely from the propagation of huge open source models. I think Nvidia is the only company that can be the champion of very large U.S.-based open models.

Matthew Berman@MatthewBerman

Demis says he wants to see a Western open source AI stack and that we’re losing to China. He also says Google doesn’t have enough compute to build two frontier (open and closed) models, which is why Gemma is a smaller family of models. Watch this incredible clip. Shout out @ycombinator and @garrytan for the fantastic interview.

English

700

Ben Dickson@bendee983·10h

How to keep up with AI advances: 1- Be unemployed 2- Have no life Alternatively: 1- Pick one very niche area 2- Allocate around 20% of your time to studying new research and experimenting

English

291

Ben Dickson@bendee983·10h

@aaditsh That's funny. In my experience, Gemini is actually the best one, both for transcription and for cleaning up. How do you use the models for transcription?

English

153

Aadit Sheth@aaditsh·11h

I LOVE voice transcription. Probably transcribed between 500K and 1M words in the last year. ChatGPT is the best at this and it's not close. Almost no errors. And it's been this good for almost a year now which is wild. Grok is surprisingly good too. Claude is unreliable and the UX isn't great. Gemini is basically unusable. Embarrassingly bad. I use all four of these every day. ChatGPT just figured out voice transcription before everyone else and nobody has caught up.

English

5.6K

Ben Dickson@bendee983·12h

There is no "one model to rule them all." True winners are those who experiment with different LLMs, find which is best for which task, and avoid getting locked into any single AI ecosystem. @opencode is a really good choice.

HealthRanger@HealthRanger

I just had DeepSeek V4 find and fix 8 memory leaks that were causing crashes in code written by Claude Opus 4.7. DeepSeek found them all and fixed them in minutes, at a total cost of maybe three pennies. DeepSeek is amazingly good at coding and bug fixing. And it costs almost nothing to use, even the Pro version. Best harness? OpenCode.

English

576

Ben Dickson@bendee983·13h

Devs, Are you switching from Claude Code to Codex because it is better or more stable? (Or is this trend not even real?).

English

1.5K

Ben Dickson@bendee983·22h

Training reasoning agents has long been stuck between two tricky options: 1- Reinforcement learning with verifiable rewards (RLVR), which is low-cost but also tricky due to sparse rewards. 2- On-Policy Distillation (OPD) from larger models, which provides granular feedback on responses but is costly because it requires a large teacher. A third option is on-policy self-distillation (OPSD), which solves the costs of OPD but results in low-quality training due to information leakage from the teacher to the student. RLVR with Self-Distillation (RLSD), a new technique by researchers at JD.com, addresses this problem by making small changes to the self-distillation process. It uses the sparse signal from the verfiable reward to determine the direction of the update (i.e., whether to reinforce or penalize a behavior). And it uses the signal from the self-distillation to determine the magnitude of the update (i.e., how much relative credit or blame a specific step deserves). The result: RLSD at 200 training steps already beats RLVR with GRPO trained for 400 steps while avoiding the costs of OPD and the poor training quality of OPSD.

VentureBeat@VentureBeat

How to build custom reasoning agents with a fraction of the compute venturebeat.com/ai/how-to-buil…

English

790

Ben Dickson@bendee983·23h

Anthropic's loss is OpenAI's gain... for now. But don't be fooled. This is not a sustainable process. Eventually, they will face the same problem as Anthropic, especially if their models are as large as empirical research shows (GPT-5.5 being ~9.7T params). Prepare yourself for token scarcity.

Tibo@thsottiaux

Don't just reset Codex rate limits for fun, it costs money. Don't just reset Codex rate limits for fun, it costs money. ... but the vibes are good ... I have reset Codex rate limits for ALL paid plans to celebrate a good week and allow everyone to build more with GPT-5.5. Enjoy

English

352

79.8K

Ben Dickson@bendee983·1d

Exactly. We haven't even scratched the surface of the space of possible and useful software. And if the cost of building applications drops, demand will only grow. And note that everyone can code now, but the people who can ship production-level code are software engineers. Some companies have realized this already. Some will do so soon.

English

John Crickett@johncrickett·1d

AI is going to create more demand for software engineers. AWS CEO says they're hiring as many as ever. It makes sense. There's so much software that could be written, apps that could be improved, new games that could be created, processes that could be automated. If creating software becomes cheaper / quick demand goes up. Jevons Paradox in action.

Shay Boloor@StockSavvyShay

$AMZN AWS CEO pushed back on the idea that AI is killing software jobs by saying Amazon is hiring as many developers as ever. He said AI agents are “exploding” across every industry & moving faster than expected changing the developer job rather than eliminating it.

English

8.1K

Ben Dickson@bendee983·1d

The undiscussed social impact of AI coding agents 😂

beginbot 🃏@beginbot

I told my gf I can't hang out right now Github is up, so I have to work don't know when I'll get this chance again

English

108

Ben Dickson@bendee983·1d

Previously, anyone who could fire up an IDE and write code called themselves a software developer/engineer. And for someone looking from outside, it was difficult to tell the difference between a coder and engineer. In reality, there is a lot more to building software than just writing code. And with LLMs writing code, all those non-coding disciplines are becoming much more important.

Fernando@Franc0Fernand0

When people claim that LLMs will replace software engineers, it either indicates a lack of understanding of LLMs or of software engineering. But if your only definition of software engineering is related to feature development, there is potential to believe that LLMs can replace developers.

English

1.1K

Ben Dickson@bendee983·1d

Acquisition incoming

Exa@ExaAILabs

We're excited to partner with Google to offer Grounding With Exa inside of Gemini models! Using Exa's agent-first search, Gemini models can now access billions of websites, technical docs, papers, people, companies, and more. 10^18🤝10^100

English

Ben Dickson@bendee983·1d

The growing costs of closed frontier models and unpredictable outages are creating opportunities for a new segment of the market. Local AI models in IDEs, in particular, will be an interesting space to watch. The use case is specialized enough to require smaller parametric knowledge (i.e., general knowledge from the world), making it a good space for small language models. One of the challenges, however, is how to make it work on the wide range of devices that constitute the install base (different processors, memory capacity, etc.). A possible solution is to provide a range of options, from self-hosted to low-cost cloud-hosted (e.g., by JetBrains) to frontier models (e.g., Claude Opus 4.7). It will be very interesting to see how this plays out. But the end of AI subsidies is creating new market dynamics.

Kirill Skrygan@kskrygan

Would you be interested if JetBrains releases a totally local AI agent, working 100% on your laptop, using our code insight engine and deeply integrated into the IDE? Yes, it will be probably 1 month behind the very recent frontier models, but no token blood bath anymore WDYT?

English

352

Ben Dickson@bendee983·1d

Be careful what kind of information you give Claude Code. Your API keys and other sensitive information might end up in your codebase (e.g., when you choose “allow always” with sensitive data in a CLI command) and be shipped to a repository. And no, the normal safeguards don't detect it.

TechTalks@bdtechtalks

A new study reveals how AI coding assistants like Claude Code are quietly hoarding and publishing sensitive API keys to code repositories. bdtechtalks.com/2026/04/27/cla…

English

164

Ben Dickson@bendee983·1d

@his_eminence_j "No AI" will be the new "bio" and "gluten-free" label.

English

250