Megan Ben Dor Ruthven

19.9K posts

Megan Ben Dor Ruthven banner
Megan Ben Dor Ruthven

Megan Ben Dor Ruthven

@_mbdr_

Just do things with AI. Senior Software Engineer @Google's Threat Analysis Group, previously @GoogleAI, #AndroidSecurity. Expressing own opinions. she/her/y'all

Switzerland Katılım Kasım 2010
860 Takip Edilen7.3K Takipçiler
Megan Ben Dor Ruthven retweetledi
Boyan Slat
Boyan Slat@BoyanSlat·
Three years ago I suddenly developed blurry vision in one of my eyes. Went to the GP. Eye drops. No effect. Went to a specialized eye hospital. No solution there either. A few weeks ago I tried something new: I asked an LLM. It analyzed my diet and suggested I might be omega-3 deficient. It also pointed me to studies showing that this can impair the meibomian glands (which produce the oily layer that smooths the surface of the cornea.) I started taking algae oil supplements. Two weeks later… my vision is perfectly restored! Honestly, I got a bit emotional. Banning AI from being used for medical questions is a terrible idea.
More Perfect Union@MorePerfectUS

A New York bill would ban AI from answering questions related to several licensed professions like medicine, law, dentistry, nursing, psychology, social work, engineering, and more. The companies would be liable if the chatbots give “substantive responses” in these areas.

English
414
1.3K
17.2K
957.9K
Megan Ben Dor Ruthven retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Say hello to Nano Banana 2, our best image generation and editing model! 🍌 You can access Nano Banana 2 through AI Studio and the Gemini API under the name Gemini 3.1 Flash Image. We are also introducing new resolutions (lower cost) and tools like Image Search!
Logan Kilpatrick tweet media
English
265
222
3.5K
574.7K
Megan Ben Dor Ruthven retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
Excited to launch Gemini 3.1 Pro! Major improvements across the board including in core reasoning and problem solving. For example scoring 77.1% on the ARC-AGI-2 benchmark - more than 2x the performance of 3 Pro. Rolling out today in @GeminiApp, @antigravity and more - enjoy!
Demis Hassabis tweet media
English
246
420
5K
245K
Megan Ben Dor Ruthven retweetledi
François Chollet
François Chollet@fchollet·
The new Gemini Deep Think is achieving some truly incredible numbers on ARC-AGI-2. We certified these scores in the past few days.
François Chollet tweet media
English
87
198
2.2K
210.6K
Megan Ben Dor Ruthven retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Gemini now processes over 10 billion tokens per minute via direct API use by our customers and the Gemini App just crossed 750M monthly active users : )
English
229
151
3.4K
293.4K
Megan Ben Dor Ruthven retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
A few random notes from claude coding quite a bit last few weeks. Coding workflow. Given the latest lift in LLM coding capability, like many others I rapidly went from about 80% manual+autocomplete coding and 20% agents in November to 80% agent coding and 20% edits+touchups in December. i.e. I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit but the power to operate over software in large "code actions" is just too net useful, especially once you adapt to it, configure it, learn to use it, and wrap your head around what it can and cannot do. This is easily the biggest change to my basic coding workflow in ~2 decades of programming and it happened over the course of a few weeks. I'd expect something similar to be happening to well into double digit percent of engineers out there, while the awareness of it in the general population feels well into low single digit percent. IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits. Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased. Speedups. It's not clear how to measure the "speedup" of LLM assistance. Certainly I feel net way faster at what I was going to do, but the main effect is that I do a lot more than I was going to do because 1) I can code up all kinds of things that just wouldn't have been worth coding before and 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion. Leverage. LLMs are exceptionally good at looping until they meet specific goals and this is where most of the "feel the AGI" magic is to be found. Don't tell it what to do, give it success criteria and watch it go. Get it to write tests first and then pass them. Put it in the loop with a browser MCP. Write the naive algorithm that is very likely correct first, then ask it to optimize it while preserving correctness. Change your approach from imperative to declarative to get the agents looping longer and gain leverage. Fun. I didn't anticipate that with agents programming feels *more* fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part. I also feel less blocked/stuck (which is not fun) and I experience a lot more courage because there's almost always a way to work hand in hand with it to make some positive progress. I have seen the opposite sentiment from other people too; LLM coding will split up engineers based on those who primarily liked coding and those who primarily liked building. Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually. Generation (writing code) and discrimination (reading code) are different capabilities in the brain. Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it. Slopacolypse. I am bracing for 2026 as the year of the slopacolypse across all of github, substack, arxiv, X/instagram, and generally all digital media. We're also going to see a lot more AI hype productivity theater (is that even possible?), on the side of actual, real improvements. Questions. A few of the questions on my mind: - What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows *a lot*. - Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro). - What does LLM coding feel like in the future? Is it like playing StarCraft? Playing Factorio? Playing music? - How much of society is bottlenecked by digital knowledge work? TLDR Where does this leave us? LLM agent capabilities (Claude & Codex especially) have crossed some kind of threshold of coherence around December 2025 and caused a phase shift in software engineering and closely related. The intelligence part suddenly feels quite a bit ahead of all the rest of it - integrations (tools, knowledge), the necessity for new organizational workflows, processes, diffusion more generally. 2026 is going to be a high energy year as the industry metabolizes the new capability.
English
1.6K
5.4K
39.4K
7.6M
Megan Ben Dor Ruthven retweetledi
MetaCritic Capital
MetaCritic Capital@MetacriticCap·
My two cents about the SaaS debate Claude Code made me subscribe to three triple-digits per year SaaS services at work. ClawdBot in one day already has me considering subscribing to 1Password and Notion.
English
12
8
285
57.7K
Megan Ben Dor Ruthven retweetledi
Guillermo Rauch
Guillermo Rauch@rauchg·
Maybe AI was just some elaborate ploy to get engineers to finally write tests and documentation
English
183
225
3.6K
123.3K
Megan Ben Dor Ruthven retweetledi
Google
Google@Google·
We're partnering with @KhanAcademy to bring a suite of Gemini-powered learning and literacy tools to students, starting with the Writing Coach tool. Writing Coach doesn’t generate answers or deliver a finished product — it walks students through the process of outlining, drafting and refining their own ideas. #BettUK2026
English
155
1.1K
9.6K
656.4K
Megan Ben Dor Ruthven retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
Helpful update for students, you can now take full practice SATs for free in the @GeminiApp. It uses vetted content from @ThePrincetonRev and gives you feedback straight away. Starting with the SAT today, but more tests are on the way!
English
526
1.6K
15.2K
1.5M
Megan Ben Dor Ruthven retweetledi
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Asked a friend heading up an accounting team her take on AI and the impact on its work: "I noticed it helps automate repetitive stuff. When I have a process with lots of manual steps, I ask if it can create a workflow using Apps Script. And it works: managed to compress work that would take 4-6 hours to ~20 minutes with it. I don't trust AI - we work with numbers, that need to be always correct. But I am now telling my team to try to see if they can remove more repetition with it, and asking it to create software for those repetitive tasks." Fascinating tbh
English
49
18
392
43.5K
Megan Ben Dor Ruthven retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
PSA: You can vibe code with Gemini 3 Flash and Gemini 3 Pro for free in Google AI Studio app.new
English
178
160
3.1K
264.2K
Megan Ben Dor Ruthven retweetledi
Ethan Mollick
Ethan Mollick@emollick·
Teaching an experimental class for MBAs on “vibefounding,” the students have four days to come up and launch a company. More on this eventually, but quick observations: 1) I have taught entrepreneurship for over a decade. Everything they are doing in four days would have taken a semester in previous years, if it could have done it at all. Quality is also far better. 2) Give people tools and training and they can do amazing things. We are using a combination of Claude Code, Gemini, and ChatGPT. The non-coders are all building working products. But also everyone is doing weeks of high quality work on financials, research, pricing, positioning, marketing in hours. All the tools are weird to use, even with some training, but they are figuring it out. 3) People with experience in an industry or skill have a huge advantage as they can build solutions that have built-in markets & which solve known hard problems that seemed impossible. (Always been true, but the barriers have fallen to actually doing stuff) 4) The hardest thing to get across is that AI doesn’t just do work for you, it also does new kinds of work. The most successful efforts often take advantage of the fact that the AI itself is very smart. How do you bring its analytical, creative, and empathetic abilities to bear on a problem? What do you do with access to a very smart intelligence on demand? I wish I had more frameworks to clearly teach. So many assumptions about how to launch a business have clearly changed. You don’t need to go through the same discovery process if you build a dozen ideas at the same time & get AI feedback. Many, many new possibilities, and the students really see how big a deal this is.
English
80
183
1.8K
125.6K
Megan Ben Dor Ruthven retweetledi
Mark Dalgleish
Mark Dalgleish@markdalgleish·
Absolutely mind boggling to stop and realise: 1) I barely code anymore, just chat with an AI. 2) It’s not slop. I’m still engineering. I don’t feel threatened. Feels like pair programming. 3) I’m enjoying it more than coding by hand. Truly wild time to be living through.
English
254
236
3.7K
163.4K
Megan Ben Dor Ruthven retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
MedGemma 1.5 is a major upgrade to our open models for healthcare developers. The new 4B model enables developers to build applications that natively interpret full 3D scans (CTs, MRIs) with high efficiency - a first, we believe, for an open medical generalist model. MedGemma 1.5 also pairs well with MedASR, our speech-to-text model fine-tuned for highly accurate medical dictation. Developers can now use these multimodal capabilities to build medical apps that reach patients in more places.
English
180
698
6K
393.6K
Megan Ben Dor Ruthven retweetledi
Ethan Mollick
Ethan Mollick@emollick·
Had Claude Code build a little plugin that visualizes the work Claude Code is doing as agents working in an office, with agents doing work and passing information to each other. New subagents are hired, they acquire skills, and they turn in completed work. Fun start.
English
289
376
6.5K
464.9K
Megan Ben Dor Ruthven retweetledi
Guillermo Rauch
Guillermo Rauch@rauchg·
10 days into 2026: - Terence Tao announces GPT & Aristotle solve Erdős problem autonomously - Linus Torvalds concedes vibe coding is better than hand-coding for his non-kernel project - DHH walks back “AI can’t code” from Lex podcast 6 months later An acceleration is coming the likes of which humanity has never experienced before
Guillermo Rauch tweet media
English
185
715
7.6K
765.8K
Megan Ben Dor Ruthven retweetledi
News from Google
News from Google@NewsFromGoogle·
Joint Statement: Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google's Gemini models and cloud technology. These models will help power future Apple Intelligence features, including a more personalized Siri coming this year. After careful evaluation, Apple determined that Google's Al technology provides the most capable foundation for Apple Foundation Models and is excited about the innovative new experiences it will unlock for Apple users. Apple Intelligence will continue to run on Apple devices and Private Cloud Compute, while maintaining Apple's industry-leading privacy standards.
English
1.6K
6.5K
52.4K
11M
Megan Ben Dor Ruthven retweetledi
tobi lutke
tobi lutke@tobi·
Shopify is building the foundation for agentic commerce. Universal Commerce Protocol, which we co-developed with Google, is now live. UCP will make it faster for agents and retailers to integrate. It’s open by default, so platforms and agents can use UCP to start transacting with any merchant. Major retailers are already using it. Agents can handle everything from discovery to fulfillment, and support things like discounts, subscriptions, and loyalty programs. We’ve accounted for all types of commerce.
English
223
456
4K
885.7K
Megan Ben Dor Ruthven retweetledi
James Zou
James Zou@james_y_zou·
Today in @NatureMedicine we report that AI can predict 130 diseases from 1 night of sleep🛌 We trained a foundation model (#SleepFM) on 585K hours of sleep recordings from 65K people—brain, heart, muscle & breathing signals combined. AI learns the language of sleep🧵
James Zou tweet mediaJames Zou tweet media
English
277
2.1K
11K
909.5K