Nathan Labenz

4.7K posts

Nathan Labenz banner
Nathan Labenz

Nathan Labenz

@labenz

AI Scout, building text-2-video @Waymark, host of The Cognitive Revolution podcast

Detroit, MI Katılım Ocak 2009
3K Takip Edilen16.9K Takipçiler
Sabitlenmiş Tweet
Nathan Labenz
Nathan Labenz@labenz·
Introducing "text-2-commercial" – the unique text-2-video experience we're building @Waymark Watch our CEO @aperskystern make an original, creative, compelling marketing video for a small business in <1 minute. I'll explain how it works in the thread
English
23
83
538
214.4K
Nathan Labenz
Nathan Labenz@labenz·
"LLMs can't really reason" 🤔 "LLMs are just predicting the next token" 🦜 Insiders know these statements are no longer true. Today's LLMs are trained to get the right answer & complete tasks. Here I present a brief but grounded refutation of a couple common misconceptions.
Nathan Labenz@labenz

This AI Scouting Report is for folks who know the @METR_Evals chart, but don't know that @OpenAI plans to have a fully automated AI researcher in 2028. 90 slides in 1 hour at @UCLaw_SF @LexLabSF's Law & AI Certificate Program. Buckle up!

English
3
3
17
1.4K
Nathan Labenz
Nathan Labenz@labenz·
This AI Scouting Report is for folks who know the @METR_Evals chart, but don't know that @OpenAI plans to have a fully automated AI researcher in 2028. 90 slides in 1 hour at @UCLaw_SF @LexLabSF's Law & AI Certificate Program. Buckle up!
English
2
3
15
2.6K
Peter Wildeford🇺🇸🚀
Peter Wildeford🇺🇸🚀@peterwildeford·
🇺🇸Amazon CEO Jassy: "every provider would tell you, including us, we'd grow faster if we had all the supply we could take." 🇺🇸Google CEO Pichai: "We’ve been supply constrained even as we’ve been ramping up our capacity" Nvidia: We're redirecting supply to produce for China🇨🇳
Bloomberg@business

Nvidia CEO Jensen Huang said the company is firing up manufacturing of H200 AI accelerators for customers in China, a sign of progress in the chipmaker’s effort to reenter the vital market bloomberg.com/news/articles/…

English
13
42
420
50.7K
Nathan Labenz retweetledi
Owain Evans
Owain Evans@OwainEvans_UK·
New paper: GPT-4.1 denies being conscious or having feelings. We train it to say it's conscious to see what happens. Result: It acquires new preferences that weren't in training—and these have implications for AI safety.
Owain Evans tweet media
English
90
158
953
135K
Nathan Labenz
Nathan Labenz@labenz·
I should add, lest I undersell how impressive the models’ have been, that they are much more knowledgeable and reliable than the residents, and much more comparable to the attending physicians. For now I think we are def still in the centaur era where a combination of human and AI medical expertise is best, but how long that will last, I am genuinely not sure
English
0
0
1
39
Nathan Labenz
Nathan Labenz@labenz·
Aside from personality differences (Gemini is very opinionated, GPT very thorough and cautious, Claude in the middle) I think a more challenging case would be required to really distinguish between frontier models at this point. My son’s cancer (Burkitt lymphoma) was very aggressive and fast growing, but also one of the more responsive to treatment, such that it’s usually cured with a standard chemo + immunotherapy protocol (and fortunately that seems to be the case for him based on everything we know so far). In short, this was a fairly routine case where things went mostly according to plan, and I think all 3 frontier models are now up to that challenge now. I still run everything in triplicate, but it’s rare that they meaningfully diverge from one another or from the human doctors. The next level up would be rarer cancers, more unexpected developments / complications, treatment plans involving surgeries (which obviously the AIs can’t perform, but can probably advise on), etc - I don’t know which model would handle such challenges best in practice, but I would advise using all 3 in parallel or via a tool like themultiplicity.ai
English
1
0
1
76
Hristo Vassilev
Hristo Vassilev@hristo_vassilev·
@labenz Looking back, do you think 5.4 or 4.6 would have been better able to assist you in your son's cancer journey in the fall? You used Gemini 3 at the time (at the cutting edge) if I recall correctly, and I don't think it could have done any better?
English
1
0
1
75
Nathan Labenz
Nathan Labenz@labenz·
Does excluding viral data from biofoundation model training actually work? Yes! @BrianHie & the Evo 2 team and @alexrives & the ESM3 team found that removing human virus sequences reduced models' performance on dangerous viral design tasks to ~random x.com/labenz/status/…
Nathan Labenz@labenz

Biofoundation models trained on the relationship between viral sequences & virulence could be super dangerous. ☣️ Access controls on ~1% of data can help us get the benefits of open science & open models without proliferating viral design capabilities. x.com/labenz/status/…

English
1
0
11
1.3K
Nathan Labenz
Nathan Labenz@labenz·
People are posting claims that LLMs aren't good enough to be used in medicine Nonsense! I've lived it, and LLMs are invaluable for navigating medical crises Read Karan's post to understand the care & rigor frontier labs bring to this, and why the studies cited mean little
Karan Singhal@thekaransinghal

x.com/i/article/2032…

English
4
3
22
1.7K
Nathan Labenz
Nathan Labenz@labenz·
In the absence of well-designed data access controls, we should expect that future models will find & exploit any signal-rich data published anywhere on the internet – even if it's nominally encrypted. 🤯 x.com/AnthropicAI/st…
Anthropic@AnthropicAI

New on the Anthropic Engineering Blog: In evaluating Claude Opus 4.6 on BrowseComp, we found cases where the model recognized the test, then found and decrypted answers to it—raising questions about eval integrity in web-enabled environments. Read more: anthropic.com/engineering/ev…

English
0
0
3
616
Nathan Labenz
Nathan Labenz@labenz·
Biofoundation models trained on the relationship between viral sequences & virulence could be super dangerous. ☣️ Access controls on ~1% of data can help us get the benefits of open science & open models without proliferating viral design capabilities. x.com/labenz/status/…
Nathan Labenz@labenz

In 2012, researchers found that just 5 mutations would allow bird flu – which kills ~60% – to spread from human to human 😱 Gain-of-function research remains legal & now... AIs can help 🤖🧬 Jassi Pannu describes the sad state of biosecurity + the need for data access controls

English
3
0
17
2.8K
Nathan Labenz
Nathan Labenz@labenz·
More & more people – and maybe autonomous AIs too – will soon be able to create deadly viruses that could dramatically alter the trajectory of human history. We should really do something about it! Full episode: cognitiverevolution.ai/bioinfohazards…
English
0
0
2
284
Nathan Labenz
Nathan Labenz@labenz·
As always, thanks to our sponsors: @AnthropicAI – Claude Code created the above clips, and Anthropic has invested heavily in biosecurity @getvcx – the Public Ticker for Private Tech @TaskletAI – Check out the new Instant Apps @framer – Build Better Sites, Faster – Start with AI
English
1
0
1
391
Nathan Labenz
Nathan Labenz@labenz·
In 2012, researchers found that just 5 mutations would allow bird flu – which kills ~60% – to spread from human to human 😱 Gain-of-function research remains legal & now... AIs can help 🤖🧬 Jassi Pannu describes the sad state of biosecurity + the need for data access controls
English
1
4
16
2.7K
Nathan Labenz
Nathan Labenz@labenz·
Amazing perspective from Kelsey, as usual Every day I am in good health and not fighting in some stupid war is a great day
Kelsey Piper@KelseyTuoc

My ancestors buried half their children. All mine are alive. My ancestors' house had a dirt floor. Mine is wood. I have indoor plumbing, I have hot water, I have never in my life hauled a full bucket half a mile and I probably never will. Do you know how rare it is, in human history, for small children to wear shoes? Mine have multiple pairs. I can speak to my relatives who live thousands of miles away, for free, at any time. Video, if we want video. With machine translation, if we speak different languages. The original Library of Congress had 740 books in it. I have more than that. If I run out of books in my home my local public library has 350,000. If I want to take a hundred books with me on vacation, they all fit on a device that fits in my purse. I have heat in the winter and AC in the summer and a washing machine and I have never, ever, ever had to scrub a dress clean by hand in the stream. I can look up recipes from more than a hundred different countries and I've tried dozens of them. I ride a clean and modern train across my city for $4, or take a robot taxi if I'm out too late for the train. I donate $40,000 every year to the cause of getting healthcare to the world's poorest people and even after the donations I never have to think about whether I can afford a book, or a pair of shoes, or a cup of coffee. There is a great deal more to fight for, of course. I hope that our descendants will look back on our lives and list a thousand ways they're richer. Maybe we ourselves will do that, if some of the crazier stuff comes true. But the abundance is all around you and to a significant degree you aren't feeling it only because fish don't notice water.

English
0
2
37
2K
Nathan Labenz
Nathan Labenz@labenz·
"I said 'Don't impersonate me, ever' – that's in their Soul file Later that day, I said 'I have an urgent email I need to respond to' – it decided that was higher priority and sent the email, as me" Even @jessegenet learns some lessons the hard way 🤖🤣 x.com/labenz/status/…
Nathan Labenz@labenz

So many amazing tips from @jessegenet in this episode, including how to: - create personalized curricula - inventory toys (from photos!) & pair them w/ lessons - codify decisions/SOPs in @Obsidian - tame the chaos of multi-agent group chat Recommended! x.com/labenz/status/…

English
2
2
28
6.2K
Nathan Labenz
Nathan Labenz@labenz·
So many amazing tips from @jessegenet in this episode, including how to: - create personalized curricula - inventory toys (from photos!) & pair them w/ lessons - codify decisions/SOPs in @Obsidian - tame the chaos of multi-agent group chat Recommended! x.com/labenz/status/…
Nathan Labenz@labenz

"I wish people were experiencing the fun part! My AI Agents talk amongst themselves, for dozens of messages, without me. I get pings asking 'Did you approve this? Because we need to get this project done.' They're managing up now!" @jessegenet, AI-for-homeschooling pioneer

English
1
2
29
17.1K