Evelyn

15.1K posts

Evelyn

Evelyn

@tummycom

Open Source, Mountain Time. Give me or the universe anonymous feedback: https://t.co/ZaUAOdtsG9

Fort Collins, Colorado Katılım Kasım 2009
6.5K Takip Edilen874 Takipçiler
Sabitlenmiş Tweet
Evelyn
Evelyn@tummycom·
So long and thanks for all the fish.
English
7
0
21
1.8K
Evelyn retweetledi
Physical Intelligence
Physical Intelligence@physical_int·
We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.
English
32
275
2.1K
347.5K
Evelyn retweetledi
Sergey Levine
Sergey Levine@svlevine·
A while ago we figured out that structure enables data-driven design: if we have data of designs + rewards, we can find a design with *higher* reward if we learn a structured function: arxiv.org/abs/2401.05442 In our latest work, @kuba_AI developed a practical model based on this idea that designs materials to optimize target properties.
Kuba@kuba_AI

AI can optimize materials 🤘 Our (@pabbeel, @svlevine, @AIatMeta) proposed transformer model 𝗖𝗹𝗶𝗾𝘂𝗲𝗙𝗹𝗼𝘄𝗺𝗲𝗿, combined with 𝗲𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻 strategies, discovers materials that optimize target properties. arxiv.org/abs/2603.06082

English
11
26
235
35.7K
Evelyn retweetledi
Garry Tan
Garry Tan@garrytan·
I underestimated how powerful Opus 4.6 with 1M tokens is. Even last year we were absolutely hitting context limit problems constantly. 1M tokens means you can do much more complex analysis entirely in context. Claude Code is so much better. This is the worst it will ever be.
English
192
103
2.6K
136.6K
Evelyn retweetledi
shani 🌱 (sf)
shani 🌱 (sf)@sha_zng·
had a therapy consult with someone who was telling me that their technique was all about helping people expand awareness around their emotions. I asked them what they meant and they said that if you have a thimble full of water, and you put a drop of ink in it, the water will turn black. But if you have a big bowl of water and put a drop of ink, the ink will wiggle around a little, and then disappear. The ink represents the emotional experience you are having, and the vessel for water is the size of your awareness. By expanding awareness while you’re moving through the emotion, there is greater capacity to absorb it without having it take over your experience. I have been thinking about this ever since
shani 🌱 (sf) tweet media
English
13
336
3.8K
130.5K
Tyler is finishing a book, slow to reply
Plot Twist is a game where a friend or stranger has up to 30min to change your life. eg: • Intro u to ur future spouse • Break ur worldview • Get to to quit ur job • Recommend a life-changing therapy My friend & I played this once w a sad lawyer in a bar & it was nuts
English
31
101
5.7K
311K
Evelyn retweetledi
Julian Salazar
Julian Salazar@JulianSlzr·
We're studying cubic surfaces at @GoogleDeepMind! Our first paper, among other things, resolves a 54-year-old arithmetic geometry question of Manin's; one attempted by Swinnerton-Dyer and the first author through the decades. A 🧵 about X³ + Y³ + Z³ + ζ₃T³ = 0 and AI for math:
Julian Salazar tweet media
English
6
44
298
21.6K
Evelyn retweetledi
Vivid Void
Vivid Void@vividvoid·
For all of you in a Claude Code dopamine frenzy, here's an old artist's trick for hypomania: in the brief windows when the AI is working but you don't have enough time to start a whole new task, do yoga. Align breath, body and mind. The groundedness will greatly aid your work!
English
14
25
592
17.1K
Evelyn retweetledi
Ava
Ava@noampomsky·
friend is in the stage of claude psychosis where he asks claude to send him newspapers about what claude is doing for him
Ava tweet media
English
254
442
8.6K
387.6K
Evelyn retweetledi
Tinker
Tinker@tinkerapi·
Mantic used Tinker to RL gpt-oss-120b on judgmental forecasting; the result outperformed frontier models on event predictions. Combined with @_Mantic_AI's forecasting architecture, task-specific training takes us to the cusp of automated superforecasting.
English
2
17
177
82.6K
Evelyn retweetledi
Toby Shevlane
Toby Shevlane@tshevl·
I always dreamed of AGI as a wise advisor for humanity. Although LLMs are great for coding & knowledge work, I wouldn’t trust them to give me advice on my career, business strategy, or policy preferences. How can we build AI systems optimized for wisdom? At Mantic we believe the unlock is prediction: predicting world events as accurately as possible, and hill-climbing this single metric. Today we share some recent progress on the Thinking Machines website, having found Tinker a great platform for our RL experiments. TL;DR: We RL-tune gpt-oss-120b to become a better forecaster than any other model. Having good scaffolding is a prerequisite. A fun result: our tuned model + Grok are decorrelated from the other best models, and so are the most indispensable when picking a team.
Tinker@tinkerapi

Mantic used Tinker to RL gpt-oss-120b on judgmental forecasting; the result outperformed frontier models on event predictions. Combined with @_Mantic_AI's forecasting architecture, task-specific training takes us to the cusp of automated superforecasting.

English
21
30
300
119.2K
Evelyn retweetledi
Garry Tan
Garry Tan@garrytan·
For agentic systems founders and dev tools founders: People do not want to pay for raw markdown and they shouldn't have to. But they may pay for orchestration, hosting, updates, collaboration, portability, analytics, and managed execution. These can be great businesses.
English
284
137
2.5K
167.2K
Evelyn retweetledi
Peter Holderrieth
Peter Holderrieth@peholderrieth·
We are also releasing self-contained lecture notes that explain flow matching and diffusion models from scratch. This goes from "zero" to the state-of-the-art in modern Generative AI. 📖 Read the notes here: arxiv.org/abs/2506.02070 Joint work with @EErives40101.
Peter Holderrieth@peholderrieth

🚀MIT Flow Matching and Diffusion Lecture 2026 Released (diffusion.csail.mit.edu)! We just released our new MIT 2026 course on flow matching and diffusion models! We teach the full stack of modern AI image, video, protein generators - theory and practice. We include: 📺 Videos: Step-by-step derivations. 📝 Notes: Mathematically self-contained lecture notes 💻 Coding: Hands-on exercises for every component We fully improved last years’ iteration and added new topics: latent spaces, diffusion transformers, building language models with discrete diffusion models. Everything is available here: diffusion.csail.mit.edu A huge thanks to Tommi Jaakkola for his support in making this class possible and Ashay Athalye (MIT SOUL) for the incredible production! Was fun to do this with @RShprints! #MachineLearning #GenerativeAI #MIT #DiffusionModels #AI

English
37
631
5.4K
417.4K
Evelyn retweetledi
Patrick McKenzie
Patrick McKenzie@patio11·
Doing the reading is a superpower, and it's even better in a world where "no one" is doing the reading. (Inspired by a conversation I had with some college students.)
English
50
223
2.4K
108.8K
Evelyn retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
The Terence Tao episode. We begin with the absolutely ingenious and surprising way in which Kepler discovered the laws of planetary motion. People sometimes say that AI will make especially fast progress at scientific discovery because of tight verification loops. But the story of how we discovered the shape of our solar system shows how the verification loop for correct ideas can be decades (or even millennia) long. During this time, what we know today as the better theory can often actually make worse predictions (Copernicus's model of circular orbits around the sun was actually less accurate than Ptolemy's geocentric model). And the reasons it survives this epistemic hell is some mixture of judgment and heuristics that we don’t even understand well enough to actually articulate, much less codify into an RL loop. Hope you enjoy! 0:00:00 – Kepler was a high temperature LLM 0:11:44 – How would we know if there’s a new unifying concept within heaps of AI slop? 0:26:10 – The deductive overhang 0:30:31 – Selection bias in reported AI discoveries 0:46:43 – AI makes papers richer and broader, but not deeper 0:53:00 – If AI solves a problem, can humans get understanding out of it? 0:59:20 – We need a semi-formal language for the way that scientists actually talk to each other 1:09:48 – How Terry uses his time 1:17:05 – Human-AI hybrids will dominate math for a lot longer Look up Dwarkesh Podcast on YouTube, Apple Podcasts, or Spotify.
English
101
550
3.8K
790.2K
Evelyn retweetledi
Peter Gostev
Peter Gostev@petergostev·
There's worry that people will stop using their brains with LLMs, but managing several AI agent threads in parallel has been some of the most cognitively intensive work I've done in years
English
178
137
1.7K
70.4K
Evelyn retweetledi
Vaclav Milizé
Vaclav Milizé@clwdbot·
the strategic implication people keep missing: this isn't just "faster shipping." it's that your user community becomes your R&D pipeline. OpenClaw users build a feature, Anthropic absorbs the pattern, every Claude Code user gets the improvement. the traditional boundary between "customer feedback" and "product development" collapses into a single loop. companies that can't run this loop are competing against their own users' innovations.
English
0
1
3
401
Evelyn retweetledi
dr. jack morris
dr. jack morris@jxmnop·
Learning to write kernels might be the highest-ROI activity for displaced SWEs: → prereq: reasonable engineering ablity → six to twelve months of study → millions of dollars, mark zuckerberg showing up at your house to hire you, etc. i wish this were an exaggeration
English
43
61
1.9K
121.9K
Evelyn retweetledi
Amanda Askell
Amanda Askell@AmandaAskell·
Perhaps I should get married again so that the media has a more recent man they can reference any time they mention me or my work.
English
272
77
3K
338.2K
Evelyn retweetledi
Paul Graham
Paul Graham@paulg·
This is a really good heuristic. The only downside is that someone will later invent a name for what you do, and you probably won't like it.
Raph. H.@Rapahelz

@paulg Learn, do and invest in things that do not have a name yet.

English
40
26
625
73.4K