Jessy Lin

365 posts

Jessy Lin banner
Jessy Lin

Jessy Lin

@realJessyLin

PhD @Berkeley_AI. ai x humans

Katılım Mart 2013
1K Takip Edilen5.3K Takipçiler
Sabitlenmiş Tweet
Jessy Lin
Jessy Lin@realJessyLin·
As part of our recent work on memory layer architectures, I wrote up some of my thoughts on the continual learning problem broadly: Blog post: jessylin.com/2025/10/20/con… Some of the exposition goes beyond mem layers, so I thought it'd be useful to highlight separately:
Jessy Lin tweet media
English
28
166
1.2K
202.7K
tobi lutke
tobi lutke@tobi·
Lots of non tech friends want openclaws. So far i've set them up on VMs, but this is getting heavy. Are there any good multi-tenant openclaw setups or alt-claws yet that are good enough?
English
289
32
1.3K
223.8K
Suhas Kotha
Suhas Kotha@kothasuhas·
to improve fine-tuning data efficiency, replay generic pre-training data not only does this reduce forgetting, it actually improves performance on the fine-tuning domain! especially when fine-tuning data is scarce in pre-training (w/ @percyliang)
Suhas Kotha tweet media
English
15
64
498
70.4K
Jessy Lin
Jessy Lin@realJessyLin·
@gabriel1 thank u for saying the quiet thing out loud
English
0
0
3
448
gabriel
gabriel@gabriel1·
there is still no substitute for perfectly understanding every single line of code in your codebase i fall into the trap of just skimming through ai changes to "just make sure it looks good" all the time, and it makes me lose so much time to not perfectly understand every line
English
160
95
2.7K
248K
Aaron Levie
Aaron Levie@levie·
In a world of openclaw, codex, claude code/cowork, manus, and other agentic systems, it’s becoming clear that the future of software has to be API-first, but also enable human interaction for verification, collaboration with agents and people, and working on the output. It’s generally been the case that software was built for people first and foremost, and then APIs are exposed for other systems to connect into that tool or data. But if we imagine a world where AI agents are doing 10X or 100X more work with software than people, then this paradigm is flipped. Software becomes API-first, with ways of having humans be able to work effectively with the agent, either through a UI as relevant, or chat. If you’re not API-first, then you’re nearly DOA to agents.
English
157
84
901
198.7K
Jessy Lin retweetledi
dr. jack morris
dr. jack morris@jxmnop·
at long last, the final paper of my phd 🧮 Learning to Reason in 13 Parameters 🧮 we develop TinyLoRA, a new ft method. with TinyLoRA + RL, models learn well with dozens or hundreds of params example: we use only 13 parameters to train 7B Qwen model from 76 to 91% on GSM8K 🤯
dr. jack morris tweet media
English
60
235
2.1K
180.6K
Jessy Lin retweetledi
idan shenfeld
idan shenfeld@IdanShenfeld·
People keep saying 2026 will be the year of continual learning. But there are still major technical challenges to making it a reality. Today we take the next step towards that goal — a new on-policy learning algorithm, suitable for continual learning! (1/n)
idan shenfeld tweet media
English
45
208
1.4K
191.5K
Jessy Lin retweetledi
signüll
signüll@signulll·
we choose to do this not because it is easy, but because it is fun as hell.
English
16
24
273
14K
Jessy Lin retweetledi
alphaXiv
alphaXiv@askalphaxiv·
New research from Sakana AI "Fast-weight Product Key Memory" So the classic Product Key Memory (PKM) layer (a sparse key–value memory module used alongside attention) is a huge sparse memory, but it’s "slow" weights, where it is trained once, then frozen at inference, so it can’t memorize new info at deployment. Sakana AI's FwPKM makes PKM writable at test time: it does small chunk-level gradient updates to write key value “episodes”, then retrieves them with product-key lookup. This adds an episodic memory layer that stays effective far beyond training context (4K -> 128K) and helps when relevant info is separated by thousands of tokens.
alphaXiv tweet media
English
15
114
661
47.3K
Jessy Lin
Jessy Lin@realJessyLin·
great post, and I generally find this way of reasoning about "limit cases" and things that should be true in principle to be really valuable for thinking about what approaches to "memory" and continual learning make sense in the long term (out of a huge and heterogenous design space!) > repeated data: when humans see the same piece of experience over and over again, we eventually stop updating -> what kind of update algorithm would make this true? > integration into existing concepts: if someone tells you they're from Michigan, your representation of Michigan should also change -> what kind of representation/parameterization would make this true?
augustus odena@gstsdn

I have a bunch of thoughts about continual learning and nothing to do with them (I'm working on something else) so I figured I'd just turn them into a post: First: I think people use "continual learning" to point at a cluster of issues that are related but distinct. I'll list the issues and then speculate about what might fix them. a) Catastrophic Forgetting: If you train on a distribution D_1 and then do SFT on another distribution D_2, you'll often find that your performance on D_1 degrades. The extent of this issue is maybe overstated and is more true for SFT than for RL, but it's still real. There's also an important limit case that IMO is a "smell" for the way we train models currently: repeated data can seriously harm model performance. Humans don't have this problem - they eventually just stop updating on redundant information. b) No integration of new knowledge into existing concepts: If I tell you that I'm from Michigan, you will update your representation of me to include that fact, but you will also change your representation of Michigan. Michigan becomes "a place where someone I know is from". If people ask you questions about Michigan in the future, you may answer those questions with this knowledge in mind. If I tell a chatbot that I'm from Michigan, that fact may get stored in a memory file about me, but it won't affect the model's representation of Michigan. c) No consolidation from short-term memory to long-term memory: Models are good at accumulating information in context up to a point, but then they run out of context (or effective context) and performance degrades. They are missing a mechanism for deciding what's important to retain and then taking action to retain it. d) No notion of timeliness: When you tell a human something, they also retain *when* they learned it, and that "time tag" becomes part of the representation. Humans experience a stream of facts unfolding through time. As a result we form an implicit model of history/causality. Many people can answer "who is the current Pope?" without doing a special search step. Now that we've enumerated the issues, we can think about solutions. In AI it's always worth asking why the simplest solution can't work. The very simplest thing to try is what chatbots currently do: maintain a text file of memories. IMO it's obvious why this is unsatisfying relative to what humans are doing, so I won't dwell on it. I expect there are many refinements you could make here around learning to manually manage the text file, but I also expect these approaches to be brittle. A slightly smarter thing that's still pretty simple is to just keep updating the model during deployment. I actually do think that something like this could work OK, but we probably need a few tweaks. Some combination of the following seems worth pursuing: 1. Sparser updates: Catastrophic forgetting is plausibly worsened by updating all parameters at once. I'd bet either selective parameter updates or making the models themselves sparser could help a lot here. @realJessyLin has some nice work here. 2. Update only on surprising data: Updating on every new datapoint feels wrong. We want a mechanism that decides what’s important/surprising and only updates on that subset. A crude version: automatically generate questions about a datapoint and only update if the model fails to answer them. The hippocampus also has interesting mechanisms for doing this that seem worth trying to emulate. 3. Don't train on the raw datapoint w/ the standard objective. Given that we've decided a datapoint is surprising, I don't think we should just train on it using the standard objective. We may want to automatically generate questions about a given corpus and train on the answers (as in e.g. the Cartridges work) and we may also want to modify the objective. One option is to do prompt distillation with the facts in context - the intuition being that the consolidated model ought to answer the question as though it has the facts on hand. These are "in-paradigm" approaches compatible with LLMs. I bet they’ll yield real progress, but I’m also starting to suspect something less in-paradigm may be needed for a really satisfying solution. That’s for a different post though.

English
1
1
12
3.6K
Jessy Lin
Jessy Lin@realJessyLin·
Really interesting, thanks for sharing your thinking! this struck me as one of the elements of programming that's lost when vibecoding -- agents can implement features in isolation, but the end state often ends up being a bunch of "jagged edges" like you describe. A lot of what humans provide is a bigger picture of how they fit together
English
0
0
6
3K
Mitchell Hashimoto
Mitchell Hashimoto@mitchellh·
The lack of "feature design" is why so many products over time feel hollow or messy. This isn't visual design. This isn't architectural design. I thought that a short video lecture of what feature design is and a real case study of applying it in Ghostty would be helpful. Feature design is the planning step behind how you're going to solve one or more user problems with a product feature: what that feature looks like, how it feels, and not just how its going to tactically solve these specific problems, but how that solution is going to interface with the edges of other features that currently exist or are planned to exist in the future.
English
56
194
2.3K
154K
Tanay Kothari
Tanay Kothari@tankots·
we just raised another $25M after 10x'ing our ARR in 5 months. the crazy part is this almost never happened. 17 years ago, I watched Iron Man as a 10-year-old kid in Delhi. that night, I pulled my first all-nighter teaching myself to code. not because I wanted to build apps or make money. because I wanted to build Jarvis. my parents gave me 1 hour of screen time per day. so I coded in secret, sleeping every alternate night through middle school and high school. built 50+ apps. got a cease and desist from Google at age 12. all for this one obsession: making computers understand us like humans do. fast forward to today: - we've raised $81M total to build the voice operating system - growing revenue 40% month-over-month this year - 70% user retention after one year (unheard of in consumer) - teams at 270 of the Fortune 500 use Wispr Flow daily our Series A2 was led by @hanstung at @notablecap (who was an early investor in five companies that made it to $100B valuation like Slack, Tiktok, and Airbnb). we also brought on @StevenBartlett as an investor and partner. but here's what matters more than the money: we cracked voice input. not transcription - actual understanding. our users hit "send" in under 0.5 seconds without checking. they trust it blindly. that's never existed before. in a recent benchmark, Wispr came out as 3-4x more accurate than OpenAI, ElevenLabs, and Siri. and we're just getting started. voice input was step one. now we're building the assistant that actually does things for you. to my co-founder @SahajGarg6 - there's no one else I'd rather build Jarvis with than my college roommate and closest friend. to our team pulling all-nighters and shipping magic - you're the reason that 10-year-old kid's dream is becoming real. we're hiring cracked engineers and growth marketers who want to build the future of human-computer interaction. the keyboard had a good 150-year run. time to build what comes next. PS: like, retweet, and bookmark to get wispr flow for free for 3 months ❤️ — Written with @WisprFlow
English
435
569
3.6K
897.2K
Mina Fahmi
Mina Fahmi@minafahmi·
A wrapper consisting of: * An always available, low-latency, highly reliable streaming architecture with syllable-level user interruptions aligned across audio & text, orchestrating several models in sequence and in tandem * A conversational entity that actually expands your thinking, with long-term memory, notes, and a personalized voice, developed using an in-house synthetic evaluation system * Deep iOS integrations for reliable connectivity, media control, and the ability to view, edit, & share notes via a UI designed for audio-first interactions * On-device firmware including audio compression, sensor fusion, haptics calibration, all-day battery life, that can isolate a whisper in a crowd * Hardware designed, sourced, and tested in-house to be reliable, water-resistant, high-quality, and great to wear @sandbar
Brother Green@BrotherGreen13

This is, and I can't stress this enough, a Bluetooth microphone. All the "AI" isn't in the ring. The "AI" is just an app, that is ChatGPT wrapped in an organizer.

English
9
0
67
10.7K
Jessy Lin retweetledi
Tony Zhao
Tony Zhao@tonyzzhao·
Today, we present a step-change in robotic AI @sundayrobotics. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. - Ultra long-horizon tasks - Zero-shot generalization - Advanced dexterity 🧵->
English
435
652
5.4K
2M
Jessy Lin
Jessy Lin@realJessyLin·
I really like this idea of having agents that control a computer with your context and data, as we're all trying to figure out the right form factor for computer/browser use agents zo's answer is ~the equivalent of "personal computers" for the ai era, and it's so cool to think about how it enables the average person to script and automate things in their lives that would otherwise be inaccessible congrats @0thernet @perceptnet !! 💻
ben guo 🪽@0thernet

today we're announcing @zocomputer. when we came up with the idea – giving everyone a personal server, powered by AI – it sounded crazy. but now, even my mom has a server of her own. and it's making her life better. she thinks of Zo as her personal assistant. she texts it to manage her busy schedule, using all the context from her notes and files. she no longer needs me for tech support. she also uses Zo as her intelligent workspace – she asks it to organize her files, edit documents, and do deep research. with Zo's help, she can run code from her graduate students and explore the data herself. (my mom's a biologist and runs a research lab. hi mom) Zo has given my mom a real feeling of agency – she can do so much more with her computer. we want everyone to have that same feeling. we want people to fall in love with making stuff for themselves. in the future we're building, we'll own our data, craft our own tools, and create personal APIs. owning an intelligent cloud computer will be just like owning a smartphone. and the internet will feel much more alive. THIS ONE'S FOR YOU MOM ❤️ special thank you to @modal, @pydantic AI, and @steeldotdev for being great partners leading up to this launch. and thank you @cursor_ai for being my sword 🗡️ and thank you to everyone who believed in us. a small handful: @southpkcommons, @adityaag, @chrisbest, @rauchg, @immad, @shreyas, @MattHartman, @lessin, @gokulr, @sabrinahahn, @iqramband, @whoisnnamdi, @guruchahal, @mikemarg_, @gaybrick, @SJCizmar, @magdovitz, @anneleeskates, @henloitsjoyce, @sugarjammi, @vibethinker, @aaronmakhoffman, @Sunfield__

English
3
4
33
7.4K
Jessy Lin retweetledi
Eugenia Kuyda
Eugenia Kuyda@ekuyda·
Today, we’re thrilled to announce $20M in funding led by @a16z, with support from @saranormous, @amasad, @akothari, @garrytan, @justinkan, @atShruti, @naval, @scottbelsky, @gokulr, @soleio, @kevinhartz and more. @wabi is ushering in a new era of personal software, where anyone effortlessly create, discover, remix, and share personalized mini apps. For 50 years, software was made for people. The next 50, it will be made by people. Just as YouTube unlocked creative power through video, Wabi will unlock creative power through software. The YouTube moment for apps is here. We can’t wait to see what you create.
English
235
143
2.4K
829.7K