Franck Verrot

15.8K posts

Franck Verrot banner
Franck Verrot

Franck Verrot

@franckverrot

Engineering @omadahealth. Building better healthcare by day, AI harnesses and simulators at night. Opinions mine.

San Francisco, CA Katılım Haziran 2009
660 Takip Edilen848 Takipçiler
Franck Verrot
Franck Verrot@franckverrot·
New LFM2.5-350M from @liquidai just dropped. My previous attempt at running an LM on-device for structured output wasn't successful, so I had to resort to a 14-head classifier. Thanks to this new model, I've been able to fine-tune a 350M model in 30 seconds and hit 93% precision with zero parse errors: that's a 175MB model outperforming models 11x its size. The classifier might be retiring soon!
Franck Verrot@franckverrot

During allergy season, when I'm being asked "how was your weekend?" I just say I stayed home, but what I don't say is that I spent an entire day on an "interesting" problem. We had long conversations about colleges this weekend, and my son shared the major he was pursuing, that D1 soccer was high on the list of requirements, and ideally not break the (ie: "Dad's") bank. All these requirements add up (state, school size, "I'm a city guy, Dad, not sure about UCSB", SUs vs UCs vs ...), I probably went overboard with this but for once I said no to a simple sheet. I grew up in a simpler time, in a simpler system too (not sure if it's still easy there.) So... I built an app. I tried running a 0.8B language model on an iPhone to parse natural language queries into structured filters. Precision was very low (9%) and it actually just produced garbage JSON half the time. I replaced it with a 60M+-parameter classifier with 14 heads (one per filter dimension): 100% precision, no parse errors, and around 15ms inference-time on-device. If you're an AI engineer, you may find the technical writeup interesting. If you're a parent, just know an app is on the way. franck.verrot.us/blog/2026/03/2…

English
1
0
0
198
Franck Verrot
Franck Verrot@franckverrot·
@liquidai This model definitely outperforms all my previous benchmarks, and makes the current option (multi-head classifier) less appealing now while it was the only working option just a couple weeks ago. Thanks! franck.verrot.us/blog/2026/03/3…
English
0
3
38
8K
Liquid AI
Liquid AI@liquidai·
Today, we release LFM2.5-350M. Agentic loops at 350M parameters. A 350M model trained for reliable data extraction and tool use, where models at this scale typically struggle. <500MB when quantized, built for environments where compute, memory, and latency are constrained. 🧵
Liquid AI tweet media
English
79
278
2.3K
344.9K
Franck Verrot
Franck Verrot@franckverrot·
@Alexintosh Really cool stuff, thanks for sharing! I jumped into the rabbit hole this weekend and decided to move away from Qwen3.5 0.8B to a regular multi-head classifier instead (+ a bunch of workarounds.). Hopefully I can go back to a SLM soon! x.com/franckverrot/s…
Franck Verrot@franckverrot

During allergy season, when I'm being asked "how was your weekend?" I just say I stayed home, but what I don't say is that I spent an entire day on an "interesting" problem. We had long conversations about colleges this weekend, and my son shared the major he was pursuing, that D1 soccer was high on the list of requirements, and ideally not break the (ie: "Dad's") bank. All these requirements add up (state, school size, "I'm a city guy, Dad, not sure about UCSB", SUs vs UCs vs ...), I probably went overboard with this but for once I said no to a simple sheet. I grew up in a simpler time, in a simpler system too (not sure if it's still easy there.) So... I built an app. I tried running a 0.8B language model on an iPhone to parse natural language queries into structured filters. Precision was very low (9%) and it actually just produced garbage JSON half the time. I replaced it with a 60M+-parameter classifier with 14 heads (one per filter dimension): 100% precision, no parse errors, and around 15ms inference-time on-device. If you're an AI engineer, you may find the technical writeup interesting. If you're a parent, just know an app is on the way. franck.verrot.us/blog/2026/03/2…

English
1
0
1
191
alexintosh
alexintosh@Alexintosh·
I just ran Qwen3.5 35B on my iPhone at 5.6 tok/sec. Fully on-device. 4bit | 256 experts. Model: 19.5GB. iPhone: 12GB RAM. wild.
English
91
151
2.3K
387.4K
Franck Verrot
Franck Verrot@franckverrot·
During allergy season, when I'm being asked "how was your weekend?" I just say I stayed home, but what I don't say is that I spent an entire day on an "interesting" problem. We had long conversations about colleges this weekend, and my son shared the major he was pursuing, that D1 soccer was high on the list of requirements, and ideally not break the (ie: "Dad's") bank. All these requirements add up (state, school size, "I'm a city guy, Dad, not sure about UCSB", SUs vs UCs vs ...), I probably went overboard with this but for once I said no to a simple sheet. I grew up in a simpler time, in a simpler system too (not sure if it's still easy there.) So... I built an app. I tried running a 0.8B language model on an iPhone to parse natural language queries into structured filters. Precision was very low (9%) and it actually just produced garbage JSON half the time. I replaced it with a 60M+-parameter classifier with 14 heads (one per filter dimension): 100% precision, no parse errors, and around 15ms inference-time on-device. If you're an AI engineer, you may find the technical writeup interesting. If you're a parent, just know an app is on the way. franck.verrot.us/blog/2026/03/2…
English
0
1
0
485
Franck Verrot
Franck Verrot@franckverrot·
@rudrank Learning I wasn't taking the crazy pill is validating... 😅 Thanks for building this!
English
0
0
1
14
Franck Verrot
Franck Verrot@franckverrot·
"You're on mute" is over: the new meeting problem is not shutting up. So I built a quick MonologueDetector app, it uses on-device speaker diarization to detect when you've been talking too long. Menu bar icon goes from green to red. Keep going and your entire screen turns red. All local, private, and open source: runs Sortformer via MLX on Apple Silicon. github.com/franckverrot/M…
English
1
0
3
246
Franck Verrot
Franck Verrot@franckverrot·
Would anyone find this useful? It intercepts MCP calls (http, with a wrapper for stdio ones), and will implement a central policy registry for companies who care about this. (And also does traffic + execution + file monitoring... need to have all of these to be buttoned up)
Franck Verrot@franckverrot

@berman66 Agreed, and even for individual devs and their side-projects, this is too wild. Maybe I should make this tool public (macOS only)

English
0
0
0
172
Franck Verrot
Franck Verrot@franckverrot·
@berman66 Agreed, and even for individual devs and their side-projects, this is too wild. Maybe I should make this tool public (macOS only)
Franck Verrot tweet media
English
0
0
0
630
Andy Berman
Andy Berman@berman66·
100%. No company wants a bunch of unmonitored CLIs on users' machines without common logging and introspection. This is such a massive security risk with 1 developer, never mind N or N+500. If only something like MCP existed to solve this. Great read.
yenkel@yenkel

x.com/i/article/2032…

English
31
14
279
85.9K
Franck Verrot
Franck Verrot@franckverrot·
@Vtrivedy10 @LangChain Really great post, and it does seem to validate the idea that harness frameworks and technologies are their own categories of products now, and I'm personally eager to see more interoperability/portability between these different frameworks.
English
0
0
1
26
Viv
Viv@Vtrivedy10·
at @LangChain we spend a lot of time designing harnesses as systems around models to do useful work in this blog, we take a first-principles look at why harnesses exist and how they help us craft good product experiences + correct model failure modes we cover filesystems, code execution, sandboxes, context rot, ralph loops, and why the best harness for your model probably isn't the one it shipped with reach out if interested in agents & harnesses! our team is doing a lot of interesting work here with deepagents
Viv@Vtrivedy10

x.com/i/article/2031…

English
7
38
458
114.2K
Franck Verrot
Franck Verrot@franckverrot·
And in the meantime, the harness framework is taking shape. Attempt #54634524 is turning out interesting, with some Elm-style niceties included.
Franck Verrot tweet mediaFranck Verrot tweet media
English
0
0
0
87
Franck Verrot
Franck Verrot@franckverrot·
I don’t know if which representation resonates the best with users: real flamegraph or cascading view of the trace, but as the tool spins up new agents as it needs them, its pretty easy to see what works gets done and where it needs to improve on its strategy. Still using Qwen on Cerebras for this one to push the decision to move to larger models only if really needed down the road. It’s no clawdbot or Twin AI yet, but it’s promising.
Franck Verrot tweet mediaFranck Verrot tweet media
English
0
0
0
72
Franck Verrot
Franck Verrot@franckverrot·
I guess I’ve lived long enough to reimplement OTel… but I’m satisfied by how this multi-agent and dynamic “swarm” mode is turning out to be. Claude Code is probably still more sophisticated on that front but all of that was done with Qwen (on Cerebras)(it’s really fast.) Little preview of a simple sequence diagram for a zero-shot tiny ruby app.
Franck Verrot tweet media
English
0
0
4
737