Lior Alexander

3.8K posts

Lior Alexander banner
Lior Alexander

Lior Alexander

@LiorOnAI

Founder @AlphaSignalAI → the Intelligence layer of AI (300k users) • MIT Lecturer • ex-MILA researcher • In ML since GANs

San Francisco, CA Katılım Kasım 2012
2.3K Takip Edilen115.4K Takipçiler
Sohrab Noorsalehi Garakani
Sohrab Noorsalehi Garakani@DocSoEnDji·
@LiorOnAI In this case there's a plethora of cause and effect relationships and causal loops that help to explain a lot
English
2
0
1
41
Daniel Wright
Daniel Wright@Danielwright_UX·
@maxclark Hi Max, I edit video podcasts regularly and have handled this type of work many times. Clean cuts, strong pacing, and keeping the conversation engaging without over-editing is exactly what I focus on.
English
1
0
0
75
Max Clark
Max Clark@maxclark·
Looking for a podcast (video) editor on a project (per recording) basis If this is what you do, let me know why we should work with you
English
94
0
80
3.5K
Factory
Factory@FactoryAI·
Today, we are excited to announce our $150M Series C led by Khosla Ventures with strong participation from Sequoia Capital, Blackstone, Insight Partners, Evantic Capital, Abstract Ventures, 20VC, NEA, and Mantis VC. This puts our valuation at $1.5B and will accelerate our investment in research, product, and global go-to-market. Long live developers.
Matan Grinberg@matanSF

x.com/i/article/2044…

English
54
39
656
256.1K
mango
mango@mangoster·
usually companies are under the impression that they can find someone who just 'gets it' but more often than not, u need to invest both time and money into a creative for them to reach a level where they just 'get it' graduates on the other hand, severely underestimate the skill required to be seen as someone worth investing into
richard@richardzphotoz

I find it so interesting how companies can’t seem to find the talent they need and graduates can’t find a job.

English
3
0
34
4.2K
newo
newo@newomp4·
Made this during my economics lectures 😭 Once I’m out of school I’ll be making stuff 10x better than this. Only the beginning.
English
3
0
21
1.2K
Arlan
Arlan@arlanr·
i got a huge office in the heart of san francisco literally yesterday and i’m filling the fridge with unlimited steaks and snacks. our team is extremely lean (just 2) and we’re working on both research and consumer-facing problems in the context space. this is going to be the best summer of your life, so if you’re an engineer, researcher, or devrel, hit me up. very generous equity + salary so you can focus on building. if you’re from another state, flights are covered.
Ben Lang@benln

Best companies for future founders to work at these days: Ramp, Cursor, OpenAI. Where else?

English
48
2
317
42.1K
Marouane Lamharzi Alaoui
Marouane Lamharzi Alaoui@marouane53·
Good article but two corrections I need to mention Section 11 blends AutoDream and Session Memory into one system. They're separate. autoDream.ts is cross-session consolidation with a cheap-first gate order (time, session count, lock). The 10,000-token init threshold, 5,000-token update interval, 3-tool-call trigger, and the 10-category template come from the SessionMemory pipeline. Every detail you cited exists in the repo, just not in the same subsystem. On the buddy system, "without any storage, database, or API overhead" isn't right. The deterministic hash gives you species, eyes, hat, stats. But personality and name come from a model call on first hatch and get saved to config. No database, yes. But there is storage and there is an API call.
English
2
0
10
2.5K
Lior Alexander
Lior Alexander@LiorOnAI·
You can now automate harness engineering. System prompts. Tool definitions. Retry logic. Context management. Changing just this layer can create a 6x performance gap on the same model. It's called Meta-Harness. Here's how it works: 1. Start with any harness. A coding agent gets a folder with its code, logs, and scores. 2. The agent reads those files. It finds what caused each failure. 3. It rewrites the harness and submits a new version. 4. That version gets tested. Its results go back into the folder. The loop repeats. The folder grows every round. Previous methods squeezed everything into short summaries. The optimizer only saw around 26K tokens per step. Not enough to trace why something broke. Meta-Harness keeps every raw file. 10 million tokens per step. 400x more to learn from. Enough to trace a failure back to the exact line that caused it. Results: - On TerminalBench-2 coding tasks, it ranks #1 among all Haiku agents. - On text classification, it beats the best hand-designed harness by 7.7 points while using 4x fewer tokens. - On math, a single strategy improves accuracy across five models it never saw. All of this came from optimizing the harness, not the model. The performance delta between frontier models is narrowing. The delta between harness implementations on the same model is not. That's where the leverage is.
Yoonho Lee@yoonholeee

How can we autonomously improve LLM harnesses on problems humans are actively working on? Doing so requires solving a hard, long-horizon credit-assignment problem over all prior code, traces, and scores. Announcing Meta-Harness: a method for optimizing harnesses end-to-end

English
17
33
248
43K
Lior Alexander
Lior Alexander@LiorOnAI·
Google's latest paper on Compression is the future. Here's why. They compressed LLM memory 6x with zero accuracy loss. When ChatGPT writes a reply, it remembers every word you've said. That memory is stored in a growing notebook (KV cache). A 100,000-word conversation can eat 16 GB of GPU memory. That's half of what most high-end GPUs even have. This is the #1 cost of running AI. Not the thinking. The remembering. TurboQuant shrinks each number in that notebook from 32 bits to just 3. That's like replacing a full paragraph with three words and losing nothing. No retraining. Works on any model instantly. Compressing numbers usually destroys their meaning. Here's how they solved it: 1. Rotate the numbers randomly so they all land on a predictable curve (PolarQuant) 2. Use one extra bit to fix the tiny errors left behind (QJL) Once numbers are predictable, you need far fewer bits to store them. The results: > 8x faster on Nvidia H100 GPUs > 16 GB notebook shrinks to under 3 GB > Search indexing drops from 500 seconds to 0.001 > Accuracy identical to the uncompressed model There's a proven math limit on how good compression can get. TurboQuant is only 2.7x above that floor. We're near the ceiling. Every company running LLMs spends most of its budget on memory. This cuts that cost by over 80%. The race is no longer about bigger models. It's about cheaper inference. Models that needed a $200K server cluster start fitting on a single $2K GPU. AI agents run 24/7 without burning budgets. The companies that win won't just have the best models. They'll have the best compression. Papers are open-access on arXiv, presented at ICLR.
Lior Alexander tweet media
English
24
32
177
25.9K
CK
CK@CMKiesling·
@LiorOnAI Great summary! Gentle clarification: It isn't the first world model to avoid collapse (VAEs/diffusion do), just the first JEPA to do it cleanly. SIGReg is a strict optimization, not a mathematical impossibility. Also, LLMs & JEPAs are complementary, not competing paths!
English
1
1
21
2.1K
Lior Alexander
Lior Alexander@LiorOnAI·
Just read LeCun's latest paper. His team trained the first world model that can't collapse. Let me explain why this matters. It's called LeWorldModel. World models predict what happens next physically. Objects moving, falling, colliding. That's the base layer for robots that plan, cars that simulate before they steer, any AI that acts in reality instead of just talking about it. The catch is nobody could train these reliably. The models kept cheating. They'd map every input to the same output. Like a weather app stuck on "sunny" forever. Technically predicting. Completely useless. So teams piled on fixes. Frozen encoders, stop-gradient hacks, 6+ loss hyperparameters. A fragile stack too brittle for production. This team asked a different question. What if you make collapse mathematically impossible? An encoder turns each video frame into a small vector. A predictor takes that vector plus an action and guesses the next one. First loss: how wrong was the guess. Second loss: a regularizer called SIGReg that checks if vectors spread out like a bell curve. If they start looking the same, the loss spikes. The model can't cheat because the math won't let it. That simplicity is what makes the results possible. Six hyperparameters became one. 15M parameters. Trains on one GPU in hours. Plans 48x faster. Encodes with ~200x fewer tokens. Open-source. I could run this on my own hardware. Which changes who gets to build physical AI. Not just big labs anymore. Any team, any startup, any grad student. LeCun has pushed JEPA as the path forward. The criticism was always training instability. This paper removes that objection. Two directions compete in AI right now. Bigger LLMs with more compute. Or small models learning physics from raw pixels.
Lior Alexander tweet media
English
73
173
1.2K
99.9K
Ziran Yang
Ziran Yang@__zrrr__·
Introducing Goedel-Code-Prover 🌲 LLMs write code, but can they prove it correct? Not just pass tests, but construct machine-checkable proofs that a program works for ALL possible inputs. We built a system that does exactly this. Given aprogram and its specification in Lean 4, Goedel-Code-Prover automatically synthesizes formal proofs ofcorrectness. Our 8B model achieves 62% overall success rate across three benchmarks (Verina, Clever &AlgoVeri), a 2.6x improvement over the strongest baseline, surpassing both frontier LLMs (GPT/Gemini/Claude)and open-source theorem provers up to 84x larger (DeepSeek-Prover/Goedel-Prover/Kimina-Prover/BFS-Prover).
Ziran Yang tweet media
English
21
76
554
70.1K
Lior Alexander
Lior Alexander@LiorOnAI·
Dario Amodei in 2025: "In 12 months, we may be in a world where AI is writing essentially all of the code."" Anthropic 2026: - Jan 2026: Claude Cowork launched. - Feb 2026: Opus 4.6 released. - Feb 2026: Sonnet 4.6 released. - Feb 2026: Cowork launched on PC - Feb 2026: PowerPoint integration - Feb 2026: Excel integrations added. - Feb 2026: Co-work plug-ins released. - Feb 2026: Claude Code security launched. - Feb 2026: Claude Code Remote Control - Feb 2026: Scheduled Task in Co- work - Feb 2026: Connector available in the free - Mar 2026: Claude memory is free - Mar 2026: Claude Marketplace launched - Mar 2026: Claude com ambassadors - Mar 2026: Code review for Claude code - Mar 2026: Claude skills for Excel & Slides - Mar 2026: charts & diagram in chat - Mar 2026: 1 million context window - Mar 2026: Dispatch for Claude Co-work - Mar 2026: Claude code Channels - Mar 2026: Co-work Projects - Mar 2026: Claude Computer use - Mar 2026: Auto mode in Claude code.
English
19
17
199
24.9K