Lior Alexander

3.8K posts

Lior Alexander banner
Lior Alexander

Lior Alexander

@LiorOnAI

Building the Bloomberg of AI @AlphaSignalAI (280K subs) • MIT lecturer • MILA researcher • 9 yrs in ML • SF 🌁

Katılım Kasım 2012
2.2K Takip Edilen114.1K Takipçiler
Ayush S
Ayush S@ayushswrites·
Today we're launching a new brand and website for Warp. Our last website was built 18 months ago. At the time, we had a handful of customers, a small team, and a product thesis that most people thought was a dead end. The feedback in 2023 from investors was consistent: the market's too fragmented, the incumbents are too entrenched, you're betting on technology that doesn't work yet. We didn't listen. Since then, Warp has grown to serve thousands of companies. We've processed hundreds of millions in payments, on track to $1B this year. Companies are migrating away from ADP, Rippling, Gusto. Some breaking their contracts to switch. We went from a payroll product to a full platform: HRIS, payroll, AI tax compliance, benefits, IT, global payroll. The first employee management platform that runs itself. At a certain point, the gap between what you've become and how you present yourself starts to work against you. So we rebuilt everything. The new brand is built around a tension we love: analog precision meets velocity. Technical, but warm. Engineered for performance, designed with soul. Think 1960s racing garage meets modern editorial design. It reflects how we build the product: obsessive attention to the details you never see. The invisible circuit boards are beautiful. We're just getting started. This is the next chapter.
English
34
10
213
273.7K
Lior Alexander
Lior Alexander@LiorOnAI·
@katiekirsch Would love to understand how the selection was made. Every incredible content creator I know was rejected.
English
0
0
0
67
Sam Hogan 🇺🇸
Sam Hogan 🇺🇸@samhogan·
I’m hosting dinner parties again. 8-10 people, twice per month in San Francisco. If you are a founder, and especially if you are NOT a founder, and you would like to come for an evening of good food and conversation, send me a DM First dinner is March 27
Sam Hogan 🇺🇸 tweet mediaSam Hogan 🇺🇸 tweet mediaSam Hogan 🇺🇸 tweet mediaSam Hogan 🇺🇸 tweet media
English
71
12
683
75K
Lior Alexander
Lior Alexander@LiorOnAI·
Every foundation model you've ever used has the same bug. It just got fixed. Since 2015, every deep network has been built the same way: each layer does some computation, adds its result to a running total, and passes it forward. Simple. But there's a problem, by layer 100, the signal from any single layer is buried under the sum of everything else. Each new layer matters less and less. Nobody fixed this because it worked well enough. Moonshot AI just changed that. Their new method, Attention Residuals, lets each layer look back at all previous layers and choose which ones actually matter right now. Instead of a blind running total, you get selective retrieval. The analogy: imagine writing an essay where every draft gets merged into one document automatically. By draft 50, your latest edits are invisible. AttnRes lets you keep every draft separate and pull from whichever ones you need. What this fixes: 1. Deeper layers no longer get drowned out 2. Training becomes more stable across the whole network 3. The model uses its own depth more efficiently To make it practical at scale, they group layers into blocks and attend over block summaries instead of every single layer. Overhead at inference: less than 2%. The result: 25% less compute to reach the same performance. Tested on a 48B-parameter model. Holds across sizes. Residual connections have been invisible plumbing for a decade. Now they're becoming dynamic. The next generation of models won't just pass through their own layers, they'll search them.
Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English
8
17
91
19.2K
Lior Alexander
Lior Alexander@LiorOnAI·
@GibsonHimself Depends what we see as net positive. If it's shorter wars with less casualitites, I'll take it!
English
0
0
0
48
GibsonHimself
GibsonHimself@GibsonHimself·
@LiorOnAI This type of RL might not be a net positive.. but here we are..
English
1
0
0
56
Libriscent
Libriscent@libriscent·
ADHDer's dream job: Being a Professional Idea Generator. Giving Unlimited ideas with absolutely no responsibility for implementation.
English
620
4.4K
29.6K
3.4M
Lior Alexander
Lior Alexander@LiorOnAI·
@DamirWallener Embeddings are abstractions of tokens. Tokens come from text. Text contains words. We’re not escaping words here.
English
3
0
27
3.5K
Damir Wallener
Damir Wallener@DamirWallener·
@LiorOnAI That's pretty profoundly incorrect. They are not trained on words - they are trained on embeddings. Embeddings are "tokens with meta". And tokens are just a mapping between set of numbers and a set of <insert anything you want>.
English
2
0
19
4.5K
Lior Alexander
Lior Alexander@LiorOnAI·
Yann just bet a billion dollars that the entire industry is building on the wrong foundation. Large language models predict the next word. They're trained on text, so they understand language. But the real world isn't made of words. It's made of continuous sensor data: camera feeds, touch, sound. And most of that data is unpredictable. You can't predict every pixel in a video the way you predict the next token in a sentence. Generative models fail here because they try to predict everything, including noise. AMI Labs is building world models using JEPA (a method LeCun proposed in 2022 that learns abstract representations of reality and predicts in that compressed space, not in raw pixels). Action-conditioned versions let AI simulate the consequences of actions before taking them. That's not generation. That's understanding. This unlocks AI that can operate in the physical world without hallucinating: 1. Robotics that plans multi-step actions 2. Healthcare devices where errors kill patients 3. Industrial process control under safety constraints 4. Wearables that adapt to real-time sensor input If JEPA works at scale, the next wave of AI companies won't fine-tune LLMs. They'll train world models on sensor data. LeCun's CEO already predicts every startup will rebrand as a "world model company" within six months. The architecture war is starting.
Yann LeCun@ylecun

Unveiling our new startup Advanced Machine Intelligence (AMI Labs). We just completed our seed round: $1.03B / 890M€, one the largest seeds ever, probably the largest for a European company. We're hiring! [the background image is the Veil Nebula - a picture I took from my backyard, most appropriate for an unveiling] More details here: techcrunch.com/2026/03/09/yan…

English
121
224
1.8K
290.6K
LogosSteve
LogosSteve@LogosSteve·
@LiorOnAI Yeah he's right, technically. But the degree to which he is right will depend.
English
1
0
3
3.1K
Kyriakos
Kyriakos@Kyriakos_Pelek·
@LiorOnAI Building models beyond language
English
1
0
8
4.8K
Lior Alexander
Lior Alexander@LiorOnAI·
Andrew Ng just solved one of the biggest problem with Agents. He released Context Hub, a CLI tool to fetch live API documentation. One command. The agent gets exactly what it needs before writing a single line of code. Agents trained months ago are flying blind. They invent parameter names. They call functions that no longer exist. They confidently write code against a spec that changed in the last release. > No more hallucinated parameters > Docs pulled fresh before each call > Agents log useful discoveries > Notes persist between sessions The agent runs a CLI command before touching the code. Instead of relying on stale data, it reads the actual spec. Fast-moving APIs used to mean maintaining a doc dump in every prompt. Now the agent does that work itself. When it finds a workaround, it saves a note for next time.
Lior Alexander tweet media
English
26
21
229
20.5K
Lior Alexander
Lior Alexander@LiorOnAI·
It's over. Karpathy just open-sourced an autonomous AI researcher that runs 100 experiments while you sleep. You don't write the training code anymore. You write a prompt that tells an AI agent how to think about research. The agent edits the code, trains a small language model for exactly five minutes, checks the score, keeps or discards the result, and loops. All night. No human in the loop. That fixed five-minute clock is the quiet genius. No matter what the agent changes, the network size, the learning rate, the entire architecture, every run gets compared on equal footing. This turns open-ended research into a game with a clear score: - 12 experiments per hour, ~100 overnight - Validation loss measures how well the model predicts unseen text - Lower score wins, everything else is fair game The agent touches one Python file containing the full training recipe. You never open it. Instead, you program a markdown file that shapes the agent's research strategy. Your job becomes programming the programmer, and this unlocks a strange new loop: 1. Agents run real experiments without supervision 2. Prompt quality becomes the bottleneck, not researcher hours 3. Results auto-optimize for your specific hardware 4. Anyone with one GPU can run a research lab overnight The best AI labs won't just have the most compute. They'll have the best instructions for agents who never sleep, never forget a failed experiment, and never stop iterating.
Andrej Karpathy@karpathy

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English
137
441
4.3K
874.6K
Lior Alexander
Lior Alexander@LiorOnAI·
Cursor Automations solves the problem that agentic coding created. Engineers can now manage 10+ coding agents at once, but human attention became the bottleneck. You can't babysit a dozen agents while also doing your actual job. Automations flips the model: instead of you launching agents, events do. A merged PR triggers a security audit. A PagerDuty alert spins up an agent that queries logs and proposes a fix. A cron job reviews test coverage gaps every morning. Each automation runs in an isolated cloud sandbox with full access to the tools you configure through MCP (a standard protocol that lets agents connect to Slack, Linear, GitHub, Datadog, or any custom API). The agent follows your instructions, verifies its own work, and learns from past runs through a built-in memory system. Cursor runs hundreds of these per hour internally. Their security automation caught multiple vulnerabilities by auditing every push to main without blocking PRs. This unlocks 4 things that weren't practical before: 1. Continuous code review at a depth humans skip 2. Incident response that starts investigating before you're paged 3. Maintenance work that happens on a schedule, not when someone remembers 4. Knowledge synthesis across tools The next two years will be defined by who builds the best factory, not the best code. The companies moving fastest won't be the ones with the best engineers. They'll be the ones whose engineers spent time configuring automations instead of writing code.
Cursor@cursor_ai

We're introducing Cursor Automations to build always-on agents.

English
16
12
225
45.1K
catid
catid@MrCatid·
@LiorOnAI 80% accuracy doesn’t seem like something we should recommend people actually use
English
1
0
0
327
Lior Alexander
Lior Alexander@LiorOnAI·
A 24-billion-parameter model just ran on a laptop and picked the right tool in under half a second. The real story is that tool-calling agents finally became fast enough to feel like software. Liquid built LFM2-24B-A2B using a hybrid architecture that mixes convolution blocks with grouped query attention in a 1:3 ratio. Only 2.3 billion parameters activate per token, even though the full model holds 24 billion. That sparse activation pattern is why it fits in 14.5 GB of memory and dispatches tools in 385 milliseconds on an M4 Max. The architecture was designed through hardware-in-the-loop search, meaning they optimized the model structure by testing it directly on the chips it would run on. No cloud translation layer. No API roundtrip. The model, the tools, and your data stay on the machine. This unlocks three things that were impractical before: 1. Regulated industries can run agents on employee laptops without data leaving the device. 2. Developers can prototype multi-tool workflows without managing API keys or rate limits. 3. Security teams get full audit trails without vendor subprocessors in the loop. The model hit 80% accuracy on single-step tool selection across 67 tools spanning 13 MCP servers. If this performance holds at scale, two assumptions need updating. First, on-device agents are no longer a battery-life trade-off; they're a compliance feature. Second, the bottleneck in agentic workflows is shifting from model capability to tool ecosystem maturity.
Liquid AI@liquidai

> 385ms average tool selection. > 67 tools across 13 MCP servers. > 14.5GB memory footprint. > Zero network calls. LocalCowork is an AI agent that runs on a MacBook. Open source. 🧵

English
13
29
339
45.4K