Debadeepta Dey

1.1K posts

Debadeepta Dey

@debadeepta

Principal Architect -AI Compiler, Microsoft | ex MSR, CMU

Kenmore, WA Katılım Temmuz 2011

2.5K Takip Edilen2.2K Takipçiler

Sabitlenmiş Tweet

Debadeepta Dey@debadeepta·28 May

1️⃣We are excited to open-source syftr: a powerful tool for automatically finding Pareto-optimal generative AI flows! syftr searches a large search space of agentic and non-agentic flows to surface optimal tradeoffs between accuracy, cost and latency. 🧵

English

7.1K

Debadeepta Dey retweetledi

Jaber@Akashi203·5 Nis

we published autokernel on arxiv inspired by @karpathy 's autoresearch, we applied the same keep/revert agent loop to GPU kernel optimization you give it any pytorch model, it profiles it, ranks bottlenecks by amdahl's law, writes triton or CUDA C++ replacements, and runs 300+ experiments overnight with no human in the loop - 5.29x over pytorch eager on rmsnorm - 2.82x on softmax - beats torch.compile by 3.44x on softmax and 2.94x on cross entropy - #1 on the vectorsum_v2 B200 leaderboard - single prompt triton FP4 matmul that beats CUTLASS by up to 2.15x every candidate passes a 5-stage correctness harness before any speedup counts, and the whole thing runs at ~40 experiments/hour so you wake up to a faster model arxiv: arxiv.org/abs/2603.21331 github: github.com/RightNow-AI/au…

English

663

87.8K

Debadeepta Dey retweetledi

Satya Nadella@satyanadella·30 Mar

Introducing Critique, a new multi-model deep research system in M365 Copilot. You can use multiple models together to generate optimal responses and reports.

English

429

508

4.2K

1.4M

Debadeepta Dey retweetledi

Elliot Arledge@elliotarledge·19 Mar

x.com/i/article/2034…

ZXX

5.8K

Debadeepta Dey@debadeepta·12 Mar

Vibe-coding wars.

The New York Times@nytimes

Breaking News: The U.S. was responsible for a missile strike on an Iranian school, an ongoing military investigation found. The inquiry said the strike — which Iranian officials said killed at least 175 people — was the result of a targeting mistake. nyti.ms/47G2uw2

English

113

Debadeepta Dey retweetledi

Leonardo de Moura@Leonard41111588·3 Mar

AI is writing a growing share of the world's software. No one is formally verifying any of it. New essay: "When AI Writes the World's Software, Who Verifies It?" leodemoura.github.io/blog/2026/02/2…

English

247

1.6K

421.5K

Debadeepta Dey retweetledi

Yuda Song@yus167·3 Şub

RL on LLMs inefficiently uses one scalar per rollout. But users regularly give much richer feedback: "make it formal," "step 3 is wrong." Can we train LLMs on this human-AI interaction? We introduce RL from Text Feedback, with 1) Self-Distillation; 2) Feedback Modeling (1/n) 🧵

English

101

598

105.9K

Debadeepta Dey@debadeepta·20 Ağu

Of course, one can run syftr on top of the silver bullets to lift the Pareto-frontier up even more!

English

162

Debadeepta Dey@debadeepta·20 Ağu

We took a leaf out of this literature and find that by cross-pollinating search across many different datasets (metatraining), one can find a set of flows we term as "silver bullets" that can do well (in the Pareto-sense) *across* tasks *without* running syftr from scratch.

English

193

Debadeepta Dey@debadeepta·20 Ağu

What if you could bring your task and have a system generate a set of AI workflows which work well out-of-the-box! No manual trial-and-error on which of the numerous agents (from single-agent to multi-agent workflows) to use. We built exactly that datarobot.com/blog/silver-bu…

English

380

Debadeepta Dey retweetledi

Wen Sun@WenSun1·16 Tem

Does RL actually learn positively under random rewards when optimizing Qwen on MATH? Is Qwen really that magical such that even RLing on random rewards can make it reason better? Following prior work on spurious rewards on RL, we ablated algorithms. It turns out that if you deploy algorithms like Reinforce and REBEL (a generalization of Natural Policy Gradient), RL does not learn under random rewards. These two simple algorithms simply behave as we would expect in this case. GRPO and PPO indeed can behave strangely. They can learn positively or negatively, depending on different random seeds. The clipping heuristic introduces certain bias in the objective function, which causes such unexpected behaviors (this even happens in bandit which has nothing to do w/ LLM or reasoning). Perhaps it is time to abandon the clipping heuristic...

Gokul Swamy@g_k_swamy

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

English

101

13.6K

Debadeepta Dey@debadeepta·31 May

@allenainie Thank you for the kind words and for making Trace! We love how easy it is to weave into complicated workflows. Excited for what Trace is cooking next.

English

100

Allen Nie (🇺🇦☮️)@allenainie·30 May

I'm experiencing the 🤩 moment when an amazing company just built their new library on top of the framework I helped build! Trace is a library for creating **extremely flexible** LLM-based workflows. Syftr uses Trace to optimize their workflow and push the cost-accuracy Pareto frontier of weaker models. They adopt multi-objective Bayesian Optimization to find the optimal workflow given a fixed budget. Shout out to @debadeepta and @DataRobot for the huge accomplishments 🚀 This is Trace's first industry adoption! And we have some mind-blowing results to share soon. Check out Syftr: github.com/datarobot/syftr And Trace: github.com/microsoft/Trace Every workflow can be optimized. Syftr works with DSPy and TextGrad too — we’re building the future of LLM workflows together 🛠️🔥

English

1.4K

Debadeepta Dey retweetledi

Shital Shah@sytelus·28 May

A different and interesting work from my ex-colleague Dey: How do you generate Pareto frontier for the agentic workflow? Many practical applications must balance cost vs performance for agents and this pioneering work shows the way!

Debadeepta Dey@debadeepta

English

1.4K

Debadeepta Dey retweetledi

roma 🦁@roma_glushko·28 May

✨Meet syftr, a new OSS framework to find the best RAG workflows (both agentic and not) balancing cost/latency/accuracy using multi-objective Bayesian Optimization

English

342

Debadeepta Dey@debadeepta·28 May

6️⃣Want to get involved? 📖 Technical blog post and full paper (to appear at @automl_conf ). 💻 Try syftr github.com/datarobot/syftr 🙌 Contribute via PRs

English

197

Debadeepta Dey@debadeepta·28 May

5️⃣syftr is made possible thanks to: Ray for distributed search orchestration. @anyscalecompute LlamaIndex for building advanced workflows. @llama_index HuggingFace Datasets for fast dataset interfaces. @huggingface Starting with question-answering and actively expanding tasks

English

202

Debadeepta Dey@debadeepta·28 May

English

7.1K

Keşfet

@karpathy @allenainie @DataRobot @automl_conf @anyscalecompute @llama_index @huggingface @elonmusk