matt turk

1.4K posts

matt turk

matt turk

@TurkMatthew

ML researcher @withprotegeai prev: ML @cleanlabAI @goodwatercap, Quant @coinbase & @goldmansachs, EECS @ucberkeley

New York, NY Katılım Mart 2012
2K Takip Edilen711 Takipçiler
Sabitlenmiş Tweet
matt turk
matt turk@TurkMatthew·
Excited to share that I’ve joined @withprotegeai as a Senior Machine Learning Researcher on the DataLab team. After 2 years at @CleanlabAI, working with the team was an incredibly formative experience. I’m deeply grateful for the chance I had to learn from them to work on data-centric AI with such thoughtful researchers and builders, and to contribute during a period that ultimately led to Cleanlab being acquired into @joinHandshake AI. I learned a tremendous amount about the importance of data quality, evaluation, and trustworthiness in modern AI systems to make them more accurate and reliable. Throughout my time there, my conviction only grew that the next major advances in AI will come not just from better models or more compute, but from better data. At DataLab, our goal is to treat the data layer of AI with the same scientific rigor that model labs apply to algorithms by building a dedicated research institution for AI data: designing high-fidelity datasets and multimodal benchmarks grounded in real-world scenarios, working closely with frontier labs on their hardest data challenges, and developing standardized ways, including “FICO scores for AI data”, to measure dataset quality, contamination, and benchmark reliability. Another important piece of this work is understanding how different kinds of data support different parts of the AI training stack. Reinforcement learning (RL) environments are a powerful form of training data that generate structured training tuples like (state, action, reward, next state) and are extremely useful for post-training optimization when the world can be simulated. But many of the highest-value domains for AI, including healthcare, enterprise workflows, and complex multimodal reasoning, cannot be faithfully simulated. Advancing models in these areas requires real-world datasets, carefully designed benchmarks, and domain-specific data for pre-training and mid-training adaptation. The idea behind DataLab is simple but important: every major leap in AI capability has historically followed a breakthrough in data (from ImageNet to large-scale web corpora). As models and compute continue to advance rapidly, closing the data gap, the gap between the data that AI systems need and the data that actually exists in usable form, may be one of the most important challenges for the field. Here is more info on some of the work the team has done so far: datalab.withprotege.ai
Bobby Samuels@BobbySamuels

x.com/i/article/2030…

English
1
0
10
603
matt turk
matt turk@TurkMatthew·
Excited to share that I’ve joined @withprotegeai as a Senior Machine Learning Researcher on the DataLab team. After 2 years at @CleanlabAI, working with the team was an incredibly formative experience. I’m deeply grateful for the chance I had to learn from them to work on data-centric AI with such thoughtful researchers and builders, and to contribute during a period that ultimately led to Cleanlab being acquired into @joinHandshake AI. I learned a tremendous amount about the importance of data quality, evaluation, and trustworthiness in modern AI systems to make them more accurate and reliable. Throughout my time there, my conviction only grew that the next major advances in AI will come not just from better models or more compute, but from better data. At DataLab, our goal is to treat the data layer of AI with the same scientific rigor that model labs apply to algorithms by building a dedicated research institution for AI data: designing high-fidelity datasets and multimodal benchmarks grounded in real-world scenarios, working closely with frontier labs on their hardest data challenges, and developing standardized ways, including “FICO scores for AI data”, to measure dataset quality, contamination, and benchmark reliability. Another important piece of this work is understanding how different kinds of data support different parts of the AI training stack. Reinforcement learning (RL) environments are a powerful form of training data that generate structured training tuples like (state, action, reward, next state) and are extremely useful for post-training optimization when the world can be simulated. But many of the highest-value domains for AI, including healthcare, enterprise workflows, and complex multimodal reasoning, cannot be faithfully simulated. Advancing models in these areas requires real-world datasets, carefully designed benchmarks, and domain-specific data for pre-training and mid-training adaptation. The idea behind DataLab is simple but important: every major leap in AI capability has historically followed a breakthrough in data (from ImageNet to large-scale web corpora). As models and compute continue to advance rapidly, closing the data gap, the gap between the data that AI systems need and the data that actually exists in usable form, may be one of the most important challenges for the field. Here is more info on some of the work the team has done so far: datalab.withprotege.ai
Bobby Samuels@BobbySamuels

x.com/i/article/2030…

English
1
0
10
603
matt turk
matt turk@TurkMatthew·
RL envs are a subset of useful data, and they generate training tuples (state, action, reward, next state). But they only work when the world can be simulated. Most high-value domains (healthcare, enterprise workflows, multimodal reasoning) can’t be faithfully simulated, so models still need real-world datasets and evaluation benchmarks. RL envs are also mainly for post training and sit a layer above in abstraction whereas mid/pre training require other real world data and domain adaptation.
English
2
1
8
3.7K
matt turk retweetledi
Stathole
Stathole@Statholesports·
While we're getting weird with it here are the top outlier high scores in NBA history based on that player's average and SD. Normality be damned, we're having some fun here. Mario West!!!
Stathole tweet media
English
27
69
2.4K
139.2K
matt turk
matt turk@TurkMatthew·
Highly recommend this series for anyone looking for top tier guidance on being a future founder! Leeor and Solly are some of the best VCs out there to work with and learn from.
Leeor Mushin@lmushin

Most startup programs are built for companies. Today, we’re publicly launching Forum built for the people who are about to start them. Forum is a highly selective, 6-session series for exceptional future founders who are still pre-company, but not far from starting one. The best founder talent may not need an accelerator, but they do need the right environment to pressure-test ideas, sharpen conviction, and figure out what is actually worth building. We know this because we have seen the results Forum has delivered for talent - from first-time founders to billion-dollar-exited ones. For years, Forum has run quietly in the background, first during my time @ Floodgate and now at Formation. Since starting our firm 18 months ago, Solly and I have had the privilege of working with 50+ extraordinary people. Many have gone on to raise millions from the best firms in the world. Forum is free. Each cohort is capped at 6 people. That constraint is a feature, not a bug. It creates a level of candor, rigor, and peer quality that is very hard to find elsewhere. We believe Forum is the highest-leverage two months of an aspiring founder’s professional career. Your idea may get stronger. It may get killed. Both are wins. If you are unusually high-agency and circling your life’s work, apply in the link in the comments. If you know someone who is in this state of mind, send them our way.

English
1
2
2
388
matt turk retweetledi
NBACentel
NBACentel@TheNBACentel·
The Los Angeles Lakers and Deandre Ayton are nearing an agreement that will allow him to work remotely. (Via @LakersLead)
NBACentel tweet media
English
328
2.5K
56.3K
1.5M
matt turk retweetledi
LakeShowYo
LakeShowYo@LakeShowYo·
Chris Paul has officially retired Lakers legend 🔥
LakeShowYo tweet media
English
43
290
13.2K
185.3K
matt turk retweetledi
New York Metro Weather
New York Metro Weather@nymetrowx·
Friday's Weather Rating: 1/10 I've got nothing. It's horribly cold, with wind chill values below zero and high temperatures barely reaching the teens this afternoon. The sun is not helping much. The vibes have completely frozen over!!
English
17
123
1.9K
80.1K
matt turk retweetledi
Curtis G. Northcutt
Curtis G. Northcutt@cgnorthcutt·
News: @joinHandshake acquires @CleanlabAI! This "ten-year old job marketplace" has quietly become a top human data lab for AI--building an AI research org, acquiring top AI talent, and advancing Cleanlab tech and research to lead data foundations for frontier AI. 1 of 4
English
3
4
18
3.4K
matt turk retweetledi
cinesthetic.
cinesthetic.@TheCinesthetic·
Matt Damon and Ben Affleck on Rogan taking about how Netflix has changed filmmaking. “you re-iterate the plot 3-4x in the dialogue because people are on their phones.”
English
643
3.9K
67.1K
25M
Fastbreak Hoops
Fastbreak Hoops@FastbreakHoops5·
Name the most random NBA Player that you can think of.
English
3.5K
18
644
385K
matt turk retweetledi
Jonas Mueller
Jonas Mueller@jomulr·
Which LLM is better for Structured Outputs / Data Extraction: Gemini-3-Pro or GPT-5? We ran popular benchmarks, but found their "ground truth" is full of errors. To enable reliable benchmarking, we've open-sourced 4 new Structured Outputs benchmarks with *verified* ground-truth
Jonas Mueller tweet media
English
3
9
33
23.6K
matt turk retweetledi
Jonas Mueller
Jonas Mueller@jomulr·
We found a way to cut AI agent failures on Tau²-Bench by 50%. Our LLM-trust-scoring + message-revision system helps agents recover from bad LLM outputs (reasoning errors, hallucinations, oversights, wrong tool calls, ...) This system works for any agent/model and domain. [1/2]
Jonas Mueller tweet media
English
3
5
20
14.3K
matt turk retweetledi
Cleanlab
Cleanlab@CleanlabAI·
The “Year of the Agent” just got pushed back. Out of 1,837 enterprise leaders, most are struggling with stack churn + reliability. ⚙️ 70% rebuild every 90 days 😬 Less than 35 % are happy with their infrastructure 🤖 Most “agents” still aren’t really acting yet
Cleanlab tweet media
English
5
7
25
15.4K
matt turk
matt turk@TurkMatthew·
@aaditsh You didn’t share the link and also stole this tweet from here… x.com/henrythe9ths/s…
Henry Shi@henrythe9ths

There's a shocking fact about AI that nobody tells you: You can catch up to the public AI research frontier in just 2 weeks. Yes, really. I've built a $150M annual revenue startup over the last 8 years and If I were to start a company today, I’d drop everything and go all-in on AI. But like many busy software builders, I felt lost—overwhelmed by the noisy, crowded and fast-moving modern AI landscape. And I wasn’t alone. So I spent my entire holiday diving deep into AI research—reading 30+ papers, watching hours of lectures, analyzing trends, and catching up to the research frontier. ✨ Here’s what I learned: - You don’t need months (or years) to catch up. - You don’t need a PhD or decades of ML experience. - You need fewer than 20 papers and 2 weeks to understand the major breakthroughs shaping AI today. It's because the technology is extremely nascent and most techniques that came before are no longer relevant: - ChatGPT is barely 2 years old and Transformers are only 7 years old. - Most game-changing discoveries happened within the last 4 years, driven by a few breakthrough ideas, scaling laws, and efficient matrix multiplication. The biggest secret? Many groundbreaking AI papers with thousands of citations are surprisingly simple and applied, like adding "let's think step by step" to the prompt, or simply asking the LLM over and over again to improve its answer (Self-Refine). I realized there are tons of founders and builders in the same boat—wanting to dive deeper into AI but unsure where to start. I've created an essential AI Guide that helped me catch up, in just 2 weeks, to the frontier of public AI research to figure out where the next opportunities and gaps were: - Curated list of only the most important papers - Simple explanations of key concepts - Clear pathway to understanding the frontier of modern AI It’s perfect for: - Founders expanding into AI - Builders wanting to innovate at the frontier of AI - Investors looking to separate the signal from the noise 👇 Want the full guide? - Like and Share this post - Comment "AI Guide" - I'll send you the complete guide (ps, I’m also teaming up with @VishalVasishth, co-founder of @obviousvc with @ev (focused on large-scale societal impact companies like Twitter, Medium, Beyond Meat), to host a small meetup to discuss what's working and needs to be solved in the AI stack in SF. Message me if you're interested)

English
1
0
31
1.9K
Aadit Sheth
Aadit Sheth@aaditsh·
No one tells you this but you can catch up to the frontier of AI in just 2 weeks. You don’t need years. Just 20 papers and focused time. Here’s the full crash course (free). Save this one.
Aadit Sheth tweet media
English
16
125
951
77.4K
Matt Turck
Matt Turck@mattturck·
How GPT-5 thinks, with @OpenAI VP of Research @MillionInt 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide How Long to Think 07:24 - Evolution from o1 to o3 to GPT-5 11:00 - The Road to OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - OpenAI's Culture of Transparency 29:32 - Balancing Research with Shipping Fast 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models
English
34
132
1K
451.3K
matt turk retweetledi
𝑪𝒐𝒏𝒆 🌩
𝑪𝒐𝒏𝒆 🌩@Three_Cone·
Cason Wallace actually just hit the greatest shot in NBA history
English
209
2.8K
53.4K
3.5M