Gavin Uberti

133 posts

Gavin Uberti

@UbertiGavin

Building model-specific AI chips @ Etched

San Jose, CA شامل ہوئے Mart 2022

243 فالونگ3.5K فالوورز

Gavin Uberti@UbertiGavin·4 Mar

@tanishqkumar07 @tri_dao @avnermay Super cool stuff!

English

2.2K

Tanishq Kumar@tanishqkumar07·4 Mar

I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.

English

134

457

4.1K

607.6K

Gavin Uberti@UbertiGavin·28 Oca

Ben and Asher are some of the smartest people I know, and the team they've assembled is world class as well. So excited to see what they build!

Flapping Airplanes@flappyairplanes

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.

English

3.2K

Gavin Uberti@UbertiGavin·17 Eyl

@andrei_serban @ThriveCapital @scale_AI @tryramp @webflow Exciting stuff!

English

213

Andrei Serban@andrei_serban·16 Eyl

All companies run on IT. But almost all IT teams are underwater. The ones that aren't run on Console. Console raised a $23M Series A led by DST Global Partners and @ThriveCapital to help leading companies like @scale_ai, @tryramp, and @webflow automate 50%+ of their tickets.

English

404

158.6K

Gavin Uberti ری ٹویٹ کیا

Brendan (can/do)@BrendanFoody·15 Eyl

Mercor (@mercor_ai) scaled from $1-500M in revenue run rate in the last 17 months, making us the fastest growing company of all time. Our growth is accelerating. We averaged 11% week over week growth in July, 18% WoW growth in August, and 19% WoW growth in September. One trend driving this meteoric growth: the Economy is Becoming an RL Environment Machine. Reinforcement learning is becoming so effective that agents can hillclimb any benchmark, but humans need to define the rewards to automate everything. While everyone fears job loss, we’re creating a new category of knowledge work faster than any other time in history. The future of work will converge on training agents. We're paying out over $1M / day to people in our marketplace and hiring experts rapidly across nearly every domain: software engineers, doctors, lawyers, consultants, bankers, and many more.

English

149

195

1.5K

618.1K

Gavin Uberti@UbertiGavin·6 Ağu

@TarunAmasa @OpenAI Congrats Tarun! Exciting stuff!

English

2.5K

Tarun Amasa@TarunAmasa·6 Ağu

It’s official. We’ve raised $14m led by @OpenAI Startup Fund to bring AI to Excel. Endex is the first AI agent to live inside Excel. For the past year, we've been working with financial firms. Today we’re releasing it to the world. Our capacity is limited; comment below for an early invite 🧵

English

2.1K

462

8.1K

10.8M

Gavin Uberti@UbertiGavin·5 Nis

There is no data wall.

English

3.7K

Gavin Uberti@UbertiGavin·5 Nis

happy llama day to all those who celebrate

AI at Meta@AIatMeta

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9

English

4.6K

Gavin Uberti@UbertiGavin·3 Mar

@JoeLi5050 @radi_cho @zeynebnkaya @nicolesplaining Congratulations guys! Well deserved

English

1.9K

Joe Li@JoeLi5050·3 Mar

Just won 1st Place and $40k at the Mercor x Etched x Cognition inference-time compute hackathon! 4 Stanford freshman (@radi_cho, @zeynebnkaya, @nicolesplaining, and I) built "LLaDA-R1: Scaling Reasoning at Inference Time with Diffusion-LLMs" in 24 hours

English

972

91.6K

Gavin Uberti@UbertiGavin·1 Mar

@magnificentgrnt The trustees behind Magnificent are some of the best mentors I've had the privilege of working with. Salar, Firat, Sidar, and of course Barend are all world-class.

English

445

Magnificent Grants@magnificentgrnt·27 Şub

Check out the new 2024 Cohort of Magnificent Grants. And happy to report that going forward, the fellowship application process will be on a rolling basis, launching a nomination system... substack.com/home/post/p-15…

English

10.5K

Gavin Uberti@UbertiGavin·1 Mar

Excited to be part of the founding class of Magnificent Grants, and congratulations to all the winners!

Magnificent Grants@magnificentgrnt

English

2.9K

Gavin Uberti@UbertiGavin·21 Şub

@vllm_project Would the additional reasoning tokens allowed by the speedup make up for the loss in accuracy (assuming we are in a time-bound environment)? Come test it at the Inference Time Compute Hackathon!

English

Gavin Uberti@UbertiGavin·21 Şub

@vllm_project Rather than insist the speculative decoding distribution match exactly, if you are OK with, say, 99% of the distribution being recovered (e.g. according to some metric like KL Divergence), you could accept guessed tokens more frequently

English

1.2K

Gavin Uberti@UbertiGavin·21 Şub

Hackathon idea - nearly speculative decoding. As of v0.7.3, @vllm_project supports Deepseek R1's Multi-Token Prediction module, letting you "skip" a token generation if the multi-token prediction guessed it correctly in advance. But what if you accepted almost correct guesses?

Etched@Etched

We're excited to partner with @Cognition_Labs @Mercor_AI @CoreWeave and @AnthropicAI to host an inference-time compute hackathon, featuring >$60K in cash prizes and >1 exaflop of free compute.

English

2.8K

Gavin Uberti@UbertiGavin·20 Şub

@akashorion @akashorion come test it out!

English

109

Gavin Uberti@UbertiGavin·20 Şub

You could add more experts by just increasing K_r and M. Would this make Deepseek smarter? Would you need to scale down the outputs accordingly? Would you need to fine-tune a bit, or would it work out of the box? What if you used all experts simultaneously?

English

102

Gavin Uberti@UbertiGavin·20 Şub

Using more FLOPs should make transformers smarter. DeepSeek R1 currently uses ~8 routed experts per token. So would selecting more (and possibly scaling by the router) improve performance? Come test it out at the Inference Time Compute Hackathon and win up to $60k in prizes!

Etched@Etched

We're excited to partner with @Cognition_Labs @Mercor_AI @CoreWeave and @AnthropicAI to host an inference-time compute hackathon, featuring >$60K in cash prizes and >1 exaflop of free compute.

English

2.3K

Gavin Uberti@UbertiGavin·20 Şub

For intelligence, FLOPs are all you need. So I'm excited to announce our Inference Time Compute Hackathon with @cognition_labs, @mercor_ai, @CoreWeave, and @AnthropicAI. When exaFLOPs are too cheap to meter, what will we build?

Etched@Etched

We're excited to partner with @Cognition_Labs @Mercor_AI @CoreWeave and @AnthropicAI to host an inference-time compute hackathon, featuring >$60K in cash prizes and >1 exaflop of free compute.

English

1.6K

Gavin Uberti@UbertiGavin·26 Kas

@juberti @OpenAI Congratulations to Sam for nabbing a world-class engineer!

English

350

Justin Uberti@juberti·25 Kas

During the development of WebRTC, we recognized the impact of voice and video on human communication, and I wondered if someday we'd talk to AIs the same way. Today, we can see this future taking shape, and I'm excited to announce I've joined @OpenAI to lead real-time AI efforts!

English

1.8K

216.6K

Gavin Uberti@UbertiGavin·13 Kas

@Tanishq97836660 Very cool!

English

133

Tanishq Kumar@tanishqkumar07·11 Kas

Our preprint is on arXiv at [arxiv.org/pdf/2411.03541]

English

939

Tanishq Kumar@tanishqkumar07·11 Kas

[1/5] Made a Twitter account just to share some fun research projects I worked on over the summer! The first is a computational neuroscience project asking: do mice grok? TLDR; we revisit recent neural data from mouse cortex as they are overtrained on a binary odor discrimination task, finding that they continue to learn generalizing solutions -- their odor representations of the two classes continue to separate -- even as behavior stops changing on a training set. Joint work with @blake__bordelon, @CPehlevan, Venki Murthy at @MCB_Harvard and @gershbrain!

English

3.5K

دریافت کریں

@tanishqkumar07 @tri_dao @avnermay @andrei_serban @ThriveCapital @scale_AI @tryramp @webflow