Sid Sharma

65 posts

Sid Sharma

@phylera14

diffusion language models @_Inception_ai

San Francisco, CA Katılım Mart 2014

221 Takip Edilen155 Takipçiler

Sabitlenmiş Tweet

Sid Sharma@phylera14·17 Nis

huge respect for justin bieber switching from autoregressive to diffusion-based text generation while headlining coachella. most artists just soundcheck. justin swapped out his LLM provider backstage... you can see him here (real photo!) spinning up mercury 2 from @_inception_ai on a dedicated instance, watching tokens materialize in parallel instead of one at a time like some kind of animal, refreshing his p99 latency graphs and whispering "discrete diffusion" to himself before performing for 100,000+ people. most performers have a vocal warmup routine. justin's is curl -X POST api.inceptionlabs.ai/v1/completions. justin ships nothing at 2500ms. neither should you.

English

640

Sid Sharma retweetledi

Kelly Greer@kellyjgreer·17h

going to start threading the autoregressive model killer candidates emerging 1/ @_inception_ai diffusion LLM Mercury 2 rips 4x the tokens per second vs autoregressive LLMs (@StefanoErmon & @phylera14)

Inception@_inception_ai

Mercury 2 is in a league of its own. 1,200 tok/s at comparable quality to speed-optimized autoregressive models, per @ArtificialAnlys.

English

908

Sid Sharma@phylera14·6d

@FeinbergVlad Getting the crowd warm before the big bang tomorrow I see 😉

English

378

Vlad Feinberg@FeinbergVlad·6d

How to land a job at a frontier lab vladfeinberg.com/2026/05/10/how…

Deedy@deedydas

The vibes in SF feel pretty frenetic right now. The divide in outcomes is the worst I've ever seen. Over the last 5yrs, a group of ~10k people - employees at Anthropic, OpenAI, xAI, Nvidia, Meta TBD, founders - have hit retirement wealth of well above $20M (back of the envelope AI estimation). Everyone outside that group feels like they can work their well-paying (but <$500k) job for their whole life and never get there. Worse yet, layoffs are in full swing. Many software engineers feel like their life's skill is no longer useful. The day to day role of most jobs has changed overnight with AI. As a result, 1. The corporate ladder looks like the wrong building to climb. Everyone's trying to align with a new set of career "paths": should I be a founder? Is it too late to join Anthropic / OpenAI? should I get into AI? what company stock will 10x next? People are demanding higher salaries and switching jobs more and more. 2. There’s a deep malaise about work (and its future). Why even work at all for “peanuts”? Will my job even exist in a few years? Many feel helpless. You hear the “permanent underclass” conversation a lot, esp from young people. It's hard to focus on doing good work when you think "man, if I joined Anthropic 2yrs ago, I could retire" 3. The mid to late middle managers feel paralyzed. Many have families and don't feel like they have the energy or network to just "start a company". They don't particularly have any AI skills. They see the writing on the wall: middle management is being hollowed out in many companies. 4. The rich aren’t particularly happy either. No one is shedding tears for them (and rightfully so). But those who have "made it" experience a profound lack of purpose too. Some have gone from <$150k to >$50M in a few years with no ramp. It flips your life plans upside down. For some, comparison is the thief of joy. For some, they escape to NYC to "live life". For others still, they start companies "just cuz", often to win status points. They never imagined that by age 30, they'd be set. I once asked a post-economic founder friend why they didn't just sell the co and they said "and do what? right now, everyone wants to talk to me. if i sell, I will only have money." I understand that many reading this scoff at the champagne problems of the valley. Society is warped in this tech bubble. What is often well-off anywhere else in the world is bang average here. Unlike many other places, tenure, intelligence and hard work can be loosely correlated with outcomes in the Bay. Living through a societally transformative gold rush in that environment can be paralyzing. "Am I in the right place? Should I move? Is there time still left? Am I gonna make it?" It psychologically torments many who have moved here in search of "success". Ironically, a frequent side effect of this torment is to spin up the very products making everyone rich in hopes that you too can vibecode your path to economic enlightenment.

English

158

2.8K

1.2M

Sid Sharma@phylera14·6d

@samlambert

GIF

QME

Sam Lambert@samlambert·6d

TIL if i sleep at night i feel pretty good the next day. i might try it more often.

English

110

5.6K

Sid Sharma@phylera14·14 May

Your company's two highest-paid inference engineers just landed in Maui for the company offsite

English

Sid Sharma@phylera14·13 May

Inception is hiring a Head of Product This is a hands-on role for a technical product lead who wants to help build the next generation of LLMs. You'd work directly with S-tier AI researchers at the frontier of model architecture, inference, and enterprise deployment. We're one of the only AI labs where the product is live in production with enterprises and AI-native companies today - and the valuation is at a stage where your equity has real upside (not financial advice). The bar is high. The role is not a walk in the park. But if you’ve been watching the frontier AI labs from the sidelines and waiting for the seat where you can help build foundational AI infrastructure before the category is obvious, this is it. DM me. Bay Area only. jobs.gem.com/inception/am9i…

English

1.7K

Sid Sharma@phylera14·12 May

The best AI agents in production aren't one model. They're 5-10 specialized subagents running in parallel, each matched to the right task/cost/speed tradeoff. @augmentcode's architecture is one of the cleanest examples of this shift. We wrote up how they do it.

Inception@_inception_ai

@augmentcode rebuilt their context compaction layer around Mercury 2. 82% latency cut. 90% cost cut. Comparable quality to Opus 4.7. Running in production today. "We took a counter-intuitive bet. We decoupled summarization entirely, offloading it to Mercury 2 as a dedicated subagent. Mercury 2 is the highly efficient engine powering our most critical workflows." -@RustagiAnkur & @jm1234567890, Members of Technical Staff at Augment Code The subagent layer needs the most efficient model. Full methodology and eval setup in the writeup. inceptionlabs.ai/blog/rise-of-r…

English

178

Sid Sharma@phylera14·11 May

@benln oh hey #7 👋

English

113

Ben Lang@benln·10 May

Pulled the fastest-growing startups based on X follower growth over the past 90 days:

English

989

103.7K

Sid Sharma@phylera14·10 May

maybe time to feed it some @ExaAILabs

English

Sid Sharma@phylera14·10 May

happy mother’s day to everyone raising an LLM. i feed him context. i set boundaries. i give him examples of right and wrong. i praise him when he listens. and he still goes out there and embarrasses this family with fake citations. i didn’t raise him like this.

English

188

Sid Sharma@phylera14·10 May

@jeffzwang 👀

QME

Jeffrey Wang@jeffzwang·10 May

Anyone have a good HTML styling template they like for LLM reports? Dark mode with purple isn't my jam

Thariq@trq212

HTML is the new markdown. I've stopped writing markdown files for almost everything and switched to using Claude Code to generate HTML for me. This is why.

English

5.3K

Sid Sharma@phylera14·7 May

The frontier is not just intelligence The conversation about LLM performance has been dominated by a single axis - intelligence. Benchmark scores on reasoning tasks. Performance on the hardest problems. That axis matters. It's what the primary agent needs. But most tokens in an agentic system never touch the primary agent. They flow through the subagent layer: the high-frequency calls that keep the agentic system moving: compressing context, searching codebases, routing tasks, validating outputs. Take compaction. Every time a coding agent fills its context window, a subagent reads the full session history and compresses it into a structured summary so the primary agent can keep going. Across production deployments, compaction alone accounts for >60% of all token volume. It doesn't need frontier intelligence. It needs speed, fidelity, and cost efficiency at scale. The next frontier is not just smarter models. It is faster, cheaper models with baseline intelligence that can serve as the infrastructure layer beneath them. That is what Mercury 2 is built for.

English

1.2K

Sid Sharma@phylera14·5 May

@sarahmsachs Wonder how Mercury 2 on Baseten would stack up here 👀

English

497

Sid Sharma retweetledi

Tomas Hernando Kofman@tomas_hk·5 May

If you're routing through Not Diamond with your own Inception API key, send me an email from your work address for 200M free Mercury 2 tokens. Cheers to @phylera14 and the @_inception_ai team for the awesome promo!

English

356

Sid Sharma@phylera14·4 May

p50: 175ms vs 686ms p99: 517ms vs 1183ms a top-10 US tech company benchmarked Mercury 2 from @_inception_ai against Gemini Flash on their search pipelines in prod. same tasks. same eval. diffusion LLMs are a different animal.

English

3.6K

Sid Sharma@phylera14·4 May

@AnjneyMidha nopa = williamsburg/park slope

Nederlands

488

Anjney Midha@AnjneyMidha·3 May

yes also: east cut = singapore mission dolores = brooklyn presidio = english countryside richmond = taiwan fidi = hong kong tenderloin = johannesburg castro = chelsea idk wtf to make of dogpatch/potrero

“paula”@paularambles

idk how to explain this but mission bay is spiritually dubai

English

330

67.3K

Sid Sharma@phylera14·1 May

@_inception_ai @joycech3n TTFD (time-to-first-drink) is the only benchmark that matters. good work @joycech3n

English

Inception@_inception_ai·1 May

Turn your sound on 🔊 @joycech3n asked Mercury 2 to plan a friday happy hour bar crawl for the Inception team. It reasoned, called tools, picked spots, at the speed of a normal conversation. Real-time voice agents weren't possible with autoregressive latency. They are now. Try Mercury 2 in your voice agent stack today → platform.inceptionlabs.ai TTS by @elevenlabs

English

3.3K

Sid Sharma@phylera14·1 May

@ewveggies beg to differ at @_inception_ai

English

Kyle Wong@ewveggies·1 May

Neolabs will run out of adjectives in a few years💀 We already have: - Ineffable Intelligence - Standard Intelligence - Physical Intelligence - Sapient Intelligence - Ricursive Intelligence - Advanced Machine Intelligence - Safe Superintelligence - Recursive Superintelligence Any more?

English

136

16.2K

Sid Sharma@phylera14·30 Nis

"we built a diffusion LLM that reasons, tool calls, and streams in under 500ms" Voice ai startups: