sohail

6.8K posts

sohail banner
sohail

sohail

@Sohailm25

forward deployed engineer @togethercompute prev @amazon @jpmorgan

Katılım Temmuz 2009
452 Takip Edilen592 Takipçiler
Sabitlenmiş Tweet
sohail
sohail@Sohailm25·
this is the way built the same for interpretability research the system is so tight that I just ask my ‘professor’ (deeply critical and unbiased) to check on the progress/advise my ‘ghost’ (execution oriented researcher) on next steps all in slack, tmux/claude sesh’s + modal
sohail tweet mediasohail tweet media
alex fazio@alxfazio

my claude flow is so battle-tested that half the time i catch myself just hitting yes/no/enter, so i built myself a ralph that doesn’t suck. it follows swe best practices with quality gates (e.g., plankton), adopts best-in-class context engineering practices, and can handle complex codebases, with stacked prs and multiple rounds of pr review. article and repo soon. spoiler, it's all claude -p

English
1
1
17
2.3K
sohail
sohail@Sohailm25·
this Benioff quote is the cleanest inference economics example i've seen Salesforce is using $300M of Anthropic this year, and "the vast majority of those tokens don't need to go to Anthropic." pick the smartest model, sign the big contract, route everything through it, and negotiate volume discounts later this is how you burn money on tasks that never needed frontier intelligence the alternative is boring and much harder: build a routing layer around 1. LCPR 2. latency 3. cost 4. privacy 5. reliability those 5 constraints decide whether a task should go to Claude, OpenAI, a smaller open model, or something running inside your own perimeter i've been writing about this for the inference economics book because the trap is everywhere at the QSR, drive-thru voice ordering cared about sub-1.5s latency. that is a totally different regime than batch analysis. if you optimize both with the same model policy, you're already wrong the counter-take is real: routing adds complexity. bad routing can make quality worse, and one vendor is easier to blame when things fail but if your spend is nine figures, "easier to blame" is not a strategy the adult version of AI adoption is model portfolios with measurement, fallbacks, and clear gates otherwise you're not buying intelligence. you're buying a very expensive default setting
Maddy A@its_maddy_a

“I think we are getting brainwashed.” @Benioff said this on @theallinpod. “We’re using $300M of @AnthropicAI this year… the vast majority of those tokens don’t need to go to Anthropic.” Some tasks need @claudeai . Some need @OpenAI . Most need smaller, cheaper, faster models like @ZeroGPU_AI @Benioff believes in what we do - @salesforcevc should take a look. zerogpu.ai

English
0
0
1
128
sohail retweetledi
Nick Khami
Nick Khami@skeptrune·
WE ARE GOING COAST COAST if you're an account executive looking for an exciting opportunity for your career then this office is for you we're hiring!!!!
Mintlify@mintlify

We're opening an office in NYC. A lot of the companies building the most interesting things on Mintlify are there, @coinbase @kalshi, and a growing list of teams. We want to be closer to them. SF will always be home. NYC is the next one. We're hiring. Come build with us.

English
18
3
117
9.9K
sohail
sohail@Sohailm25·
im surprised by people who say their job as engineers has reduced to hitting approve all day are people not thinking before they interact with their agent? that should be somewhat involved so long as you care about the outcome i'd assume ig its about incentive alignment
English
1
0
1
24
Matt Slotnick
Matt Slotnick@matt_slotnick·
if you’re leading an FDE team I’d love to chat with you… looking for more perspectives on what’s working, what’s not, and where you see things going. Can be off the record :)
English
3
1
22
3.3K
Charles 🎉 Frye
Charles 🎉 Frye@charles_irl·
Inference isn't everything, but it does require a new stack -- not Kubernetes, not SLURM. At @modal, we dove deep to build that stack. In this blog post we explain how, from compute management & cloud-native cacheing to CRIU & GPU checkpointing. modal.com/blog/truly-ser…
Charles 🎉 Frye tweet media
English
21
64
580
87.4K
Together AI
Together AI@togethercompute·
Introducing voice finder from Together AI, a new tool to search, filter, and audition 600+ voices across leading TTS models. AI natives can now find the right voice for their app faster by describing what they need or uploading an audio sample.
English
2
3
56
7.1K
andrew pignanelli
andrew pignanelli@ndrewpignanelli·
The new Cofounder site is literally a step-by-step guide on how to start a company. When i started my first company there was all sorts of stuff i had to learn and all of the guides were SEO maxxed slop. Now there's a real one written by experts and u can download it.
English
49
66
1.6K
115.8K
sohail
sohail@Sohailm25·
fdemaxxing
Dansk
0
0
1
47
sohail
sohail@Sohailm25·
everyday im reminded that posting content consistently will lead to outsized returns in the long term market lag is REAL
English
0
0
0
17
Jay Scambler
Jay Scambler@JayScambler·
Realizing I need to build a homelab. Current plan is to link Mac Studios with Exo for decode and use a DGX Spark for prefill. What am I missing? Would it be better to get a couple RTX Pro 6000 workstations?
English
3
2
6
850
sohail retweetledi
Nick Khami
Nick Khami@skeptrune·
mintlify is now valued at 500 million dollars!!! we raised a 45 million dollar series b from a16z and salesforce ventures!!! we are leading the charge on agent-first knowledge infrastructure. docs for humans are dead. by this time next year, agents are going to be 99% of all traffic to your site. some of the things we're doing to adjust to this 1. return markdown by default anytime a resource is requested with an accepts ‘prefers markdown’ header 2. skill md support (generates a skill with AI and updates it with every change, you can also set your won). further, skills are now present as resources on all MCP servers 3. MCPs generated by default for every site (a significant percentage of docs traffic comes from MCPs now, if your content doesn't have one this is a significant disadvantage) 4. AI workflows that automatically monitor your codebase for changes and keep your docs up to date not only have i been working here for 10 months, but i use a mintilfy site every time i work on any side software project. today, it's harder to find a devtool not on Mintlify than on the craziest part is, this is the worst the mintlify's product will ever be. i don't think most understand how fast we are accelerating. it's going to be a a completely different experience in a few months
Han Wang@handotdev

We just raised a $45M Series B at a $500M valuation led by @a16z and @SalesforceVC to build the knowledge infrastructure for AI

English
148
24
675
61.7K
sohail
sohail@Sohailm25·
the main weakness is honest here: - held-out prediction R²=0.24, so the structure is recoverable but not yet linearly predictable - one model, oracle-level analysis, no causal claim but the ceiling and the task conditioning are worth characterizing
English
1
0
1
42
sohail
sohail@Sohailm25·
the Kimi team’s Attention Residuals paper (arxiv.org/abs/2603.15031) made me ask a simple question if replacing fixed residual accumulation with learned input-dependent depth routing helps that much, how much of that routing signal already exists in standard frozen transformers? so I ran an experiment
English
1
0
1
93