sohail

6.8K posts

sohail

@Sohailm25

forward deployed engineer @togethercompute prev @amazon @jpmorgan

Katılım Temmuz 2009

452 Takip Edilen592 Takipçiler

Sabitlenmiş Tweet

sohail@Sohailm25·28 Şub

this is the way built the same for interpretability research the system is so tight that I just ask my ‘professor’ (deeply critical and unbiased) to check on the progress/advise my ‘ghost’ (execution oriented researcher) on next steps all in slack, tmux/claude sesh’s + modal

alex fazio@alxfazio

my claude flow is so battle-tested that half the time i catch myself just hitting yes/no/enter, so i built myself a ralph that doesn’t suck. it follows swe best practices with quality gates (e.g., plankton), adopts best-in-class context engineering practices, and can handle complex codebases, with stacked prs and multiple rounds of pr review. article and repo soon. spoiler, it's all claude -p

English

2.3K

sohail@Sohailm25·4h

this Benioff quote is the cleanest inference economics example i've seen Salesforce is using $300M of Anthropic this year, and "the vast majority of those tokens don't need to go to Anthropic." pick the smartest model, sign the big contract, route everything through it, and negotiate volume discounts later this is how you burn money on tasks that never needed frontier intelligence the alternative is boring and much harder: build a routing layer around 1. LCPR 2. latency 3. cost 4. privacy 5. reliability those 5 constraints decide whether a task should go to Claude, OpenAI, a smaller open model, or something running inside your own perimeter i've been writing about this for the inference economics book because the trap is everywhere at the QSR, drive-thru voice ordering cared about sub-1.5s latency. that is a totally different regime than batch analysis. if you optimize both with the same model policy, you're already wrong the counter-take is real: routing adds complexity. bad routing can make quality worse, and one vendor is easier to blame when things fail but if your spend is nine figures, "easier to blame" is not a strategy the adult version of AI adoption is model portfolios with measurement, fallbacks, and clear gates otherwise you're not buying intelligence. you're buying a very expensive default setting

Maddy A@its_maddy_a

“I think we are getting brainwashed.” @Benioff said this on @theallinpod. “We’re using $300M of @AnthropicAI this year… the vast majority of those tokens don’t need to go to Anthropic.” Some tasks need @claudeai . Some need @OpenAI . Most need smaller, cheaper, faster models like @ZeroGPU_AI @Benioff believes in what we do - @salesforcevc should take a look. zerogpu.ai

English

128

sohail@Sohailm25·4h

a response to @ashwingop piece that you can read first here: x.com/ashwingop/stat…

sohail@Sohailm25

x.com/i/article/2056…

English

sohail@Sohailm25·4h

x.com/i/article/2056…

ZXX

sohail@Sohailm25·4h

whoever owns the surface owns the rollouts

sohail@Sohailm25

x.com/i/article/2056…

English

sohail@Sohailm25·4h

x.com/i/article/2056…

ZXX

sohail retweetledi

Nick Khami@skeptrune·5h

WE ARE GOING COAST COAST if you're an account executive looking for an exciting opportunity for your career then this office is for you we're hiring!!!!

Mintlify@mintlify

We're opening an office in NYC. A lot of the companies building the most interesting things on Mintlify are there, @coinbase @kalshi, and a growing list of teams. We want to be closer to them. SF will always be home. NYC is the next one. We're hiring. Come build with us.

English

117

9.9K

sohail@Sohailm25·6h

Holy crap

Andrej Karpathy@karpathy

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

sohail@Sohailm25·7h

man I thought this said forward deployed b**** and got hype

Greg Van Horn@gvh41

x.com/i/article/2056…

English

sohail@Sohailm25·1d

im surprised by people who say their job as engineers has reduced to hitting approve all day are people not thinking before they interact with their agent? that should be somewhat involved so long as you care about the outcome i'd assume ig its about incentive alignment

English

sohail@Sohailm25·2d

@matt_slotnick dm’d

Matt Slotnick@matt_slotnick·4d

if you’re leading an FDE team I’d love to chat with you… looking for more perspectives on what’s working, what’s not, and where you see things going. Can be off the record :)

English

3.3K

sohail@Sohailm25·12 May

@charles_irl @modal love this man

English

183

Charles 🎉 Frye@charles_irl·12 May

Inference isn't everything, but it does require a new stack -- not Kubernetes, not SLURM. At @modal, we dove deep to build that stack. In this blog post we explain how, from compute management & cloud-native cacheing to CRIU & GPU checkpointing. modal.com/blog/truly-ser…

English

580

87.4K

sohail@Sohailm25·12 May

@togethercompute 👀👀👀

QME

100

Together AI@togethercompute·12 May

Introducing voice finder from Together AI, a new tool to search, filter, and audition 600+ voices across leading TTS models. AI natives can now find the right voice for their app faster by describing what they need or uploading an audio sample.

English

7.1K

sohail@Sohailm25·12 May

@ndrewpignanelli its giving maplestory i love it

English

105

andrew pignanelli@ndrewpignanelli·12 May

The new Cofounder site is literally a step-by-step guide on how to start a company. When i started my first company there was all sorts of stuff i had to learn and all of the guides were SEO maxxed slop. Now there's a real one written by experts and u can download it.

English

1.6K

115.8K

sohail@Sohailm25·11 May

fdemaxxing

Dansk

sohail@Sohailm25·11 May

everyday im reminded that posting content consistently will lead to outsized returns in the long term market lag is REAL

English

sohail@Sohailm25·16 Nis

@JayScambler @SIGKITTEN share ur wisdom

English

Jay Scambler@JayScambler·16 Nis

Realizing I need to build a homelab. Current plan is to link Mac Studios with Exo for decode and use a DGX Spark for prefill. What am I missing? Would it be better to get a couple RTX Pro 6000 workstations?

English

850

sohail retweetledi

Nick Khami@skeptrune·14 Nis

mintlify is now valued at 500 million dollars!!! we raised a 45 million dollar series b from a16z and salesforce ventures!!! we are leading the charge on agent-first knowledge infrastructure. docs for humans are dead. by this time next year, agents are going to be 99% of all traffic to your site. some of the things we're doing to adjust to this 1. return markdown by default anytime a resource is requested with an accepts ‘prefers markdown’ header 2. skill md support (generates a skill with AI and updates it with every change, you can also set your won). further, skills are now present as resources on all MCP servers 3. MCPs generated by default for every site (a significant percentage of docs traffic comes from MCPs now, if your content doesn't have one this is a significant disadvantage) 4. AI workflows that automatically monitor your codebase for changes and keep your docs up to date not only have i been working here for 10 months, but i use a mintilfy site every time i work on any side software project. today, it's harder to find a devtool not on Mintlify than on the craziest part is, this is the worst the mintlify's product will ever be. i don't think most understand how fast we are accelerating. it's going to be a a completely different experience in a few months

Han Wang@handotdev

We just raised a $45M Series B at a $500M valuation led by @a16z and @SalesforceVC to build the knowledge infrastructure for AI

English

148

675

61.7K

sohail@Sohailm25·22 Mar

is it just me who thought this was a megamind shaped creature flailing a box between its 5 appendages with its head cocked back

ORCA Dexterity@orcahand

Last week, we announced our three new hands. Today, we're releasing their digital twins ↓↓↓ > new orcahand mjcf/urdf files available on github.com/orcahand/orcah… > custom learning environment @ github.com/orcahand/orca_…

English

101

sohail@Sohailm25·22 Mar

this was a GREAT piece for anyone who has only heard of world models but gotten see into it

Packy McCormick@packyM

Read this over the weekend (bonus if you read all the papers in the Research List) and you’ll be among the “very few who understand how far-reaching” the shift to World Models is. notboring.co/p/world-models

English

844

sohail@Sohailm25·21 Mar

full writeup with methods and limitations sohailmo.ai/research/laten…

English

sohail@Sohailm25·21 Mar

the main weakness is honest here: - held-out prediction R²=0.24, so the structure is recoverable but not yet linearly predictable - one model, oracle-level analysis, no causal claim but the ceiling and the task conditioning are worth characterizing

English

sohail@Sohailm25·21 Mar

the Kimi team’s Attention Residuals paper (arxiv.org/abs/2603.15031) made me ask a simple question if replacing fixed residual accumulation with learned input-dependent depth routing helps that much, how much of that routing signal already exists in standard frozen transformers? so I ran an experiment

English

Keşfet

@ashwingop @matt_slotnick @charles_irl @modal @togethercompute @ndrewpignanelli @JayScambler @SIGKITTEN