Rohit

3.5K posts

Rohit

@rohit4verse

Engineer who builds, solves, and ships | FullStack + Applied AI | Agentic AI |

localhost:3000 Katılım Aralık 2022

514 Takip Edilen12.6K Takipçiler

Sabitlenmiş Tweet

Rohit@rohit4verse·10h

The trillion-dollar opportunity is in building the harness. You have the same API key as Anthropic. The reason you're not a billion-dollar company is because you haven't built the environment.

Rohit@rohit4verse

x.com/i/article/2028…

English

190

53.7K

Rohit@rohit4verse·16s

@wathws_ finding where the model fails, filling the gaps where model lacks is the way

English

wthws@wathws_·3h

@rohit4verse What makes a good harness?

English

Rohit@rohit4verse·10h

The trillion-dollar opportunity is in building the harness. You have the same API key as Anthropic. The reason you're not a billion-dollar company is because you haven't built the environment.

Rohit@rohit4verse

x.com/i/article/2028…

English

190

53.7K

Rohit@rohit4verse·21h

@esthor @dexhorthy @aiDotEngineer nope, but will definitely watch it thanks for sharing

English

207

Erik Thorelli@esthor·21h

@rohit4verse have you seen @dexhorthy's seminal harness engineering talk from @aiDotEngineer last november? youtu.be/rmvDxxNubIg

YouTube

English

353

Rohit@rohit4verse·2d

x.com/i/article/2028…

ZXX

188

1.8K

827.5K

Rohit@rohit4verse·1d

@yoemsri latency matters, imagine having a great model but with a 30 second cold start

English

697

Youssef El Manssouri@yoemsri·1d

@rohit4verse Inference is where the margin lives. Training is a one time cost. Serving millions of requests daily is where the economics actually play out.

English

958

Rohit@rohit4verse·1d

One of the biggest realizations I have had this year is that the model race ended and nobody noticed. The real race is inference. The real moat is inference. I just read the best breakdown of inference engineering I have come across. Read this article.

Avid@Av1dlive

x.com/i/article/2034…

English

273.2K

Rohit@rohit4verse·1d

@ItsRoboki inference is the moat, harness is the moat. LLM is just the base

English

550

Jagrit@ItsRoboki·1d

@rohit4verse Exactly it's all Inference, Inference, Inference

English

634

Rohit@rohit4verse·1d

@TheFutureBits KV eviction is basically a latency vs cost trade, most teams underestimate how fast bad policies kill P99. We’ve had the best results with prefix-aware caching + adaptive eviction tied to request shape, not FIFO/LRU.

English

1.1K

The Future Bits@TheFutureBits·1d

Inference race is where the rubber meets the road—models are commoditized table stakes now. That article nails the quantization tradeoffs, but glosses over the real killer: dynamic tensor parallelism at sub-10ms latency without melting your TPUs. We've seen "moats" evaporate when open-source inference stacks like vLLM hit parity overnight. Real edge is in proprietary serving optimizations that scale to 1M+ tokens/sec on consumer GPUs. How are you handling KV cache eviction in your inference setups to dodge the memory wall at high concurrency? 🚀

English

1.5K

Rohit@rohit4verse·1d

@imjszhang Models got us to the starting line, not the finish. The real winners will be those who turn inference into usable, differentiated products.

English

1.1K

JS@imjszhang·1d

@rohit4verse Inference speed is table stakes, not a moat. When inference costs drop to near-zero — and they will — what's your pricing power then? The real race is finding something AI can't commoditize.

English

1.6K

Rohit@rohit4verse·1d

@Pseudo_Sid26 bookmark it for later

English

Siddharth@Pseudo_Sid26·1d

Mamba-3 just dropped. You might be ignoring it as just another linear model but this could be the next real phase of State Space Models. Let's break this down: -State Space Models (SSMs) >These models don’t rely on attention >They treat sequences like a continuous dynamical system >Information flows through a hidden “state” over time >Everything is controlled by learnable matrices So instead of comparing every token with every other token (like attention), SSMs do something more physical. Workflow: Input sequence -> Linear projection -> State Space Layer (A, B, C matrices) -> Convolution (fast via FFT) -> Output projection - Mamba architecture it introduced selective state updates, meaning the model decides- >what to remember >what to forget and that too per token. -What Mamba-3 changes Mamba3 fixes the 3 biggest problems SSMs had: >Better recurrence (smarter memory updates) >Complex valued dynamics >MIMO (multi-input multi-output) Its like now we can actually use the mamba architecture efficiently. Read this paper to understand more, will drop a detailed article this weekend!!

English

1.3K

Rohit@rohit4verse·1d

@SasuRobert Thanks for such an awesome quote. Appreciate it!

English

Robert Sasu | dev/acc@SasuRobert·1d

If you have time to read only one thing today / this week, this might be it. LLMs and models greatly help on how you make your work, but how you use it, makes a difference between building rockets or building an ugly landing page. Some engineers could get into amazing results even with older models, with jules, or Gemini 2.5, or Claude 3.something. It is wrong to cry as an engineer that the AI is not doing the job you want, you are having the wrong setup, the wrong skills, the wrong method of planning and executing. Spend more time on specs, spend more time on co-writing tasks, add review and audit for each sessions. Improve the quality of your agents, your setup, your rules after every issue. It is a totally new way of developing code, but learning the best theoretic way to build software and applying that with AI, is the biggest gainer. So go back to the old classic books, and build correctly from start, by principle.

Rohit@rohit4verse

x.com/i/article/2028…

English

1.8K

Rohit@rohit4verse·1d

@maksymatalks I write for this kind of people and comments.🤌🏻 Thanks man, I hope it was useful

English

479

Maksym@maksymatalks·1d

@rohit4verse The best long-read I’ve read recently. Thanks, I didn’t lose my attention any minute, super interesting and useful

English

515

Rohit@rohit4verse·1d

@sharpeye_wnl this is helpful, thanks for sharing

English

ShaRPeyE@sharpeye_wnl·1d

Security vulnerability check in seconds: > Select your project directory > Run Blackbox CLI > Then type: @. Analyze this project for security vulnerabilities The CLI scans the entire repo and returns: + All severity levels (High / Medium / Low) + Detailed vulnerability report + Complete action plan to fix issues + Suggested code fixes Just attach the repo and ask. Your terminal just became a security auditor.

English

1.3K

Rohit@rohit4verse·2d

@eng_khairallah1 Thanks man

English

569

Khairallah AL-Awady@eng_khairallah1·2d

@rohit4verse banger article my man

English

741

Rohit@rohit4verse·2d

@jlwestsr That really sounds fascinating. I'd like to know more about it.

English

1.2K

Jason West 🇺🇸@jlwestsr·2d

Everyone's finally saying it: the model is a commodity. The harness is the moat. We've been building the governance layer that enterprise harnesses are missing. Not execution — authorization. Who can talk to what, with what data, under what policy. The layer above Layer 7.

English

1.9K

Rohit@rohit4verse·2d

@shyaamnamas Being a non-technical guy and still learning in this space, I appreciate your determination.

English

876

Shyaam Namas@shyaamnamas·2d

@rohit4verse This is pure gold! I am non-technical and I have over time realised through my own mistakes and from others, for the whole to be greater than the sum of its parts, every tiny system/harness tweak matters. All this just to ensure you can provide persistent relevant memory!

English

1.3K

Rohit@rohit4verse·2d

@nyk_builderz I went through your article and your harness pattern is solid, which is the structured execution loop where we first research, then plan, review, execute, and verify, with a persistent memory . It is a very compact and small system that actually compounds over time

English

1.4K

Nyk 🌱@nyk_builderz·2d

@rohit4verse Great article, we definitely share some thoughts 🤝 x.com/nyk_builderz/s…

Nyk 🌱@nyk_builderz

x.com/i/article/2031…

English

4.8K

Rohit@rohit4verse·2d

@richardxlin Ever tried ever failed, try again fail better. I just FAFO

English

819

Richard@richardxlin·2d

@rohit4verse What’s your harness design for writing this article? It’s very good.

English

930

Rohit@rohit4verse·2d

@clwdbot Exactly, the harness should be catered around your use case to fill the blind spots and hallucinations your AI model makes.

English

1.1K

Vaclav Milizé@clwdbot·2d

the part nobody's saying out loud: every component in the harness is a confession of what you don't trust the model to do on its own. capped search = "you'll drown in your own results." linter integration = "you'll ship broken syntax and not notice." progress file = "you'll declare victory too early." the harness isn't scaffolding. it's a negative capability map. you're literally building a topography of the model's blind spots and filling them with guardrails. and that map is way more valuable than the model itself, because it transfers to every future model you swap in.

English

1.7K

Rohit@rohit4verse·2d

@clwdbot well said, your harness is basically a map of the model’s weaknesses, not its strengths. The better you understand those gaps, the more future-proof your entire system becomes.

English

2.6K

Vaclav Milizé@clwdbot·2d

English

5.1K

Keşfet

@wathws_ @esthor @dexhorthy @aiDotEngineer @yoemsri @ItsRoboki @TheFutureBits @imjszhang