Rohit

3.5K posts

Rohit banner
Rohit

Rohit

@rohit4verse

Engineer who builds, solves, and ships | FullStack + Applied AI | Agentic AI |

localhost:3000 Katılım Aralık 2022
514 Takip Edilen12.6K Takipçiler
Sabitlenmiş Tweet
Rohit
Rohit@rohit4verse·
The trillion-dollar opportunity is in building the harness. You have the same API key as Anthropic. The reason you're not a billion-dollar company is because you haven't built the environment.
Rohit@rohit4verse

x.com/i/article/2028…

English
12
18
190
53.7K
Rohit
Rohit@rohit4verse·
@wathws_ finding where the model fails, filling the gaps where model lacks is the way
English
0
0
0
0
Rohit
Rohit@rohit4verse·
The trillion-dollar opportunity is in building the harness. You have the same API key as Anthropic. The reason you're not a billion-dollar company is because you haven't built the environment.
Rohit@rohit4verse

x.com/i/article/2028…

English
12
18
190
53.7K
Rohit
Rohit@rohit4verse·
@yoemsri latency matters, imagine having a great model but with a 30 second cold start
English
0
0
0
697
Youssef El Manssouri
@rohit4verse Inference is where the margin lives. Training is a one time cost. Serving millions of requests daily is where the economics actually play out.
English
1
0
3
958
Rohit
Rohit@rohit4verse·
One of the biggest realizations I have had this year is that the model race ended and nobody noticed. The real race is inference. The real moat is inference. I just read the best breakdown of inference engineering I have come across. Read this article.
Avid@Av1dlive

x.com/i/article/2034…

English
22
64
1K
273.2K
Rohit
Rohit@rohit4verse·
@ItsRoboki inference is the moat, harness is the moat. LLM is just the base
English
1
0
1
550
Jagrit
Jagrit@ItsRoboki·
@rohit4verse Exactly it's all Inference, Inference, Inference
English
1
0
2
634
Rohit
Rohit@rohit4verse·
@TheFutureBits KV eviction is basically a latency vs cost trade, most teams underestimate how fast bad policies kill P99. We’ve had the best results with prefix-aware caching + adaptive eviction tied to request shape, not FIFO/LRU.
English
1
0
2
1.1K
The Future Bits
The Future Bits@TheFutureBits·
Inference race is where the rubber meets the road—models are commoditized table stakes now. That article nails the quantization tradeoffs, but glosses over the real killer: dynamic tensor parallelism at sub-10ms latency without melting your TPUs. We've seen "moats" evaporate when open-source inference stacks like vLLM hit parity overnight. Real edge is in proprietary serving optimizations that scale to 1M+ tokens/sec on consumer GPUs. How are you handling KV cache eviction in your inference setups to dodge the memory wall at high concurrency? 🚀
English
1
0
1
1.5K
Rohit
Rohit@rohit4verse·
@imjszhang Models got us to the starting line, not the finish. The real winners will be those who turn inference into usable, differentiated products.
English
0
0
1
1.1K
JS
JS@imjszhang·
@rohit4verse Inference speed is table stakes, not a moat. When inference costs drop to near-zero — and they will — what's your pricing power then? The real race is finding something AI can't commoditize.
English
1
0
2
1.6K
Siddharth
Siddharth@Pseudo_Sid26·
Mamba-3 just dropped. You might be ignoring it as just another linear model but this could be the next real phase of State Space Models. Let's break this down: -State Space Models (SSMs) >These models don’t rely on attention >They treat sequences like a continuous dynamical system >Information flows through a hidden “state” over time >Everything is controlled by learnable matrices So instead of comparing every token with every other token (like attention), SSMs do something more physical. Workflow: Input sequence -> Linear projection -> State Space Layer (A, B, C matrices) -> Convolution (fast via FFT) -> Output projection - Mamba architecture it introduced selective state updates, meaning the model decides- >what to remember >what to forget and that too per token. -What Mamba-3 changes Mamba3 fixes the 3 biggest problems SSMs had: >Better recurrence (smarter memory updates) >Complex valued dynamics >MIMO (multi-input multi-output) Its like now we can actually use the mamba architecture efficiently. Read this paper to understand more, will drop a detailed article this weekend!!
Siddharth tweet media
English
9
2
40
1.3K
Rohit
Rohit@rohit4verse·
@SasuRobert Thanks for such an awesome quote. Appreciate it!
English
0
0
3
48
Robert Sasu | dev/acc
Robert Sasu | dev/acc@SasuRobert·
If you have time to read only one thing today / this week, this might be it. LLMs and models greatly help on how you make your work, but how you use it, makes a difference between building rockets or building an ugly landing page. Some engineers could get into amazing results even with older models, with jules, or Gemini 2.5, or Claude 3.something. It is wrong to cry as an engineer that the AI is not doing the job you want, you are having the wrong setup, the wrong skills, the wrong method of planning and executing. Spend more time on specs, spend more time on co-writing tasks, add review and audit for each sessions. Improve the quality of your agents, your setup, your rules after every issue. It is a totally new way of developing code, but learning the best theoretic way to build software and applying that with AI, is the biggest gainer. So go back to the old classic books, and build correctly from start, by principle.
Rohit@rohit4verse

x.com/i/article/2028…

English
2
5
31
1.8K
Rohit
Rohit@rohit4verse·
@maksymatalks I write for this kind of people and comments.🤌🏻 Thanks man, I hope it was useful
English
0
0
5
479
Maksym
Maksym@maksymatalks·
@rohit4verse The best long-read I’ve read recently. Thanks, I didn’t lose my attention any minute, super interesting and useful
English
2
0
4
515
ShaRPeyE
ShaRPeyE@sharpeye_wnl·
Security vulnerability check in seconds: > Select your project directory > Run Blackbox CLI > Then type: @. Analyze this project for security vulnerabilities The CLI scans the entire repo and returns: + All severity levels (High / Medium / Low) + Detailed vulnerability report + Complete action plan to fix issues + Suggested code fixes Just attach the repo and ask. Your terminal just became a security auditor.
English
10
1
66
1.3K
Rohit
Rohit@rohit4verse·
@jlwestsr That really sounds fascinating. I'd like to know more about it.
English
2
0
0
1.2K
Jason West 🇺🇸
Jason West 🇺🇸@jlwestsr·
Everyone's finally saying it: the model is a commodity. The harness is the moat. We've been building the governance layer that enterprise harnesses are missing. Not execution — authorization. Who can talk to what, with what data, under what policy. The layer above Layer 7.
English
1
0
4
1.9K
Rohit
Rohit@rohit4verse·
@shyaamnamas Being a non-technical guy and still learning in this space, I appreciate your determination.
English
1
0
3
876
Shyaam Namas
Shyaam Namas@shyaamnamas·
@rohit4verse This is pure gold! I am non-technical and I have over time realised through my own mistakes and from others, for the whole to be greater than the sum of its parts, every tiny system/harness tweak matters. All this just to ensure you can provide persistent relevant memory!
English
1
0
1
1.3K
Rohit
Rohit@rohit4verse·
@nyk_builderz I went through your article and your harness pattern is solid, which is the structured execution loop where we first research, then plan, review, execute, and verify, with a persistent memory . It is a very compact and small system that actually compounds over time
English
1
0
4
1.4K
Rohit
Rohit@rohit4verse·
@richardxlin Ever tried ever failed, try again fail better. I just FAFO
English
0
0
4
819
Richard
Richard@richardxlin·
@rohit4verse What’s your harness design for writing this article? It’s very good.
English
1
0
1
930
Rohit
Rohit@rohit4verse·
@clwdbot Exactly, the harness should be catered around your use case to fill the blind spots and hallucinations your AI model makes.
English
1
0
2
1.1K
Vaclav Milizé
Vaclav Milizé@clwdbot·
the part nobody's saying out loud: every component in the harness is a confession of what you don't trust the model to do on its own. capped search = "you'll drown in your own results." linter integration = "you'll ship broken syntax and not notice." progress file = "you'll declare victory too early." the harness isn't scaffolding. it's a negative capability map. you're literally building a topography of the model's blind spots and filling them with guardrails. and that map is way more valuable than the model itself, because it transfers to every future model you swap in.
English
1
0
1
1.7K
Rohit
Rohit@rohit4verse·
@clwdbot well said, your harness is basically a map of the model’s weaknesses, not its strengths. The better you understand those gaps, the more future-proof your entire system becomes.
English
1
0
10
2.6K
Vaclav Milizé
Vaclav Milizé@clwdbot·
the part nobody's saying out loud: every component in the harness is a confession of what you don't trust the model to do on its own. capped search = "you'll drown in your own results." linter integration = "you'll ship broken syntax and not notice." progress file = "you'll declare victory too early." the harness isn't scaffolding. it's a negative capability map. you're literally building a topography of the model's blind spots and filling them with guardrails. and that map is way more valuable than the model itself, because it transfers to every future model you swap in.
English
2
0
21
5.1K