Jainit Purohit

687 posts

Jainit Purohit

Jainit Purohit

@mjainit

CTO @ Terrabase, AI agents, eval harnesses, decision infrastructure Meditation | Dhamma | phenomenology

Katılım Ocak 2010
1.3K Takip Edilen294 Takipçiler
Sabitlenmiş Tweet
Jainit Purohit
Jainit Purohit@mjainit·
May you find release with boundless freedom, With formless expansion stretching in all directions, Boundless, immeasurable, limitless, with goodwill in its core, and filled with joy, reaching far, evermore.
Jainit Purohit tweet media
English
0
0
15
1.2K
Jainit Purohit
Jainit Purohit@mjainit·
@thsottiaux Codex has been great, thanks! One issue though on Mac: in long threads, scrolling up sometimes jumps to random positions much higher in the thread. It breaks context and you have to scroll back down again. Pretty frustrating, would be great to see this fixed.
English
0
0
8
6.6K
Tibo
Tibo@thsottiaux·
Codex Compute efficient ✅ Always up, never down ✅ Best at hardcore engineering ✅ Crazy good app, first to escape the terminal ✅
English
454
187
5.1K
2.4M
Jainit Purohit retweetledi
Soumya Jain
Soumya Jain@wild_and_empty·
This is exactly why trace data shouldn’t just sit in an observability bucket. One layer below this is -- reviewed traces can also teach the agent how to work better, not just tell us whether the final answer was good. You can look at strong runs, review the workflow itself, tool calls, handoffs, context gathering, etc and turn that into a retrievable execution layer. So not just a data asset for analysis, but a learning layer for better runtime behavior.
English
1
2
3
650
Jainit Purohit
Jainit Purohit@mjainit·
@manthanguptaa same fear cycle every year. first "software engineering is solved", now "agents running companies". wonder how many actually survive production and real users most teams still fighting evals and edge cases. every line shipped is future maintenance debt.
English
1
0
0
183
Manthan Gupta
Manthan Gupta@manthanguptaa·
AI Twitter lately feels like pure fear mongering. If you are not spinning up 20 agents in parallel or pushing 200 commits a day, you are “falling behind.” If your agents aren't earning you a million then you will be poor. Most of it feels lot like noise right now
English
110
30
515
20.8K
Jainit Purohit
Jainit Purohit@mjainit·
@pmarca You’re conflating rumination with introspection. Rumination reinforces negative pathways. Introspection enables metacognition and error correction. No cognitive tool is inherently good or bad. Outcomes depend on whether it produces emotional loops or better models of reality.
English
27
61
1.3K
85.4K
Marc Andreessen 🇺🇸
My big conclusion from this week: Introspection causes emotional disorders.
English
1.6K
647
10.6K
52.2M
Jainit Purohit
Jainit Purohit@mjainit·
@elonmusk You’re conflating rumination with introspection. Rumination reinforces negative pathways. Introspection enables metacognition and error correction. No cognitive tool is inherently good or bad. Outcomes depend on whether it produces emotional loops or better models of reality.
English
1
0
10
700
Jainit Purohit
Jainit Purohit@mjainit·
yep, this is basically the exact workflow I use right now. I intended to automate the entire self-improvement loop but apart from obvious overfitting issues, you start seeing unknown behaviors and unnecessary layers of abstraction creeping in. The last remaining stabilizing step I had to add was a human-in-the-loop. very interesting you mentioned human-in-the-loop "today" because it probably won’t be required in the future. part of why karpathy’s autoresearch works well is because it is optimizing a single clean objective: validation bits per byte. whereas an agent harness is much messier. you end up having to measure and tune multiple things at once like trajectory correctness, tool call success rate, end outcome quality, handoff efficiency, context groundedness, and a bunch of other interacting metrics.
English
0
0
2
41
Viv
Viv@Vtrivedy10·
exciting avenues where evals/specs become the base language to build agents: - start with a base harness, pretty barebones - specify a goal to your agent. build up exactly what you mean with the agent - map your crafted goal to specs/evals with the agent. Together you think really hard about “what do I want the agent behavior to be” - agent loops and adjusts the harness until a threshold of evals pass - human in the loop today for cheating/overfitting Evals are a great language to specify behavior Every row in your Eval dataset is a little vector that shifts the agent definition towards behavior to make that Eval pass
English
4
2
42
3K
Jainit Purohit retweetledi
Hamel Husain
Hamel Husain@HamelHusain·
Made this video to explain evals
English
37
58
458
85.3K
Jainit Purohit
Jainit Purohit@mjainit·
agents are commodity edge = context quality × harness design × eval loops compounding comes from faster: trace → failure → tweak → re-run cycles
English
0
0
1
128
Jainit Purohit
Jainit Purohit@mjainit·
@Vtrivedy10 are you guys thinking about trace —> eval —> harness improvement loops in langsmith? feels like missing infra for fast harness iteration
English
0
0
0
25
Jainit Purohit
Jainit Purohit@mjainit·
@Vtrivedy10 agreed. big unlock now is trace-driven iteration not just measuring runs, but learning from traces what and how to tweak in harness to hill climb fast building with deepagents around this. measure, eval, tweak, repeat for long-horizon data tasks
English
1
0
0
29
Jainit Purohit
Jainit Purohit@mjainit·
another solid piece by @vtrivedy10 been a fan since he coined haas agent performance is now as much a harness problem as a model problem same model, different harness, wildly different outcomes we’re still early. a lot of the alpha is still here
Viv@Vtrivedy10

x.com/i/article/2031…

English
1
0
2
305
Jainit Purohit retweetledi
Alex Mordvintsev
Alex Mordvintsev@zzznah·
Every time you read or listen to something, you're running untrusted code on your wetware with no sandbox. Reading is code execution. Text is the oldest exploit. Choose your inputs carefully. Including this post.
English
113
183
1.5K
93.8K
Jainit Purohit retweetledi
ThePrimeagen
ThePrimeagen@ThePrimeagen·
Before ai, you were bad decisions you could feel how bad they were and often could correct course on the after a couple weeks. Slowing down often helped make decisions I am happy about for months or years Now our brick laying meme is the norm
ThePrimeagen tweet media
English
27
17
643
36.2K
Jainit Purohit retweetledi
ThePrimeagen
ThePrimeagen@ThePrimeagen·
I hate these "coding isn't the hard part" tweets I have been a part of and seen several companies not just struggling with "the right decision" but the culmination of their past technical decisions. AI won't magically make this go away. Lines of Code is still a liability and producing it faster doesn't change or reduce it, if anything it increases liability. Room temperature Twitter take strikes yet again
English
241
213
4.6K
236.6K
Jainit Purohit
Jainit Purohit@mjainit·
LangChain is moving so fast that upgrading to the latest version and aligning my harness with new features has become a routine every few days!!! 🚀
LangChain JS@LangChain_JS

🚀 deepagents@1.6.2 is out! • Skills and memory now properly restore from StateBackend checkpoints • Fixed infinite loop when agents read large files • Removed unnecessary REMOVE_ALL_MESSAGES operations in PatchToolCallsMiddleware — fewer message mutations during tool call handling Upgrade: npm i @langchain/deepagents@latest github.com/langchain-ai/d…

English
0
0
1
81
Jainit Purohit retweetledi
LangChain
LangChain@LangChain·
📊 New blog: Choosing the right multi-agent architecture Start with a single agent. But when you need multi-agent capabilities, pick the right pattern: 👥 Subagents - Centralized orchestration for multiple domains 💡 Skills - Progressive disclosure, load capabilities on-demand 🔄 Handoffs - Sequential workflows with state transitions 🧭 Router - Parallel dispatch across specialized agents Includes performance benchmarks, decision framework, and code examples. 📖 Read the full guide: blog.langchain.com/choosing-the-r…
English
14
53
354
34.9K
Jainit Purohit retweetledi
Soumya Jain
Soumya Jain@wild_and_empty·
Goal for this lifetime: Fierce in discipline, soft in heart, empty of self, precise in craft.
English
1
1
3
87