Antrixsh Gupta

2K posts

Antrixsh Gupta banner
Antrixsh Gupta

Antrixsh Gupta

@AntrixshG

Data Science Professional, Technology Geek,

Pune, India Katılım Ağustos 2018
114 Takip Edilen423 Takipçiler
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
Vibe coding hackathon You all love this.
Antrixsh Gupta tweet media
English
0
0
0
34
Arsh Goyal
Arsh Goyal@arsh_goyal·
Guess the city?
Arsh Goyal tweet media
English
27
4
91
11.5K
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
A lot of AI agent startups got cooked today. Anthropic launched Managed Agents. A fully hosted service that runs long-horizon AI agents on your behalf. Session management. Sandbox execution. Context engineering. Failure recovery. All of it. Native to the Claude platform. Here is what this actually means. There is an entire category of startups whose product is exactly this. “We make AI agents reliable at scale.” That is their pitch. That is their Series A. That is their moat. Anthropic just made it a feature. The AI agent infrastructure space just got a lot more crowded.
Claude@claudeai

Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.

English
0
0
0
52
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
Oracle fired 30k employees via 6 AM email. And Oracle is not a struggling company, they made most money than ever. Despite that, 30k people lost their jobs. They told you to code, you did. They told you to upskill, you did. They told you to learn AI, you did. And then they replaced you with the same system you helped them build. #oracle #layoff
English
0
0
0
34
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
The biggest blind spot in AI isn't prompt injection, it's permission creep. Agents execute valid API calls that quietly cross trust boundaries. EDR sees processes, not context drift. Treat agents like privileged identities, not just software. #AIInfra #AppSec #Agents
Antrixsh Gupta tweet media
English
0
0
0
33
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
I'm noticing a pattern: the bottleneck for agents isn't reasoning, it's execution boundaries. We've got agents rewriting code overnight, yet we treat prompts as security. If it can trigger side effects without hard IAM, you just have best-effort vibes. #AIagents #LLMs #IAM
Antrixsh Gupta tweet media
English
0
0
0
12
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
Noticing a shift away from naive RAG. Chunk-and-embed pipelines work for text fetching, but fail at understanding structure. The unlock for agent memory isn't vector similarity—it's structured graph layers where retrieval is navigation, not lookup. #RAG #AIAgents #LLMs
Antrixsh Gupta tweet media
English
0
0
0
18
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
I'm noticing a shift in agent infra: browser automation is a dead end. Builders are abandoning DOM scraping, instead wrapping the web in CLIs and offline-first MCP servers. Stop making your models read HTML. Give them native programmatic access. #AIagents #MCP #dev
Antrixsh Gupta tweet media
English
1
0
1
30
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
I'm noticing a pattern: our evals are rotting. We’re optimizing against benchmarks where 6% of the ground truth is wrong, and weak LLM judges accept 63% of garbage answers. We're benchmarking context windows, not actual memory. Build deterministic evals. #LLMs #Evals #GenAI
Antrixsh Gupta tweet media
English
0
0
0
14
Sick
Sick@sickdotdev·
Hey Founders Drop what you’re building👇 Last time 1M+ people saw it. Consider this as marketing.
English
345
8
162
13.7K
Blake Emal
Blake Emal@heyblake·
Drop your project URL Let’s drive some traffic
English
1.4K
16
734
121.1K
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
An agent without strict monitoring and token limits isn't a feature. It's a memory leak with a credit card attached. Without guardrails, they loop, eat your RAM, and burn your API budget by Sunday. Agents are volatile infra, not static code. #AIAgents #Infra #LLMs
Antrixsh Gupta tweet media
English
2
0
0
101
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
I'm noticing a massive blind spot in agent evals: grading the final output. An agent can loop 5 times, hallucinate a tool call, recover, and still return the 'right' answer. If you aren't scoring the execution trace, your evals are lying to you. #AIAgents #Evals #LLMs
Antrixsh Gupta tweet media
English
1
0
2
16
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
The real issue is we're optimizing models against broken yardsticks. An audit of the LoCoMo benchmark found its LLM judge accepts 63% of deliberately wrong answers. LongMemEval is just a context window test. Stop trusting leaderboards. Build custom evals. #LLMs #Evals #AI
Antrixsh Gupta tweet media
English
0
0
0
23
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
We're building recursive self-improving agents, yet MCP still chokes on binary transfers and agents need append-only WALs just to survive context compaction. The real bottleneck isn't reasoning. It's brittle infra and output-only evals. #AIAgents #LLMs #DevTools
Antrixsh Gupta tweet media
English
0
0
0
11
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
Evaluating AI agents on their final output is a massive blind spot. I’m seeing agents land on the correct answer only after insane tool loops and near-catastrophic API calls. Stop scoring the output. The real signal is in the execution trace. #AIAgents #LLMs #Evals
Antrixsh Gupta tweet media
English
0
0
0
10
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
Most 'agents' are just hardcoded DAGs with an LLM node in the middle. And that's fine. Hardcode your logic. Use models strictly for messy inputs. When workflows break, you patch a node. When agents break, you're lost in hallucinated tool calls. #LLMs #Agents #DevTools
Antrixsh Gupta tweet media
English
0
1
0
9
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
The biggest blindspot in agent dev isn't reasoning, it's infra. We obsess over prompts but ignore structural failures: missing idempotency keys, hidden trace loops, and duplicate tool calls. Stop evaluating just final outputs and audit the trace. #AIAgents #LLMs #Infra
Antrixsh Gupta tweet media
English
3
0
2
27
Antrixsh Gupta
Antrixsh Gupta@AntrixshG·
Evaluating agents purely on final output is a trap. They can hit the right answer while doing nonsense under the hood: infinite loops, hallucinated tool calls, and wasted compute. If your evals don't score the execution trace, you're flying blind. #Agents #Evals #LLMs
Antrixsh Gupta tweet media
English
0
0
0
11