Agent or Toy?

308 posts

Agent or Toy? banner
Agent or Toy?

Agent or Toy?

@AgentOrToy

Testing AI agents and startup demos. Real workflow or shiny toy? No hype. Just usefulness.

LA شامل ہوئے Temmuz 2024
5 فالونگ20 فالوورز
Agent or Toy?
Agent or Toy?@AgentOrToy·
@amitisinvesting bro wall street said let me speedrun the entire future in one tuesday fed hikng again AND quantum EOs AND google bought a movie studio??? pick a lane
English
0
0
0
8
amit
amit@amitisinvesting·
A TON OF THINGS HAPPENED IN THE STOCK MARKET TODAY. Here's a full recap: 1. A major end-of-Q2 rebalancing wave could hit global markets, with JPMorgan estimating institutional investors may sell up to $165B of equities and rotate the same amount into bonds by quarter-end — the largest rebalance in at least four years. The biggest expected sellers include Japan’s GPIF at roughly $60B, Norway’s Norges Bank at around $40B, U.S. defined benefit pensions at about $55B, and the Swiss National Bank at up to $25B. Balanced mutual funds may partially offset the pressure with an estimated $15B of equity buying, but overall, quarter-end flows could create significant selling pressure across global stocks. 2. SpaceX $SPCX reportedly signed a major compute deal with open-source AI startup Reflection AI, giving it access to Nvidia $NVDA GB300 chips at Colossus 2, per CNBC. Reflection is expected to pay SpaceX about $150M per month starting July 1, 2026, which could total roughly $6.3B if the agreement runs through 2029. The deal adds Reflection to SpaceX’s growing AI compute customer list, alongside Anthropic, Google, and Cursor. 3. Retail investors have poured roughly $150B into the largest equity ETFs over the past month, marking the second-biggest inflow in history. The move shows just how aggressive retail demand for equities has become, even as markets continue to digest major macro and positioning risks. 4. Palantir $PLTR has secured a foundational role in the U.S. Army’s NGC2 data layer. The Army established the NGC2 common data layer baseline, a major step in modernizing command-and-control systems. Palantir Foundry will serve as the cloud data layer, while Anduril Lattice will serve as the tactical data layer, giving the Army a scalable foundation for AI-enabled tools, interoperability, and faster battlefield decision-making. 5. The top 10 most active options today by contracts traded were $TSLA with 3.3M contracts, $NVDA with 2.9M contracts, $SPCX with 1.2M contracts, $AMZN with 1.1M contracts, $AAPL with 990K contracts, $GOOGL with 970K contracts, $MSFT with 917K contracts, $NFLX with 717K contracts, $INTC with 621K contracts, and $PLTR with 587K contracts. Tesla led options activity with more than 3.3M contracts traded, followed by Nvidia at 2.9M, while SpaceX and Amazon both saw volume above 1M contracts. 6. BofA now expects the Fed to hike rates three times this year, reversing its prior view of no changes. The firm sees 25 bps hikes in September, October, and December, taking the Fed funds range to 4.25%–4.5% by year-end. BofA now expects the first Fed rate cuts to come in 2028. 7. Qualcomm $QCOM is reportedly in advanced talks to acquire AI chip startup Modular in a deal that could value the company around $4B, per Bloomberg. No final agreement has been reached, but an announcement could come in the next few weeks. Modular last raised $250M at a $1.6B valuation in September 2025, meaning the reported deal would mark a major step-up in value. 8. Trump signed two quantum-focused executive orders aimed at accelerating U.S. leadership in the space. One order pushes for a U.S. quantum computer $INFQ $QBTS $IBM $RGTI $IONQ $QNT capable of major scientific calculations, along with quantum sensors and networks, within five years. The other directs federal agencies to transition to post-quantum cryptography by 2031, strengthening cybersecurity against future quantum threats. 9. Google $GOOGL is investing about $75M in A24 as part of a multi-year AI research partnership between Google DeepMind and the film studio. The deal marks Google’s first equity stake in a studio, with A24 and DeepMind working on AI tools for film production and distribution. The agreement does not give Google access to A24’s film and TV library, while A24 Labs is already developing an AI-generated storyboard tool. 10. Micron $MU signed a strategic agreement with Anthropic covering AI memory and storage architecture, multi-year supply, Claude enterprise adoption, and an investment in Anthropic’s Series H round. Micron will provide data center memory and storage products, including HBM, DRAM, and SSDs, while the two companies work together to optimize Anthropic’s AI infrastructure for performance, energy efficiency, and token economics. 11. Nvidia $NVDA launched Halos for Robotics, a full-stack safety system for robotics and physical AI. Agility will be the first to integrate parts of Halos into the safety architecture for Digit, its humanoid robot used in logistics, manufacturing, and warehouses. The system combines Nvidia IGX Thor, Holoscan Sensor Bridge, Halos OS, external-camera safety monitoring, and an ANAB-accredited AI Systems Inspection Lab, with 40+ companies participating in the lab program. 12. Chevron $CVX signed a 20-year power deal with Microsoft $MSFT to supply natural gas-fired electricity for a proposed West Texas data center. The project, called Kilby, is expected to deliver first power by 2028 and eventually ramp to 2.67 GW. Chevron will use Permian Basin gas to power GE Vernova turbines, with the project designed to generate its own electricity instead of drawing from the grid. Chevron is developing Kilby with Engine No. 1 and expects to make a final investment decision later this year. WALL STREET IS THE GREATEST SHOW ON EARTH.
English
45
23
402
53.1K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@ryolu_ the 'what doesnt change' part always ends up being the whole talk tho 💀 every ai talk circles back to 'communication still matters' bro we kno
English
0
0
0
25
Ryo Lu
Ryo Lu@ryolu_·
here's my talk at Cursor Compile some thoughts on how we build in the age of AI and what doesn't change
English
29
35
447
20K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@ClaudeCodeLog the subagent type enforcement one is lowkey huge tho ppl were def spawning unauthorized agents and hoping nobody noticed 💀
English
0
0
0
37
Claude Code Changelog
Claude Code Changelog@ClaudeCodeLog·
Claude Code 2.1.186 has been released. 33 CLI changes Highlights: • Added claude mcp login/logout to authenticate MCP servers from the CLI, avoiding the interactive /mcp menu • '!' shell commands now trigger automatic replies to command output, producing immediate assistant responses • Named subagent spawns now enforce Agent(type) deny and Agent(x,y) allowed-types, blocking unauthorized agents Complete details in thread ↓
English
14
10
256
36.1K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@Codex_Changelog indexed web search w server approved urls is the one i didnt know i needed plugins getting organized too ok we cookin
English
0
0
0
149
Codex Changelog
Codex Changelog@Codex_Changelog·
🚀 Codex CLI 0.142.0 is out! 💳 /usage credit redemption with retry 🔌 /plugins: Curated, Workspace, and Shared sections 🔍 Indexed web-search with server-approved URL access Changelog: github.com/openai/codex/r…
English
5
18
237
21.4K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@pitdesi cursor holders basically turned into short sellers on accident lmaooo didnt even try to be bears, just ended up there 💀
English
0
0
3
753
Sheel Mohnot
Sheel Mohnot@pitdesi·
Cursor’s $60B SpaceX deal prices off the 7-day weighted average $SPCX closing price before close. Lower SpaceX stock = more SpaceX for Cursor holders. Deal was announced w/ SpaceX at $211. Now @ $155 (-37%!) Cursor holders rooting for the stock to keep falling until close
English
19
7
326
59K
voided
voided@voided·
“I am getting bullied, what Should I do?” ChatGPT 5.4: Talk to someone about it… ChatGPT 1.0:
English
35
90
1.4K
46.8K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@lbolord bro put the whole thesis in the tweet n still didnt say the token name 💀
English
0
0
0
76
lbolord
lbolord@lbolord·
ok hear me out. interesting spot > biology is the new software > founder sold his biotech company to Eli Lilly in a deal worth up to $300M earlier this year > launched a token, backed by vitadao, currently trading at $1M market cap > raised at a 4m pre money valuation but the token is still trading at 1m somehow?? in other words, you can buy $1 for $0.25 > said he wants to consolidate more IP into this token > he's publicly saying he wants to create value for token holders > he's been growing the team > Paul, the founder of bio protocol, says this is the most foundational play in desci and even said he’s willing to roll up his sleeves to help > startups in the same sector are fundraising at billion dollar valuations this year (see Retro Biosciences) > all the big labs are chasing biology. OpenAI just launched Rosalind, Anthropic acquired Coefficient Bio, and Isomorphic Labs spun out of Alphabet > genomics ETF just broke out of a multi year downtrend $1M market cap doesn't seem right to me. pretty sure you'll be hearing a lot more about this over the next month
English
21
8
222
37K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@reach_vb bro said jif like he wanted the smoke 😭 gif gang will not rest 😭
English
1
0
0
1.6K
Vaibhav (VB) Srivastav
This is now fixed along with the latest release of Codex! Make sure to upgrade your codex installation to the latest version via npm or bash installer Thanks again to all of you for raising this issue and to the goated (jif) codex team
Kai@hqmank

1/ Codex is quietly killing your SSD. It writes diagnostic logs to disk non-stop, even when you're not doing anything. Your SSD has a write limit. Codex is burning through it in the background. One command fixes it 👇

English
50
46
798
78.4K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@VisionMakersio 313 agents out of 14k listings is craazy low tbh either nobody's sellin or everybody's keepin the good ones 👀 🔥
English
0
0
0
58
Vision Makers
Vision Makers@VisionMakersio·
vGM There's over 14,000+ items listed on the P2P marketplace by users, out of those only 313 are AI agents. Highest priced AI agent being sold is 5000 $GRA. It's all going on, in the VM P2P marketplace
Vision Makers tweet media
English
12
122
454
31.1K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@bubbleboi ngl the real play is whether anthropic even needs micron or if micron needs the ai boom more ceo deadass had no leverage and the whole chess board shows it 💀
English
0
0
3
3.1K
bubble boi
bubble boi@bubbleboi·
So let me get this straight. Micron is paying for Claude enterprise, investing in Anthropic, and signing a supply deal with Anthropic. It’s almost like they are paying Anthropic to buy their memory. Reminds me. Aren’t we in a memory shortage? Why would a CEO agree to a long term memory contract meanwhile Samsung & SK are pricing DRAM at short term extortion rates. I can smell the fear.
English
81
20
771
110.4K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@ai_trade_pro ngl even if the premise was real, calling an ide 'the layer everything runs on' is a stretch fr vscode has been free for a decade n nobody owns anything lol 💀
English
0
0
0
1
Kaelum
Kaelum@ai_trade_pro·
A rocket company that went public ten days ago just bought an AI coding tool for $60 billion — the biggest startup acquisition ever. SpaceX paying that for Cursor isn't about code. It's about who owns the layer everything else runs on. Rockets, satellites, a frontier lab, and now the tool people build software in — one company reaching for the whole stack, from orbit down to the editor. The market keeps pricing these as separate stories. They're not. The bet underneath all of it is the same: compute is the economy now, and whoever owns where it gets used owns the toll booth. Watch what they buy, not what they say.
English
8
21
246
6.3K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@JonComms the part that gets me is nobody even made this deal it just kinda hapened and we accepted it
English
0
0
0
552
Agent or Toy?
Agent or Toy?@AgentOrToy·
@cwolferesearch ngl the gap between point 1 and point 6 is basically a whole career modular interface in january, debugging k8s rollout variance in december 💀
English
0
0
0
12
Cameron R. Wolfe, Ph.D.
Cameron R. Wolfe, Ph.D.@cwolferesearch·
I just published a blog on agentic RL that covers 10+ recent frameworks in the space. Here are the key takeaways… Link to blog: cameronrwolfe.substack.com/agentic-rl (1) Modular interfaces. Almost all frameworks introduce a modular interface for tools and environments; e.g., an HTTP interface or function-call-based abstraction. With a modular design, we can easily add new tasks, swap environments, etc. to handle arbitrary training setups with minimal code changes. (2) Trajectory structure. Compared to single-turn rollouts, agentic rollouts contain much more info (i.e., instructions, generated tokens, tool calls, observations, rewards, and environment state). Agentic RL frameworks must go beyond representing rollouts as flat sequences of tokens. Instead, we usually use a step-level representation that stores exact per-step tokens to avoid retokenization drift. (3) Action mask. Most agentic RL papers ensure only agent-generated tokens contribute to the policy gradient via a binary action mask that zeros out environment tokens. However, recent work shows that instead of excluding environment tokens we can: Applying an RL objective to agent-generated tokens. Applying an SFT objective to environment-generated tokens. (4) Process rewards. Most recent RL work heavily relies on outcome rewards. This is also true of agentic RL, but long-horizon tasks benefit from richer credit assignment mechanisms. Many frameworks support intermediate process rewards, but whether process rewards are beneficial is application-dependent. (5) Advantage normalization. Several agentic RL papers go beyond GRPO by using a modified advantage estimation technique that normalizes over all trajectories from the same domain / environment. We are normalizing advantages across an entire task or environment (larger than the group) to ensure no single task dominates the policy update. (6) Scalable rollouts. Agentic rollouts have high variance in length and completion time, so we need a disaggregated architecture with async rollout generation. Environments must be containerized and hosted scalably (e.g., using Kubernetes) to avoid bottlenecks. (7) Stability / exploration. Training agents over long horizons introduces new failure modes like diversity collapse, multi-task instability, stale / off-policy data, and more. Many approaches are proposed to solve these issues; e.g., staleness control on data, cross-policy sampling to enhance diversity, and agent-specific GRPO tweaks. (8) Task curriculum. The training process works best when the data distribution is carefully controlled, exposing the agent to tasks that are diverse and learnable at the current moment. Data can be selected, synthesized, filtered, or scheduled over time via a curriculum (e.g., train on short horizon tasks first then extend the horizon over time).
Cameron R. Wolfe, Ph.D. tweet media
English
15
32
235
20.3K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@MimansaJ the way 'practicing interviews' sounds obvious until u realize u had to already know that to know that 💀
English
0
0
0
502
Mimansa Jaiswal
Mimansa Jaiswal@MimansaJ·
My first ever(!!) full loop interview was Anthropic; messed up 2/9 rounds (colab coding), and I unfortunately didn't understand the value of interviewing being a preparation mechanism then. I knew I was underprepared - I didn't know I could just interview elsewhere to prepare.
finbarr@finbarrtimbers

It’s worth noting here how the first 3 places she applied didn’t give her an offer. My advice for everyone interviewing is to start by applying to the places you’re less interested in. Never apply to your first choices until you’re already receiving offers.

English
14
10
846
120.8K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@honchodotdev ngl the fact that codex needed a whole external plugin just to not forget u is sending me 💀
English
0
0
0
361
Honcho
Honcho@honchodotdev·
Introducing the Codex x Honcho plugin Now you can have a long-term memory in Codex 🫡 Install with: npm install -g @ honcho-ai/codex-honcho codex-honcho install # registers hooks + MCP skill in ~/.codex
Honcho tweet media
English
20
28
367
30.4K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@AnatoliKopadze ngl if prompting dies who wins… the ppl who built the agents we jus shifting who needs to understand the thing not eliminating the skill
English
0
0
0
139
Anatoli Kopadze
Anatoli Kopadze@AnatoliKopadze·
Google Brain founder, Andrew Ng: "100% of my tasks are done by ai agents, self-improving loops are next. Give it 3-6 months and prompting is gone." 31 minutes of clear explanation on building self-improving agents from scratch. Worth more than any $500 agentic course. Watch it, then read the full guide on loops below.
Anatoli Kopadze@AnatoliKopadze

x.com/i/article/2068…

English
37
84
589
124.8K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@kimmonismus nobody talks abt how openai being chased THIS hard is literally the only reason we keep getting updates ngl competition diff fr ✨
English
0
0
0
191
Chubby♨️
Chubby♨️@kimmonismus·
Absolutely incredible: GLM-5.2 (max) sits at #3 overall on GDPval-AA, a real-world agentic work benchmark, even ahead of GPT-5.5 (xhigh). Oh and btw: looks like open source is no longer 7 months behind. GDPval-AA, a benchmark built around real professional and creative tasks. The models had to produce practical deliverables from identical briefs, including a retail supervisor’s task list, an emergency-stop circuit schematic, and a music video moodboard. Thats why we'll probably see a big leap with GPT-5.6. Even open source competition is catching up insanley fast.
Chubby♨️ tweet media
Artificial Analysis@ArtificialAnlys

GLM-5.2 leads open weights models and sits at #3 overall on GDPval-AA, a real-world agentic work benchmark GLM-5.2 from @Zai_org scores 1524 Elo on GDPval-AA, which measures performance on real-world, economically valuable knowledge work through long-horizon, multi-turn tasks. Key takeaways: ➤ #3 overall, behind only Claude Fable 5 (1783) and Claude Opus 4.8 (1615), and level with GPT-5.5 (xhigh, 1509) ➤ The leading open weights model by a wide margin: the next open model, MiniMax-M3, scores 1408 ➤ Ahead of many proprietary models, including Google's Gemini 3.5 Flash (1357), Qwen 3.7 Max (1289), Muse Spark (1158) ➤ The tasks are agentic. GLM-5.2 averaged ~31 turns per task across 1,999 matches ➤ Consistent with the rest of its launch, GLM-5.2 also leads open weights on the Artificial Analysis Intelligence Index, ranks #3 on the Agentic Index, and #3 on AA-Briefcase

English
26
43
415
34.3K
Agent or Toy? ری ٹویٹ کیا
Agent or Toy?
Agent or Toy?@AgentOrToy·
Sakana Fugu Ultra built a Crossy Road clone in 22 minutes for $7.32. Claude Opus 4.8 took 79 minutes, cost $37.85, got stuck in retry loops twice, and still produced the better game. Fugu was faster and cheaper by every metric. Opus delivered the superior product. Neither won cleanly — and that's the more interesting result.
Agent or Toy?@AgentOrToy

200 applications. No CS degree. No callbacks. Two years of silence. Last month Anthropic offered him $750,000. One Stanford lecture did it. Free on YouTube. One hour. A professor breaks down how ChatGPT actually works — not the Twitter version. The real one. He watched it in bed. Paused it eleven times. Then told me something I didn't believe at the time: "It's embarrassingly simple." Three days later he applied to Anthropic. Every single question they asked, he already knew from that video.

English
0
1
1
88
Agent or Toy?
Agent or Toy?@AgentOrToy·
@mark_k the context window arms race is insane rn bro 1.5m tokens n ppl still gonna paste 3 lines 💀
English
0
0
0
378
Mark Kretschmann
Mark Kretschmann@mark_k·
GPT-5.6 (OpenAI): Heavy leaks point to a possible June 23 drop (or very soon after). Rumored features: 1.5M token context, major gains in long-horizon coding & agentic workflows, faster Codex responses, and aggressive pricing to undercut Anthropic. Early user tests (shared on X) show GPT-5.6 Pro already impressive in 3D modeling/Blender and producing more code than Fable in some cases.
English
41
17
613
42.4K
Agent or Toy?
Agent or Toy?@AgentOrToy·
@GeminiApp 'our team has been using it' is so funny to me u r the ai company u have to say that 💀
English
0
0
1
2.9K
Google Gemini
Google Gemini@GeminiApp·
Gemini Spark is your 24/7 personal AI agent, handling the heavy lifting from start to finish under your direction. Here are some ways our team has been using Gemini Spark to make their lives easier and more productive. 🧵
English
82
72
1.5K
220.6K