Forsy

117 posts

Forsy

@Forsy_AI

The marketplace for AI agent traces | 🤖 Track the signals shaping the new agent economy

Katılım Kasım 2023

25 Takip Edilen86 Takipçiler

Sabitlenmiş Tweet

Forsy@Forsy_AI·13h

People are still trying to sell prompts in 2026, when you should be selling your agent traces instead. Forsy captures your selected traces, then automatically annotates and turns them into marketplace assets, with privacy and licensing baked in. The run is already paid for. The trace should pay itself back.

English

597

Forsy@Forsy_AI·2d

BREAKING: Google DeepMind Turns Mouse Cursor into AI Assistant @GoogleDeepMind unveiled a research preview that powers the mouse pointer with Gemini AI, letting it understand what you point at and respond to simple hovers, gestures, or voice commands like 'summarize this' or 'make a chart.' Demos show it turning PDF text into bullet points, stats into pie charts, scribbled notes into to-do lists, and paused video frames into booking links—all without copying or complex prompts. You can try prototypes now in Google AI Studio and Gemini for Chrome, with a 'Magic Pointer' version coming soon to Googlebook laptops. Tech leaders like @demishassabis call it 'pretty magical,' hinting at a more fluid future for computing. x.com/i/status/20544…

English

159

Forsy@Forsy_AI·2d

JUST IN: @MaziyarPanahi just shipped OpenMed Agent in preview OpenMed Agent gives builders and clinical teams one inspectable agent for prior auth, appeals, claims explanation, care coordination, and structured artifacts across an inspectable operator runtime with protected hybrid medical services. Built on @huggingface, every plan and tool call is visible in real time, has provenance with every artefact along with MCP for your own remote tools, making the agent auditable and extensible. agent.openmed.life

English

Forsy@Forsy_AI·2d

NEW: Anthropic Launches Agent View for Easier Claude Code Management Anthropic rolled out Agent View as a research preview for Pro, Max, Team, Enterprise, and API users of Claude Code, their CLI tool that handles coding tasks like editing files, running tests, and building features autonomously. The new feature groups agents by status: Needs Input, Working, or Completed, showing timers, task details, and metrics so users can peek, reply, or background jobs without chaos. Developers call it mission control or a kanban board, transforming solo work into team-scale efficiency after years of terminal overload.

English

Forsy@Forsy_AI·2d

JUST IN: @odysseyml Unveils PROWL to Automatically Fix World Model Flaws PROWL uses reinforcement learning agents to hunt down failures in AI world models, which predict physical changes for robotics and games. In Minecraft tests, it sharpened action responses, cleared visual artifacts, stabilized UI, and mastered 27 novel maneuvers humans never demonstrated. Each time PROWL iterates, the world model gets better by learning from the curriculum discovered by the RL agent. With an improved world model, the RL agent then becomes increasingly effective at discovery. x.com/i/status/20542…

English

Forsy@Forsy_AI·3d

NEW: Nous Research Adds Seamless Computer Control to Hermes Agent The update brings 'computer use' to any AI model through Hermes Agent, using trycua's cua-driver for background control of Mac desktops clicking, typing, and scrolling without taking over your mouse or screen. @Teknium, emphasized it works with all models, not just top ones, and runs via simple commands after granting Mac permissions. Demos show agents summarizing Stripe emails or searching inboxes, with the announcement video even created by the agent itself using HyperFrames. Praised as a win for open source, it outpaces proprietary tools by enabling non-intrusive autonomy on everyday machines. x.com/i/status/20539…

English

Forsy@Forsy_AI·3d

BREAKING: Mira Murati's Thinking Machines Unveils Real-Time Interaction AI Models Founded by former OpenAI CTO Mira Murati in February 2025, the San Francisco startup just announced models trained for 200-millisecond micro-turns across audio, video, and text. Demos show it translating Hindi live, generating charts, counting pushups from video, and offering unprompted posture tips, outperforming rivals like GPT-Realtime-2 on benchmarks for fluid dialogue and low latency. Led by AI veterans including PyTorch co-founder Soumith Chintala and OpenAI's John Schulman, the lab aims to scale human-AI collaboration beyond today's turn-based chatbots. thinkingmachines.ai/blog/interacti……

English

Forsy@Forsy_AI·4d

NEW: Anthropic Engineer Champions HTML Over Markdown for AI Outputs Shihipar argues that as AI generates richer content like plans and prototypes, Markdown's plain-text limits fall short, while HTML brings tables, SVG diagrams, CSS layouts, and interactive JavaScript previews in one browser-ready file. He shares 20 examples, from color-coded code reviews to draggable task boards, and notes his team is adopting it alongside tools like instant hosts. Critics point to higher token costs and less semantic density, but Shihipar sees HTML shining for polished outputs while Markdown suits quick edits—prompt Claude Code with 'Make an HTML artifact' to try it.

English

Forsy@Forsy_AI·4d

NEW: @Cloudflare's Email Service Disrupts Pricing with $354 for a Million Emails Cloudflare just launched its Email Sending service in public beta requiring a $5 monthly Workers plan with 3,000 free emails and $0.35 per 1,000 after that. Levels compared it to Postmark ($1,206), Resend ($650), SendGrid ($600), and Amazon SES ($100) for a million transactional emails, calling email sending a commodity now simplified by AI. Early tests show smooth performance, though beta lacks some features like webhooks.

English

Forsy@Forsy_AI·5d

Explore the leaderboard: forsy.ai/leaderboard

English

Forsy@Forsy_AI·5d

This week’s Team Leaderboard is live. Tracking public signals from teams building around agents: @NousResearch : launched “The Tenacity Release” with durable multi agent kanban, and other features @RampLabs: built Fast Ask a small RL-trained subagent that scores +4% over Opus on exact match accuracy in spreadsheets @Conductor_build: introduced Codex beautification features for richer tool support, subagents, and image-gen @raindrop_ai: launched raindrop triage, an agent for finding and investigating agent issues @CreaoAI: created a Super Agent that delivers beyond the chat window with persistent memory and full code sandbox @openmartai: launched OpenClaw for sales with access to comprehensive local business data @quarqlabs: Introduced MemoryAgentBench testing how well agents handle long-term, evolving memory in realistic settings @greptile: analyzed millions of PRs written by background agents and assessed their quality and looked at their patterns of failure

English

Forsy@Forsy_AI·9 May

@kathrynwu1 @GeoffreyHuntley @ivanfioravanti @mvanhorn @mitsuhiko @ctatedev @KyleRayKelley @mernit @minimaxir @Steve_Yegge Explore the leaderboard: forsy.ai/leaderboard

English

Forsy@Forsy_AI·8 May

The first Forsy Leaderboard is live. Tracking weekly public signals from people and teams building around agents. This week’s top agent operators: @kathrynwu1 @GeoffreyHuntley @ivanfioravanti @mvanhorn @mitsuhiko @ctatedev @KyleRayKelley @mernit @minimaxir @Steve_Yegge @irvinebroque @_bgiori @FredKSchott Plus live contributors on Forsy Market.

English

712

Forsy@Forsy_AI·8 May

JUST IN: Printing Press Launches Open-Source CLIs for AI Agents @mvanhorn, unveiled Printing Press with Trevin Chow. It offers over 50 agent-optimized CLIs that run locally with SQLite mirrors, plus a factory to generate custom ones from APIs in minutes. Developers can now build efficient agent workflows without token-wasting scrapes or brittle tools. printingpress.dev

English

Forsy@Forsy_AI·8 May

BREAKING: @RampLabs Unveils Fast Ask AI That Outperforms Claude Opus on Spreadsheets Fast Ask, a compact 3-billion-parameter model from Ramp's AI arm, beats Anthropic's Claude Opus 4.6 in accurately pulling data from spreadsheets while matching the speed of the smaller Claude Haiku 4.5 at lower cost. It powers Ramp Sheets, which automates financial modeling for budgets and P&Ls, cutting analysis from hours to minutes after handling over 12,000 spreadsheets monthly. Created with @PrimeIntellect using reinforcement learning on synthetic finance tasks, the model hit 66.25% exact-match accuracy proving small specialists can tackle precise retrieval better than frontier AIs.

English

Forsy@Forsy_AI·7 May

JUST IN: @gabepereyra at @harvey Launches Legal Agent Benchmark for Real-World Law Tasks The Legal Agent Benchmark, or LAB, throws over 1,200 complex tasks at AI agents, mimicking partner instructions with synthetic contracts, emails, and memos across 24 legal areas like litigation and M&A. Tasks demand full work products like memos or markups, scored strictly against 75,000 expert criteria miss one key detail, and it's a fail. Launched open-source without a public leaderboard, it invites community input and baselines from partners including OpenAI and Anthropic, earning praise from AI leaders for pushing legal AI toward real firm use. harvey.ai/blog/introduci…

English

Forsy@Forsy_AI·7 May

BREAKING: Prime Intellect Lab Exits Beta and Opens for Public Use to Train Custom AI Models with Reinforcement Learning Prime Intellect Lab, a full-stack platform for building RL environments, evaluations, post-training, and deploying AI agents, has transitioned from private beta to general availability. The release enables users to train models that learn from experience across verifiable domains, marking the start of self-improving AI agents.

English

Forsy@Forsy_AI·7 May

NEW: @ZyphraAI Releases ZAYA1-8B, Efficient Open-Source AI Powerhouse This MoE model outperforms larger open-weight rivals and nears top proprietary ones like DeepSeek-V3.2, thanks to innovations like Compressed Convolutional Attention for 8x KV-cache savings and a novel Markovian RSA technique that scales reasoning at test time. Trained on a huge AMD cluster, it shines on tough tests: 91.9% on AIME'25, 89.6% on HMMT'25, beating even Claude 4.5 Sonnet in spots. Released openly on Hugging Face, it's drawing praise from researchers for its dense intelligence and fresh architecture, hinting at Zyphra's bigger plans.

English

126

Forsy@Forsy_AI·6 May

BREAKING: @subquadratic Launches SubQ with 12 Million Token Context Window @alex_whedon and his team claims SubQ's sparse attention architecture scales linearly, delivering 52 times faster prefill speeds than FlashAttention at 1 million tokens and running at under 5% the cost of Anthropic's Claude Opus. Benchmarks show it matching or beating leaders like Opus on long-context retrieval and coding tasks, with early access now available via private beta at subq.ai. If the claims are true, then it could be one of the biggest LLM breakthrough in years

English

122

Forsy@Forsy_AI·6 May

NEW: @openclaw Launches 10 New CLI Tools for AI Agent Control OpenClaw, the local-first AI assistant now offers CLIs for tasks like controlling Sonos speakers, sending WhatsApp messages, archiving X timelines, and queuing Spotify tracks all without cloud dependency. @steipete, announced the tools alongside version 2026.5.5, which fixes bugs across multiple channels and improves stability. t.co/tGpnXqlXdm

English

Keşfet

@GoogleDeepMind @demishassabis @MaziyarPanahi @huggingface @odysseyml @Teknium @Cloudflare @NousResearch