fils

1.2K posts

fils

@fils

Data Guy and wave man 浪人 San Diego Supercomputer Center (UCSD/SDSC) Research Data Services / Data Initiatives Group

Iowa 参加日 Nisan 2008

1.1K フォロー中349 フォロワー

fils@fils·1d

You have got to be kidding.... news.slashdot.org/story/26/06/02… Here is the NY Times link (pay wall): nytimes.com/2026/06/01/cli…

English

fils がリツイート

Earth Science Information Partners (ESIP)@ESIPfed·3d

🚀The ESIP July Meeting agenda is now live! Our sessions bring together the community for hands-on, interdisciplinary deep dives as we explore "Bridging Divides: Data, Technology, Community" this year.💡 2026julyesipmeeting.sched.com/list/simple

English

fils@fils·6d

OceanHackWeek 2026 (OHW26) OceanHackWeek 2026 (OHW26) will be held on August 24-28, 2026 at the Bamfield Marine Sciences Centre on the West Coast of beautiful Vancouver Island, British Columbia, Canada. oceanhackweek.org/ohw26/

English

fils がリツイート

Gabe Grand@gabe_grand·6d

hot take: dynamic workflows is much better described as an instance of our DisCIPL framework, which predates both RLM and Opus 4.8 👀🙏 (arxiv v1 April 2025)

alex zhang@a1zhang

In case you're curious about why dynamic workflows are so powerful and the future, read the RLM paper! Opus 4.8 + dynamic workflows in Claude Code is perhaps the first instance of a frontier model seriously trained to be an RLM. I suspect within a year they'll just become the standard for nearly all coding agent interactions.

English

274

30.6K

fils がリツイート

isaac 🧩@isaacbmiller1·28 May

DSPy v3.3.0 beta 1 is released on pypi! We would really appreciate your feedback! We are introducing ReActV2 and a much improved LM/BaseLM system, along with a way to pass data to an RLM. Thanks to @MaximeRivest, @kmad, and @mchonedev for their contributions. Install it with `pip install dspy==3.3.0b1`

English

203

19.3K

fils がリツイート

Joshua Gu@astrogu_·21 May

Recent agentic systems (Claude Code, Codex, RLM, etc.) push context out of the prompt and into the environment (e.g., as files). This helps them maintain long-term knowledge about their goals and functionality. 🚨 While this is a good idea, we show a surprising result: systems that use external environments like this perform much better when given a small, fixed-size, in-context, agent-managed cache that "𝘱𝘦𝘦𝘬𝘴 𝘪𝘯𝘵𝘰" these environments. 🚀 Our paper, 𝗣𝗘𝗘𝗞: 𝙖 𝙨𝙮𝙨𝙩𝙚𝙢 𝙛𝙤𝙧 𝙗𝙪𝙞𝙡𝙙𝙞𝙣𝙜 𝙖𝙣𝙙 𝙢𝙖𝙞𝙣𝙩𝙖𝙞𝙣𝙞𝙣𝙜 𝗮𝗻 𝗼𝗿𝗶𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻 𝗰𝗮𝗰𝗵𝗲 𝙛𝙤𝙧 𝙇𝙇𝙈 𝙖𝙜𝙚𝙣𝙩𝙨, introduces this idea. Compared with strong baselines, including RAG, Compaction Agents, and SOTA prompt-learning frameworks, PEEK dominates the cost–quality Pareto frontier: achieving +6.3–34.0% in quality, with fewer iterations and lower cost. Paper: arxiv.org/abs/2605.19932 GitHub: github.com/zhuohangu/peek More in the thread below! (1/N)

English

352

109.1K

fils@fils·26 May

The ACM open access paper: The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science is a nice read. Some elements of section 3 are a bit deep for me. :) However, the rest is very easy to engage with. My workflow needs to rise to the level of ORNL, but much of what they talk about is broadly applicable. dl.acm.org/doi/full/10.11…

English

fils@fils·22 May

@KGConference @KGConference I think this link is bad? Is this the correct one? docs.google.com/forms/d/1zK9mV…

English

The Knowledge Graph Conference (KGC)@KGConference·21 May

Pre-register: docs.google.com/forms/d/1zK9mV…

English

fils がリツイート

The Knowledge Graph Conference (KGC)@KGConference·21 May

The Builder Summer Cohort is enrolling now. May 29 - Aug 21 | 4 micro-certificates | First one free Built for data scientists, knowledge engineers, and technical practitioners. 12 weeks, live + self-paced, includes a KGC 2027 virtual ticket.

The Knowledge Graph Conference (KGC) tweet media

English

181

fils がリツイート

ACM Conference on AI and Agentic Systems@CAISconf·19 May

📋 The full CAIS '26 schedule is live. 61 peer-reviewed papers. 45 live system demos. Three keynotes. No pitch decks. No vibes-based benchmarks. No "AI-powered" anything without the receipts. This is what it looks like when the field stops performing and starts publishing. caisconf.org/schedule/2026/ We're nearly at our registration cap. Single-digit spots left. caisconf.org/registration/ San Jose · May 26–29

English

1.1K

fils@fils·19 May

Will interesting to see the papers that come out of this. An IEEE Agentic AI for Large-scale Science workshop. Paper deadline July 13th. agent4sc.github.io

English

fils がリツイート

Senzing, Inc.@senzing·15 May

Shell companies. Proxy owners. Fragmented registries. On 6/4, Senzing + Understand Beneficial Ownership break down #EntityResolution to find illicit finance — #BeneficialOwnership, sanctions screening, PEP matching & more. #GraphPowerHour w @pacoid hubs.li/Q04gQ7Fw0

GIF

English

175

fils@fils·15 May

Job Opportunity: Strategic Consultant, Open Science, Data Resilience (American Geophysical Union - AGU) Enjoyed being a part of the related meeting in Berlin on this topic by AGU. Glad to see them make this position available to support the work. paycomonline.net/v4/ats/web.php…

English

fils がリツイート

alex zhang@a1zhang·12 May

Some awesome initial experiments on training small RLMs :) A direction I think will be super super important moving forward for fully seeing the capabilities of RLMs vs. traditional agentic systems

alphaXiv@askalphaxiv

Reinforcing Recursive Language Models Can a 4B model learn to recursively call itself to answer hard long-context questions? We RL fine-tuned a small model to behave as a native RLM. On evidence selection across scientific papers, our 4B RLM matches Sonnet 4.6 in quality while running significantly faster and cheaper.

English

293

28.1K

fils がリツイート

alex zhang@a1zhang·11 May

how did I miss this! related to training RLMs :)

Apurva Gandhi@apurvasgandhi

Sub-agents are a promising inference-time scaling primitive: • Expand an agent's working memory • Divide-and-conquer hard problems • Solve problems faster with parallel execution But how do we train a model to best take advantage of sub-agents and make sure we get these benefits? Very excited to release RAO: Recursive Agent Optimization. RAO is an end-to-end reinforcement learning approach for training LLM agents to spawn, delegate to, and coordinate with recursive copies of themselves (that can themselves spawn other agents) - turning recursive inference into a learned capability. 1/10

English

344

50.1K

fils@fils·10 May

Last Starfighter looses job to AI! A tragic story, all too common today. The last Starfighter, High schooler Alex Rogan has lost his job to AI. Read how Alex will be replaced as Google's DeepMind announces plan to train AI on player actions in quarter-million-player MMORPG Eve Online! Is no job safe?! tomshardware.com/tech-industry/…

English

fils@fils·10 May

So isn't "Code Mode" very analogous to RLM? I you take RLM via DSPy for example and then make tools and pass them in, (ref: dspy.ai/api/modules/RL… ) it seems very similar. Is Code Mode part of Claude Code? If so is it fair to say this the same destination via two routes? Refs: * arxiv.org/abs/2512.24601 * youtube.com/watch?v=5RAFKE…

YouTube

English

190

Akshay 🚀@akshay_pachaar·10 May

The MCP vs CLI debate. For most of 2025, AI Engineers argued about it. The skeptics had real numbers: - Playwright MCP eats 13.7K tokens - Chrome DevTools MCP eats 18K - A 5-server setup burns 55K tokens before any work The defenders pushed back: - CLIs break on multi-tenant apps - No typed contracts, so the agent guesses at outputs - On unfamiliar APIs, agents waste turns parsing text Both sides were arguing about the wrong thing. In November 2025, Anthropic published "Code execution with MCP" and reframed it from first principles. The problem was never the protocol. It was the habit of dumping every tool's full description into the model's context the moment a session starts. Add the data those tools return, passed through the model on every step, and a single workflow can balloon to 150K tokens. Most of which the model never needed. The fix is to flip the model's job. Instead of the model calling tools through its context, the model writes code that calls tools through a runtime. The runtime is where tools live. The model only sees what it imports. In Anthropic's example, a Google Drive transcript flows into a Salesforce CRM update. The old way loaded both tool schemas and piped the entire transcript through the model twice. The new way is ten lines of TypeScript that import what they need. Same task, 2K tokens. A 98.7% drop. Cloudflare pushed the idea to its limit. They collapsed their entire 2,500-endpoint API from 1.17M tokens of schemas down to 1K tokens, by exposing just two functions: search and execute. The agent writes code that searches the catalog, then executes only what matches. The new pattern has a name: Code Mode. It is a runtime where the agent writes code that mixes two primitives. Bash, for anything with a binary already installed like git or curl. Typed module imports, for proprietary APIs where the type signatures load only when the agent actually imports the tool. That second part is the unlock. Types travel with imports, so the agent gets a strict contract for the tools it picks, and pays nothing for the ones it skips. MCP's typed contracts plus CLI's lazy loading, in one runtime. The agent picks per task. "MCP is dead" was the wrong takeaway. Anthropic just reported 300M MCP SDK downloads, up from 100M at the start of the year. The protocol is not dying. It is the fastest growing piece of agent infrastructure right now. What died was loading every tool upfront. That was always a bad idea. If you are building agents in 2026, the rule is simple. Tool definitions belong in code, not in context. The model writes a few lines that call them. The runtime does the rest. That is what the debate was actually about.

Akshay 🚀@akshay_pachaar

x.com/i/article/2053…

English

489

67.1K

fils@fils·7 May

Hugging Face for Science at huggingscience.co This is very interesting. So I am exploring at what an agent optimized data repository looks like. So finding "Hugging Science" by Hugging Face was interesting. It is, so they say, a site optimized for your AI agent, and supports quite a few major domain specific data formats with large file support (huggingface.co/docs/datasets/…). They have projects to get involved with, design challenges ( huggingscience.co/#/getting-star… ) etc. I don't see many geo-science datasets here yet. A call out to my community I guess. Related paper: AI for scientific discovery is a social problem ( sciencedirect.com/science/articl… ) Is llms.txt still a thing?: huggingscience.co/llms.txt

English

fils がリツイート

Sooraj@iAnonymous3000·1 May

x.com/i/article/2050…

ZXX

5.6K

ディスカバー

@MaximeRivest @kmad @mchonedev @KGConference @pacoid @elonmusk @BarackObama @taylorswift13