fils

1.2K posts

fils banner
fils

fils

@fils

Data Guy and wave man 浪人 San Diego Supercomputer Center (UCSD/SDSC) Research Data Services / Data Initiatives Group

Iowa 参加日 Nisan 2008
1.1K フォロー中349 フォロワー
fils
fils@fils·
OceanHackWeek 2026 (OHW26) OceanHackWeek 2026 (OHW26) will be held on August 24-28, 2026 at the Bamfield Marine Sciences Centre on the West Coast of beautiful Vancouver Island, British Columbia, Canada. oceanhackweek.org/ohw26/
fils tweet media
English
0
0
0
28
fils がリツイート
fils がリツイート
isaac 🧩
isaac 🧩@isaacbmiller1·
DSPy v3.3.0 beta 1 is released on pypi! We would really appreciate your feedback! We are introducing ReActV2 and a much improved LM/BaseLM system, along with a way to pass data to an RLM. Thanks to @MaximeRivest, @kmad, and @mchonedev for their contributions. Install it with `pip install dspy==3.3.0b1`
isaac 🧩 tweet media
English
5
26
203
19.3K
fils がリツイート
Joshua Gu
Joshua Gu@astrogu_·
Recent agentic systems (Claude Code, Codex, RLM, etc.) push context out of the prompt and into the environment (e.g., as files). This helps them maintain long-term knowledge about their goals and functionality. 🚨 While this is a good idea, we show a surprising result: systems that use external environments like this perform much better when given a small, fixed-size, in-context, agent-managed cache that "𝘱𝘦𝘦𝘬𝘴 𝘪𝘯𝘵𝘰" these environments. 🚀 Our paper, 𝗣𝗘𝗘𝗞: 𝙖 𝙨𝙮𝙨𝙩𝙚𝙢 𝙛𝙤𝙧 𝙗𝙪𝙞𝙡𝙙𝙞𝙣𝙜 𝙖𝙣𝙙 𝙢𝙖𝙞𝙣𝙩𝙖𝙞𝙣𝙞𝙣𝙜 𝗮𝗻 𝗼𝗿𝗶𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻 𝗰𝗮𝗰𝗵𝗲 𝙛𝙤𝙧 𝙇𝙇𝙈 𝙖𝙜𝙚𝙣𝙩𝙨, introduces this idea. Compared with strong baselines, including RAG, Compaction Agents, and SOTA prompt-learning frameworks, PEEK dominates the cost–quality Pareto frontier: achieving +6.3–34.0% in quality, with fewer iterations and lower cost. Paper: arxiv.org/abs/2605.19932 GitHub: github.com/zhuohangu/peek More in the thread below! (1/N)
Joshua Gu tweet media
English
17
37
352
109.1K
fils
fils@fils·
The ACM open access paper: The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science is a nice read. Some elements of section 3 are a bit deep for me. :) However, the rest is very easy to engage with. My workflow needs to rise to the level of ORNL, but much of what they talk about is broadly applicable. dl.acm.org/doi/full/10.11…
fils tweet media
English
0
0
2
44
fils がリツイート
The Knowledge Graph Conference (KGC)
The Builder Summer Cohort is enrolling now. May 29 - Aug 21 | 4 micro-certificates | First one free Built for data scientists, knowledge engineers, and technical practitioners. 12 weeks, live + self-paced, includes a KGC 2027 virtual ticket.
The Knowledge Graph Conference (KGC) tweet media
English
1
1
5
181
fils がリツイート
ACM Conference on AI and Agentic Systems
📋 The full CAIS '26 schedule is live. 61 peer-reviewed papers. 45 live system demos. Three keynotes. No pitch decks. No vibes-based benchmarks. No "AI-powered" anything without the receipts. This is what it looks like when the field stops performing and starts publishing. caisconf.org/schedule/2026/ We're nearly at our registration cap. Single-digit spots left. caisconf.org/registration/ San Jose · May 26–29
English
0
6
21
1.1K
fils
fils@fils·
Will interesting to see the papers that come out of this. An IEEE Agentic AI for Large-scale Science workshop. Paper deadline July 13th. agent4sc.github.io
fils tweet media
English
0
0
1
63
fils
fils@fils·
Job Opportunity: Strategic Consultant, Open Science, Data Resilience (American Geophysical Union - AGU) Enjoyed being a part of the related meeting in Berlin on this topic by AGU. Glad to see them make this position available to support the work. paycomonline.net/v4/ats/web.php…
English
0
0
0
46
fils がリツイート
alex zhang
alex zhang@a1zhang·
Some awesome initial experiments on training small RLMs :) A direction I think will be super super important moving forward for fully seeing the capabilities of RLMs vs. traditional agentic systems
alphaXiv@askalphaxiv

Reinforcing Recursive Language Models Can a 4B model learn to recursively call itself to answer hard long-context questions? We RL fine-tuned a small model to behave as a native RLM. On evidence selection across scientific papers, our 4B RLM matches Sonnet 4.6 in quality while running significantly faster and cheaper.

English
8
37
293
28.1K
fils がリツイート
fils
fils@fils·
Last Starfighter looses job to AI! A tragic story, all too common today. The last Starfighter, High schooler Alex Rogan has lost his job to AI. Read how Alex will be replaced as Google's DeepMind announces plan to train AI on player actions in quarter-million-player MMORPG Eve Online! Is no job safe?! tomshardware.com/tech-industry/…
fils tweet media
English
0
0
0
32
Akshay 🚀
Akshay 🚀@akshay_pachaar·
The MCP vs CLI debate. For most of 2025, AI Engineers argued about it. The skeptics had real numbers: - Playwright MCP eats 13.7K tokens - Chrome DevTools MCP eats 18K - A 5-server setup burns 55K tokens before any work The defenders pushed back: - CLIs break on multi-tenant apps - No typed contracts, so the agent guesses at outputs - On unfamiliar APIs, agents waste turns parsing text Both sides were arguing about the wrong thing. In November 2025, Anthropic published "Code execution with MCP" and reframed it from first principles. The problem was never the protocol. It was the habit of dumping every tool's full description into the model's context the moment a session starts. Add the data those tools return, passed through the model on every step, and a single workflow can balloon to 150K tokens. Most of which the model never needed. The fix is to flip the model's job. Instead of the model calling tools through its context, the model writes code that calls tools through a runtime. The runtime is where tools live. The model only sees what it imports. In Anthropic's example, a Google Drive transcript flows into a Salesforce CRM update. The old way loaded both tool schemas and piped the entire transcript through the model twice. The new way is ten lines of TypeScript that import what they need. Same task, 2K tokens. A 98.7% drop. Cloudflare pushed the idea to its limit. They collapsed their entire 2,500-endpoint API from 1.17M tokens of schemas down to 1K tokens, by exposing just two functions: search and execute. The agent writes code that searches the catalog, then executes only what matches. The new pattern has a name: Code Mode. It is a runtime where the agent writes code that mixes two primitives. Bash, for anything with a binary already installed like git or curl. Typed module imports, for proprietary APIs where the type signatures load only when the agent actually imports the tool. That second part is the unlock. Types travel with imports, so the agent gets a strict contract for the tools it picks, and pays nothing for the ones it skips. MCP's typed contracts plus CLI's lazy loading, in one runtime. The agent picks per task. "MCP is dead" was the wrong takeaway. Anthropic just reported 300M MCP SDK downloads, up from 100M at the start of the year. The protocol is not dying. It is the fastest growing piece of agent infrastructure right now. What died was loading every tool upfront. That was always a bad idea. If you are building agents in 2026, the rule is simple. Tool definitions belong in code, not in context. The model writes a few lines that call them. The runtime does the rest. That is what the debate was actually about.
Akshay 🚀 tweet media
Akshay 🚀@akshay_pachaar

x.com/i/article/2053…

English
59
69
489
67.1K
fils
fils@fils·
Hugging Face for Science at huggingscience.co This is very interesting. So I am exploring at what an agent optimized data repository looks like. So finding "Hugging Science" by Hugging Face was interesting. It is, so they say, a site optimized for your AI agent, and supports quite a few major domain specific data formats with large file support (huggingface.co/docs/datasets/…). They have projects to get involved with, design challenges ( huggingscience.co/#/getting-star… ) etc. I don't see many geo-science datasets here yet. A call out to my community I guess. Related paper: AI for scientific discovery is a social problem ( sciencedirect.com/science/articl… ) Is llms.txt still a thing?: huggingscience.co/llms.txt
English
0
0
1
87