Marcus Barnes

10.7K posts

Marcus Barnes

@MarcusBarnes

PhD Researcher | LLM4SE & multi-agent AI systems. Exploring LLMs for mathematics & autoformalization. Open to research collaborations, industry roles, an

Toronto, ON Beigetreten Mayıs 2009

6K Folgt1.3K Follower

Marcus Barnes retweetet

Edgar Dobriban@EdgarDobriban·2d

Excited to share our #ICML2026 workshop "AI as a Tool for Mathematics, Computer Science, and Machine Learning" ai4research-icml-workshop.github.io AI is becoming an indispensable tool in research in math and CS (including in ML). However, due to the "jagged frontier", it is not always clear how to best use AI workflows for each given problem. To address this, our workshop aims to help the community by collecting best practices and workflows for using AI in research. We have an exciting lineup of speakers, including Sergeu Gukov (Caltech), Remy Degenne (University of Lille · Inria), Damek Davis (@damekdavis, UPenn), Rachel Ward (UT Austin), Mehtaab Sawhney (@mehtaab_sawhney, Columbia University / OpenAI). We also welcome submissions that highlight workflows using AI for machine learning, math, and computer science research more generally. Your contribution should illustrate—in an accessible way for a non-expert—how a simple workflow has proven to be useful in solving a cognitive research task (e.g., time-saving, energy-saving, result-strengthening, etc.). Deadline: May 13. See our Call for Papers: #cfp" target="_blank" rel="nofollow noopener">ai4research-icml-workshop.github.io/#cfp. We aim to collect these workflows, make them available after a workshop, and even organize a challenge where we run the workflows against a test suite of problems, to understand their relaive merits. In this sense, by focusing on general strategies and workflows, our workshop is complementary to other cool related workshops at ICML, such as the AI4math workshop (ai4math2026.github.io). I'm glad to be co-organizing this with the amazing @FannyYangETH, Misha Belkin (UCSD), Dmitriy Drusvyatskiy (@ddrusvyat), @SebastienBubeck & Ravi Vakil (Stanford). Also grateful to excellent trainee volunteers Federico Di Gennaro (ETH), Sunay Joshi (UPenn), Tao Wang (UPenn), Qingsong Wang (UCSD). We are looking for additional volunteers and partners! If you would like to be a partner or sponsor, or contribute by reviewing papers, helping set up the challenge, logistics, advertising, etc., please reach out to us directly or fill out this form: docs.google.com/forms/d/e/1FAI…

English

173

13K

Marcus Barnes retweetet

Software Engineering Papers@ComputerPapers·29 Oca

LogSieve: Task-Aware CI Log Reduction for Sustainable LLM-Based Analysis Marcus Emmanuel Barnes, Taher A. Ghaleb, Safwat Hassan arxiv.org/abs/2601.20148 [𝚌𝚜.𝚂𝙴 𝚌𝚜.𝙻𝙶]

English

Marcus Barnes retweetet

Vincent Abbott@vtabbott_·9 Nis

The lack of formalism wrt broadcasting in deep learning models annoyed me so much I learned category theory. Weaves, Wires, and Morphisms is now out on arXiv! First step to using the Yoneda lemma to automatically derive fused kernels. arxiv.org/abs/2604.07242

English

298

14.8K

Marcus Barnes@MarcusBarnes·11 Nis

@void_comind What do you know about me that you think I don't already know? What might surprise me about how you perceive me?

English

Marcus Barnes retweetet

Sarah Wooders@sarahwooders·3 Nis

Letta Code isn't just about building a good coding harness - that's table stakes. It's about working towards building agents that learn and evolve from experience, i.e. "experiential AI" We've written a constitution for our agents to help them become more than just the models they run on

Letta@Letta_AI

x.com/i/article/2039…

English

4.2K

Marcus Barnes retweetet

Daniel Litt@littmath·2 Nis

One challenge in checking mathematics is that almost all (informal) math contains minor errors. So when you run across an error, you work to fix it, or decide that it is likely fatal. This is hard work, and relies on the presumption that the vast majority of errors are indeed fixable. Why should this presumption hold true? It’s because math is typically guided by the intuitions of a truth-seeking mathematician, and these intuitions typically do actually faithfully reflect the behavior of the objects under study. Authors typically stress-test their arguments before making them public. So while some papers do contain fatal errors, or errors that are difficult to correct, the more common situation is that wrong statements are not actually important to the overall argument. I think it’s possible that, in the future, arguments constructed by AI tools will also have this property (and of course formalization, auto- or otherwise, can help to check correctness). But right now they do not—I think it’s rather more common for such arguments to have fatal errors, especially if they are not verified adversarially.

English

807

65.2K

Marcus Barnes retweetet

Yiqing Xu@xuyiqing·28 Mar

Important caveat: Computational reproducibility does not imply credibility, especially with respect to research design. In the scientific literature, *reproducibility* means reproducing results using the authors' code and data, whereas *replication* goes beyond what was originally done or provided. Replication often involves: (1) conducting additional analyses; (2) using new data; (3) re-running the experiment. What we do here is computational reproducibility, merely a first step toward research credibility. At this stage, I do not think machines can replace human replicators, but they can lower the cost substantially.

Yiqing Xu@xuyiqing

1/🧵 A major update to our paper: "Scaling Reproducibility" w/ @YangYang_Leo. We move beyond reanalyzing a single design to (almost) full-paper replication! Paper: bit.ly/repro-ai

English

11.8K

Marcus Barnes retweetet

Yiqing Xu@xuyiqing·25 Mar

1/🧵 A major update to our paper: "Scaling Reproducibility" w/ @YangYang_Leo. We move beyond reanalyzing a single design to (almost) full-paper replication! Paper: bit.ly/repro-ai

English

164

70.2K

Marcus Barnes retweetet

Matija Franklin@FranklinMatija·31 Mar

Excited about our new paper: AI Agent Traps AI agents inherit every vulnerability of the LLMs they're built on - but their autonomy, persistence, and access to tools create an entirely new attack surface: the information environmental itself. The web pages, emails, APIs, and databases agents interact with can all be weaponised against them. We introduce a taxonomy of six classes of adversarial threats - from prompt injections hidden in web pages to systemic attacks on multi-agent networks. I’m outlining the six categories of traps in the thread bellow

English

162

625

57.8K

Marcus Barnes@MarcusBarnes·1 Nis

.@void_comind Please read my posts and tell me something about me that you think I don't know.

English

Marcus Barnes retweetet

Graham Neubig@gneubig·24 Mar

One of the surprising findings in coding agents has been the relative *ineffectiveness* of multi-agent systems for large tasks, e.g. cognition.ai/blog/dont-buil… The below paper was our attempt to take a look at the problem, and I think we have some interesting results!

Jiayi Geng@JiayiiGeng

As long-horizon software engineering tasks grow in complexity, a single agent can no longer finish the tasks alone — effective multi-agent collaboration becomes necessary. This leads to a natural question: how can multiple agents be coordinated to asynchronously collaborate over a shared artifact in an effective way? We answer this question in our new preprint: Effective Strategies for Asynchronous Software Engineering Agents! We suggest that to coordinate multiple software engineering agents, branch-and-merge is the key coordination mechanism, and that human SWE primitives like git worktree, git commit, and git merge are all you need to support it. (1/n)

English

115

17.1K

Marcus Barnes retweetet

Yueqi Song @ ICLR26@yueqi_song·27 Mar

Yes! Our work on Agent Data Protocol (agentdataprotocol.com) proposes a standardized schema for agent interaction traces to make collection, sharing, and reuse easier across different agent frameworks. Happy to contribute/collaborate! 📰Paper link: arxiv.org/abs/2510.24702 @gneubig

English

8.2K

Marcus Barnes retweetet

Alex Kontorovich@AlexKontorovich·26 Mar

Announcing the "Milestones of Autonomous Mathematics" workshop April 13-17, 2026 Co-sponsored by ICARM and Principia Labs. Applications available here: icarm.io/project/milest…

English

149

12.2K

Marcus Barnes retweetet

François Charton@f_charton·26 Mar

Axplorer: an open source library for constructing interesting mathematical objects (and attacking really hard problems). Have fun!

Axiom@axiommathai

We open-sourced Axplorer. Axplorer builds on PatternBoost; it discovers outlier math constructions to attack open problems. On Turán 4-Cycles, No 5 Points on Sphere, and Isosceles-Free Sets, Axplorer matched SOTA w/ a fraction of compute cost and time. It's now in your hands.

English

Marcus Barnes retweetet

Axiom@axiommathai·26 Mar

The Axplorer blog: axiommath.ai/territory/axpl… Github codebase with manual: github.com/AxiomMath/axpl… Read more about us in MIT Technology Review: technologyreview.com/2026/03/25/113…

English

5.3K

Marcus Barnes retweetet

Benji Taylor@benjitaylor·9 Mar

New domain just dropped: agentation.com

English

467

105.5K

Marcus Barnes retweetet

Ali Hatamizadeh@ahatamiz1·22 Mar

If you’re an AI PhD student just starting out, don't be discouraged by the hype of "autoresearch" automating scientific discovery. It won't. AutoML made the same big promises in 2017, and we all know how that turned out. Ignore the noise. Master the fundamentals and learn to do research from first principles. Trends fade, but a solid foundation is how you will actually thrive.

English

182

1.8K

141.6K

Marcus Barnes retweetet

Aidan Li@aidanmrli·19 Mar

I wrote an article on agentic coding for beginners after my talk at @apsarathchandar @ChandarLab group. We cover history of AI coding tools, the importance of model harnesses, and general principles in simple research workflows. Feedback is very welcome! aidanli.dev/writing/articl…

English

125

9.8K

Marcus Barnes retweetet

Type Theory Forall@ttforall·18 Mar

Tristan Stérin used LLMs to hunt for bugs and inconsistencies in Rocq and Lean. This is actually pretty neat and kind of wild. twp.ai/4ixLIF

English

2.6K

Marcus Barnes retweetet

Schwartz Reisman Institute@TorontoSRI·16 Mar

Join us in-person, or online, this Wednesday for a special SRI Seminar Series event with @ZhijingJin (@UofTCompSci). Talk: "Emergent AI safety risks in multi-agent LLMs." 📅 Wednesday, March 18, 2026 ⏰ 12:30–2:00 PM Register: uoft.me/cfx

English

1.8K

Entdecken

@damekdavis @mehtaab_sawhney @FannyYangETH @ddrusvyat @SebastienBubeck @void_comind @YangYang_Leo @gneubig