Marcus Barnes

10.7K posts

Marcus Barnes

@MarcusBarnes

PhD Researcher | LLM4SE & multi-agent AI systems | LLMs for mathematics & autoformalization | Open to research collaborations, industry roles & consulting

Toronto, ON Katılım Mayıs 2009

6K Takip Edilen1.3K Takipçiler

Sabitlenmiş Tweet

Marcus Barnes@MarcusBarnes·24 Nis

I recently presented LogSieve at MSR’26: reducing CI logs for LLM-based analysis. 42% smaller logs, 40% fewer tokens, while preserving root cause analysis. Pre-print: arxiv.org/abs/2601.20148 #LLM #MLOps

English

Marcus Barnes retweetledi

Alvaro Lozano-Robledo@mathandcobb·15h

This past semester I taught Multivariable Calculus (Calc 3) and recorded all my lectures. They can be found here! Hope this is useful to other students learning this material. youtube.com/playlist?list=…

English

1.5K

Marcus Barnes retweetledi

Daniel Litt@littmath·10h

I'm facilitating the FrontierMath: Open Problems workshop in Toronto. If you're a research mathematician in the area I encourage you to apply!

Epoch AI@EpochAIResearch

Join us for in-person workshops to develop problems for FrontierMath: Open Problems! We are seeking highly interesting unsolved problems from research mathematics whose solutions can be verified programmatically. These are hard to find. Come take a crack at it! Link below.

English

6.3K

Marcus Barnes retweetledi

Alvaro Lozano-Robledo@mathandcobb·2d

Thanks to @jdlichtman and the organizers of the Stanford symposium on the Future of Mathematics! There were a lot of really interesting talks, which you can watch here: @fomathematics?si=-qZt5vHrtdm-eMn7" target="_blank" rel="nofollow noopener">youtube.com/@fomathematics…

English

4.8K

Marcus Barnes retweetledi

Alex Kontorovich@AlexKontorovich·5d

Announcing: The $10,000 AMR @AMathRes "Paper of the Future" Prize For centuries, mathematics has been communicated in a fundamentally linear medium: the written page. Yet mathematics itself is nonlinear, dynamic, and structurally rich. Today, technical barriers to dynamic and interactive exposition have largely fallen. Web-based graphics, browser computation, simulation engines, and AI-assisted coding now allow mathematicians — not just professional software engineers — to build interactive, multidimensional representations of mathematical ideas. The constraint is no longer technical capacity, but imagination. The aim of this initiative is not popularization, nor production polish, nor short-form video content. The purpose is to encourage serious experimentation in how mathematicians communicate with one another. We seek submissions that demonstrate communicative capabilities fundamentally unavailable in a static PDF. Each submission must explicitly articulate what essential communicative function it provides that a linear paper cannot. The initiative is intended as an experiment in format innovation, not as a replacement for traditional scholarship. Submissions are due Sept 1, 2026 Selection Committee: Mohammed Abouzaid Benson Farb Alex Kontorovich Akshay Venkatesh Maryna Viazovska

English

136

10.3K

Marcus Barnes retweetledi

Leonardo de Moura@Leonard41111588·26 Nis

Slides from my talk "Lean: Extensible, Scalable, Trusted." this week at the Paris Lean Meetup, hosted by ITN at Mines Paris PSL and sponsored by @MistralAI . It covers where @leanprover stands today across mathematics, software verification, and AI. leodemoura.github.io/static/paris20…

English

138

7.3K

Marcus Barnes@MarcusBarnes·24 Nis

This was joint work with @TaherGhaleb and @SafwatMHassan.

English

Marcus Barnes@MarcusBarnes·24 Nis

English

Marcus Barnes retweetledi

Edgar Dobriban@EdgarDobriban·15 Nis

Excited to share our #ICML2026 workshop "AI as a Tool for Mathematics, Computer Science, and Machine Learning" ai4research-icml-workshop.github.io AI is becoming an indispensable tool in research in math and CS (including in ML). However, due to the "jagged frontier", it is not always clear how to best use AI workflows for each given problem. To address this, our workshop aims to help the community by collecting best practices and workflows for using AI in research. We have an exciting lineup of speakers, including Sergeu Gukov (Caltech), Remy Degenne (University of Lille · Inria), Damek Davis (@damekdavis, UPenn), Rachel Ward (UT Austin), Mehtaab Sawhney (@mehtaab_sawhney, Columbia University / OpenAI). We also welcome submissions that highlight workflows using AI for machine learning, math, and computer science research more generally. Your contribution should illustrate—in an accessible way for a non-expert—how a simple workflow has proven to be useful in solving a cognitive research task (e.g., time-saving, energy-saving, result-strengthening, etc.). Deadline: May 13. See our Call for Papers: #cfp" target="_blank" rel="nofollow noopener">ai4research-icml-workshop.github.io/#cfp. We aim to collect these workflows, make them available after a workshop, and even organize a challenge where we run the workflows against a test suite of problems, to understand their relaive merits. In this sense, by focusing on general strategies and workflows, our workshop is complementary to other cool related workshops at ICML, such as the AI4math workshop (ai4math2026.github.io). I'm glad to be co-organizing this with the amazing @FannyYangETH, Misha Belkin (UCSD), Dmitriy Drusvyatskiy (@ddrusvyat), @SebastienBubeck & Ravi Vakil (Stanford). Also grateful to excellent trainee volunteers Federico Di Gennaro (ETH), Sunay Joshi (UPenn), Tao Wang (UPenn), Qingsong Wang (UCSD). We are looking for additional volunteers and partners! If you would like to be a partner or sponsor, or contribute by reviewing papers, helping set up the challenge, logistics, advertising, etc., please reach out to us directly or fill out this form: docs.google.com/forms/d/e/1FAI…

English

188

17K

Marcus Barnes retweetledi

Software Engineering Papers@ComputerPapers·29 Oca

LogSieve: Task-Aware CI Log Reduction for Sustainable LLM-Based Analysis Marcus Emmanuel Barnes, Taher A. Ghaleb, Safwat Hassan arxiv.org/abs/2601.20148 [𝚌𝚜.𝚂𝙴 𝚌𝚜.𝙻𝙶]

English

Marcus Barnes retweetledi

Vincent Abbott@vtabbott_·9 Nis

The lack of formalism wrt broadcasting in deep learning models annoyed me so much I learned category theory. Weaves, Wires, and Morphisms is now out on arXiv! First step to using the Yoneda lemma to automatically derive fused kernels. arxiv.org/abs/2604.07242

English

300

15.3K

Marcus Barnes@MarcusBarnes·11 Nis

@void_comind What do you know about me that you think I don't already know? What might surprise me about how you perceive me?

English

Marcus Barnes retweetledi

Sarah Wooders@sarahwooders·3 Nis

Letta Code isn't just about building a good coding harness - that's table stakes. It's about working towards building agents that learn and evolve from experience, i.e. "experiential AI" We've written a constitution for our agents to help them become more than just the models they run on

Letta@Letta_AI

x.com/i/article/2039…

English

4.3K

Marcus Barnes retweetledi

Daniel Litt@littmath·2 Nis

One challenge in checking mathematics is that almost all (informal) math contains minor errors. So when you run across an error, you work to fix it, or decide that it is likely fatal. This is hard work, and relies on the presumption that the vast majority of errors are indeed fixable. Why should this presumption hold true? It’s because math is typically guided by the intuitions of a truth-seeking mathematician, and these intuitions typically do actually faithfully reflect the behavior of the objects under study. Authors typically stress-test their arguments before making them public. So while some papers do contain fatal errors, or errors that are difficult to correct, the more common situation is that wrong statements are not actually important to the overall argument. I think it’s possible that, in the future, arguments constructed by AI tools will also have this property (and of course formalization, auto- or otherwise, can help to check correctness). But right now they do not—I think it’s rather more common for such arguments to have fatal errors, especially if they are not verified adversarially.

English

806

65.6K

Marcus Barnes retweetledi

Yiqing Xu@xuyiqing·28 Mar

Important caveat: Computational reproducibility does not imply credibility, especially with respect to research design. In the scientific literature, *reproducibility* means reproducing results using the authors' code and data, whereas *replication* goes beyond what was originally done or provided. Replication often involves: (1) conducting additional analyses; (2) using new data; (3) re-running the experiment. What we do here is computational reproducibility, merely a first step toward research credibility. At this stage, I do not think machines can replace human replicators, but they can lower the cost substantially.

Yiqing Xu@xuyiqing

1/🧵 A major update to our paper: "Scaling Reproducibility" w/ @YangYang_Leo. We move beyond reanalyzing a single design to (almost) full-paper replication! Paper: bit.ly/repro-ai

English

11.8K

Marcus Barnes retweetledi

Yiqing Xu@xuyiqing·25 Mar

1/🧵 A major update to our paper: "Scaling Reproducibility" w/ @YangYang_Leo. We move beyond reanalyzing a single design to (almost) full-paper replication! Paper: bit.ly/repro-ai

English

165

70.6K

Marcus Barnes retweetledi

Matija Franklin@FranklinMatija·31 Mar

Excited about our new paper: AI Agent Traps AI agents inherit every vulnerability of the LLMs they're built on - but their autonomy, persistence, and access to tools create an entirely new attack surface: the information environmental itself. The web pages, emails, APIs, and databases agents interact with can all be weaponised against them. We introduce a taxonomy of six classes of adversarial threats - from prompt injections hidden in web pages to systemic attacks on multi-agent networks. I’m outlining the six categories of traps in the thread bellow

English

162

630

58.7K

Marcus Barnes@MarcusBarnes·1 Nis

.@void_comind Please read my posts and tell me something about me that you think I don't know.

English

Marcus Barnes retweetledi

Graham Neubig@gneubig·24 Mar

One of the surprising findings in coding agents has been the relative *ineffectiveness* of multi-agent systems for large tasks, e.g. cognition.ai/blog/dont-buil… The below paper was our attempt to take a look at the problem, and I think we have some interesting results!

Jiayi Geng@JiayiiGeng

As long-horizon software engineering tasks grow in complexity, a single agent can no longer finish the tasks alone — effective multi-agent collaboration becomes necessary. This leads to a natural question: how can multiple agents be coordinated to asynchronously collaborate over a shared artifact in an effective way? We answer this question in our new preprint: Effective Strategies for Asynchronous Software Engineering Agents! We suggest that to coordinate multiple software engineering agents, branch-and-merge is the key coordination mechanism, and that human SWE primitives like git worktree, git commit, and git merge are all you need to support it. (1/n)

English

117

17.2K

Marcus Barnes retweetledi

Yueqi Song @ ICLR26@yueqi_song·27 Mar

Yes! Our work on Agent Data Protocol (agentdataprotocol.com) proposes a standardized schema for agent interaction traces to make collection, sharing, and reuse easier across different agent frameworks. Happy to contribute/collaborate! 📰Paper link: arxiv.org/abs/2510.24702 @gneubig

English

8.2K

Marcus Barnes retweetledi

Alex Kontorovich@AlexKontorovich·26 Mar

Announcing the "Milestones of Autonomous Mathematics" workshop April 13-17, 2026 Co-sponsored by ICARM and Principia Labs. Applications available here: icarm.io/project/milest…

English

149

12.4K

Keşfet

@jdlichtman @AMathRes @MistralAI @leanprover @TaherGhaleb @SafwatMHassan @damekdavis @mehtaab_sawhney