CBMM retweetledi
CBMM
291 posts

CBMM
@MIT_CBMM
The Center for Brains, Minds and Machines is a multi-institutional NSF Center dedicated to the study of the science and engineering of intelligence.
Cambridge, MA Katılım Ocak 2017
39 Takip Edilen3.1K Takipçiler
CBMM retweetledi

This is a project I’m very excited about.
Back in the days the smartest computer scientists were finding the efficient ways to solve their problems.
We made the agents do this work here.
Tomer Galanti@GalantiTomer
1/ Many optimization problems are hard in theory. But real OR and NP-hard instances often have exploitable structure. Can an LLM agent discover that structure automatically and turn it into faster solver code?
English
CBMM retweetledi
CBMM retweetledi

[blog] What is Intelligence? Or "Distinguishability is All You Need"
Here are several related questions to which we do not have a good answer:
How will we know when we've achieved "Artificial General Intelligence" (AGI)?...
poggio-lab.mit.edu/blogsupdates/w…

English

[video] "Intelligence as Prediction: Cybernetics, LLMs, and Sociality"
Speaker: Blaise Agüera y Arcas - Google, Paradigms of Intelligence
youtu.be/6NC0tSjZXBo

YouTube

Français

[blog post] "PoggioAI/MSc Went Online"
This first public release is an open-source, customizable, modular multi-agent system for academic research workflows, with a current emphasis on machine learning theory and nearby quantitative fields.
poggio-lab.mit.edu/blogsupdates/p…

English
CBMM retweetledi

Check the blog of Poggio Lab at MIT! We went online with some very nice blogs!
The last one being about our multiagent system:
poggio-lab.mit.edu/blogsupdates/p…

English
CBMM retweetledi
CBMM retweetledi

Simply adding Gaussian noise to LLMs (one step—no iterations, no learning rate, no gradients) and ensembling them can achieve performance comparable to or even better than standard GRPO/PPO on math reasoning, coding, writing, and chemistry tasks. We call this algorithm RandOpt.
To verify that this is not limited to specific models, we tested it on Qwen, Llama, OLMo3, and VLMs.
What's behind this? We find that in the Gaussian search neighborhood around pretrained LLMs, diverse task experts are densely distributed — a regime we term Neural Thickets.
Paper: arxiv.org/pdf/2603.12228
Code: github.com/sunrainyg/Rand…
Website: thickets.mit.edu

English

[blog] Beneficial Misalignment: Why We Shouldn't Always Align AI to Humans
In the rapidly evolving field of NeuroAI, a significant amount of energy is dedicated to 'alignment', the idea that representations from artificial intelligence should converge...
poggio-lab.mit.edu/blogsupdates/b…

English

[blog post] A Conversation with Blaise Agüera y Arcas: On Intelligence, Life, and the Future of AI
What does it mean to call something intelligent - and when did this question get so hard to answer? For Blaise Agüera y Arcas, VP at Google and founder...
poggio-lab.mit.edu/blogsupdates/i…

English

[blog post] Can a Neural Network Think Before It Speaks?
Somewhere around 2022, an observation started making the rounds among researchers working with large language models: if you just asked a model...
poggio-lab.mit.edu/blogsupdates/c…

English

[blog post] Edge of (Stochastic) Stability made simple — Part II: the mini-batch case
In Part I we had one landscape and a deterministic update.
Now we have a distribution of mini-batch landscapes and a stochastic update...
poggio-lab.mit.edu/blogsupdates/e…

English

[blog post] Edge of (Stochastic) Stability made simple — Part I: A crash course on (full-batch) Edge of Stability
In this part I introduce the phenomenon and what I believe are the two key mechanisms—which we’ll use as the springboard for the mini-bat...
poggio-lab.mit.edu/blogsupdates/e…

English

[blog post] Are Transformers Just "Stochastic Parrots"?
A common criticism of Large Language Models (LLMs) is that they are merely "stochastic parrots"—statistical mimics that stitch together likely patterns without genuine reasoning...
poggio-lab.mit.edu/blogsupdates/a…

English
CBMM retweetledi

🧵 New paper: LLM-ERM: Sample-Efficient Program Learning via LLM-Guided Search
arxiv.org/abs/2510.14331
We use reasoning LLMs to learn tasks like IsPrime from ~200 samples by proposing short programs, making both the learned function *and* the learning process interpretable 🤯
English
CBMM retweetledi

Does SGD really “seek flat minima”?
We show that SGD has no intrinsic preference for flatness, even for stable linear networks—going against ~10 years of folklore.
Flatness emerges iff label noise is isotropic; anisotropic noise drives SGD to arbitrarily sharp solutions.
This reveals a new flattening–sharpening mechanism in late training, unrelated to standard progressive sharpening or Edge-of-Stability effects.


English
CBMM retweetledi

Reinforcement Learning (RL) has long been the dominant method for fine-tuning, powering many state-of-the-art LLMs. Methods like PPO and GRPO explore in action space. But can we instead explore directly in parameter space? YES we can. We propose a scalable framework for full-parameter fine-tuning using Evolution Strategies (ES).
By skipping gradients and optimizing directly in parameter space, ES achieves more accurate, efficient, and stable fine-tuning.
Paper: arxiv.org/pdf/2509.24372
Code: github.com/VsonicV/es-fin…
English

[blog post] Intelligence Begins with Memory: From Reflexes to Attention
Why associative memory is the oldest mechanism of intelligence—and still its computational core.
sites.mit.edu/poggio-lab/int…

English



