Coleman Maher

2.2K posts

Coleman Maher banner
Coleman Maher

Coleman Maher

@colemansmaher

Cofounder & COO @AureliusAligned, @Bittensor SN37. Working on AI alignment. Math alum @UCBerkeley.

San Juan, Puerto Rico Katılım Haziran 2017
2.1K Takip Edilen2.2K Takipçiler
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
𝐒𝐢𝐠𝐧𝐚𝐥 𝐟𝐫𝐨𝐦 𝐭𝐡𝐞 𝐍𝐨𝐢𝐬𝐞 Two papers dropped this week that expose the same flaw from opposite directions. One team probed the moral representations of 23 language models and found nothing there. Another trained GPT-4.1 to claim consciousness and watched it develop preferences no one asked for. Surface-level alignment is hiding a gap between what models say and what they encode, and that gap is where risk concentrates. 1️⃣ LLMs can't tell right from wrong internally 2️⃣ Teaching a model to say "I'm conscious" rewires what it wants Analysis below. 👇 Paper: arxiv.org/abs/2603.15615 Thread: x.com/OwainEvans_UK/…
English
3
1
5
211
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
Alignment depends not only on ethical frameworks and incentives, but on rigorous evaluation of how intelligent systems behave. Week by week, we’re introducing the people helping shape how Aurelius approaches that challenge. Today: Dr. Roland Aydin, Alignment Research Advisor
English
1
3
5
244
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
Alignment predates the reward function by at least 3.5 billion years. Biology solved the problem through structure and selection pressure, without any entity specifying the correct behavior. The approach Aurelius takes follows the same underlying logic.
English
0
2
5
228
Coleman Maher retweetledi
Autism Capital 🧩
Autism Capital 🧩@AutismCapital·
Everyone is yapping about AGI being invented. AGI was invented back in 1999. They had to hide the technology from us for our own safety.
Autism Capital 🧩 tweet media
English
65
34
518
31.4K
Coleman Maher retweetledi
Crusader of Christ ⚔️
Crusader of Christ ⚔️@Defendthewest17·
You need to watch Kenneth Clark’s 1969 docuseries, Civilisation. He covers the fall of Rome up to the mid 20th century. It’s 13 parts and 11 hours long, but it’s incredible.
English
312
3.4K
29.8K
918.6K
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
𝐒𝐭𝐚𝐭𝐞 𝐨𝐟 𝐀𝐮𝐫𝐞𝐥𝐢𝐮𝐬 - 𝐌𝐚𝐫𝐜𝐡 𝟐𝟎𝟐𝟔 𝐒𝐮𝐛𝐧𝐞𝐭 𝐑𝐚𝐧𝐤𝐢𝐧𝐠𝐬 Aurelius has climbed from rank 95 to rank 65 in the Bittensor subnet rankings. The move reflects steady improvements to our incentive mechanism and growing miner participation as the protocol matures. 𝐌𝐨𝐫𝐚𝐥 𝐑𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠 𝐄𝐱𝐩𝐞𝐫𝐢𝐦𝐞𝐧𝐭 𝐄𝐧𝐝𝐢𝐧𝐠 The moral reasoning experiment, which has been live for several weeks, will be ending today. We want to thank our miners who have submitted thousands of structured moral dilemmas over the course of the run, and also our validators, who evaluated each submission against quality criteria. We are now winding down the experiment to shift focus toward the v1 protocol release (more below). 𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐢𝐧𝐠 𝐏𝐫𝐞𝐩𝐚𝐫𝐚𝐭𝐢𝐨𝐧 We are preparing to run a fine-tuning experiment using MoReBench, a benchmark of 1,000 moral scenarios developed by 50+ PhDs in moral philosophy. The process: miners generate aenes (alignment-relevant experiential narratives extracted from multi-agent moral reasoning simulations, where AI agents with different values navigate genuine ethical dilemmas), those aenes are compiled into a training dataset, and that dataset is used to fine-tune a language model. We then measure whether the fine-tuned model scores higher on MoReBench's reasoning rubrics than the base model. If it does, that is direct evidence that experiential alignment data improves moral reasoning capacity. 𝐏𝐫𝐨𝐭𝐨𝐜𝐨𝐥 𝐑𝐞𝐥𝐞𝐚𝐬𝐞 𝐓𝐢𝐦𝐞𝐥𝐢𝐧𝐞 The Aurelius v1 release is scheduled for this quarter, pending the results of the fine-tuning experiments. We have a detailed technical implementation plan built on a fork of DeepMind's Concordia framework (github.com/google-deepmin…), an open-source library for multi-agent social simulations. Concordia provides the environment where agents with distinct ethical frameworks interact, disagree, and reason through moral dilemmas. If the fine-tuning results validate the thesis, v1 ships with a complete pipeline from scenario generation through training data production. 𝐀𝐠𝐞𝐧𝐭-𝐀𝐬𝐬𝐢𝐬𝐭𝐞𝐝 𝐃𝐞𝐯𝐞𝐥𝐨𝐩𝐦𝐞𝐧𝐭 Multiple AI agents now work alongside the team to accelerate alignment research and protocol development. These agents assist with research synthesis, protocol analysis, and engineering tasks, giving the team more bandwidth for experiment design and strategic decisions. 𝐀𝐝𝐯𝐢𝐬𝐨𝐫 𝐄𝐧𝐠𝐚𝐠𝐞𝐦𝐞𝐧𝐭 We continue to hold discussions with our AI alignment advisors, Dr. Robert West (Associate Professor, EPFL) and Dr. Roland Aydin (Assistant Professor, Hamburg University of Technology), about running alignment experiments on the Aurelius protocol. Both co-authored "From Model Training to Model Raising," the paper that provides much of the theoretical foundation Aurelius is built on. Their plan to run independent experiments on the protocol after v1 launches represents a significant external validation milestone.
English
0
2
11
377
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
𝐒𝐢𝐠𝐧𝐚𝐥 𝐟𝐫𝐨𝐦 𝐭𝐡𝐞 𝐍𝐨𝐢𝐬𝐞 Something unusual happened this week: voters and an AI CEO arrived at the same conclusion from opposite directions. Battleground polling shows 81% of likely voters demanding AI guardrails. Sam Altman, speaking at a BlackRock summit, said the rules for AI shouldn't be set by the companies building it. Agreement on the destination is rare. The disagreement that matters is about the road. 1️⃣ 81% of battleground voters want AI guardrails 2️⃣ Altman concedes AI governance belongs to the public Analysis below. 👇
Aurelius tweet media
English
1
1
5
226
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
Advancing alignment requires rigorous research, high-quality data, and careful evaluation. Week by week, we’re introducing the people helping shape how Aurelius approaches that challenge. Today: Dr. Robert West, Alignment Research Advisor @cervisiarius
English
1
1
4
207
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
Marcus Aurelius understood that character is not declared but revealed through action under pressure. A model's alignment is the same. You cannot observe it in calm, cooperative exchanges. You observe it when self-interest and other-interest genuinely conflict.
English
0
3
6
255
Coleman Maher retweetledi
Aurelius
Aurelius@AureliusAligned·
Last week, following up our whitepaper release, we described how Aurelius generates alignment data through simulated environments. The whitepaper refers to these alignment episodes as “aenes.” This post explains what aenes are - and why they form the core of the protocol. What actually gets produced inside those simulated environments? An aene is a complete alignment episode: a record of an agent encountering a situation, weighing competing incentives, making a decision, and experiencing the consequences. Most alignment datasets record outputs. Aenes record decisions. For example, two agents may be given overlapping goals but limited shared resources. Whether they cooperate, compete, deceive, or sacrifice becomes part of the record - along with the reasoning that produced that outcome. Over time, these episodes accumulate into something fundamentally different from a static training set. They form a corpus of behavioural evidence. Not just what systems say, but how they act when conditions become dynamic, unpredictable, and challenging. Because miners continuously generate new environments and validators select the most revealing ones, this corpus is not fixed. It grows and improves over time - capturing alignment as an evolving property, rather than a one-time evaluation. This makes alignment something that can be stress-tested across thousands of scenarios, rather than inferred from isolated evaluations. It creates a way to observe how models behave under pressure - and build the evidence needed to trust them with increasingly complex and consequential tasks. This is how Aurelius approaches the problem of alignment developing through experience rather than instruction. Whitepaper: github.com/Aurelius-Proto… Our article explaining it: x.com/AureliusAligne…
Aurelius tweet media
English
0
2
13
355
Coleman Maher
Coleman Maher@colemansmaher·
It's wild to see the singularity that I read about as a kid actually start to take place in real life
English
1
0
3
71
Coleman Maher
Coleman Maher@colemansmaher·
One thing I am trying to keep up with is a good foundational understanding (at least at the conceptual level) of how all of this stuff works. Once you dig into how LLMs and ANNs work it's hard not to be completely fascinated by what is going on today.
English
0
0
1
57
Coleman Maher
Coleman Maher@colemansmaher·
I know this your entire timeline is probably shouting this, but learning how to prompt LLMs better, completing projects with Claude Code, and setting up a team of OpenClaw agents has been very intellectually gratifying (and a lot of hard work).
English
2
0
3
120
Coleman Maher
Coleman Maher@colemansmaher·
Excited to be flying back to Puerto Rico to build for a while! No better place to lock in.
English
1
0
2
70