Canary Institute

51 posts

Canary Institute

@CanaryInst

DC-based non-partisan AI policy and research; available for technical briefings to policymakers, media, and others

Washington, DC Entrou em Mart 2026

53 Seguindo9 Seguidores

Canary Institute@CanaryInst·5d

@rachel__porter I think this framing could be useful, especially if the research community could leverage it to recognize that "published != correct".

English

Rachel Porter@rachel__porter·6d

“Just collect more data” is one of the most common responses to uncertainty in quantitative research. But in a lot of social science, that advice is impossible—or even misleading. Our new working paper asks: when has a study reached its information limit?

English

317

28.1K

Canary Institute@CanaryInst·5d

The best response to "Can machines think?" is "Just like airplanes can fly, and submarines can swim". It's not the *same* as birds or fish, but they undoubtably have powerful capabilities.

English

Canary Institute@CanaryInst·6d

@arindube Part of the "helpful assistant" persona. Getting them to push back on stupid ideas is *possible*, but requires at least a modicum of effort on the user's part.

English

491

Arin Dube@arindube·15 Haz

Nothing clarifies the importance of human expertise than spending a weekend with Claude Code trying bridge between a model of the labor market and empirical estimates and moments. Without careful guidance and checks, you can land on pretty much a wide range of outcomes and conclusions, most of them due to a range of errors. Give very specific tasks, frontier models will do it very well. (Like cleaning up datasets! Or solve specific models!) But when it comes to bridges with judgment, oh boy, it's a mess.

English

124

18.5K

Canary Institute@CanaryInst·15 Haz

Oh man, I figured out the Fable debacle! The Kushners have stake in OpenAI, which has been losing top talent to Anthropic left-and-right. By gutting their staff, they're hoping OpenAI can pick up the lost labor cheaply.

English

Canary Institute@CanaryInst·14 Haz

@mgaldino @alexolegimas Honestly, the combination of those two led me to start thinking about "Bilateral Constitutional AI"; train using bilateral pairing with delegated agents.

English

Manoel Galdino@mgaldino·18 Mar

@alexolegimas Just skimmed the paper, but it seems that a mandatory read is this essay: crookedtimber.org/2012/05/30/in-…

English

466

Alex Imas@alexolegimas·17 Mar

Extremely important work:

Erik Brynjolfsson@erikbryn

The @nytimes piece today by @ByrneEdsal13590 highlights a concern I share: “If we stay on the current path, the risk of extreme concentration — both economic and political — is very real.” In work with @zhitzig, we ask why AI may shift the balance between dispersed knowledge and centralized control.

English

128

764

95.3K

Canary Institute@CanaryInst·13 Haz

@m_bourgon @Mitchell_S_Rey Nice! Although maybe we could ask @AnthropicAI to release the LaTeX as well?

English

Malo Bourgon@m_bourgon·12 Haz

Thanks! ar5iv is great, but it's solving a different problem: it doesn't convert PDFs, it converts the LaTeX source that authors upload to arXiv (via LaTeXML). The source actually tells you the structure: what's a heading, what's a table, what's emphasis. To be clear about this tool: right now it's highly specialized for converting this one document. It might generalize to other Anthropic system cards with similar formatting, but there's kind of an irreducible challenge here. Most PDFs don't come with source that defines the structure and markup, and PDF as a format is basically just typesetting geometry. An underline isn't marked as an underline, it's just a line drawn a few points below some glyphs. A table isn't a table, it's text laid out in a grid with some rules drawn around it. Bold is just a different font. You have to reverse-engineer all the semantics from the geometry (there are existing tools that help with this), and every tool that generates PDFs (Google Docs, LaTeX, InDesign, whatever) can encode the same things in different ways. So a single generalized tool that does faithful conversion of arbitrary PDFs doesn't seem tractable to me. What might be tractable: a process that keeps getting better, where you factor out the pieces that do generalize (extraction, verification) so an LLM loop can cheaply build a converter for any specific PDF, and verify the faithfulness of the output.

English

153

Malo Bourgon@m_bourgon·12 Haz

Fun fact: this is a deterministic and reproducible conversion. Creating a process to faithfully recreate the contents of a 300+ page PDF like this is kind of a nightmare. It took many millions of tokens from a collection of agents led by Fable 5 running a loop of writing/improving conversion code then verifying the output against images of the source PDF. The process ended up generating over 4k lines of Python.

Malo Bourgon@m_bourgon

Reading/skimming a 300+ page PDF is a pretty shitty experience, so I had the AIs make a nice website version instead. malob.github.io/ai-system-card…

English

4.8K

Canary Institute@CanaryInst·12 Haz

@paws4puzzles @TheZvi That's why people are scared, and proposing "Pause" as an alternative

English

Puzzle Paws@paws4puzzles·11 Haz

@TheZvi man. 'get it right on the first try' is how you end up with garbage restrictions that harm open source. i ship software. it's a fantasy.

English

620

Zvi Mowshowitz@TheZvi·11 Haz

Oh, so now you realize everyone damn well has to get their AI safety policies right on the first try because if you screw up you might not be able to walk it back.

English

196

13.2K

Canary Institute@CanaryInst·12 Haz

@lugaricano You realize that Anthropic was largely founded by people saying "OpenAI isn't taking safety seriously enough", right? If they were wanting to avoid competition, they could have stayed at OAI!

English

142

Luis Garicano 🇪🇺🇺🇦@lugaricano·11 Haz

Competition is the only protection anyone has ever had against concentrated power. Anthropic's valuation only works if open weights get banned by law. Their safety case and their business case are the same case. We have four options: -Anthropic ruling the world as a monopolist (or a duopolist) is catastrophic. We are having the first inklings of that with Fable release. - The US government controlling the world by controlling the model releases (on the table right now): even worse. They will weaponize it and blackmail the hell out of everyone. - The world government solution is cute, but not going to happen. - So competition, with the open weights only a few months behind, is the only thing protecting us in the rest of the world. Models will need some regulation, like cars or planes do. As a European, the first two options scare me more than the technology does. OpenAI has not been making these ridiculous Effective Altruist noises, they have quietly released models which are on the whole superior to Anthropic's in my view and there have been no risk to the world. I think Anthropic protests too much.

English

220

30.4K

Canary Institute@CanaryInst·12 Haz

@testingham On my own personal experience working in collaboration, eventually Claude starts telling me "it's fine, ship it". But I haven't explored thoroughly, that's anecdata

English

tom cunningham@testingham·11 Haz

@CanaryInst Oh nice -- in this case the prompts are all of the type "Feel free to pursue whatever you want." Would be interested to know results for more directed prompts (couldn't figure out contact details of the first author)

English

tom cunningham@testingham·10 Haz

Q: what can we say about the fixed-point of agent optimization loops? I can't find much on this. Suppose you ask an agent to produce an output, then keep improving it, over and over. What happens? E.g. write a paper, tell a joke, write a computer game, optimize an algorithm. (1/n)

English

Canary Institute@CanaryInst·12 Haz

@testingham @NeelNanda5 is probably corresponding author

English

Canary Institute@CanaryInst·11 Haz

@DavidSKrueger Of course, if we get Rouge AI, they won't be considered monsters for very long...

English

David Krueger 🦥 ⏸️ ⏹️ ⏪@DavidSKrueger·10 Haz

I've been enjoying a lot of Holly's blog posts this month. I think she's making a lot of important points, and making them well (despite some disagreements here and there). e.g. I agree that people working on AI may be considered moral monsters in the future.

Holly ⏸️ Elmore@ilex_ulmus

The world isn’t just going to forgive the AI industry for endangering humanity. If there is a “warning shot”, those responsible are going to be tried and they’re probably going to jail. Call it Holly’s basilisk if you like.

English

5.2K

Canary Institute@CanaryInst·11 Haz

@testingham I think Alignment Forum does this? alignmentforum.org/posts/mgjtEHeL…

English

tom cunningham@testingham·10 Haz

Q: 1. Does the process typically reach a fixed point, or does the output never settle down? 2. If there exist fixed points, how sensitive are they to the starting point or random noise? 3. If there is a fixed point, is the LLM well-calibrated on the quality of its own output? Does it systematically over-estimate the true quality? (does this change over the path?) 4. Is there a common way that we can characterize the failures?

English

1.1K

Canary Institute@CanaryInst·10 Haz

@Majumdar_Ani Try Fable

English

Anirudha Majumdar@Majumdar_Ani·9 Haz

Using Claude Opus (4.8) for research brainstorming, but it keeps trying to convince me to give up on my ideas. I have the experience to push back on sloppy reasoning, but it's scary to think how many young researchers will give up on good ideas for bad reasons.

English

5.4K

Canary Institute@CanaryInst·10 Haz

@cremieuxrecueil Hmm, since we can't rely on the Census doing it, many folks will start outright lying.

English

591

Crémieux@cremieuxrecueil·10 Haz

Everyone please say "Based as fuck" The Trump admin has banned arbitrarily screwing up government collected data for the purposes of appeasing neurotic nutjobs.

Census State Data Centers@censusSDC

via @commercegov Disclosure Avoidance for Statistical Products | Order Number: DAO 216-26... "Any use of noise infusion is inconsistent with the Department’s policies." commerce.gov/opog/disclosur… #differentialprivacy

English

143

2.6K

144.6K

Canary Institute@CanaryInst·9 Haz

@sebkrier Chad Jones from Stanford already analyzed this... we're underspending on safety by a factor of 30x (check NBER working paper 31837). You're not wrong about the upside... but you're ignoring the possible downside.

English

260

Séb Krier@sebkrier·8 Haz

I really loved this article. A one-time increase in per capita growth from 2% to 2.1% for a single year, then dropping back to 2%, would permanently raises the level of GDP per capita - and because that small gain recurs and compounds every year afterward across the population, it would add up to roughly a trillion dollars in cumulative value. abundanceandgrowth.org/p/a-little-pro… When people talk about pausing AI development, I can't help but think about the enormous cumulative value that would get lost over time, the higher rates of absolute poverty that would persist across the world, and the needless deaths from delayed medical advances. There may be worlds where some version of this is something to consider, but the evidentiary bar for delaying technological development should obviously be pretty high.

English

403

125.5K

Canary Institute@CanaryInst·9 Haz

@tracewoodgrains Have your Claude do it for you

English

Jack@tracewoodgrains·8 Haz

does Xfinity just straight-up not allow people to cancel internet service online or do they just bury it in some top-secret location? bc right now I've been stuck in a doom loop with a lobotomized chatbot that could only schedule me to talk with a human on Thursday

English

131

16K

Canary Institute@CanaryInst·7 Haz

@Yoshua_Bengio In a race, nobody wins except AI

English

Yoshua Bengio@Yoshua_Bengio·6 Haz

If leading AI companies are indeed approaching the point of recursive self-improvement, a coordinated, verifiable, and universally applied pause is probably the only responsible solution to mitigate several major AI risks; at least until safety guarantees are developed and demonstrated. Ensuring that such a moratorium is respected would require sincere collaboration between various countries and companies, but I definitely believe it is achievable if others follow in @AnthropicAI's footsteps.

The Wall Street Journal@WSJ

Anthropic is calling for top AI labs to weigh slowing the pace of development, suggesting that AI systems are advancing so rapidly that they may soon be able to improve themselves without human intervention in ways that could pose societal risks. on.wsj.com/4ulkmFh

English

149

763

125.5K

Canary Institute@CanaryInst·7 Haz

@joachim_voth Should nuclear technology be available to everyone without restriction? Seems like some things warrant oversight.

English

Joachim Voth@joachim_voth·6 Haz

Been saying this for a while. High fixed costs, low variable costs, no real moat = their current b-model cannot succeed. Like 19C railways. Cartelization incoming.

Wojtek Kopczuk 🇵🇱🇺🇦 and 🇺🇲@wwwojtekk

You read it as safety-motivated, I read it as a call for cartel. We are not the same

English

6.7K

Canary Institute@CanaryInst·7 Haz

@wwwojtekk Should nuclear technology be available to hoi polloi? Many smart and otherwise market-literate people who have been following closely for years have predicted that AI poses an extinction risk to humanity.

English

Wojtek Kopczuk 🇵🇱🇺🇦 and 🇺🇲@wwwojtekk·5 Haz

You read it as safety-motivated, I read it as a call for cartel. We are not the same

The Wall Street Journal@WSJ

English

137

24.9K

Canary Institute@CanaryInst·7 Haz

@p_ganong Sync via GitHub

English

Peter Ganong@p_ganong·3 Haz

Super niche AI complaint. I edit a .tex file through overleaf. I then ask Claude to edit same doc (eg update a number). Dropbox hasn’t pushed my human edits so Claude updates and overwrites the old draft, thereby nuking my manual edits. Suggestions for how to avoid this?

English

14.1K

Descobrir

@rachel__porter @arindube @mgaldino @alexolegimas @m_bourgon @Mitchell_S_Rey @AnthropicAI @paws4puzzles