Goodfire

415 posts

Goodfire

@GoodfireAI

Using interpretability to understand, learn from, and design AI.

San Francisco Bergabung Ağustos 2024

28 Mengikuti13.1K Pengikut

Tweet Disematkan

Goodfire@GoodfireAI·5 Şub

We raised a $150M Series B at a $1.25B valuation to fundamentally change the field of AI. Scaling is powerful, but we can't intentionally design what we don't understand.

English

488

204.4K

Goodfire me-retweet

mark bissell@MarkMBissell·2d

Come discuss autonomous science next Tuesday at the Academy of Sciences! Excited to talk about what we're building @GoodfireAI and looking forward to hearing talks from @Ginkgo, Monomer Bio, and Tetsuwan!

Cristian Ponce@c_m_ponce

We rented out the Academy of Sciences next Tuesday for a series of talks on autonomous science with @Ginkgo, @GoodfireAI, and Monomer Bio! Come by! luma.com/rmuagyc0

English

Goodfire@GoodfireAI·2d

New paper from Atticus Geiger and collaborators, led by @arunasank!

Aruna S@arunasank

Interpretability methods usually study single-token behavior. But real model behaviors, like sycophancy or writing style, are diffuse across many tokens. Can these diffuse behaviors be localized and controlled from long-form responses? YES!

English

4.7K

Goodfire me-retweet

Nathan Labenz@labenz·6 Mar

In <2 years, @goodfireai has built an all-star team, landed their first blue-chip customers, published a ton of research, and now... Raised $150M at a $1.25B valuation to become the first interpretability unicorn. 🦄 Here @DanJBalsam & @banburismus_ explain the motivation & vision for their new *Intentional Design" research agenda, which aims to understand and control what models learn in training. Full episode ↓

English

5.9K

Goodfire@GoodfireAI·12 Mar

Authors: @sidbop1, @annabelma314, @maxsloef, @RaphaelSarfati, @EricBigelow, Atticus Geiger, Owen Lewis, & @jack_merullo_ Paper: arxiv.org/abs/2603.05488 Read more in the blog post: goodfire.ai/research/reaso…

English

2.4K

Goodfire@GoodfireAI·12 Mar

Interpretability methods can augment monitoring by identifying relevant information internally that isn’t reliably expressed. Plus, well-calibrated probes give us a knob to control the tradeoff between accuracy and token usage, enabling adaptive inference-time compute! (6/7)

English

1.7K

Goodfire@GoodfireAI·12 Mar

LLMs often reason “performatively” well after deciding on a final answer - something that CoT monitors are slow to catch. Our new paper finds that: - probes can help monitor for this - it seems to track with task difficulty - probes enable early CoT exit, saving tokens! (1/7)

English

327

41.4K

Goodfire@GoodfireAI·10 Mar

Welcome @thomas_fel_! We're excited to have you at Goodfire.

Kempner Institute at Harvard University@KempnerInst

After 2 years probing #visionmodels at the #KempnerInstitute, @thomas_fel_ reflects on what he’s learned—and what pieces of the #interpretability puzzle remain hidden—as he heads to @GoodfireAI. Read the interview: bit.ly/4aEBzmp 🎙️🧩

English

5.1K

Goodfire@GoodfireAI·10 Mar

Stanford students - check out this spring quarter course on interp that we're helping support:

Amir Zur@AmirZur2000

Excited to launch a course on mechanistic interpretability at Stanford next quarter! So grateful to be part of an amazing teaching team: Atticus Geiger, Jing Huang, Junyi Tao, and Thomas Icard (who all share the admirable trait of not having a twitter account)

English

7.2K

Goodfire@GoodfireAI·5 Mar

Read more on how we interpreted Evo 2: goodfire.ai/research/inter… Plus some follow-up work since then: goodfire.ai/research/phylo…

English

1.3K

Goodfire@GoodfireAI·5 Mar

Congrats to @pdhsu @BrianHie @davey_burke @garykbrixi and all our other coauthors at @arcinstitute @nvidia @Stanford @UCSF @UCBerkeley @UW!

English

1.5K

Goodfire@GoodfireAI·5 Mar

Not every day nine of your teammates get published in Nature! We've been working with Evo 2 since its release, and have found a number of exciting results with our interpretability tools - including discovering numerous biologically relevant features in the model.

Arc Institute@arcinstitute

Evo 2, the largest fully open biological AI model to date, is now published in @Nature.

English

213

16.7K

Goodfire@GoodfireAI·25 Şub

Read the post: goodfire.ai/blog/interpret… Join us: goodfire.ai/careers

English

1.3K

Goodfire@GoodfireAI·25 Şub

Those optimizations let us harvest 3 billion activations from Kimi in an overnight run, at 14,000 tokens-per-second on an 8 GPU node. That (+ other work) let us steer Kimi's chain of thought in real time! (5/6)

Ryan Panwar@RyanPanwar

Kimi is just like me frfr Steering @Kimi_Moonshot's K2 Thinking's reasoning in the Kimi CLI

English

2.5K

Goodfire@GoodfireAI·25 Şub

New blog post: how we built infrastructure to enable interp at trillion-parameter scale with minimal inference overhead. In a couple short years, interpretability has gone from toy models to the frontier. (1/6)

English

212

18.8K

Jelajahi

@Ginkgo @arunasank @DanJBalsam @banburismus_ @sidbop1 @annabelma314 @maxsloef @RaphaelSarfati