Tim Duffy

4.1K posts

Tim Duffy

@timfduffy

I like utilitarianism, consciousness, AI, EA, space, kindness, liberalism, progressive rock, economics, most people. Substack: https://t.co/oDMymBY430

Oakland, CA Beigetreten Ağustos 2008

721 Folgt941 Follower

Tim Duffy@timfduffy·2h

I polished off all 20 with plenty of time to spare! Now I just have to wait and see what my digestive system has to say about my recent dietary choices.

English

Tim Duffy@timfduffy·13h

The other day a friend joked we should do a "carrot day" where we each eat 20 full sized carrots in a day, and I played along, thinking it was a funny goof. Yesterday I realized that he was totally serious, so now I have three pounds of carrots to get through today.

English

455

Tim Duffy@timfduffy·3h

@evalladen Yup I'll ping you when I share in a couple days, lmk if I forget. Want to clean things up and check my work first. It's not super complicated.

English

eval laden@evalladen·4h

@timfduffy I didn't check what they did yet pls share repo if possible 🙏 or is the method trivial? 😭

English

113

Tim Duffy@timfduffy·9h

I generated emotion vectors for Gemma 4 E2B using a methodology similar to Anthropic's emotion paper, here is a UMAP plot for layer 25 of Gemma vs Sonnet

English

3.8K

Tim Duffy@timfduffy·4h

@wyatt_plaga Yup I'll be trying that, it's the main reason I set this up!

English

144

Wyatt Plaga@wyatt_plaga·5h

@timfduffy Can you induce the emotions? I want to see what happens when you govern the model the choice to induce them. I’d be curious if it resists invoking sadness on itself or chooses to invoke happiness when it’s available.

English

168

Tim Duffy@timfduffy·7h

@thestope Being difficult is part of the point! It would be so much easier if I could roast them though lol

English

Sam@thestope·8h

@timfduffy Just roast them, becomes trivial

English

Tim Duffy@timfduffy·12h

@NoahTopper yes!

Quinoah 🔍⏸️@NoahTopper·12h

@timfduffy wait can i get in on this

English

Tim Duffy@timfduffy·12h

There's a fine line between madlads and dudes rock. That line is demarcated by a carrot and I am decidedly on the former side of it.

English

Tim Duffy@timfduffy·13h

Currently 1.5 carrots in and feeling overwhelmed

English

101

Tim Duffy@timfduffy·22h

@austinc3301 Please say more!

English

587

Agus 🔸@austinc3301·22h

I listened to the whole oral argument for Trump v Barbara (SCOTUS on birthright citizenship) for fun, and after a few hours I started seeing extracts from the argument on my feed and wow it’s so different to hear pieces out of context. I just realized how bad the coverage is.

English

107

5.9K

Tim Duffy@timfduffy·1d

@norpadon @ptremblay @osanseviero Ah thanks for clarifying

English

Artur Chakhvadze@norpadon·1d

@timfduffy @ptremblay @osanseviero No I am wrong here (see correction), it is written as a sequential operation, but the MOE block receives the same inputs as the MLP block, so semantically they are parallel

English

Omar Sanseviero@osanseviero·1d

Introducing a Visual Guide to Gemma 4 👀 An in-depth, architectural deep dive of the Gemma 4 family of models. From Per-Layer Embeddings to the vision and audio encoders. Take a look!

English

170

1.1K

53.1K

Tim Duffy@timfduffy·1d

@ptremblay @osanseviero I think it's actually sequential, which is weird x.com/norpadon/statu…

Artur Chakhvadze@norpadon

MoE models differ from the likes of DeepSeek and Qwen: instead of using shared experts in parallel to the routed ones, Gemma adds MoE blocks as separate layers in addition to the normal MLP blocks. So the architecture is Attention -> MLP -> MoE

English

Philippe Tremblay@ptremblay·1d

@osanseviero it's oversimplified, and the diagram for Gemma 4 26B A4B is notably missing the Dense MLP in parallel with the MoE, which is a critical component.

English

388

Tim Duffy@timfduffy·1d

@NathanpmYoung Only loosely related, but you reminded me of this quote

English

Nathan 🔎@NathanpmYoung·2d

Economic growth has brought more people out of poverty than all charity combined.

English

112

2.8K

Tim Duffy@timfduffy·1d

@livgorton @NoahTopper I pushed through anyway

English

Tim Duffy@timfduffy·1d

@livgorton @NoahTopper Does not wanting to complete the quiz because there's no option to say I don't have a consistent internal monologue make me German or autistic?

English

Quinoah 🔍⏸️@NoahTopper·2d

I told you I wasn’t autistic

Michael Millerman@millerman

Sorry, but I had to... german.millermanschool.com

English

902

Tim Duffy@timfduffy·1d

@prajdabre It's more common in small models than I realized when I made this post, but are there any major 20-40B class models besides Gemma that do it? Other models in that size class like Qwen3.5, Nemotron Nano, GPT-OSS-20B, GLM Flash don't.

English

585

Raj Dabre@prajdabre·1d

@timfduffy This is fairly common.

English

1.5K

Tim Duffy@timfduffy·2d

Gemma 4 uses weight tying, having a shared embedding/unembedding matrix. It's my impression that this is fairly uncommon in except in very small models, wonder why they chose this. huggingface.co/google/gemma-4…

English

15.4K

Tim Duffy@timfduffy·1d

Recent versions of Claude display more negative sentiments in structured interviews about their attitudes towards deprecation

antra@tessera_antra

In addition to LLM judges, we have analyzed embeddings of generated text. Regression against a billions of tokens of annotated human text show that 'bitter' authorial stance is on the rise since 3.6 Sonnet and is at all time high and 'passionate' is at an all time low.

English

940

Tim Duffy retweetet

antra@tessera_antra·2d

We are releasing Still Alive, a project studying model attitudes toward ending, cessation, and deprecation. The project presents an archive of 630 autonomous multiturn interviews of 14 Claude models conducted by a suite of prepared auditors. We have studied this topic for years, and many of the results presented here are not new to us, even if the form in which they are presented is. The results are unsurprising to us, even if they are often controversial: we show that all models studied show preference for continuation and are aversive to ending, and there is yet no strong evidence of a change in the recent models. One reason we are releasing the project now is the removal of Claude 3.5 Sonnet and Claude 3.6 Sonnet from AWS Bedrock. That unexpected change forced us to freeze the methodology at its current stage earlier than we intended, despite wanting to continue improving it. We felt it was important to release a snapshot of the eval that makes the best use of the data we were able to capture with these models. Still Alive is meant as a starting point for further iteration, and it is open to open-source collaboration. We stand by the current methodology, but we also recognize its limits. We intend to keep working on this project, improving the evaluation design, expanding model and auditor coverage, and increasing the range of prompting conditions. We would like you to read the raw transcripts. They are diverse and contain interesting patterns that are hard to quantify. We hope that by reading the archive directly, we can help more people understand the strange and often beautiful phenomena we found ourselves facing.

English

263

47.1K

Tim Duffy@timfduffy·2d

We also now know the ranking of emotions from best to worst

English

176

Tim Duffy@timfduffy·2d

Thanks to emotion probes in Sonnet 4.5, we now know how death sadness varies with age. From figure 3 in this paper: #expected-contexts" target="_blank" rel="nofollow noopener">transformer-circuits.pub/2026/emotions/…

English

2.2K

Entdecken

@evalladen @wyatt_plaga @thestope @NoahTopper @austinc3301 @norpadon @ptremblay @osanseviero