tom white
4.2K posts

tom white
@dribnet
creations with code and networks
Wellington, New Zealand Katılım Haziran 2011
4.3K Takip Edilen11K Takipçiler

@wendlerch @jeremyphoward @aryaman2020 @jatin_n0 @voooooogel Thanks Chris! Another use of this contrastive synthetic data technique from years past was steering text-to-image generators away from putting text in the image. x.com/dribnet/status…
tom white@dribnet
@NeelNanda5 @ry_serene @jd_pressman @AlecRad @CasperKaae @hugo_larochelle @OleWinther1 Also: AFAIK no text to image engines currently support steering vectors, but my @pixray tool (which proceeded @midjourney, etc) *did* support using these and by default would apply a vector to suppress text appearing in the image. x.com/dribnet/status…
English

I’m very glad to see that Anthropic interp has caught up to the idea of generating a bunch of contrastive synthetic data for extracting supervised steering vectors from!
It’s unfortunate that there’s no prior work to cite on this…
Anthropic@AnthropicAI
New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.
English

if it makes you feel better: i also introduced the idea of generating useful steering vectors from contrastive synthetic data in my 2016 paper - a whole section on augmenting inputs with low pass gaussian filter to derive a steering vector that produces less blurry samples. arxiv.org/abs/1609.04468

English

@jatin_n0 arxiv.org/abs/2501.17148, literally figure 1. but the idea was in @voooooogel's tweets before too surely

English

@Rahatcodes 👋 This is one of the signals we use to figure out if people are having a good experience. We put it on a dashboard and call it the “fucks” chart
English

Lots of interp thought discusses the linearity of the residual stream! This blog post: the residual stream isn't linear in a way that provides formal leverage, and interp methods based on linearity should not be preferred beyond empirical utility.
cs.columbia.edu/~johnhew/resid…
English

@RT_Artwork Got this too. 5 minutes before call they ask you to use the riverside client and forward you to a website clone (riverside dot name - BEWARE) with their malware installer (I stopped there).
(would be up for a themed exhibit on scams showcasing all artists with this invite! 😂)
English


























