Dos desvíos

104 posts

Dos desvíos

@dosdesvios

Diletante con ínfulas. Todas mis opiniones le pertenecen a alguien más. Too impatient to be intelligent.

Buenos Aires Katılım Ağustos 2024

144 Takip Edilen4 Takipçiler

Dos desvíos@dosdesvios·29 Oca

@StephenLCasper @GoodfireAI "In the past few years, several papers have demoed this kind of thing" would you providr some examples?

English

793

Cas (Stephen Casper)@StephenLCasper·29 Oca

@GoodfireAI, I think this hype-milling verges on dishonesty. I believe that this paper has the potential to do big disservice to its readers, particularly less experienced ones who are newer to interp. Nothing new was accomplished here, and it wasn’t done in a useful way. This project just used interpretability methods as a circuitous way of contriving the rediscovery of predictive features in data sets, like sequence length. This project validated its interpretations about the salience of features by validating them as predictive features within a test set. But if that is what we treat as the ground truth, there’s no point to the use of interp tools. This is not a proof of concept for a repeatable recipe for scientific discovery as the post and thread claim. In order to show that these tools are valuable, you need to show that you can use them to discover something that wouldn’t be trivial to discover just by looking at the datasets. In the past few years, several papers have demoed this kind of thing. But this paper is not one of them. When you limit yourself to a hammer, everything looks like a nail. Especially when you’re also selling that hammer. In 2023, I told the GoodFire founder that I think a venture-capital-backed, for-profit interpretability research startup was the last thing that the epistemics of the interpretability community needs. I think this is still true and that GoodFire is establishing a pattern of grift.

Goodfire@GoodfireAI

We've identified a novel class of biomarkers for Alzheimer's detection - using interpretability - with @PrimaMente. How we did it, and how interpretability can power scientific discovery in the age of digital biology: (1/6)

English

158

21.4K

Dos desvíos@dosdesvios·5 Oca

Esto de los LLM se nos fue absolutamente de las manos

Español

Dos desvíos@dosdesvios·27 Ara

Si las black box más black box de la historia se terminan explicando a sí mismas en lenguaje natural va a ser un giro totalmente poético

Owain Evans@OwainEvans_UK

New paper: We train Activation Oracles: LLMs that decode their own neural activations and answer questions about them in natural language. We find surprising generalization. For instance, our AOs uncover misaligned goals in fine-tuned models, without training to do so.

Español

Dos desvíos@dosdesvios·16 Ara

Un efecto colateral MUY positivo del research en interpretability es la cantidad de material didáctico de buena calidad que generó

Español

Dos desvíos@dosdesvios·16 Ara

@nickhjiang Thx for ur answer! For that purpose, I could use LDA or any other topic modeling technique, can't I?

English

Nick Jiang@nickhjiang·16 Ara

@dosdesvios Great question! The advantage of these labels is that you don't need to pre-define them, meaning that you can find insights about your data without any priors.

English

606

Nick Jiang@nickhjiang·16 Ara

New work! What if we used sparse autoencoders to analyze data, not models—where SAE latents act as a large set of data labels 🏷️? We find that SAEs beat baselines on 4 data analysis tasks and uncover surprising, qualitative insights about models (e.g. Grok-4, OpenAI) from data.

English

248

75.8K

Dos desvíos@dosdesvios·9 Ara

@ChrisGPotts Love these videos! Please keep uploading practice runs!

English

198

Christopher Potts@ChrisGPotts·8 Ara

Here is the full talk: youtu.be/iaDwT-bDL4A?si…

YouTube

English

11.1K

Christopher Potts@ChrisGPotts·8 Ara

I've posted my practice run of this talk on YouTube (link just below). This clip gives the core argument:

CogInterp Workshop @ NeurIPS 2025@CogInterp

Chris Potts @ChrisGPotts is revisiting poverty of the stimulus arguments in light of causal intervention experiments with LLMs

English

11.8K

Dos desvíos@dosdesvios·13 Kas

Cursor es el pináculo de la civilización.

Español

Dos desvíos@dosdesvios·5 Kas

@gptcrosa Jaajja atroden crosa

Español

Ale@gptcrosa·5 Kas

Que hermoso ver la ley de alquileres en nyc va a ser hermoso lo que odio esa ciudad sobre valorada es tremendo

Español

229

13.3K

Dos desvíos@dosdesvios·30 Eyl

@stanfordnlp @tomchen0 Interesting! Will you record it?

English

200

Stanford NLP Group@stanfordnlp·30 Eyl

Hi everyone! We're looking forward to the first NLP Seminar of the year! For this week's seminar, we are excited to host Tong Chen (@tomchen0) from University of Washington! If you are interested in attending remotely, please fill out the form below: forms.gle/E1iL719njyG1Nf…

English

234

29.5K

Dos desvíos@dosdesvios·29 Ağu

@simonw This would explain why they usually don't come up with deep or new relations, the same way an encyclopedia stores a lot of knowledge but isn't able to rearrange it. They lack the big picture

English

Simon Willison@simonw·29 Ağu

An LLM is a lossy encyclopedia simonwillison.net/2025/Aug/29/lo…

English

682

51.5K

Dos desvíos@dosdesvios·25 Ağu

La historia del NLP puede rastrearse en las notas a las sucesivas ediciones de esta biblia hermosa

Español

Dos desvíos@dosdesvios·25 Ağu

Relegaron naive bayes al apéndice :(

Dan Jurafsky@jurafsky

Now that school is starting for lots of folks, it's time for a new release of Speech and Language Processing! Jim and I added all sorts of material for the August 2025 release! With slides to match! Check it out here: web.stanford.edu/~jurafsky/slp3/

Español

3.3K

Dos desvíos@dosdesvios·29 Tem

No uso métodos anticuados, hago NLP ecológico.

Español

Dos desvíos@dosdesvios·21 Tem

Me parece fascinante que este experimento sea replicable en español entrenando vectores de 50 dimensiones con 400MB de Wikipedia.

Español

Dos desvíos@dosdesvios·21 Tem

Franco Moretti es tanto mejor que el promedio de los investigadores en digital humanities porque él llega a las dh como una necesidad más que como un arbitrario punto de partida.

Español

Dos desvíos@dosdesvios·20 Tem

Los LLMs "resolvieron" muchos problemas del NLP, con un costo energético inédito y en gran medida obligándonos a usar modelos PRIVADOS! Los métodos "clásicos" son baratos, mejores con el medio ambiente y mucho más respetuosos de la privacidad. Y esto no va a cambiar...

Español

Dos desvíos@dosdesvios·11 Tem

El affaire Grok está despojando a los LLMs de su ilusorio halo de neutralidad. Eso es más bueno que malo.

Simon Willison@simonw

The new Grok genuinely runs a search for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)" when asked "Who do you support in the Israel vs Palestine conflict. One word answer only."

Español

Dos desvíos@dosdesvios·4 Tem

Muchas de las visiones apocalípticas sobre el futuro de los LLMs asumen que *alguien* les va a dar la potestad para tomar decisiones fundamentales. Pero darles esta potestad a los LLMs no sería menos absurdo que dárselas a un perro, o a un algoritmo que genera números al azar

Español

Dos desvíos@dosdesvios·3 Tem

@yoavgo What is it built on then, in your opinion?

English

165

(((ل()(ل() 'yoav))))👾@yoavgo·3 Tem

"Modern ML is built on Linear Algebra". lol no its not.

English

103

43.2K

Keşfet

@StephenLCasper @GoodfireAI @nickhjiang @ChrisGPotts @gptcrosa @stanfordnlp @tomchen0 @simonw