
Dos desvíos
104 posts

Dos desvíos
@dosdesvios
Diletante con ínfulas. Todas mis opiniones le pertenecen a alguien más. Too impatient to be intelligent.


We've identified a novel class of biomarkers for Alzheimer's detection - using interpretability - with @PrimaMente. How we did it, and how interpretability can power scientific discovery in the age of digital biology: (1/6)

New paper: We train Activation Oracles: LLMs that decode their own neural activations and answer questions about them in natural language. We find surprising generalization. For instance, our AOs uncover misaligned goals in fine-tuned models, without training to do so.





Chris Potts @ChrisGPotts is revisiting poverty of the stimulus arguments in light of causal intervention experiments with LLMs





Now that school is starting for lots of folks, it's time for a new release of Speech and Language Processing! Jim and I added all sorts of material for the August 2025 release! With slides to match! Check it out here: web.stanford.edu/~jurafsky/slp3/

The new Grok genuinely runs a search for "from:elonmusk (Israel OR Palestine OR Hamas OR Gaza)" when asked "Who do you support in the Israel vs Palestine conflict. One word answer only."







