Julien Roy

@juleroy13

Montréal, Québec Katılım Ekim 2017

76 Takip Edilen37 Takipçiler

Julien Roy retweetledi

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·27 Ara

I am particularly bullish on using mechanistic interpretability (especially SAEs) to better understand and discover new knowledge about biology and medicine. "Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models" is a recent paper by @valence_ai that demonstrated the use of mech interp to discover features that correspond to biologically relevant structures in a microscopy image. A self-supervised foundation model (MAE) was trained on a large dataset of microscopy images of cell cultures that have undergone either knockout of specific genes or perturbations with small molecules. A novel mech interp technique that is similar to SAEs called Iterative Codebook Feature Learning was applied to the MAE. By examining which image patches exhibit the highest cosine similarities with specific feature directions, the authors demonstrated these features pick up specific biological concepts. For example, the researchers identified a feature associated with the disruption of adherens junction proteins ("glue proteins" that allow cells to "stick" to each other). This feature is activated in parts of the microscopy image where there are small, bright and isolated cells which appear unable to establish proper connections with the neighboring cells. This paper is just a very early demonstration and proof-of-concept, but I think there is significant promise of this approach: 1. collect lots of observations of cells/tissues/patients/etc. under different conditions to create a huge dataset 2. train large-scale self-supervised foundation models on such a dataset 3. Train mech interp models like SAEs on the foundation models and identify biologically/clinically-relevant features 4. Use such features to derive novel scientific and clinical insights!

Tanishq Mathew Abraham, Ph.D. tweet media

English

383

42.8K

Julien Roy retweetledi

Valence Labs@valence_ai·25 Mar

🧵1/3 Using LLMs as reasoning engines unlocks an exciting future where language agents can autonomously orchestrate complex drug discovery workflows. Tomorrow, @craigmichaelm will share more details about our vision at #AMLDEPFL2024 in the AI and the Molecular World track.

English

6.1K

Julien Roy retweetledi

Pierluca D'Oro@proceduralia·24 Eki

Can reinforcement learning from AI feedback unlock new capabilities in AI agents? Introducing Motif, an LLM-powered method for intrinsic motivation from AI feedback. Motif extracts reward functions from Llama 2's preferences and uses them to train agents with reinforcement learning. On the complex NetHack game, Motif solves previously unsolved tasks without needing any expert demonstrations. Surprisingly, Motif's reward leads to better game score than the one obtained by using the score itself as a reward. Given access to an event captioning mechanism, a few properties make Motif a general method: • it is entirely based on open models • the LLM doesn't need direct access to the environment dynamics (e.g., its source code) • the LLM doesn't need to understand observation and action spaces The best part? You can start using Motif right now, even on a small compute budget: the whole pipeline can take less than two GPU-days. Feel free to read our paper and try our code out. Paper: arxiv.org/abs/2310.00166 Code: github.com/facebookresear… Blog post: mila.quebec/en/article/mot… Work co-lead by @MartinKlissarov and myself, with @shagunsodhani @robertarail @pierrelux Pascal Vincent @yayitsamyzhang @HenaffMikael Learn more in the thread 🧵

English

160

731

311.8K

Julien Roy@juleroy13·23 Ağu

Your RL agent is not behaving as expected? Try Direct Behavior Specification via Constrained RL! In our #ICML2022 paper we propose to use a special family of constraints to specify behavior instead of forcing everything into a single reward. Blog is out: tinyurl.com/3z6zy6n2

English

Julien Roy retweetledi

Alexandre@alexpiche_·9 Haz

Target Q networks are essential to make deep RL work in practice. But they slow down learning by not using the most up-to-date target value estimates. We introduce a functional regularization (FR) alternative to solve this problem. arxiv.org/abs/2106.02613

English

Julien Roy@juleroy13·30 Eki

Really proud of our work on simplifying Adversarial Imitation Learning by using a special form of discriminator! twitter.com/paulbbarde/sta… Thanks to all my collaborators! @paulbbarde, @WonseokJeon, Joelle Pineau, @chrisjpal, @DerekRenderling, @UbisoftLaForge

English

Julien Roy@juleroy13·10 Tem

Great results @FelixGHarvey!

Mean Squared Error - On Hiatus@non_manifold

Generating plausible transitions between character states grows exponentially with the number of states, a burdensome task for animators or motion capture actors. Recurrent Transition Networks for Character Locomotion approaches this with an LSTM-style NN arxiv.org/pdf/1810.02363…

English

Julien Roy@juleroy13·11 Mar

@anorangeduck Very interesting perspective! Also, when you talk about how the scientific method is naturally occurring to humans, it reminds me how @yudapearl describes how causal reasoning is also very natural to us... which makes me think that they are intrinsically linked.

English