Laura Hanu

212 posts

Laura Hanu

@HanuLaura

next gen computer-use agents @salesforce | ex @convergence_ai_ @UnitaryAI @CompSciOxford @imperialcollege | interested in steering ai to do more good than bad

London, England Katılım Ağustos 2018

507 Takip Edilen618 Takipçiler

Sabitlenmiş Tweet

Laura Hanu@HanuLaura·11 Kas

Excited to release code and models for detoxify 🙊, a simple python library built @UnitaryAI that aims to detect hate speech and toxic comments online! Includes training scripts built with @PyTorchLightnin ⚡ and transformers checkpoints @huggingface 🤗! github.com/unitaryai/deto…

GIF

English

440

Laura Hanu@HanuLaura·15 Ara

pretty good high level intro to the current state of llms youtube.com/watch?v=xCRvOU…

YouTube

English

172

Laura Hanu@HanuLaura·30 Eki

main takeaways from the @dwarkeshpodcast @karpathy interview: *RL limitations*: hard to get past reward sparsity problem in RL when it comes to real world tasks, one promising direction could be more sample efficient learning by reflecting on mistakes *reduced model size by getting rid of redundant memorisation*: soon the cognitive engine might look like a 1 billion parameter model that just knows how to think without the need to memorise lots of data - it should be able to recognise what it doesn't know and look it up *gradual loss of control and understanding*: even if we still have humans delegating tasks to autonomous entities, it will get increasingly harder to fully control them, let alone understand what they're doing; similar thoughts in this great albeit on the pessimistic side paper arxiv.org/abs/2501.16946 *role of AI education post-AGI*: pre-AGI AI education is useful, post-AGI education is fun, could be seen as a way to train mentally just how you do physically

English

101

Laura Hanu@HanuLaura·30 Ağu

pantheon is now up there with some of my favourite sci fi of all time like asimov’s the end of eternity, the last question or story of your life with some elements of snow crash or dark the tv show but upping the scale even more, dare i say something akin to what @DavidDeutschOxf was describing in the beginning of infinity

English

206

Laura Hanu@HanuLaura·29 Ağu

a nice straight forward summary on some of the grpo limitations beyond just not being great for multi-turn e.g. * if you have multiple reward signals -> the model won't know which one it is being rewarded for since they're usually all collapsed into one * only the scalar reward is used for policy update when a more detailed textual feedback could be used (what gepa kinda does with their reflective prompt evolution)

arya ꩜@aryagxr

wrote a short blogpost on what I think are some limitations of GRPO: I’ve been playing around with RL finetuning for reasoning tasks and came across a few limitations that i wanted to document here feedback/corrections are welcome!

English

187

Laura Hanu@HanuLaura·29 Ağu

nice watch for a reading group on verifiers, like a jigsaw falling into place🤭

Samuel Albanie 🇬🇧@SamuelAlbanie

Video summary for "Prover-Verifier Games improve legibility of LLM outputs" youtu.be/EMDa4urzz-M 1/2

English

191

Laura Hanu@HanuLaura·7 May

It's been interesting to see RL having a comeback lately. Guess in retrospect it's not surprising that RLHF is not enough since it naturally hits a ceiling limited by the quality of human feedback. What's particularly exciting though is that this shift is happening just as AI is becoming more agentic and able to interact with the digital world. Combine this with some grounded reward signals (e.g. profit, likes, citations) and we should see AI not just replicate what humans do but create new knowledge through their own experiences. The next few years should be a fun and wild ride 🎢 For another interesting paper on the topic: incompleteideas.net/papers/TheEraO… or this podcast on it: youtu.be/zzXyPGEtseI?si…

YouTube

English

Laura Hanu@HanuLaura·7 May

Why does RL work so well to learn preferred completions in llm post training and why can't we just use supervised finetuning? This paper has an interesting information-theoretic explanation: arxiv.org/pdf/2503.01067 TLDR: The literature shows that the 2 stage RL approach (1. learn a good reward model that can score llm completions similar to how a human would 2. use that to learn good policies with RL) used today in the sota models outperforms using only supervised finetuning. The authors posit that this happens when the verification of an output is simpler than generating it. This is because, in the RL case, the search space is constrained to the subset of policies that are optimal for the learnt reward model. When they reduce the verification-generation gap empirically the difference in results diminishes too.

English

168

Laura Hanu@HanuLaura·1 May

the fact that this could be the prequel to Adolescence 😳

Neel Nanda@NeelNanda5

Well that's fucking terrifying...

English

117

Laura Hanu@HanuLaura·8 Şub

note to self, don’t start addictive puzzles late at night, but also level 4 lets go?🕵️‍♀️ fairly consistent strategy so far 👀

Jan Leike@janleike

We challenge you to break our new jailbreaking defense! There are 8 levels. Can you find a single jailbreak to beat them all? claude.ai/constitutional…

English

251

Laura Hanu@HanuLaura·28 Ara

great discussion! we need more public discourse between top ai leaders and economists 🤝 would’ve liked to see them engage with hinton’s concern that the benefits ai brings will deepen the economic divide in a capitalist system further creating fertile ground for fascism 🌶️

The Nobel Prize@NobelPrize

Watch our 2024 Nobel Prize laureates talk about their research and careers in a unique roundtable discussion, 'Nobel Minds', moderated by BBC's Zeinab Badawi. youtu.be/1tELlYbO_U8

English

282

Laura Hanu@HanuLaura·11 Ara

at @NeurIPSConf this week 👋

English

143

Laura Hanu retweetledi

Saining Xie@sainingxie·26 Haz

Introducing Cambrian-1, a fully open project from our group at NYU. The world doesn't need another MLLM to rival GPT-4V. Cambrian is unique as a vision-centric exploration & here's why I think it's time to shift focus from scaling LLMs to enhancing visual representations.🧵[1/n]

English

246

1.1K

332.9K

Laura Hanu@HanuLaura·4 Mar

If you haven't seen it, part 1 dives into how to set up a HPC cluster! unitary.ai/articles/intro…

English

167

Laura Hanu@HanuLaura·4 Mar

The 2nd part of the Intro to multi-node machine learning is out! 🥳 This all about how to use Slurm to scale up your ML applications! 🚀 unitary.ai/articles/intro…

English

1.1K

Laura Hanu@HanuLaura·24 Eki

Excited to kick off a new deep-dive blog series on how to build and set up the infrastructure for distributed training from scratch. Spoiler alert – setting up a cloud-based cluster that can scale to hundreds of nodes isn't as daunting as it sounds! 👀🚀 unitary.ai/articles/intro…

English

1.1K

Laura Hanu@HanuLaura·4 Eki

Hello from the ICCV magazine 👋 🙋‍♀️ #ICCV2023 @ICCVConference Paper: arxiv.org/abs/2309.10783

English

547

Laura Hanu retweetledi

Anita Verő@anitaveroe·29 Eyl

If you are at ICCV in Paris next week, come to talk to us at our poster at the "What is Next in Multimodal Foundation Models?" workshop! The title is: "Language as the Medium: Multimodal Video Classification through text only" Paper link: arxiv.org/pdf/2309.10783…

English

486

Laura Hanu@HanuLaura·1 Eki

Can we use LLMs like GPT3.5/Claude/Llama2 to directly classify multimodal content like videos in-context with no training? 👀 Excited to share our findings at the "What is Next in Multimodal Foundation Models?" workshop at #ICCV2023 @anitaveroe @jdthewlis arxiv.org/abs/2309.10783

English

510

Laura Hanu@HanuLaura·21 Nis

So fun to see so many people interested in LLMs, MLOps and generative AI in London!🐝

Emad@EMostaque

Amazing event by Lukas and our friends @wandb, well over 1,000 people showing up to listen to and discuss machine learning ops & AI! Great energy and excitement, London will be amazing AI hub 🦄 🇬🇧

English

622

Laura Hanu@HanuLaura·8 Mar

Our most popular detoxify model on @huggingface has been downloaded more than 7M times 🥳 huggingface.co/unitary/toxic-…

Elizabeth@lizallendorf

🔥Total downloads over ALL TIME are now available for @huggingface models and datasets in the repo’s settings! There have been many requests for this feature as folks have wanted to use this data for promotions, grants, and even green card applications. 🥳

English

16.1K

Keşfet

@dwarkeshpodcast @karpathy @DavidDeutschOxf @NeurIPSConf @ICCVConference @anitaveroe @jdthewlis @huggingface