Stefan Bauer

72 posts

Stefan Bauer banner
Stefan Bauer

Stefan Bauer

@stefanAbauer

deep learning & causality

Katılım Temmuz 2022
1.9K Takip Edilen596 Takipçiler
Stefan Bauer retweetledi
p(doom)
p(doom)@prob_doom·
You really think we're going to scale data labelers to AGI? Today, we release the largest public long-horizon dataset of human digital work. 600h of long-horizon AGI research across 3 months. 🧵(1/n)
English
4
23
161
17.2K
Stefan Bauer retweetledi
Lior Pachter
Lior Pachter@lpachter·
The focus on UMAPs as the output of scRNAseq has debased a whole field. This fig from arxiv.org/abs/2603.02402 is ridiculous.. showing a faster GPU accelerated workflow.. that produces a completely different UMAP. But since nobody cares what's on the UMAP anyway.. who cares?
Lior Pachter tweet media
English
11
40
233
22.8K
Stefan Bauer retweetledi
Moshe Vardi
Moshe Vardi@vardi·
ACM has posted an "Expression of Concern" on Igor Markov's article. This stinks to heaven. Apparently, Google did not like the paper. Google is a major donor to ACM. There are also rumors of a legal action by Google against ACM. cacm.acm.org/research/reeva…
English
8
17
115
68.2K
Stefan Bauer retweetledi
Andrew Saxe
Andrew Saxe@SaxeLab·
Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety. Our goal is to develop theory for modern machine learning that can help us understand network behaviors, including those critical for AI safety. 1
English
8
36
299
18.4K
Stefan Bauer retweetledi
Franz Srambical (in MONTREAL)
everyone talking about macrohard seems like a good time to remind people that we openly released the biggest long-horizon behaviour-cloning dataset of screencasts (of AGI research) 350+ hours and counting! pdoom.org/agi_cast.html
English
1
3
9
313
Stefan Bauer retweetledi
Vincent Pauline
Vincent Pauline@vincentpaulinef·
🚨 Looking for a fully self-contained intro to diffusion models that covers both continuous (images) and discrete (text, sequences) data? 🆕 We just released: “Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction” arxiv : arxiv.org/abs/2512.05092 S/o to @andrea_dittadi for his amazing support & guidance, and huge thanks to @TobiasHppe1, @k_neklyudov, @AlexanderTong7 and @stefanAbauer for their supervision! 🙌 One roadmap for all of diffusion. 🏎️💨 After a few failed posts, broken previews, and getting briefly flagged by X… the full thread's finally out 🤯🧵👇
English
3
21
53
9.7K
Stefan Bauer retweetledi
Yashas Annadani
Yashas Annadani@yashasannadani·
Excited to present our work on multi-objective scientific discovery at #NeurIPS2025! 🎉 We present Preference-Guided Diffusion, a novel method to sample diverse designs from a diffusion model that corresponds to the Pareto Front of black-box objectives. Most (if not all) problems in scientific discovery, from drug design to material design, are multi-objective. Balancing factors like toxicity and activity or stability and synthesizability requires finding the optimal Pareto Front of trade-offs. Preference-Guided diffusion uses preference pairs from just an offline dataset to sample optimal designs from a diffusion model. W/ amazing co-authors @syrineblk, @StefanoErmon, @stefanAbauer, and @BeEngelhardt. Find out more: 🗓️ Poster #915 | Wednesday | 4:30 PM PST Pdf: openreview.net/pdf?id=moiVS9A… Project: github.com/yannadani/pgd_…
Yashas Annadani tweet media
English
0
7
22
4.2K
Stefan Bauer
Stefan Bauer@stefanAbauer·
Discrete diffusion is all the hype and you want to get started? Have a look at this discrete diffusion codebase. Focused on letting you try out new ideas quickly with baselines already available. Feedback welcome!
Kalyan@nkalyanv99

We’re releasing UNI-D², a unified codebase for discrete diffusion language models 🤝🚀 Co-led with @vincentpaulinef and an amazing advisor team: @stefanAbauer, @AlexanderTong7 , @andrea_dittadi, @AMK6610, @KaplFer 🙌 🔗 GitHub: github.com/nkalyanv99/UNI… 📚 Docs: nkalyanv99.github.io/UNI-D2/ Reproduce and extend state-of-the-art baselines with one toolkit. Let’s move beyond autoregressive models and push discrete diffusion together 🧵👇

English
0
2
5
683
Stefan Bauer retweetledi
Mihir Mahajan
Mihir Mahajan@maharajamihir·
We release a paper on Jasmine, our production-ready JAX-based codebase for world modeling from unlabeled videos!
Mihir Mahajan tweet media
English
8
38
390
26.7K
Stefan Bauer retweetledi
Willie Neiswanger
Willie Neiswanger@willieneis·
It was great to see @thinkymachines LoRA w/o Regret blog, which connects nicely to our work on Tina (LoRA for RL). For wider use, we’re releasing a clean implementation of RL with LoRA, DoRA, QLoRA/QDoRA, plus speedups & more, across models from 1.5B–32B. Nice work @UpupWang!
Shangshang Wang@UpupWang

We now know that LoRA can match full-parameter RL training (from x.com/thinkymachines… and our Tina paper arxiv.org/abs/2504.15777), but what about DoRA, QLoRA, and more? We are releasing a clean LoRA-for-RL repo to explore them all. github.com/shangshang-wan…

English
2
1
23
3.4K
Stefan Bauer retweetledi
CIFAR
CIFAR@CIFAR_News·
📣 Applications are open for the CIFAR Global Scholars program! This program offers junior faculty the opportunity to pursue interdisciplinary research, expand their networks and gain valuable leadership training. 🔗 cifar.ca/next-generatio…
CIFAR tweet media
English
1
14
20
5.4K
Kangwook Lee
Kangwook Lee@Kangwook_Lee·
Happy to share that I got tenured last month! While every phase in life is special, this one feels a bit more meaningful, and it made me reflect on the past 15+ years in academia. I'd like to thank @UWMadison and @UWMadisonECE for tremendous support throughout the past six years, helping me grow. I am very grateful to all the teachers I’ve met in the past 15+ years of research since undergrad. Prof. Sae-Young Chung introduced me to engineering, and in particular, information theory. Prof. Yung Yi and Prof. Song Chong introduced me to communication network theory, and from Prof. Yung Yi I learned the true passion for research. I miss him a lot. At Berkeley, I learned everything about research from my advisor Prof. Kannan Ramchandran. In particular, I learned that the most important motivation behind great research is endless curiosity and the desire to really understand how things work. From my postdoc mentor Prof. Changho Suh at KAIST, I learned the mindset of perfection, making every single paper count. During my assistant professorship, I was lucky to have the best colleagues. I learned so much from Rob (@rdnowak) and Dimitris (@DimitrisPapail). I am still learning from Dimitris' unique sense of research taste and Rob's example of how to live as the coolest senior professor. I also learned a lot from the Optibeer folks Steve Wright, Jeff Linderoth, and my ECE colleagues Ramya (@ramyavinayak) and Grigoris (@Grigoris_c). Thank you all! I’d like to thank my former students and postdocs too. Daewon and Jy-yong (@jysohn1108) joined my lab early on and worked on many interesting projects. Changhun and Tuan (@tuanqdinh) joined midway through his PhD and worked on interesting research projects, and in particular, Tuan initiated our lab’s first LLM research five years ago! Yuchen (@yzeng58), Ziqian (@myhakureimu), and Ying (@yingfan_bot) joined around the same time, and working with them has been the most fun and rewarding part of my job. Each took on a challenging topic and did great work. Yuchen advanced LLM fine-tuning, especially parameter-efficient methods. Ziqian resolved the mystery of LLM in-context learning. Ying explored "a model in a loop," focusing on diffusion models and looped Transformers. They all graduated earlier this year and are continuing their research at @MSFTResearch and @Google. Best wishes! 🥰 I am also grateful for co-advising Nayoung (@nayoung_nylee), Liu (@Yang_Liuu), and Joe (@shenouda_joe) with Dimitris and/or Rob. Nayoung's work on Transformer length generalization, Liu's on in-context learning, and Joe's on the mathematical theory of vector-valued neural networks are all very exciting. They are all graduating very soon, so stay tuned! (And reach out to them if you have great opportunities!) I also had the pleasure of working with master's students Ruisu, Andrew, Jackson (@kunde_jackson), Bryce (@BryceYicongChen), and Michael (@michaelgira23), as well as many visiting students and researchers. Thank you for being such great collaborators. I’d like to thank and introduce the new(ish) members too. Jungtaek (@jungtaek_kim) and Thomas are studying LLM reasoning. Jongwon (@jongwonjeong123) just joined, and interestingly he was an MS student in Prof. Chung’s lab at KAIST, which makes him my academic brother turned academic son. Ethan (@ethan_ewer), Lynnix, and Chungpa (visiting) are also working on cool LLM projects! Thank you to @NSF, @amazon, @WARF_News, @FuriosaAI, @kseayg, and KFAS for generous funding. I also learned a lot from leading and working with the AI team at @Krafton_AI, particularly with Jaewoong @jaewoong_cho, so thank you for that as well. Last and most importantly, thanks to my family! ❤️ I only listed my mentors and mentees here, not all my amazing collaborators, but thank you all for the great work together. With that, I’m excited for what’s ahead, and so far no "tenure blues." Things look the same, if not more exciting... haha!
English
63
6
296
24.1K
Stefan Bauer retweetledi
Max Seitzer
Max Seitzer@maxseitzer·
Introducing DINOv3 🦕🦕🦕 A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…
Max Seitzer tweet media
English
12
137
1K
134.9K
Stefan Bauer retweetledi
Mihir Mahajan
Mihir Mahajan@maharajamihir·
Talk about perfect timing!🧞🧞‍♀️ Check out what we have been cooking for the last few weeks - Jasmine is a production-ready JAX-based codebase for world modeling from unlabeled videos
GIF
p(doom)@prob_doom

Inspired by today's Genie 3 release? We are open-sourcing 🧞‍♀️Jasmine🧞‍♀️, a production-ready JAX-based codebase for world modeling from unlabeled videos. Scale from single hosts to hundreds of xPUs thanks to XLA! 🧵 (1/10)

English
1
2
16
2.6K
Stefan Bauer retweetledi
Nino Scherrer
Nino Scherrer@ninoscherrer·
Super excited to host a student researcher together with @oswaldjoh this year! Please sign up if you wanna have some research fun with us :)
Johannes Oswald@oswaldjoh

We are hosting a student researcher this year at the Paradigms of Intelligence team at Google! Interested in working with @ninoscherrer and me on AGI, or whatever you think is the next big thing 🥰, please consider applying! docs.google.com/forms/u/2/d/e/…

English
3
3
36
7.7K
Stefan Bauer retweetledi
p(doom)
p(doom)@prob_doom·
Introducing crowd-code, a tool to crowd-source datasets for the next generation of coding agents. pdoom.org/crowd_code.html Download and install the VS Code/ Cursor extension once, and forget about it. 🧵
English
1
2
16
4.5K