Stefan Bauer

72 posts

Stefan Bauer

@stefanAbauer

deep learning & causality

Katılım Temmuz 2022

1.9K Takip Edilen596 Takipçiler

Stefan Bauer retweetledi

p(doom)@prob_doom·1d

You really think we're going to scale data labelers to AGI? Today, we release the largest public long-horizon dataset of human digital work. 600h of long-horizon AGI research across 3 months. 🧵(1/n)

English

161

17.2K

Stefan Bauer retweetledi

Lior Pachter@lpachter·9 Mar

The focus on UMAPs as the output of scRNAseq has debased a whole field. This fig from arxiv.org/abs/2603.02402 is ridiculous.. showing a faster GPU accelerated workflow.. that produces a completely different UMAP. But since nobody cares what's on the UMAP anyway.. who cares?

English

233

22.8K

Stefan Bauer retweetledi

Moshe Vardi@vardi·28 Şub

ACM has posted an "Expression of Concern" on Igor Markov's article. This stinks to heaven. Apparently, Google did not like the paper. Google is a major donor to ACM. There are also rumors of a legal action by Google against ACM. cacm.acm.org/research/reeva…

English

115

68.2K

Stefan Bauer retweetledi

Andrew Saxe@SaxeLab·16 Şub

Excited to launch Principia, a nonprofit research organisation at the intersection of deep learning theory and AI safety. Our goal is to develop theory for modern machine learning that can help us understand network behaviors, including those critical for AI safety. 1

English

299

18.4K

Stefan Bauer retweetledi

Franz Srambical (in MONTREAL)@lemergenz·20 Oca

everyone talking about macrohard seems like a good time to remind people that we openly released the biggest long-horizon behaviour-cloning dataset of screencasts (of AGI research) 350+ hours and counting! pdoom.org/agi_cast.html

English

313

Stefan Bauer@stefanAbauer·17 Oca

Training on code is one path to AGI. Yet, training on github repos only implies training on end results rather than training on the way to get there. Crowd-code is an effort to solve that! Install once and contribute to open source efforts for AGI.

Mihir Mahajan@maharajamihir

Today we release crowd-code 2.0, our second phase of crowd-sourcing the largest long-horizon open software engineering dataset. Install once. Forget about it.

English

258

Stefan Bauer retweetledi

Ferdinand Kapl@KaplFer·16 Ara

SOTA LLMs are trained as large monolithic models. But looping or depth growth can beat that. How? Co-led with @AngelisManos and @TobiasHppe1. Incredible collaboration with Paradigms of Intelligence Team @Google 🙌 @kaitlinmaile @oswaldjoh @ninoscherrer and @stefanAbauer. 🧵🔽

English

7.4K

Stefan Bauer retweetledi

Vincent Pauline@vincentpaulinef·15 Ara

🚨 Looking for a fully self-contained intro to diffusion models that covers both continuous (images) and discrete (text, sequences) data? 🆕 We just released: “Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction” arxiv : arxiv.org/abs/2512.05092 S/o to @andrea_dittadi for his amazing support & guidance, and huge thanks to @TobiasHppe1, @k_neklyudov, @AlexanderTong7 and @stefanAbauer for their supervision! 🙌 One roadmap for all of diffusion. 🏎️💨 After a few failed posts, broken previews, and getting briefly flagged by X… the full thread's finally out 🤯🧵👇

English

9.7K

Stefan Bauer retweetledi

Yashas Annadani@yashasannadani·3 Ara

Excited to present our work on multi-objective scientific discovery at #NeurIPS2025! 🎉 We present Preference-Guided Diffusion, a novel method to sample diverse designs from a diffusion model that corresponds to the Pareto Front of black-box objectives. Most (if not all) problems in scientific discovery, from drug design to material design, are multi-objective. Balancing factors like toxicity and activity or stability and synthesizability requires finding the optimal Pareto Front of trade-offs. Preference-Guided diffusion uses preference pairs from just an offline dataset to sample optimal designs from a diffusion model. W/ amazing co-authors @syrineblk, @StefanoErmon, @stefanAbauer, and @BeEngelhardt. Find out more: 🗓️ Poster #915 | Wednesday | 4:30 PM PST Pdf: openreview.net/pdf?id=moiVS9A… Project: github.com/yannadani/pgd_…

English

4.2K

Stefan Bauer@stefanAbauer·28 Kas

Discrete diffusion is all the hype and you want to get started? Have a look at this discrete diffusion codebase. Focused on letting you try out new ideas quickly with baselines already available. Feedback welcome!

Kalyan@nkalyanv99

We’re releasing UNI-D², a unified codebase for discrete diffusion language models 🤝🚀 Co-led with @vincentpaulinef and an amazing advisor team: @stefanAbauer, @AlexanderTong7 , @andrea_dittadi, @AMK6610, @KaplFer 🙌 🔗 GitHub: github.com/nkalyanv99/UNI… 📚 Docs: nkalyanv99.github.io/UNI-D2/ Reproduce and extend state-of-the-art baselines with one toolkit. Let’s move beyond autoregressive models and push discrete diffusion together 🧵👇

English

683

Stefan Bauer retweetledi

Mihir Mahajan@maharajamihir·3 Kas

We release a paper on Jasmine, our production-ready JAX-based codebase for world modeling from unlabeled videos!

English

390

26.7K

Stefan Bauer retweetledi

Willie Neiswanger@willieneis·9 Eki

It was great to see @thinkymachines LoRA w/o Regret blog, which connects nicely to our work on Tina (LoRA for RL). For wider use, we’re releasing a clean implementation of RL with LoRA, DoRA, QLoRA/QDoRA, plus speedups & more, across models from 1.5B–32B. Nice work @UpupWang!

Shangshang Wang@UpupWang

We now know that LoRA can match full-parameter RL training (from x.com/thinkymachines… and our Tina paper arxiv.org/abs/2504.15777), but what about DoRA, QLoRA, and more? We are releasing a clean LoRA-for-RL repo to explore them all. github.com/shangshang-wan…

English

3.4K

Stefan Bauer retweetledi

Franz Srambical (in MONTREAL)@lemergenz·13 Eyl

My @Cohere_Labs talk is online. We outline research directions that embrace the bitter lesson, and state roadblocks on the path to AGI that need to be addressed even in a regime of absolute energy- and compute-abundance. youtube.com/watch?v=6wraMn…

YouTube

English

3.7K

Stefan Bauer retweetledi

CIFAR@CIFAR_News·3 Eyl

📣 Applications are open for the CIFAR Global Scholars program! This program offers junior faculty the opportunity to pursue interdisciplinary research, expand their networks and gain valuable leadership training. 🔗 cifar.ca/next-generatio…

English

5.4K

Stefan Bauer@stefanAbauer·2 Eyl

@Kangwook_Lee @UWMadison @UWMadisonECE Congratulations! Well deserved and looking forward to reading more of your work!.

English

170

Kangwook Lee@Kangwook_Lee·2 Eyl

Happy to share that I got tenured last month! While every phase in life is special, this one feels a bit more meaningful, and it made me reflect on the past 15+ years in academia. I'd like to thank @UWMadison and @UWMadisonECE for tremendous support throughout the past six years, helping me grow. I am very grateful to all the teachers I’ve met in the past 15+ years of research since undergrad. Prof. Sae-Young Chung introduced me to engineering, and in particular, information theory. Prof. Yung Yi and Prof. Song Chong introduced me to communication network theory, and from Prof. Yung Yi I learned the true passion for research. I miss him a lot. At Berkeley, I learned everything about research from my advisor Prof. Kannan Ramchandran. In particular, I learned that the most important motivation behind great research is endless curiosity and the desire to really understand how things work. From my postdoc mentor Prof. Changho Suh at KAIST, I learned the mindset of perfection, making every single paper count. During my assistant professorship, I was lucky to have the best colleagues. I learned so much from Rob (@rdnowak) and Dimitris (@DimitrisPapail). I am still learning from Dimitris' unique sense of research taste and Rob's example of how to live as the coolest senior professor. I also learned a lot from the Optibeer folks Steve Wright, Jeff Linderoth, and my ECE colleagues Ramya (@ramyavinayak) and Grigoris (@Grigoris_c). Thank you all! I’d like to thank my former students and postdocs too. Daewon and Jy-yong (@jysohn1108) joined my lab early on and worked on many interesting projects. Changhun and Tuan (@tuanqdinh) joined midway through his PhD and worked on interesting research projects, and in particular, Tuan initiated our lab’s first LLM research five years ago! Yuchen (@yzeng58), Ziqian (@myhakureimu), and Ying (@yingfan_bot) joined around the same time, and working with them has been the most fun and rewarding part of my job. Each took on a challenging topic and did great work. Yuchen advanced LLM fine-tuning, especially parameter-efficient methods. Ziqian resolved the mystery of LLM in-context learning. Ying explored "a model in a loop," focusing on diffusion models and looped Transformers. They all graduated earlier this year and are continuing their research at @MSFTResearch and @Google. Best wishes! 🥰 I am also grateful for co-advising Nayoung (@nayoung_nylee), Liu (@Yang_Liuu), and Joe (@shenouda_joe) with Dimitris and/or Rob. Nayoung's work on Transformer length generalization, Liu's on in-context learning, and Joe's on the mathematical theory of vector-valued neural networks are all very exciting. They are all graduating very soon, so stay tuned! (And reach out to them if you have great opportunities!) I also had the pleasure of working with master's students Ruisu, Andrew, Jackson (@kunde_jackson), Bryce (@BryceYicongChen), and Michael (@michaelgira23), as well as many visiting students and researchers. Thank you for being such great collaborators. I’d like to thank and introduce the new(ish) members too. Jungtaek (@jungtaek_kim) and Thomas are studying LLM reasoning. Jongwon (@jongwonjeong123) just joined, and interestingly he was an MS student in Prof. Chung’s lab at KAIST, which makes him my academic brother turned academic son. Ethan (@ethan_ewer), Lynnix, and Chungpa (visiting) are also working on cool LLM projects! Thank you to @NSF, @amazon, @WARF_News, @FuriosaAI, @kseayg, and KFAS for generous funding. I also learned a lot from leading and working with the AI team at @Krafton_AI, particularly with Jaewoong @jaewoong_cho, so thank you for that as well. Last and most importantly, thanks to my family! ❤️ I only listed my mentors and mentees here, not all my amazing collaborators, but thank you all for the great work together. With that, I’m excited for what’s ahead, and so far no "tenure blues." Things look the same, if not more exciting... haha!

English

296

24.1K

Stefan Bauer retweetledi

Max Seitzer@maxseitzer·14 Ağu

Introducing DINOv3 🦕🦕🦕 A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…

English

137

134.9K

Stefan Bauer@stefanAbauer·6 Ağu

World Models are all the hype, but it is difficult to get started? Check this easy to use codebase from @avocadoalii, @lemergenz and @maharajamihir

Mihir Mahajan@maharajamihir

Talk about perfect timing!🧞🧞‍♀️ Check out what we have been cooking for the last few weeks - Jasmine is a production-ready JAX-based codebase for world modeling from unlabeled videos

English

833

Stefan Bauer retweetledi

Mihir Mahajan@maharajamihir·6 Ağu

Talk about perfect timing!🧞🧞‍♀️ Check out what we have been cooking for the last few weeks - Jasmine is a production-ready JAX-based codebase for world modeling from unlabeled videos

GIF

p(doom)@prob_doom

Inspired by today's Genie 3 release? We are open-sourcing 🧞‍♀️Jasmine🧞‍♀️, a production-ready JAX-based codebase for world modeling from unlabeled videos. Scale from single hosts to hundreds of xPUs thanks to XLA! 🧵 (1/10)

English

2.6K

Stefan Bauer retweetledi

Nino Scherrer@ninoscherrer·17 Tem

Super excited to host a student researcher together with @oswaldjoh this year! Please sign up if you wanna have some research fun with us :)

Johannes Oswald@oswaldjoh

We are hosting a student researcher this year at the Paradigms of Intelligence team at Google! Interested in working with @ninoscherrer and me on AGI, or whatever you think is the next big thing 🥰, please consider applying! docs.google.com/forms/u/2/d/e/…

English

7.7K

Stefan Bauer retweetledi

p(doom)@prob_doom·24 Haz

Introducing crowd-code, a tool to crowd-source datasets for the next generation of coding agents. pdoom.org/crowd_code.html Download and install the VS Code/ Cursor extension once, and forget about it. 🧵

English

4.5K

Keşfet

@AngelisManos @TobiasHppe1 @Google @kaitlinmaile @oswaldjoh @ninoscherrer @andrea_dittadi @k_neklyudov