Sebastian Baum

201 posts

Sebastian Baum banner
Sebastian Baum

Sebastian Baum

@SirBaum

PhD Candidate at @Uni_Stuttgart. Obsessed with Graphs. Really interested in Robotics, Optimization and A.I. 🤖https://t.co/wGLy7CKsdy

Deutschland Katılım Eylül 2021
328 Takip Edilen43 Takipçiler
Sebastian Baum retweetledi
François Fleuret
François Fleuret@francoisfleuret·
So it seems that "real CS" people got quite a huge result: anything that can be done in O(f(n)) compute can be done in O(sqrt(f(n))) memory. Wow. arxiv.org/abs/2502.17779
English
29
196
2.2K
172.1K
Sebastian Baum retweetledi
Sebastian Baum retweetledi
Richard Sutton
Richard Sutton@RichardSSutton·
The original RL algorithms, inspired by natural learning, were online and incremental—they were streaming in the sense that they learned from each increment of experience as it happened, then discarded it, never to be processed again. The streaming algorithms were simple and elegant, but the first big successes of RL in deep learning were not with streaming algorithms. Instead, methods such as DQN chopped the stream of experience into individual transitions, then stored and sampled them in arbitrary batches. Subsequent work followed, extended, and refined the batch approach into asynchronous and offline RL, while the streaming approach languished, unable to produce good results in popular deep learning domains. Until now. Now researchers at the University of Alberta have shown that streaming RL algorithms can work just as well as DQN on Atari and Mujoco tasks (arxiv.org/pdf/2410.14606). How did they do it? Mostly just by getting signal normalization and step-size bounding right for the streaming case—otherwise they use standard streaming algorithms like TD(lambda) and Q(lambda). To me it looks like they were simply the first researchers knowledgeable of streaming RL algorithms to seriously address deep RL without being over-influenced by batch-oriented software and batch-oriented supervised-learning ways of thinking.
Mohamed Elsayed@mhmd_elsaye

Would you believe that deep RL can work without replay buffers, target networks, or batch updates? Our recent work gets deep RL agents to learn from a continuous stream of data one sample at a time without storing any sample. Joint work with @Gautham529 and @rupammahmood.

English
16
226
1.4K
129.2K
Sebastian Baum retweetledi
Alex Dimakis
Alex Dimakis@AlexGDimakis·
AI monoliths vs Unix Philosophy: The case for small specialized models. The current thinking in AI is that AGI is coming, and that one gigantic model will be able to reason and solve business problems ranging from customer support to product development. Currently, agents are basically big system prompts on the same gigantic model. Through prompt engineering, AI builders are trying to plan and execute complex multi-step processes. This is not working very well. This monolith view of AI is in sharp contrast to how we teach engineers to build systems. When multiple people have to build complex systems, they should build specialized modular components. This makes systems reliable and helps large teams of people coordinate with specs that are easy to explain, engineer and evaluate. Monolithic AI systems are also extremely wasteful in terms of energy and cost: using GPT4o as a summarizer, fact checker, or user intent detector, reminds me of the first days of the big data wave, when people where spinning Hadoop clusters to process 1GB of data. Instead, I would like to make the case for Small Specialized Models following the Unix philosophy guidelines: 1. Write programs that do one thing and do it well. 2. Write programs to work together. 3. Write programs to handle text streams, because that is a universal interface. Now replace programs with AI models. I believe that the best way to engineer AI systems will be to use post-training to specialize Llama small models into narrow focused jobs. 'Programming' these small specialized models will be done by creating post-training datasets. These datasets will be created by transforming internal data by prompting big foundation models and then distilling them through post-training. This is similar to the "Textbooks is all you need", but for narrow jobs like summarization, legal QA, and so on, as opposed to building general-purpose small models. Several papers have shown that it is possible to create post-training datasets by prompting big models and creating small specialized models that are faster and also outperform their big teachers in narrow tasks. Creating small specialized models is currently hard. Evaluation, post-training data curation and fine-tuning are tricky, and better tools are needed. Still, its good to go back to UNIX philosophy to inform our future architectures.
Alex Dimakis tweet media
English
9
16
76
11.1K
Sebastian Baum retweetledi
Bastian Grossenbacher-Rieck
Bastian Grossenbacher-Rieck@Pseudomanifold·
Exciting work showcasing the potential of using ML for mathematical discovery! I like the idea of 'flipping the script,' and enlisting ML models in the hypothesis-generation phase. This bodes well for the future of mathematics! (Another win for 'attention is all you need'?!)
Baran Hashemi@Rythian47

🚨How can we teach Transformers to learn and model Enumerative geometry? How deep can AI go in the rabbit hole of understanding complex mathematical concepts? 🤔 We’ve developed a new approach using Transformers to compute psi-class intersection numbers in algebraic geometry.

English
1
3
27
1.9K
Sebastian Baum retweetledi
Martin Bauer
Martin Bauer@martinmbauer·
PhD level
Martin Bauer tweet media
Magyar
294
187
5.6K
688.5K
Sebastian Baum retweetledi
Bastian Grossenbacher-Rieck
Bastian Grossenbacher-Rieck@Pseudomanifold·
Friends, I am beyond happy! I'm starting a new position as Full Professor of #MachineLearning at the University of Fribourg @unifr 🇨🇭! With #SwissAI and many other initiatives, I am taking my research at the intersection of #geometry, #topology, and #MachineLearning to a new level 🚀. This #SwissNationalDay will thus hold an even more special meaning for me—thanks for this wonderful chance, my dear confederates! The past few years have been a veritable roller coaster 🎢, with ups and downs. Through it all, I was sustained and supported by my family, for which I am eternally grateful. As much as we like to believe it in academia, 'no man is an island,' and I have tons of people to thank, foremost among them my postdoctoral adviser @kmborgwardt, as well as my long-term collaborators @KrishnaswamyLab and @mrguywolf. I am also indebted to my great research group at the AIDOS Lab. Working with all of you is a pleasure! 🙏 Finally, I am grateful for the advice of my mentors and role models @stefanabauer, @mmbronstein, and @guennemann (plus many others—you know who you are). It's time to give back now and make academia better! PS: 🔥I'm hiring soon! 🔥Please share widely and direct any inquiries to my e-mail or DM.
English
80
37
539
37.9K
Sebastian Baum retweetledi
Antonia Wüst
Antonia Wüst@toniwuest·
New paper about learning symbolic concepts in an unsupervised way! 🎉Our Neural Concept Binder discovers expressive discrete concepts that are interpretable for humans and can even be revised ✏️.
Wolfgang Stammer@WolfStammer

🚀 We present Neural Concept Binder for unsupervised symbolic concept discovery. It combines continuous and discrete encodings for concept representations that are: expressive ✔️ inspectable ✔️ revisable ✔️ 🤯 🔗arxiv.org/abs/2406.09949 @toniwuest @Dav_Steinmann @kerstingAIML

English
0
3
17
9.5K
Sebastian Baum retweetledi
Anthony Bonato
Anthony Bonato@Anthony_Bonato·
Probably the sauciest dedication I've ever seen in a math book.
Anthony Bonato tweet mediaAnthony Bonato tweet media
English
22
339
2.7K
218.5K
Farshad Arvin
Farshad Arvin@ArvinFarshad·
Huge congratulations to my PhD student, Dr Seongin Na @SeonginNa1 for passing his viva with minor correction today.🎉🥳🎊 “Deep Reinforcement Learning - Driven Automatic Controller Design for Swarm Robotics” @SwaCILab @UoM_EEE @UomRobotics
Farshad Arvin tweet media
English
1
1
15
534
Sebastian Baum retweetledi
Rona Wang
Rona Wang@ronawang·
overheard a first date where the guy was explaining some paper to the girl & the girl said “i know, i co-authored that” oh i—
English
386
1.7K
41.7K
3.9M
Sebastian Baum retweetledi
Ian Curtis
Ian Curtis@XRarchitect·
Finally got my game boy stylized portal running on the quest3 🕹️👀 I think I’ll stay in here for a while.
English
63
628
5.2K
569.5K
Sebastian Baum retweetledi
Matt Henderson
Matt Henderson@matthen2·
one easy way to make a simulation of falling sand piling up- just define what happens for all 16 possible 2x2 grids, and repeatedly apply these rules to the picture
English
82
526
6.1K
981.5K
Sebastian Baum retweetledi
Georgia Chalvatzaki
Georgia Chalvatzaki@GeorgiaChal·
We are honored to be nominated for the Best Paper Award on Mobile Manipulation for the second year in a row #IROS2023. Congratulations to the first co-authors of our paper @n_w_funk and Luca Lach. Also congratulations to the winners!
Georgia Chalvatzaki tweet media
Detroit, MI 🇺🇸 English
1
5
105
5.4K
Sebastian Baum retweetledi
Vaisakh Shaj🏳️‍🌈
Vaisakh Shaj🏳️‍🌈@vaisakhsshaj·
I'm super excited about this PhD work of mine, "Multi Time Scale World Models", which has been accepted to Neurips 2023 as a spotlight (Top 3% of all submitted papers). Details are in the thread below. (1/6)
Vaisakh Shaj🏳️‍🌈 tweet media
English
2
8
29
2.3K
Sebastian Baum retweetledi
Russ Salakhutdinov
Russ Salakhutdinov@rsalakhu·
I get asked a lot: why stay in academia, all the excitement in AI is happening in industry with massive compute. And I am seeing some profs leaving academia, but also seeing lots of researchers in industry looking to go back to academia, especially those who don’t work on LLMs. I have spent some time and actively working with industry, and it has been a great experience, but for me the answer is simple: 1. Students: Nothing can replace working with really smart & amazing students. 2. Academic freedom: In general, industry will one way or another dictate what you should work on. Today it is LLMs, tomorrow it is something else. The good old days of "here is a cool idea, let me investigate and publish" are pretty much over. But as an academic, I can work on whatever I want: I can start working on "black holes" tomorrow if I choose to. 3. And my favourite one: No reorgs -- big tech really loves reorgs that happen every few months. Finally, as an academic, I can always take some time off, work in industry or start-up, and come back if I want to. Going into industry is usually a one-way street: It is far easier to go to industry from academia and much harder the other way around.
English
35
166
1.3K
453.8K
Sebastian Baum retweetledi
ARIA
ARIA@ARIA_research·
Here we go 💥 We’re so excited to introduce ARIA’s first cohort of Programme Directors.
English
18
65
238
437.5K