Sebastian retweetledi
Sebastian
245 posts

Sebastian
@sscdotopen
Professor of data management for ML at @bifoldberlin. Ex-@UvA_Amsterdam, @NYUDataScience, @Twitter intern; member of @TheASF & @EFF. Views are my own.
Berlin, Germany Katılım Haziran 2010
1.7K Takip Edilen2.9K Takipçiler
Sebastian retweetledi
Sebastian retweetledi

Blog Post: Looking back on the first decade as faculty (2014-2024).
I list my favorite papers from the decade, why I enjoyed working on them, and provide backstory and reflection.
data-people-group.github.io/blogs/2025/09/…
English
Sebastian retweetledi
Sebastian retweetledi

I should write or record a longer piece on this at some point. But hopefully the slides will useful to someone.
Link: github.com/okhat/blog/blo…
English

Next time my students ask me how real-world data looks like, I will point them to this article :)
jimmyhmiller.com/ugliest-beauti…

English
Sebastian retweetledi

New research agenda we're kickstarting at Berkeley: redesigning data systems to serve the dominant workload of the future: agents!
Agentic speculation is massive, heterogeneous, steerable, and redundant: properties data systems can better support and take advantage of.
Take a look: arxiv.org/abs/2509.00997

English
Sebastian retweetledi

Vol:18 No:12 → mlidea: Interactively Improving ML Data Preparation Code via "Shadow Pipelines" vldb.org/pvldb/vol18/p5…

English

If you are at #icml25, don't miss @o_ovcharenko's spotlight poster today at 11 a.m. PDT — 1:30 p.m. PDT at West Exhibition Hall B2-B3 #W-311. ICML link: icml.cc/virtual/2025/p…
Olga Ovcharenko@o_ovcharenko
Thanks to all co-authors Florian Barkmann, Philip Toma, @ImantDaunhawer, @vogt_je, @sscdotopen and @val_boeva 📄 Full paper: openreview.net/pdf?id=jnPHZqc… 💻 Code: github.com/BoevaLab/scSSL…
English
Sebastian retweetledi

Join our lab's presentations at ICML'2025 @icmlconf in beautiful Vancouver!
1. Thursday, Olga Ovcharenko (@o_ovcharenko) will present our work with @sscdotopen and @vogt_je on "scSSL-Bench: Benchmarking Self-Supervised Learning for Single-Cell Data", selected for a spotlight poster. icml.cc/virtual/2025/p…. Paper: arxiv.org/abs/2506.10031
2. Saturday, Marc Glettig (@GlettigMarc) will present our work on "H&Enium, Applying Foundation Models to Computational Pathology and Spatial Transcriptomics to Learn an Aligned Latent Space", selected for a poster presentation at the Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences. Paper: openreview.net/forum?id=W64Ns… ICML link: icml.cc/virtual/2025/w…
3. Saturday, I will give an invited talk about our CancerFoundation model by @Theus__A and Florian Barkmann at the Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences. Preprint to be updated soon with new results: biorxiv.org/content/10.110…

English

On Saturday, @o_ovcharenko will present a poster on "Towards Cross-Modal Error Detection with Tables and Images" at the the Data World workshop, which details our initial ideas on finding errors in tables by inspecting corresponding image data:
olgaovcharenko.github.io/_pages/MERIT.p…
(3/3)

English

On Thursday, Olga will present her research on "scSSL-Bench: Benchmarking Self-Supervised Learning for Single-Cell Data". This paper is joint work with ETH Zuerich and was selected as a spotlight poster:
icml.cc/virtual/2025/p…
(2/3)

English
Sebastian retweetledi

Our paper "Towards Cross-Modal Error Detection with Tables and Images" was accepted for the DataWorld workshop at ICML'25! 🥳
Thanks to @sscdotopen!

English
Sebastian retweetledi

New PhD position at @AmlabUva on learning concepts with theoretical guarantees using #causality and #RL with me, Frans Oliehoek (TU Delft) and @herkevanhoof 💥
Deadline: 15 June
werkenbij.uva.nl/en/vacancies/p…
English

We have a PhD opening in Berlin on "Responsible Data Engineering", with a focus on data preparation pipelines designed along responsibility objectives.
This is a fully-funded position at @bifoldberlin, co-supervised by @stoyanoj from NYU.
Details: #jobs-17725" target="_blank" rel="nofollow noopener">deem.berlin/#jobs-17725
English

We have a PhD opening in Berlin on "Responsible Data Engineering", with a focus on data preparation for ML/AI systems.
This is a fully-funded position with salary level E13 at the DEEM Lab, as part of @bifoldberlin .
Details available at #jobs-2225" target="_blank" rel="nofollow noopener">deem.berlin/#jobs-2225
English






