Immanuel Trummer

465 posts

Immanuel Trummer

@ImmanuelTrummer

Database Prof at Cornell. I make data analysis more efficient and more user-friendly.

Ithaca, NY (USA) Katılım Ekim 2017

57 Takip Edilen2.1K Takipçiler

Sabitlenmiş Tweet

Immanuel Trummer@ImmanuelTrummer·27 Ağu

🚗ThalamusDB counting pictures of red cars in the database: - Semantic operators are described in natural language and evaluated via GPT-5 - Simply store paths to images or audio files in your database – ThalamusDB recognizes the file format and selects the right LLM 💾 Code: github.com/itrummer/thala… 📄 Website: itrummer.github.io/thalamusdb #SemanticQueries #ApproximateProcessing #LLM #GPT5 #ThalamusDB

English

1.1K

Immanuel Trummer@ImmanuelTrummer·11 Mar

💡 Two arXiv papers published in recent days (one from us, one from TUD) reach the same conclusion: LLMs can now generate C++ code for SQL processing that outperforms classical database systems. ⚙️ Our code generator is based on Claude Code and exploits multiple agents working in parallel. Each agent performs tasks typically associated with different components in a #DBMS, such as workload analysis, query optimization, or physical design tuning. 📊 We compare to various classical #DBMS such as DuckDB, ClickHouse, Umbra, MonetDB, and PostgreSQL, finding that the agent-generated code is often significantly faster. Code generation costs are moderate (<$20), making the approach practical for frequently executed queries. 🤖 Analyzing generated code, we find that agents exploit various optimization techniques, including query-specific data structures, as well as low-level optimizations that are specific to the hardware cache hierarchy of our server. 📃 Paper: arxiv.org/pdf/2603.02081 💾 Code: github.com/SolidLao/GenDB 🌐 Site: solidlao.github.io/GenDB @lojil192574 #LLM #Databases #AI #DB

English

3.2K

Immanuel Trummer@ImmanuelTrummer·18 Kas

@adwiteekk @freeCodeCamp Glad to hear it 😀

English

Adwiteek@adwiteekk·18 Kas

DBMS is such a cool subject especially if u are studying from @freeCodeCamp and @ImmanuelTrummer ✨

English

129

Immanuel Trummer@ImmanuelTrummer·10 Kas

Happy to hear you liked it, @DanKornas!

English

1.4K

Immanuel Trummer@ImmanuelTrummer·9 Tem

A demo of #ThalamusDB (#SIGMOD2023), introducing semantic filter operators. Users write SQL queries with natural language predicates on table columns containing 🖼️ images, 📃 text, or 🔊 sound files. These predicates are evaluated via #LLMs. In the video (below), I'm querying for furniture ads with pictures showing "wooden tables". After entering my query, #ThalamusDB 1️⃣ performs data profiling and cost-based optimization, 2️⃣ shows the Pareto frontier of cost-quality tradeoffs, 3️⃣ updates bounds on query aggregates while processing. #ThalamusDB is designed from the ground up for approximate processing, prioritizing data that maximally reduces approximation error per cost unit. 🪧 #SIGMOD2023 demo: dl.acm.org/doi/abs/10.114… 📃 #SIGMOD2024 paper: dl.acm.org/doi/10.1145/36… 💾 Code repository: github.com/saehanjo/thala… @SaehanJo @sigmod #GPT4 #LanguageModel #MultimodalData @Cornell @CornellCIS

English

908

Immanuel Trummer@ImmanuelTrummer·7 Tem

📢 All our posters & talks at #SIGMOD2025! 1️⃣ λ-Tune — using #LLMs to write configuration scripts for databases. 🪧 Poster: itrummer.github.io/SIGMOD2025/Lam… 💬 Slides: itrummer.github.io/SIGMOD2025/Lam… @giannakourisv 2️⃣ SpareLLM — selecting #LLMs with optimal cost-quality tradeoffs 🪧 Poster: itrummer.github.io/SIGMOD2025/Spa… @SaehanJo 3️⃣ SQLBarber — generating custom benchmarks via #LLMs 🪧 Poster: itrummer.github.io/SIGMOD2025/SQL… @lojil192574 4️⃣ CEDAR — cost-efficient data-driven claim verification via #LLMs 🪧 Poster: itrummer.github.io/SIGMOD2025/CED… @Tharushi96 5️⃣ SwellDB — generating data on-the-fly during query processing by #LLMs 🪧 Poster: itrummer.github.io/SIGMOD2025/Swe… @giannakourisv 6️⃣ Query optimization for hybrid classical-quantum workflows 💬 Slides: itrummer.github.io/SIGMOD2025/Que… 7️⃣ Quantum annealing for optimal data partitioning 💬 Slides: itrummer.github.io/SIGMOD2025/Qua… 8️⃣ Panel "AI for Future Databases" with @tim_kraska, @adityagp, @feifei_initiald, @ailamaki, and #SurajitChaudhuri 💬 Slides: itrummer.github.io/SIGMOD2025/Pan… @SIGMODConf @sigmod @Cornell @CornellCIS

English

840

Immanuel Trummer@ImmanuelTrummer·28 Haz

Really proud of my students — @SaehanJo, @giannakourisv, @lojil192574, and @Tharushi96 (left to right) — who each presented their latest work at #SIGMOD2025. Many thanks to the organizers for an amazing conference! @SIGMODConf @sigmod

English

735

Immanuel Trummer@ImmanuelTrummer·22 Haz

🥳Looking forward to an amazing #SIGMOD2025 conference! Our schedule: 📃 Sunday, 15:00-17:30: Data partitioning with quantum and digital annealers 📃 Sunday, 15:00-17:30: Optimizing hybrid quantum-classical processing pipelines 📃 Tuesday, 10:30-11:30: SpareLLM - selecting LLMs with optimal cost-quality tradeoffs 🖥️ Tuesday, 11:30-13:00: Demonstrating SQLBarber - generating custom benchmarks via LLMs 🖥️ Tuesday, 11:30-13:00: Demonstrating SwellDB - generating data on-the-fly during query processing 📢 Tuesday, 16:30-18:00: Panel on AI for future databases with @TimKraska, @drfeifei, @adityagp, @ailamaki, and Surajit Chaudhuri 📃 Thursday, 10:30-11:30 & 16:30-18:00: λ-Tune - using LLMs to write configuration scripts for databases 🖥️ Thursday, 16:30-18:00: Demonstrating CEDAR - cost-efficient data-driven claim verification @giannakourisv @Tharushi96 @SaehanJo @lojil192574 @SIGMODConf @sigmod #LLM #SQL #Database

English

701

Immanuel Trummer@ImmanuelTrummer·11 Haz

💵Don't overpay when using #LLMs! Introducing our upcoming #SIGMOD2025 paper on #SpareLLM by @SaehanJo ... @sigmod @SIGMODConf #LanguageModel #GPT4 #ChatGPT #CostOptimization #Data @Cornell

English

572

Immanuel Trummer@ImmanuelTrummer·4 Haz

Outperforming various baselines, including #QuantumAnnealers and classical optimization, for large problem instances with 1000 queries. #SQL #DB #Quantum #QueryOptimization

English

340

Immanuel Trummer@ImmanuelTrummer·4 Haz

🥳Paper accepted at #SIGMOD2026! Our paper leverages #DigitalAnnealers (hardware accelerators for optimization) for #QueryOptimization. We scale up to large problem instances using 1⃣domain-specific problem decomposition and 2⃣pre/post-processing on classical machines. @sigmod

English

827

Immanuel Trummer@ImmanuelTrummer·5 May

Had a great time at the @dagstuhl seminar on Table Representation Learning! Lots of interesting discussions and future work directions. Many thanks to the organizers (@FrankRHutter @cbinnig @MadelonHulsebos @eisenjulian)! #TabularFoundationModel #AI #ML #DB

English

649

Immanuel Trummer retweetledi

Ibrahim Sabek@ibrahim_sabek·19 Nis

The submission deadline for the Q-Data Workshop has been extended by one week. The new submission deadline is April 27, 2025. @SIGMODConf #SIGMOD2025

Q-Data Workshop@Q_Data_Workshop

Based on many requests, the submission deadline for the Q-Data Workshop has been extended to April 27, 2025. Plan to submit your work and spread the word! @SIGMODConf #SIGMOD2025

English

618

Immanuel Trummer@ImmanuelTrummer·18 Nis

@shctechnologies @ManningBooks @OpenAI @langchain @llama_index Thank you!

English

SHC TECHNOLOGIES@shctechnologies·18 Nis

@ImmanuelTrummer @ManningBooks @OpenAI @langchain @llama_index Congrats on finishing the book! It sounds super interesting and timely. Excited to dive into it and learn somethin

English

Immanuel Trummer@ImmanuelTrummer·17 Nis

🥳I finished my book! 📘"Data Analysis with LLMs" shows how to analyze (📄text/🖼️image/🔊audio/📽️video/...) data with #LLMs and #Python! 🔗dataanalysiswithllms.com @ManningBooks #LLM #GPT4 @OpenAI #SQL #GraphData #AgenticAI @langchain @llama_index #Multimodal #DataScience

English

791

Immanuel Trummer@ImmanuelTrummer·17 Nis

The book is a hands-on introduction to #LLMs and #Multimodal #DataAnalysis, based on a few mini-projects. It covers the @OpenAI #Python library, #Prompting, #FewShotLearning, #FineTuning, #LLM #Agents, and recent #LLM frameworks like #LangChain and #LlamaIndex.

English

308

Immanuel Trummer@ImmanuelTrummer·15 Nis

Well deserved 😀

English

369

Immanuel Trummer@ImmanuelTrummer·15 Nis

🥳Many congrats to Dr. Saehan Jo! 🎓Saehan successfully defended his PhD thesis "Efficient Data Systems for Scalable Analysis with LLMs", introducing systems like #ThalamusDB and #SpareLLM that scale up processing with #LLMs to very large data sets! @SIGMODConf #Data #SQL #ML