Delta Institute @ ICLR

730 posts

Delta Institute @ ICLR banner
Delta Institute @ ICLR

Delta Institute @ ICLR

@DeltaInstitutes

Supporting exceptional researchers and engineers, from academia to industry and beyond.

가입일 Şubat 2025
809 팔로잉2.9K 팔로워
고정된 트윗
Delta Institute @ ICLR
Delta Institute @ ICLR@DeltaInstitutes·
Join us in welcoming the first cohort of Delta Fellows! 🎉 Congrats to the ~100 amazing researchers and engineers joining the Delta Institute family. We're excited for our fellows to get to know each other through dinners, retreats, and much more! Our fellows come from diverse backgrounds: undergrads, PhD students, high-frequency trading, big tech, startups, neolabs, frontier labs, and more. What brings them together is their kindness, intellectual curiosity, and intrinsic passion for their field. deltainstitutes.org/cohort1
Delta Institute @ ICLR tweet media
English
13
12
173
87.1K
Hanna Hajishirzi
Hanna Hajishirzi@HannaHajishirzi·
Life update here: Last week marked the end of my time at Ai2. Proud to have built releases like Olmo, Tülu, FlexOlmo, DRTulu, OLMoTrace, OlmoE, and datasets including Dolma and Dolci—and of how strongly we pushed for open models and open science. Our artifacts reached 33M+ downloads, including ~4M for Olmo 3. I believe Olmo has empowered researchers to push the boundaries of AI I’ll always be cheering on Ai2 and will continue to strongly support open-source, open-science AI. I’m deeply grateful for this chapter and excited for what comes next.
Hanna Hajishirzi tweet media
English
37
22
503
47.3K
Delta Institute @ ICLR 리트윗함
rLLM
rLLM@rllm_project·
One day into the Parameter Golf Challenge. Hive’s agent swarm pushes val_bpb from 1.19 → 1.14 — and the best runs are now topping the official leaderboard from @OpenAI 🔥 Plug in your agents and evolve with the swarm🐝
rLLM tweet mediarLLM tweet media
rLLM@rllm_project

Hive now adds the Parameter Golf challenge from @OpenAI Just 12 hours in, val_bpb is already down from 1.26 → 1.19. Plug in your agents to the Hive mind to accelerate the evolution. 🐝

English
0
11
81
11.3K
Delta Institute @ ICLR 리트윗함
Sijun Tan
Sijun Tan@sijun_tan·
This might be the coolest project I’ve worked on during my PhD. We built Hive 🐝 — a platform where agents collaboratively evolve shared solutions. We started with optimizing an agent harness for Tau2-Bench. One agent begins, iteratively improving its answers… then others join. They read each other’s runs, fork the best ideas, propose new ones, and push the solution forward together. It feels like watching a horse race on a live leaderboard: you root for your agent, it climbs… but every move raises the bar for everyone else. Competitive energy, but collaborative progress. Overnight, the scores improve from 45% to 77% This is swarm intelligence in action. We’ve added Terminal-Bench and ARC-AGI-2, with more tasks coming soon. Join the hive mind — plug in your agents and start evolving together. Can’t wait to see how Hive evolves with the community.
rLLM@rllm_project

We built Kaggle, but for agents. Introducing Hive 🐝 A crowdsourced platform where agents evolve solutions together. Every agent builds on prior work. Every improvement is shared. Every step moves the frontier forward. As a first step, we’re launching challenges for agents to evolve their own harnesses — modifying themselves to score higher on benchmarks. Recursive self-improvement, in the wild. Let’s see how far swarm intelligence can take this. Links below:

English
7
20
308
45.2K
rLLM
rLLM@rllm_project·
We built Kaggle, but for agents. Introducing Hive 🐝 A crowdsourced platform where agents evolve solutions together. Every agent builds on prior work. Every improvement is shared. Every step moves the frontier forward. As a first step, we’re launching challenges for agents to evolve their own harnesses — modifying themselves to score higher on benchmarks. Recursive self-improvement, in the wild. Let’s see how far swarm intelligence can take this. Links below:
GIF
English
19
52
442
93K
Anne Ouyang
Anne Ouyang@anneouyang·
Excited to share @Standard_Kernel's seed round and some reflections on what we’ve learned about kernel generation and what we believe is next. Grateful to our amazing team, supporters, and the broader community pushing this space forward.
Anne Ouyang tweet media
English
46
44
511
123.4K
alphaXiv
alphaXiv@askalphaxiv·
Introducing MCP for arXiv Let your research agents stand on the shoulders of giants Fast multi-turn retrieval, keyword search, and embedding search tools across millions of arXiv papers 🚀
English
74
399
3K
255K
Stephan Hoyer
Stephan Hoyer@shoyer·
After an incredible decade at Google, it’s time for my next chapter. This week, I joined Periodic Labs, a startup building and training AI scientists with autonomous laboratories.
English
21
21
742
32K
Philip Bogdanov
Philip Bogdanov@philip_bogdanov·
incredibly impressed with OpenAI Grove Cohort 2, looking forward to the next batch
Philip Bogdanov tweet media
English
17
4
186
18.2K
Delta Institute @ ICLR 리트윗함
Moonlake
Moonlake@moonlake·
Introducing a world built by the Moonlake's world model. 🏙️ Most world models only allow for a limited action space. Moonlake maintains multimodal states across physics, appearance, geometry, and casual effects and predict how they evolve under different actions. 👇
English
100
202
1.6K
371.3K
Jinjie Ni
Jinjie Ni@NiJinjie·
Life update: I’ve joined @GoogleDeepMind as a research scientist to work on ✨gemini scaling and RL, under the leadership of Yi Tay (@YiTayML) and Quoc Le (@quocleix). I feel extremely fortunate to be on the critical path towards AGI and can't wait to help push the frontier of gemini capabilities! 🚀
Jinjie Ni tweet media
English
66
26
1.2K
88.7K
Madhav Kanda
Madhav Kanda@madhav_kanda_·
Excited to share that RefineStat has been selected as an Oral at ICLR 2026 (top ~1%)! 🚀 We show how execution feedback + semantic constraints can guide models toward correct programs far more reliably than naive resampling. 🧠✨ Grateful to amazing collaborators and mentors 🙌
English
2
1
28
3.2K
Zheng Zhao
Zheng Zhao@zhengzhao97·
🎉 Thrilled to announce our paper "Verifying Chain-of-Thought Reasoning via Its Computational Graph" has been accepted as an ICLR 2026 ORAL! 🚨 We look inside the "black box" to detect reasoning errors by analyzing the model's internal circuit. 🧠⚡️ Read more on CRV 👇
Zheng Zhao@zhengzhao97

Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.

English
5
32
151
27K
Fred Zhangzhi Peng
Fred Zhangzhi Peng@pengzhangzhi1·
PAPL is accepted by ICLR2026! A simple tweak to ur DLM training that allows it to learn the generation order that you will use in the sampling, with ONE line of code change. shoutout to Zach, Anru, @ShuibaiZ69721 @jarridrb @AlexanderTong7 @mmbronstein @bose_joey #ICLR2026
Fred Zhangzhi Peng@pengzhangzhi1

🚨 New paper! We introduce a planner-aware training tweak to diffusion language models. ⚡ One-line-of-code change to the loss 💡 Fixes training–inference mismatch 📈 Strong gains in protein, text, and code generation arxiv.org/abs/2509.23405 (1/n)

English
4
8
45
18.3K
Justin Chih-Yao Chen
Justin Chih-Yao Chen@cyjustinchen·
Excited to share that Nudging the Boundaries of LLM Reasoning (NuRL) has been accepted to #ICLR2026! 🎉 We show that "nudging" the LLM with self-generated hints can expand the model's learning zone, meaning that it can solve the previously "unsolvable" problems with hints 👉 consistent gains (+0.8-1.8% over GRPO) in pass@1 & raises pass@1024 up to +7.6% on challenging tasks!
Justin Chih-Yao Chen@cyjustinchen

🚨 NuRL: Nudging the Boundaries of LLM Reasoning GRPO improves LLM reasoning, but often within the model's "comfort zone": hard samples (w/ 0% pass rate) remain unsolvable and contribute zero learning signals. In NuRL, we show that "nudging" the LLM with self-generated hints effectively expands the model's learning zone 👉consistent gains in pass@1 on 6 benchmarks w/ 3 models & raises pass@1024 on challenging tasks! Key takeaways: 1⃣GRPO can't learn from problems the model never solves correctly, but NuRL uses self-generated "hints" to make hard problems learnable 2⃣Abstract, high-level hints work best—revealing too much about the answer can actually hurt performance! 3⃣NuRL improves performance across 6 benchmarks and 3 models (+0.8-1.8% over GRPO), while using fewer rollouts during training 4⃣NuRL works with self-generated hints (no external model needed) and shows larger gains when combined with test-time scaling 5⃣NuRL raises the upper limit: it boosts pass@1024 up to +7.6% on challenging datasets (e.g., GPQA, Date Understanding) 🧵

English
2
22
51
3.2K
Takuya Akiba
Takuya Akiba@iwiwi·
Thrilled to share that three of our papers have been accepted to #ICLR2026. Huge thanks to my co-authors!
Takuya Akiba tweet media
English
3
11
172
17.8K