DeepReinforce

128 posts

DeepReinforce banner
DeepReinforce

DeepReinforce

@deep_reinforce

Trialing and reinforcing the path to tomorrow.

Katılım Temmuz 2025
272 Takip Edilen550 Takipçiler
Sabitlenmiş Tweet
DeepReinforce
DeepReinforce@deep_reinforce·
The last stronghold of coding has just been conquered by AI. In the most recent three Codeforces live competitions, i.e., Round 1087, Round 1088, and Round 1089, GrandCode, our agentic AI system, ranked first in all of them, beating all human participants, including legendary grandmasters. GrandCode is a multi-agent reinforcement learning system designed for competitive programming. It orchestrates a variety of agentic modules (hypothesis proposal, solver, test generator, summarization, etc) and jointly improves them through post-training and online test-time RL. GrandCode is developed based on Qwen. Huge respect to the Qwen @Alibaba_Qwen team for their contributions to the community. It is hard to imagine how quickly AI has advanced in just one year: 1st — GrandCode (March 2026) 8th — Gemini 3.1 Pro (February 2026) 175th — OpenAI o3 (April 2025) We can’t wait to see what happens over the next year.
DeepReinforce tweet media
English
80
141
246
198K
DeepReinforce
DeepReinforce@deep_reinforce·
@Alibaba_Qwen 🫡Thanks to the Qwen team for their contributions to the open-source community.
English
0
0
2
201
DeepReinforce retweetledi
Qwen
Qwen@Alibaba_Qwen·
Congratulations to the GrandCode team on this remarkable achievement!👏 The last stronghold of coding has been conquered, and we are incredibly proud that Qwen is the engine behind it! 💻🔥 This is a real milestone moment for coding intelligence and a fascinating example of how far agentic RL systems have come. The pace of AI progress is truly mind-blowing. Stay tuned .🌍✨
DeepReinforce@deep_reinforce

The last stronghold of coding has just been conquered by AI. In the most recent three Codeforces live competitions, i.e., Round 1087, Round 1088, and Round 1089, GrandCode, our agentic AI system, ranked first in all of them, beating all human participants, including legendary grandmasters. GrandCode is a multi-agent reinforcement learning system designed for competitive programming. It orchestrates a variety of agentic modules (hypothesis proposal, solver, test generator, summarization, etc) and jointly improves them through post-training and online test-time RL. GrandCode is developed based on Qwen. Huge respect to the Qwen @Alibaba_Qwen team for their contributions to the community. It is hard to imagine how quickly AI has advanced in just one year: 1st — GrandCode (March 2026) 8th — Gemini 3.1 Pro (February 2026) 175th — OpenAI o3 (April 2025) We can’t wait to see what happens over the next year.

English
9
12
124
13.3K
DeepReinforce
DeepReinforce@deep_reinforce·
1/ Screenshots of our standings in the three competitions.
DeepReinforce tweet media
English
1
7
40
2.2K
DeepReinforce
DeepReinforce@deep_reinforce·
The last stronghold of coding has just been conquered by AI. In the most recent three Codeforces live competitions, i.e., Round 1087, Round 1088, and Round 1089, GrandCode, our agentic AI system, ranked first in all of them, beating all human participants, including legendary grandmasters. GrandCode is a multi-agent reinforcement learning system designed for competitive programming. It orchestrates a variety of agentic modules (hypothesis proposal, solver, test generator, summarization, etc) and jointly improves them through post-training and online test-time RL. GrandCode is developed based on Qwen. Huge respect to the Qwen @Alibaba_Qwen team for their contributions to the community. It is hard to imagine how quickly AI has advanced in just one year: 1st — GrandCode (March 2026) 8th — Gemini 3.1 Pro (February 2026) 175th — OpenAI o3 (April 2025) We can’t wait to see what happens over the next year.
DeepReinforce tweet media
English
80
141
246
198K
DeepReinforce
DeepReinforce@deep_reinforce·
🧑‍🍳CUDA-L2 now supports H100 and RTX 3090. 🔹On H100 under server mode, CUDA-L2 achieves +41.7%, +40.5%, +42.1%, and +22.1% over torch.matmul, cuBLAS, cuBLASLt-heuristic, and cuBLASLt-AutoTuning. 🔹On RTX 3090 under server mode, CUDA-L2 achieves +28.7%, +35.3%, +28.1%, and +19.8% over torch.matmul, cuBLAS, cuBLASLt-heuristic, and cuBLASLt-AutoTuning. 🥳More updates will come. Stay tuned🫡 #CUDA #AI
DeepReinforce tweet media
English
1
0
10
321
DeepReinforce
DeepReinforce@deep_reinforce·
☺️ Stay tuned!!
English
0
0
0
87
DeepReinforce
DeepReinforce@deep_reinforce·
🧑‍🍳New Use Case Drop! 🧐We used IterX to optimize pbrt-v4’s hottest CPU paths, tackling a core bottleneck in physically based rendering: the cost of doing more work per ray than necessary during traversal, intersection, spectrum math, and volumetric sampling. 🥳On an AMD EPYC 7402 24-Core (8 threads), at 16 spp, across 8 scenes, our optimization delivered an average speedup of 11.8%, with volumetric scenes improving by about 14%. 🎁We also offer unlimited credits for all users! 🫡Thanks for this amazing community @GPUOpen @seanbax @KostasAAA @stigatle @marcosalvi @adyaman #AMD #IterX #DeepReinforce
DeepReinforce tweet media
DeepReinforce@deep_reinforce

🥳Introducing IterX: an automated system for deep code optimization using reinforcement learning. 🧐Simply define a reward function, and IterX automatically iterates toward the optimal solution through thousands of trials and explorations using RL. 🎁Every new user receives 30M free tokens. We can’t wait to see what you build with IterX. 🧵

English
2
0
6
725
DeepReinforce
DeepReinforce@deep_reinforce·
☺️ Stay tuned!!
English
0
0
0
75
DeepReinforce
DeepReinforce@deep_reinforce·
🥳IterX for MLSys 2026 NVIDIA Track Fused MoE 🧑‍🍳IterX achieves a 15.62× speedup on H100 and 14.84× on B200, significantly surpassing GPT-5.2 Pro and Claude 4.6 Opus on the Fused MoE setting of FlashInfer AI Kernel generation contest. 🎁We offer unlimited credits for all participants. Come and Join! 🤗We’ve also open-sourced the full recipe to reproduce our results. All MLSys 2026 challenge participants are welcome to build on top of it. #NVIDIAGTC
DeepReinforce tweet media
English
3
3
13
856
DeepReinforce
DeepReinforce@deep_reinforce·
Feedback is appreciated !!
English
1
0
5
461
DeepReinforce
DeepReinforce@deep_reinforce·
🚀Major IterX upgrade: Agent integration is now supported! 🔹Optimizing Hardcore Code: SOTA in Infra, CUDA, Smart Contracts, DBs, AI/ML Ops and beyond. 🔹No Manual Code: Agents (claude code, cursor) handle the integration. Effortless Onboarding! #CUDA #AI
DeepReinforce@deep_reinforce

🥳Introducing IterX: an automated system for deep code optimization using reinforcement learning. 🧐Simply define a reward function, and IterX automatically iterates toward the optimal solution through thousands of trials and explorations using RL. 🎁Every new user receives 30M free tokens. We can’t wait to see what you build with IterX. 🧵

English
38
29
149
1.1M
DeepReinforce
DeepReinforce@deep_reinforce·
🎉🎉CUDA-L1 is accepted to ICLR 2026! 🌟🌟This was our first work using RL for CUDA generation. Now we have CUDA-L2, alongside so much great work from the community. It’s amazing how fast the field has moved in just the past six months. 🦾🦾 Still cooking! stay tuned! 🔗Paper : arxiv.org/abs/2507.14111 🔗Project: github.com/deepreinforce-…
DeepReinforce tweet media
English
4
3
13
867
DeepReinforce
DeepReinforce@deep_reinforce·
🥳🥳 Demo on using IterX to achieve ~1140 cycles on Anthropic's take-home challenge without trouble shooting. 😀😀Feedback is deeply appreciated !!
English
2
1
10
733