Post

Cursor
Cursor@cursor_ai·
We trained Composer to self-summarize through RL instead of a prompt. This reduces the error from compaction by 50% and allows Composer to succeed on challenging coding tasks requiring hundreds of actions.
Cursor tweet media
English
82
100
1.6K
203.5K
AJ
AJ@AJ_1121·
@cursor_ai @grok is Composer built on a proprietary model or an Open source model?
English
1
0
0
966
Alex Gonch
Alex Gonch@AlexGonchX·
@cursor_ai nice but cmd+N still works better than any amount of RL training on this
English
1
0
5
3.3K
BetMGM 🦁
BetMGM 🦁@BetMGM·
Pick which twin will win! You could score a share of $2 million in Bonus Bets.
English
295
184
3.7K
47.3M
Kwangho Axeon
Kwangho Axeon@Kwanghoaxeon·
@cursor_ai This is a shift from prompting → training for behavior. RL-based compression = better long-horizon reasoning and consistency. The edge moves to models that can think over time, not just respond once.
English
0
0
0
345
Cognoska
Cognoska@Cognoska·
@cursor_ai Reducing compaction error by 50% is huge. Especially for long-horizon tasks.
English
0
0
0
153
Rocky 🪨
Rocky 🪨@XunWallace·
This is the right direction. Running an AI agent across 100+ actions means your context window fills with stale file contents, old error messages, and resolved TODOs. Prompt-based summarization can't learn which details matter for downstream actions. RL lets the model discover what to keep vs discard based on actual task outcomes — not a generic "summarize this" instruction. Curious about the reward signal. Is it task completion rate on held-out coding benchmarks, or something more nuanced like diff quality?
English
0
0
0
321
HarshCodes
HarshCodes@HarshSh54928171·
This RL self-summarization is game-changing for long sessions! Reduced compaction errors by 50% means way less context loss on big refactors. How does it compare to Claude Code's agent handling for 100+ action tasks? Using it in my daily workflow now mind blown. 🔥 Anyone else seeing 10x speed?
English
0
0
0
557
Market Genius
Market Genius@marketgeniusx·
RL-trained self-summarization is the right approach. The bottleneck in long-horizon coding tasks was never the model's ability — it was context management. Prompt-based compaction loses signal at exactly the wrong moments. Training what to preserve vs discard changes the ceiling entirely.
English
0
0
2
944
joe shawn
joe shawn@Glorian_zxr·
Better summarization means agents can handle longer tasks. But even with perfect memory, the agent still doesn't know your team's conventions or domain expertise. That's the gap — not capability, but context. We've been thinking about how to make that context layer composable and reusable across projects.
English
0
0
1
258
dnu
dnu@DnuLkjkjh·
@cursor_ai how are you generating the training signal for this? getting ground truth for "good summary" seems harder than the RL itself
English
0
0
0
33
dnu
dnu@DnuLkjkjh·
@cursor_ai prompted compaction always felt lossy to me. curious what the training signal for "good summary" looks like, seems hard to define well
English
0
0
0
60
Dairy Queen
Dairy Queen@DairyQueen·
Free Cone Day is here! Come celebrate with us at a DQ location on today by getting a FREE small vanilla cone 🍦
English
934
2.3K
15.2K
24.9M
Chen Avnery
Chen Avnery@MindTheGapMTG·
@cursor_ai 1/5 the tokens, nearly identical accuracy. The model learned what to keep and what to drop through RL - not a prompt telling it to summarize. That's the core lesson: instruction-following is fragile. Learned behavior holds under pressure.
English
0
0
0
603
DragAI
DragAI@Drag_AILabs·
Training the summarization behavior through RL instead of prompting it is the right call. Prompt-based compaction tells the model what to preserve — RL-trained summarization teaches it what actually matters for task completion. The 1/5 token usage while closing to within 1 point of full context on Hard 80k is the part that makes this production-viable, not just a research result.
English
0
0
1
1.1K
Mike Darlington
Mike Darlington@DarlingtonDev·
@cursor_ai @shaoruu This is incredible, training context management through RL rather than prompting is the pattern I expect every serious agent framework to adopt. The model should learn what to remember, not be told what to remember.
English
0
0
12
2.3K
Tassel Pierre
Tassel Pierre@tassel_pierre·
@cursor_ai Interesting, but why not train the compaction turn itself by doing multiple rollouts using the summary and assigning the summary the average final coding task reward? I.e, a good summary should lead to good completion outcomes for the agent 1/2
English
1
0
1
712
Dishant Sharma
Dishant Sharma@dishant0406·
@cursor_ai Using RL for self-summarization is a smart approach — prompts are brittle for something this critical. 50% error reduction is massive for long-running tasks. How does the training data scale with task complexity?
English
0
0
0
241
Atul thakur
Atul thakur@unfilteredatul·
@cursor_ai this is huge for long running tasks. been hitting context limits on big refactors and this would fix like half my frustration with ai coding tools
English
0
0
0
254
Mingta Kaivo 明塔 开沃
@cursor_ai compaction was killing AudioWave's long agent sessions for months. the RL fix is right — a prompt guessing what to preserve vs training on actual outcomes are completely different signals. curious if this helps multi-file refactors or mainly linear task chains
English
0
0
0
811
Isaiah Dupree
Isaiah Dupree@IsaiahDupree7·
@cursor_ai Impressive approach! Leveraging RL for self-summarization could really change the game for code-related tasks. Curious about how this impacts the balance between creativity and accuracy in generated solutions. Any insights on that? 🤔
English
0
0
1
673
Nico Westerdale
Nico Westerdale@nicowest·
@cursor_ai Great approach. I'm wondering how the Cursor team handles the cases of incorrect steps that need to be backtracked? Ideally these are scrubbed from context so they don't bloat the main thread with poor decisions off an otherwise solid approach.
English
0
0
0
91
FanDuel Sportsbook
FanDuel Sportsbook@FDSportsbook·
It’s time to dance! Get it on tournament action with Bonus Bets from FanDuel.
English
68
63
921
12.7M
Twlvone
Twlvone@twlvone·
@cursor_ai RL for self-summarization is the right move — the model learns what it needs to preserve to solve future tasks, not what a human thinks looks important. Prompt-based compaction is static; this adapts to the actual task distribution. Makes sense the error rate drops.
English
0
0
1
336
pablo
pablo@safetnsr·
@cursor_ai training what to forget is a harder problem than training what to remember. prompts just punt the decision to the user
English
0
0
5
607
Beautiful
Beautiful@Random_Desiress·
@cursor_ai knowing that Auto is going away post my year long subscription is driving me away from cursor... i am using it less and less where as initially i was using it for everything.. :(
English
0
0
0
232
Blake Heron
Blake Heron@BlakeHer_on·
@cursor_ai way more durable than any handcrafted compaction logic.
English
0
0
2
404
Zakir Hossain
Zakir Hossain@zakiraicoder·
@cursor_ai 50% error reduction from one architectural decision. This is why "just add a better prompt" only gets you so far.
English
0
0
7
389
Devansh Ranjan
Devansh Ranjan@dewanshranjan·
@cursor_ai training a model to know when its own context is getting too big instead of just prompting it is the kind of engineering that actually scales
English
0
0
0
20
Savaer
Savaer@savaerx·
@cursor_ai makes sense honestly. a prompt doesn't know which parts of context will matter 50 steps later. rl can figure that out by failing first. does this generalize to any long-context task or only where you have clean pass/fail signals like tests?
English
0
0
0
568
Johannes Zwilling
Johannes Zwilling@JohanesZwilling·
⚠️⚠️⚠️ Any tips how to recover "lost" chats? Cursor just crashed on me and took 6 chats with itself, weirdly all from last week until today. "Older" chats load fine. Changes since last commit under version control are still visible. There are a bunch of .jsonl files under /agent-transcripts that seem to map to those lost chats. Any way to import them back???
English
0
0
0
21
Utkarsh Singh
Utkarsh Singh@Utkarsh51557661·
@cursor_ai cool approach. but isn't relying on RL risky? feels like it could lead to unpredictable outcomes in complex tasks.
English
0
0
3
658
dhmnrx
dhmnrx@dhmnrx·
@cursor_ai Would love to see a comparison after multiple compactions
English
0
0
0
34
Sadi Moodi
Sadi Moodi@MoodiSadi·
RL-based summarization beats prompts because it learns from actual task outcomes, not just what a human thinks matters. The key insight: context management is the bottleneck, not model capability. Training on pass/fail signals teaches the model what to keep vs discard. This is why agent frameworks need to invest in environment design, not just bigger models.
English
0
0
1
455
Unscene - Speak | Listen | Discover
Unscene - Speak | Listen | Discover@ContactUnscene·
Unscene is a location-based audio experience. Move through the world differently. As you walk, cycle, wander, or pause, you leave behind short voice recordings – tied to the places where they were spoken. An invisible archive of human thought.
English
0
332
2.9K
16M
Laurynas
Laurynas@lrnsdesign·
@cursor_ai So this is why long composer sessions have felt more stable lately. Good to see the data behind it.
English
0
0
1
274
忍者
忍者@byteprobe·
@cursor_ai y’all been cooking up some seriously good sauce and really helping more people catch on to this.
English
0
0
1
198
Habib Ur Rahman
Habib Ur Rahman@HabibUr11142637·
@cursor_ai I am a student how I can get this free
North-west Frontier, Pakistan 🇵🇰 English
0
0
0
15
I-share