mufeez

523 posts

mufeez banner
mufeez

mufeez

@moofeez

tinkering with ML systems and agents infra @linear, fmr @meta. @uwaterloo SE alum

nyc/toronto Katılım Aralık 2015
458 Takip Edilen1.7K Takipçiler
Sabitlenmiş Tweet
mufeez
mufeez@moofeez·
I post-trained Qwen3-Coder to fix bugs using an actual debugger. The result: Solve rate: 70% → 89% Median turns to fix: 46 → 19 (-59%) Instead of just reading code or print-debugging, it: - reasons from execution - inspects live variables and call stacks - sets breakpoints, steps, and evaluates expressions
English
92
119
1.6K
119.6K
mufeez
mufeez@moofeez·
ok london kinda mogs nyc ngl
mufeez tweet mediamufeez tweet mediamufeez tweet mediamufeez tweet media
English
0
0
0
0
mufeez
mufeez@moofeez·
@_lychrel @bilaltwovec @_arohan_ ambassador clubhouse is their sister restaurant a few blocks away, might offend someone but it’s similar food and vibes
English
1
0
1
5
rohan anil
rohan anil@_arohan_·
How do I buy spv on dishoom and hoppers?
English
5
2
23
2.4K
Ily
Ily@ily_8750·
Canary Wharf will always be my favourite part of London
Ily tweet mediaIly tweet mediaIly tweet media
English
32
60
1.1K
167.9K
mufeez
mufeez@moofeez·
tuning into the gdb testimony like it's the super bowl
English
0
0
3
209
Arnie Ramesh
Arnie Ramesh@arnie_hacker·
@chesterzelaya @OpenAI guessing you mean my cs2 project? compute+cloud service quotas💀 i'm rendering out 1K-hours cleaned dataset atm and moving to training baselines this week though :)
English
1
0
2
114
Arnie Ramesh
Arnie Ramesh@arnie_hacker·
Okay so the @OpenAI image-gen team have definitely rendered out a bajillion cs:go games
Arnie Ramesh tweet media
English
6
0
20
3K
Tommy D. Rossi
Tommy D. Rossi@__morse·
GitHub if Linear designed it
Tommy D. Rossi tweet media
English
22
2
125
30K
Jori Lallo
Jori Lallo@jorilallo·
Don't tell anyone (to enable, reconnect GitHub integration with new permissions). Agents which output draft PRs also get code diffs inside Linear now
Daniel Kumlin@daniellkumlin

@linear just silently drops the fact you can review prs now directly. you also have access to the linear agent when you review. I have been waiting so long for this finally! you can even change your theme etc.

English
3
0
48
9.6K
mufeez
mufeez@moofeez·
@tenderizzation ootl, why did they go for an old school dense model?
English
2
0
4
584
Giles 🇺🇦 Slava Ukraini!
@moofeez Just wondering what the ideal agentic debugger interface would be… is there an MCP for a language server? Or is there a more efficient mechanism…
English
1
0
0
149
mufeez
mufeez@moofeez·
I post-trained Qwen3-Coder to fix bugs using an actual debugger. The result: Solve rate: 70% → 89% Median turns to fix: 46 → 19 (-59%) Instead of just reading code or print-debugging, it: - reasons from execution - inspects live variables and call stacks - sets breakpoints, steps, and evaluates expressions
English
92
119
1.6K
119.6K
mufeez
mufeez@moofeez·
@JedidiahMain i’ll cover this in my blog post – a lot of the project was taking proven approaches and adapting them to my specific training objective/shape
English
0
0
2
218
Daniel Antonio
Daniel Antonio@JedidiahMain·
@moofeez coming from a web SE, what do i need to know to be able to do this?
English
1
0
0
243
mufeez
mufeez@moofeez·
@EliasLumer @willccbb I explored the standard harness + tool call approach, though there’s definitely room for experimentation here
English
1
0
1
154
Elias Lumer
Elias Lumer@EliasLumer·
@moofeez @willccbb Interesting. And by diff variations, im asking how you actually gave an LLM a debugger, like how did you explore it to the LM
English
1
0
1
165
Elias Lumer
Elias Lumer@EliasLumer·
@moofeez @willccbb This is great, do you have the implementation (tools for debugger, etc) open source? Would love to see/implement this
English
1
0
0
1.1K
mufeez
mufeez@moofeez·
great questions, I did run evals on Claude models towards the beginning of the project — the failure mode I observed was that the models would start a debug session but fail to use it effectively (shallow/incomplete debugger use), even on harder bugs not sure what you mean by “diff variations of giving the LLM a debugger”
English
1
0
1
1.3K
Elias Lumer
Elias Lumer@EliasLumer·
@moofeez @willccbb Did you give opus 4.6/ gpt5.5 a debugger? And see results there? Also did u try diff variations of giving the LLM a debugger?
English
1
0
1
1.4K