Command Code (@CommandCodeAI) - Twitter پروفائل

پن کیا گیا ٹویٹ

Command Code@CommandCodeAI·20 May

A dollar for $40 of DeepSeek V4 Pro usage? Hard to say no to that.

English

103

78

1.4K

1.5M

Command Code@CommandCodeAI·3h

@Sroinotna email us support@commandcode.ai

Français

0

183

An@Sroinotna·3h

@CommandCodeAI Where can I get support about my billing plan?

English

2

0

208

Command Code@CommandCodeAI·3h

Command Code is the only code agent that has: 1. $1 Go plan with 10x free credits (best overall) 2. optimizes for top open models 3. repairs open models tool calls free 3. doesn't charge 400% more on open models like DeepSeek/MiMo - almost every other coding agent does, check!

English

10

5

68

3.2K

Command Code ری ٹویٹ کیا

Lord Libra⚖️@lordlibrachaos·4h

It’s been a month and some days now since I tried @CommandCodeAI and it’s received a massive influx of frontier models, added core browsing capabilities, and dramatically overhauled how it handles deep reasoning and local developer workflows.

Lord Libra⚖️@lordlibrachaos

subscribed. let’s build.

English

1

2

5

733

Command Code@CommandCodeAI·1d

// Reliable tool calls. Any open model. Open models are great at writing code and terrible at calling tools. Command Code fixes that: it validates every tool call and auto-repairs 56K+ tool calling issues. ↳ $ npm i -g command-code

English

2

3

81

2.9K

Command Code ری ٹویٹ کیا

Ahmad Awais@MrAhmadAwais·21h

Back to open models everyone!!

English

14

1

146

5.2K

Command Code@CommandCodeAI·20h

Resume session with ease, now in Command Code. cmd -c or cmd --resume

English

6

3

67

5.1K

Command Code@CommandCodeAI·1d

@FishRaposo @MrAhmadAwais @zeeg Yep, wonder who? 💁‍♂️

English

0

2

22

Vinícius Raposo (Fish)@FishRaposo·1d

@MrAhmadAwais @zeeg I wonder if someone fixed the taste issue 👀

English

1

0

1

56

David Cramer@zeeg·1d

codex writes the most digusting code idk who's responsible for pre-training over there but you gotta flip the script

English

83

14

646

104.5K

Command Code@CommandCodeAI·1d

@igorcesarcode Can you file a ticket with your session? We did make lots of fixes and a continuity loop repair. Check the latest version.

English

0

1

218

Igor César@igorcesarcode·1d

@CommandCodeAI For me, it's currently unusable. It gets stuck in a loop and ends up consuming all of my credits. I'll wait for a fix.

English

1

0

261

Command Code ری ٹویٹ کیا

Command Code@CommandCodeAI·1d

Kimi K2.7 Code is now in available in Command Code. 10x free credits in Go. Our new #1 open mode in internal benchmarks. cmd update to v0.37.0 select via /model • 256K context 🍃 • 30% lower reasoning tokens than K2.6 ✅ • Open weights 1T-parameter MoE - 32B active ⚡

English

15

13

225

16.9K

Command Code@CommandCodeAI·1d

@WilThomson42533 BYOK not here atm. We need to make a few architectural changes before enabling it again.

English

0

1

252

Wil Thomson@WilThomson42533·1d

@CommandCodeAI Hey team, is there currently a way to add a custom provider / custom endpoint in Command Code? I couldn’t find an option and I’d love to connect my own proxy APIs or local models. Adding custom provider support would be really useful for many users.

English

2

0

2

443

Command Code ری ٹویٹ کیا

Ahmad Awais@MrAhmadAwais·1d

My Latent Space Podcast with Swyx is now live. 🎙️ How Command Code made DeepSeek outperform Opus 4.7 with Taste Thanks for having me, swyx!

Ahmad Awais@MrAhmadAwais

how did we make deepseek outperform opus 4.7? i've been thinking about why "open model bad at tool calling" is almost always a harness problem, not a model problem. context: spent the two days looking at billions of tokens in @CommandCodeAI (tb open source ai cli) using deepseek. I ended up writing a tool-input repair layer. the trigger was watching deepseek-flash fail on the simplest /review run, every shellCommand and readFile call bouncing back with a raw zod issues blob, the model unable to recover because the error wasn't in a form it could read. by the end deepseek v4 pro was beating opus 4.7 6/10 times on our internal evals. a few things i learned that feel general: 1/ the failure modes aren't random they're a small finite compositional set. across deepseek-flash, deepseek v4 pro, glm, qwen, the same four mistakes repeat almost exactly: - sending `null` for an optional field instead of omitting it - emitting `["a","b"]` as a json *string* instead of an actual array - wrapping a single arg in `{}` where the schema expected an array (an "empty placeholder") - passing a bare string where an array was expected (`"foo"` instead of `["foo"]`) four repairs, ~30-100 lines each, ordered carefully (json-array-parse must run before bare-string-wrap or `'["a","b"]'` becomes `['["a","b"]']`). that is the whole catalogue. when i hear "this open source model can't do tool calls" i now assume one of those four, and so far that's been right ~90% of the time. 2/ the funniest failure mode is also the most revealing. deepseek-flash, when asked to edit or write a file, sometimes emits the path as a *markdown auto-link*: filePath: "/Users/x/proj/[notes.md](http://notes. md)" our writeFile tool obediently trued creating files literally named `[notes.md](http://notes .md)` until we caught it. this is not a hallucination. it's the post-training chat distribution leaking through the tool boundary the model has been rewarded for auto-linking in conversational output, and is applying that prior in a context where it makes no sense. the fix is two regex lines that unwrap only the degenerate case where link text equals url-without-protocol real markdown like `[click](https://x .com)` passes through untouched. this is also conditioning of their own tools during RL which were different from all other tools we write and ofc can't predict. "tool confusion" is a more useful frame than "capability gap." the model knows how to format a path. it just hasn't been told clearly enough that this path is going to fopen, not into a chat bubble. so we encode that hint at the schema level `pathString()` instead of `z.string()` and the leak is plugged for every path field at once. 3/ the design choice that mattered was inverting preprocess-then-validate to validate-then-repair. my first attempt was the obvious one: a preprocessing pass that normalized inputs (strip nulls, parse stringified arrays, etc.) before zod ever saw them. it broke immediately, writeFile content that *happened* to be json-shaped got rewritten before it hit disk. silent corruption, easy to miss in a smoke test. then i made it less greedy - parse the input as-is. if it succeeds, ship it. valid inputs are never touched. - on failure, walk the validator's own issue list. for each issue path, try the four repairs in order until one applies. - parse again. on success, log `tool_input_repaired:${toolName}`. on failure, log `tool_input_invalid:${toolName}` and return a model-readable retry message. the structural insight here is: when you preprocess, you encode a prior about what's broken. when you let the validator complain first, the schema is the prior, and you only spend repair budget at the exact paths the schema actually disagreed at. the validator is doing the work of localizing the bug for you. it's the same shape as cheap-then-careful everywhere else try the fast path, fall back on evidence. (this also gives you per-tool telemetry for free. you can watch repair rates per (model, tool) and notice when a model regresses on a specific contract before users do.) 4/ shape invariants and relational invariants need different fixes. the four repairs above all handle shape problems wrong type, missing key, wrong container. but read_file had a *relational* invariant: "if you provide offset, you must also provide limit, and vice versa." deepseek kept calling `readFile({ absolutePath, limit: 30 })` and getting an `ERROR:` back. you can't fix this with input repair, because each field is independently valid the bug is in the relationship between them. so i taught the function the model's intent instead. `limit` alone → `offset = 0`. `offset` alone → `limit = 2000` (matches common read tool ops default). then surfaced the decision back to the model in the result: "Note: limit was not provided; defaulted to 2000 lines. To read more or fewer lines, retry with both offset and limit." no `Error:` prefix, so the tui doesn't paint it red. the model sees what we picked and can self-correct on the next turn if our guess was wrong. transparency over silent magic wins big. repair where you can. extend semantics where you can't. surface the choice either way. zoom out: a lot of what looks like model capability is actually contract design. a strict schema is a choice with a cost it filters out noise, but it also filters out recoverable noise from any model that hasn't memorized the exact json contract you happened to pick. the largest commercial models eat that cost invisibly and are linient on tool calling because they've seen enough of every contract during pretraining; open models pay it loudly and get dismissed for it. the harness is where you mediate between distributions. four small repairs (i'm sure more to follow as we have three more merging today), two regex lines for auto-links, one relational default, one prefix change. the model didn't change. the contract got more forgiving in exactly the places it needed to be. deepseek v4 pro now beats opus 4.7 6/10 times on our internal evals. imo "skill issue" applies to the harness more often than the model.

English

3

44

3.5K

Command Code ری ٹویٹ کیا

Ahmad Awais@MrAhmadAwais·1d

Got interviewed by Business Insider on how enterprise are switching to open models. With Command Code gaining over 10K paying customers in 30 days. Demand for cheaper yet intelligent open models is growing fast, we have figured out how to make open models outperform closed, most of my research on this is public. Excited to see where this goes as we're compounding and growing 30% week over week! All organic inbound, no PR here.

English

3

6

54

4.5K

Command Code@CommandCodeAI·1d

@Kimi_Moonshot Kimi-K2.7-Code shipping to @CommandCodeAI $1 Go plan.

English

3

0

53

1.6K

Kimi.ai@Kimi_Moonshot·1d

🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced! 🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite. 🔷 Reasoning efficiency: Less overthinking, with 30% lower reasoning-token usage compared to K2.6. 🔷 Long-horizon coding: Improved instruction following, higher end-to-end coding task success rates. ⚡️ 6x High-Speed Mode coming soon! 🔌 Available today via Kimi API and Kimi Code. 🔗 Kimi Code: kimi.com/code 🔗 API: platform.moonshot.ai

English

575

1.6K

13K

1.8M

Command Code ری ٹویٹ کیا

Ahmad Awais@MrAhmadAwais·1d

@pseudokid At @CommandCodeAI we do this in $1 Go plan, which has $10 in it. And absolutely free tool call repairs, tool call issues is the biggest problem with open models. shameless plug i guess, founder here, my research are open, 56K+ now, here's eng deep dive x.com/MrAhmadAwais/s…

Ahmad Awais@MrAhmadAwais

how did we make deepseek outperform opus 4.7? i've been thinking about why "open model bad at tool calling" is almost always a harness problem, not a model problem. context: spent the two days looking at billions of tokens in @CommandCodeAI (tb open source ai cli) using deepseek. I ended up writing a tool-input repair layer. the trigger was watching deepseek-flash fail on the simplest /review run, every shellCommand and readFile call bouncing back with a raw zod issues blob, the model unable to recover because the error wasn't in a form it could read. by the end deepseek v4 pro was beating opus 4.7 6/10 times on our internal evals. a few things i learned that feel general: 1/ the failure modes aren't random they're a small finite compositional set. across deepseek-flash, deepseek v4 pro, glm, qwen, the same four mistakes repeat almost exactly: - sending `null` for an optional field instead of omitting it - emitting `["a","b"]` as a json *string* instead of an actual array - wrapping a single arg in `{}` where the schema expected an array (an "empty placeholder") - passing a bare string where an array was expected (`"foo"` instead of `["foo"]`) four repairs, ~30-100 lines each, ordered carefully (json-array-parse must run before bare-string-wrap or `'["a","b"]'` becomes `['["a","b"]']`). that is the whole catalogue. when i hear "this open source model can't do tool calls" i now assume one of those four, and so far that's been right ~90% of the time. 2/ the funniest failure mode is also the most revealing. deepseek-flash, when asked to edit or write a file, sometimes emits the path as a *markdown auto-link*: filePath: "/Users/x/proj/[notes.md](http://notes. md)" our writeFile tool obediently trued creating files literally named `[notes.md](http://notes .md)` until we caught it. this is not a hallucination. it's the post-training chat distribution leaking through the tool boundary the model has been rewarded for auto-linking in conversational output, and is applying that prior in a context where it makes no sense. the fix is two regex lines that unwrap only the degenerate case where link text equals url-without-protocol real markdown like `[click](https://x .com)` passes through untouched. this is also conditioning of their own tools during RL which were different from all other tools we write and ofc can't predict. "tool confusion" is a more useful frame than "capability gap." the model knows how to format a path. it just hasn't been told clearly enough that this path is going to fopen, not into a chat bubble. so we encode that hint at the schema level `pathString()` instead of `z.string()` and the leak is plugged for every path field at once. 3/ the design choice that mattered was inverting preprocess-then-validate to validate-then-repair. my first attempt was the obvious one: a preprocessing pass that normalized inputs (strip nulls, parse stringified arrays, etc.) before zod ever saw them. it broke immediately, writeFile content that *happened* to be json-shaped got rewritten before it hit disk. silent corruption, easy to miss in a smoke test. then i made it less greedy - parse the input as-is. if it succeeds, ship it. valid inputs are never touched. - on failure, walk the validator's own issue list. for each issue path, try the four repairs in order until one applies. - parse again. on success, log `tool_input_repaired:${toolName}`. on failure, log `tool_input_invalid:${toolName}` and return a model-readable retry message. the structural insight here is: when you preprocess, you encode a prior about what's broken. when you let the validator complain first, the schema is the prior, and you only spend repair budget at the exact paths the schema actually disagreed at. the validator is doing the work of localizing the bug for you. it's the same shape as cheap-then-careful everywhere else try the fast path, fall back on evidence. (this also gives you per-tool telemetry for free. you can watch repair rates per (model, tool) and notice when a model regresses on a specific contract before users do.) 4/ shape invariants and relational invariants need different fixes. the four repairs above all handle shape problems wrong type, missing key, wrong container. but read_file had a *relational* invariant: "if you provide offset, you must also provide limit, and vice versa." deepseek kept calling `readFile({ absolutePath, limit: 30 })` and getting an `ERROR:` back. you can't fix this with input repair, because each field is independently valid the bug is in the relationship between them. so i taught the function the model's intent instead. `limit` alone → `offset = 0`. `offset` alone → `limit = 2000` (matches common read tool ops default). then surfaced the decision back to the model in the result: "Note: limit was not provided; defaulted to 2000 lines. To read more or fewer lines, retry with both offset and limit." no `Error:` prefix, so the tui doesn't paint it red. the model sees what we picked and can self-correct on the next turn if our guess was wrong. transparency over silent magic wins big. repair where you can. extend semantics where you can't. surface the choice either way. zoom out: a lot of what looks like model capability is actually contract design. a strict schema is a choice with a cost it filters out noise, but it also filters out recoverable noise from any model that hasn't memorized the exact json contract you happened to pick. the largest commercial models eat that cost invisibly and are linient on tool calling because they've seen enough of every contract during pretraining; open models pay it loudly and get dismissed for it. the harness is where you mediate between distributions. four small repairs (i'm sure more to follow as we have three more merging today), two regex lines for auto-links, one relational default, one prefix change. the model didn't change. the contract got more forgiving in exactly the places it needed to be. deepseek v4 pro now beats opus 4.7 6/10 times on our internal evals. imo "skill issue" applies to the harness more often than the model.

English

6

2

39

2.4K

Command Code@CommandCodeAI·2d

A lot is happening at Command Code right now, and it feels incredible. Y'all will love our June launches. Stay tuned.

English

13

1

75

2.6K

Command Code@CommandCodeAI·2d

The next Command Code deal drops. Worth the wait. - Monday June 15, 2026 - 10 AM PT You'll want to be online for this.

English

15

11

129

4.5K

Command Code@CommandCodeAI·3d

@AhmadBilalDev @UOSJoe no joke!!

English

0

25

Bilal@AhmadBilalDev·4d

@UOSJoe @CommandCodeAI 🫡 ds with command code is great at complex ui fixes.

English

1

0

2

43

Joe@UOSJoe·4d

I just managed to fix 2 very annoying visual bugs on my website which Codex was struggling with after numerous iterations. 1 of them was very simple I was suprised it failed so much. 5 minutes with Deep Seek V4 Pro with @CommandCodeAI and problem solved! Awesome Harness.

English

1

0

2

64

Command Code@CommandCodeAI·3d

@adp_alpha @claudeai you bet!!

English

1

0

7

500