ali

1.3K posts

ali banner
ali

ali

@realaliarain

building https://t.co/i6Ss5lRFB5 , https://t.co/TcngU7YtMp mvp service: https://t.co/S0PqSirgkj

Pakistan Katılım Haziran 2018
2.6K Takip Edilen203 Takipçiler
ali
ali@realaliarain·
@CommandCodeAI Open models only matter if pricing stays sane. Command Code gets that right.
English
0
0
1
233
Command Code
Command Code@CommandCodeAI·
Command Code is the only code agent that has: 1. $1 Go plan with 10x free credits (best overall) 2. optimizes for top open models 3. repairs open models tool calls free 3. doesn't charge 400% more on open models like DeepSeek/MiMo - almost every other coding agent does, check!
Command Code tweet media
English
20
14
218
18.7K
ali
ali@realaliarain·
@adonis_singh open models might wins now, a best chance for them now
English
0
0
4
4.4K
adi
adi@adonis_singh·
openai has the chance to the funniest thing ever
adi tweet media
English
50
51
4.3K
489.8K
ali retweetledi
Command Code
Command Code@CommandCodeAI·
Kimi K2.7 Code is now in available in Command Code. 10x free credits in Go. Our new #1 open mode in internal benchmarks. cmd update to v0.37.0 select via /model • 256K context 🍃 • 30% lower reasoning tokens than K2.6 ✅ • Open weights 1T-parameter MoE - 32B active ⚡
Command Code tweet media
English
15
13
227
17.4K
ali retweetledi
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
Got interviewed by Business Insider on how enterprise are switching to open models. With Command Code gaining over 10K paying customers in 30 days. Demand for cheaper yet intelligent open models is growing fast, we have figured out how to make open models outperform closed, most of my research on this is public. Excited to see where this goes as we're compounding and growing 30% week over week! All organic inbound, no PR here.
Ahmad Awais tweet media
English
3
6
56
4.8K
ali
ali@realaliarain·
@MrAhmadAwais Fable is making stuff out of nowhere Just random things at random points. Werid
English
1
0
0
673
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
Using Claude Fable for two days now over 20hrs of engineering work. It has not impressed me a single time. What the heck is good on? So much noise in the market atm. Open models are phenomenal. The most you should do is plan from a better model and build with an open model.
English
45
7
185
14.8K
Daniel
Daniel@DanielWhit21874·
Welcome Slot-text text roll animation made to be satisfying to use. Use the lab page to tweak it and create the animation YOU want. textmotion.dev Inspired by @raphaelsalaja work of his landing page for his beautiful tool Calligraph.
English
6
6
185
9.4K
ali retweetledi
Command Code
Command Code@CommandCodeAI·
The next Command Code deal drops. Worth the wait. - Monday June 15, 2026 - 10 AM PT You'll want to be online for this.
Command Code tweet media
English
16
11
130
4.6K
Command Code
Command Code@CommandCodeAI·
Claude Fable 5 now available on Command Code. The Mythos class, state-of-the-art model shows exceptional performance in software engineering tasks, agentic workflows, and scientific research. Try it out now with /model
Command Code tweet media
English
14
4
106
4.9K
ali retweetledi
Command Code
Command Code@CommandCodeAI·
New in v0.33.2 Browse the web now from your terminal. Web Search → ranked results Web Fetch → reads full pages, not a snippet Training data has a cutoff. The web doesn't.
English
22
8
149
5.8K
ali retweetledi
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
i don't use 90% of the vs code features anymore. i'm changing!! as someone who's been writing code for 27 years now, this is the biggest change i'm experiencing. software engineering is changing more deeply than i had realized. as we build a GUI app for Command Code, i'm forced to take a hard look at all of this. things i'm discovering i don't need anymore. i remember pivoting from Electrical Eng to Computer Science after graduating, everything i knew about my life changed. this feels like that. everything i knew about software eng, all my tricks and hacks were built for a world where i'd write a lot of code manually. it's all changing now. i spent 1,000+ hours building vscode .pro. course 65 videos 100+ extensions in it and taught 31,715 developers how to become editor power users. but i don't use 90% or more of that now. the extensions are fine. the job they were installed for is just gone. i hardly need snippets, or weird shortcut shenanigans, debugging is more hard core now then stepping over/in to code, Command Code writes scripts and wires itself in there, reads logs, and verifies everything for me. every one of them existed to speed up human-typed code. emmet expanded your html. linters caught your typos. gitlens prettied up your blame. snippets, keybindings, macros, all of it sanded down the keystroke. then agents deleted the keystrokes. and more i guess. i run Command Code now (3rd largest coding agent in the world) and my whole setup is a terminal, an agent, and taste. the scarce skill moved up the stack: from "how fast can you edit a buffer" to "how precisely can you specify intent and verify the result." context engineering > keybinding engineering. reading code > writing code. how fast can i review code and understand what's changing is the bottle neck now. not my snippets or multi cursor hacks. the part i can't stop thinking about: the agent interface is still primitive. chatting with a coding agent in a terminal feels like using an 80s computer console. text in, text out, scroll forever. the gui for this new computer hasn't been invented yet. like anything, i wanna take a shot at this. i'm going to translate my entire shipsheet workflow into our upcoming gui. i think some of its properties can already be predicted: 1. it'll be visual. vision is the 10-lane highway into the brain. diffs, dependency graphs, test matrices want to be seen. reading them line by line out of a scrollback buffer is brutal. watching an agent work should feel like watching a build pipeline. right now it feels like tailing a log file. 2. it'll be generated on demand. a refactor wants a different surface than a debugging session, which wants a different surface than a security review. the ui should rebuild itself around the task at hand instead of shoving everything through one chat box. 3. the autonomy and human elements. how much do i want to steer right now: approve every tool call, or only review the final diff. the gui's job is making delegation legible. what did the agent do, why, and where should i look. the cli is the right first move. it's raw and native. it won't be beat. the agent should live on your machine, or in a sandbox, with your env, your repo, your private context, at least in this slow-takeoff world of jagged capabilities. but the terminal is where this paradigm starts. it ends somewhere much weirder. there's something very funny about the guy who built his career teaching editor mastery now building the thing that retires most of it. no regrets. those 1,000 hours taught me exactly what developers touch a hundred times a day, which turns out to be exactly the knowledge you need to design what comes after it. i've never been more excited to build.
Ahmad Awais tweet media
English
29
10
114
8.6K
ali retweetledi
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
NVIDIA Nemotron 3 Ultra now in Command Code! it's now my favorite model replacing all other flash models. and boy it's so fast!! the strongest us open model released today! • 1M context • 5x faster inference Our $1 Go plan gets you ~$23 usage on Nemotron. this is awesome!!
Ahmad Awais tweet media
Command Code@CommandCodeAI

NVIDIA Nemotron 3 Ultra now available in Command Code! Strongest US open model yet! 🍀 • 1M context • 5x faster inference • 550B MoE frontier-intelligence open model DEAL 2.3x usage 🎟️ $1 Go plan gets you ~$23 usage on Nemotron Woah, it's fast x taste compliance is great!

English
7
3
70
4K
ali retweetledi
Command Code
Command Code@CommandCodeAI·
DeepSeek works best in Command Code. two ways to prove it: $1 Go plan with $10 → $40 for DeepSeek V4 pro read this harness engineering deep dive below: on how we fix and repair 50K+ tool calls, saving you cost and improve speed & quality of outputs.
Command Code tweet media
Ahmad Awais@MrAhmadAwais

how did we make deepseek outperform opus 4.7? i've been thinking about why "open model bad at tool calling" is almost always a harness problem, not a model problem. context: spent the two days looking at billions of tokens in @CommandCodeAI (tb open source ai cli) using deepseek. I ended up writing a tool-input repair layer. the trigger was watching deepseek-flash fail on the simplest /review run, every shellCommand and readFile call bouncing back with a raw zod issues blob, the model unable to recover because the error wasn't in a form it could read. by the end deepseek v4 pro was beating opus 4.7 6/10 times on our internal evals. a few things i learned that feel general: 1/ the failure modes aren't random they're a small finite compositional set. across deepseek-flash, deepseek v4 pro, glm, qwen, the same four mistakes repeat almost exactly: - sending `null` for an optional field instead of omitting it - emitting `["a","b"]` as a json *string* instead of an actual array - wrapping a single arg in `{}` where the schema expected an array (an "empty placeholder") - passing a bare string where an array was expected (`"foo"` instead of `["foo"]`) four repairs, ~30-100 lines each, ordered carefully (json-array-parse must run before bare-string-wrap or `'["a","b"]'` becomes `['["a","b"]']`). that is the whole catalogue. when i hear "this open source model can't do tool calls" i now assume one of those four, and so far that's been right ~90% of the time. 2/ the funniest failure mode is also the most revealing. deepseek-flash, when asked to edit or write a file, sometimes emits the path as a *markdown auto-link*: filePath: "/Users/x/proj/[notes.md](http://notes. md)" our writeFile tool obediently trued creating files literally named `[notes.md](http://notes .md)` until we caught it. this is not a hallucination. it's the post-training chat distribution leaking through the tool boundary the model has been rewarded for auto-linking in conversational output, and is applying that prior in a context where it makes no sense. the fix is two regex lines that unwrap only the degenerate case where link text equals url-without-protocol real markdown like `[click](https://x .com)` passes through untouched. this is also conditioning of their own tools during RL which were different from all other tools we write and ofc can't predict. "tool confusion" is a more useful frame than "capability gap." the model knows how to format a path. it just hasn't been told clearly enough that this path is going to fopen, not into a chat bubble. so we encode that hint at the schema level `pathString()` instead of `z.string()` and the leak is plugged for every path field at once. 3/ the design choice that mattered was inverting preprocess-then-validate to validate-then-repair. my first attempt was the obvious one: a preprocessing pass that normalized inputs (strip nulls, parse stringified arrays, etc.) before zod ever saw them. it broke immediately, writeFile content that *happened* to be json-shaped got rewritten before it hit disk. silent corruption, easy to miss in a smoke test. then i made it less greedy - parse the input as-is. if it succeeds, ship it. valid inputs are never touched. - on failure, walk the validator's own issue list. for each issue path, try the four repairs in order until one applies. - parse again. on success, log `tool_input_repaired:${toolName}`. on failure, log `tool_input_invalid:${toolName}` and return a model-readable retry message. the structural insight here is: when you preprocess, you encode a prior about what's broken. when you let the validator complain first, the schema is the prior, and you only spend repair budget at the exact paths the schema actually disagreed at. the validator is doing the work of localizing the bug for you. it's the same shape as cheap-then-careful everywhere else try the fast path, fall back on evidence. (this also gives you per-tool telemetry for free. you can watch repair rates per (model, tool) and notice when a model regresses on a specific contract before users do.) 4/ shape invariants and relational invariants need different fixes. the four repairs above all handle shape problems wrong type, missing key, wrong container. but read_file had a *relational* invariant: "if you provide offset, you must also provide limit, and vice versa." deepseek kept calling `readFile({ absolutePath, limit: 30 })` and getting an `ERROR:` back. you can't fix this with input repair, because each field is independently valid the bug is in the relationship between them. so i taught the function the model's intent instead. `limit` alone → `offset = 0`. `offset` alone → `limit = 2000` (matches common read tool ops default). then surfaced the decision back to the model in the result: "Note: limit was not provided; defaulted to 2000 lines. To read more or fewer lines, retry with both offset and limit." no `Error:` prefix, so the tui doesn't paint it red. the model sees what we picked and can self-correct on the next turn if our guess was wrong. transparency over silent magic wins big. repair where you can. extend semantics where you can't. surface the choice either way. zoom out: a lot of what looks like model capability is actually contract design. a strict schema is a choice with a cost it filters out noise, but it also filters out recoverable noise from any model that hasn't memorized the exact json contract you happened to pick. the largest commercial models eat that cost invisibly and are linient on tool calling because they've seen enough of every contract during pretraining; open models pay it loudly and get dismissed for it. the harness is where you mediate between distributions. four small repairs (i'm sure more to follow as we have three more merging today), two regex lines for auto-links, one relational default, one prefix change. the model didn't change. the contract got more forgiving in exactly the places it needed to be. deepseek v4 pro now beats opus 4.7 6/10 times on our internal evals. imo "skill issue" applies to the harness more often than the model.

English
7
7
151
745.7K
ali retweetledi
Command Code
Command Code@CommandCodeAI·
MiniMax M3 is live now on CommandCode! A frontier-class open-weight model with 1M context, frontier coding, agentic performance, and native multimodality Give it a try with our $1 Go plan with 10x free usage credits!
Command Code tweet media
English
16
15
165
36.9K
ali retweetledi
Command Code
Command Code@CommandCodeAI·
MiniMax M3 is now 50% off for the next week! If you haven’t tried it yet, our Go plan starts at just $1 and includes $10 in usage credits.
Command Code tweet media
English
5
6
82
13.6K
ali retweetledi
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
BIG day for us!! @CommandCodeAI has crossed $1M in annual run rate, 1 trillion tokens of usage, with over 9K customers, just 24 days after our public beta launch. we believe this makes it the fastest-growing coding agent harness for open models. 3rd largest by usage. Command Code is built around two ideas: 1. open models should be production-grade for coding. 2. your coding agent should learn your taste. we're building for taste and developer experience. so instead of making a soup of thousands of models, we build for the best ones, open or closed. the goal: a coding agent that feels like an iphone, opinionated and with taste, not a random android or a windows phone with no taste. on the first idea: open models. we fixed the "open models aren't good enough at tool calling" problem. our research came down to two things, quality and speed, and both trace back to one root cause: broken tool-calls that open models produce, especially when you use a bad harness. open-model tool-call failures are not deep, they are a small finite set of contract mismatches. so we repair them, with zero token loss. what started as 4 repairs is now the largest repair layer in the space: 36k tool-call fix variants. i wrote the idea up openly¹ a few weeks ago, and it has quietly become a de facto way people fix open models. developers have either adopted Command Code or used the same idea to build repair harnesses for nearly every top coding agent. i take that as more meaningful validation than anything we could say about ourselves. on the second idea: taste. Command Code builds your coding taste into skills, learned from your accepts, rejects, edits, prompts, and the corrections you repeat. over time it drifts away from generic code and toward how you actually ship code. it learns continuously, and while it is early, the direction feels right. net effect: developers using Command are writing production-quality code on open models, 10x to 100x cheaper, without fighting tool calls, while building repo and team-wide coding taste that compounds. i believe these numbers are a consequence of getting those two things right. what's next. we've applied the same repair idea to ai design slop, and bundled a /design capability² so every developer can level up their design work. the early response has been great. we have a big roadmap ahead of us. the feedback we hear most is that Command Code feels fundamentally different: an approach built on taste and repair. we're going open source next month. today we're a cli at the core, and we're also launching a full-fledged gui app, sandboxed background agents, and cooking up something fun i can't wait to share. we're growing too, hiring in sf and remote worldwide. check open roles on my profile bio. try it now. npm i -g command-code if you like engineering deep dives on how we're doing all this, i've linked some relevant posts below.
English
63
21
247
793.3K
ali retweetledi
SpaceX
SpaceX@SpaceX·
Liftoff of Starship!
English
1.1K
6.4K
32.2K
1.9M