Numman Ali

9.2K posts

Numman Ali banner
Numman Ali

Numman Ali

@nummanali

Agentic Engineering | OpenSkills Creator | OSS

Glasgow, Scotland Katılım Şubat 2023
156 Takip Edilen11.7K Takipçiler
Numman Ali
Numman Ali@nummanali·
The numbers for Rapid MLX are insane Probably the best way to run models on Mac OS
Geek Lite@QingQ77

在 Apple Silicon Mac 上本地运行 LLM 推理服务,提供比 Ollama 和 llama.cpp 更快的 OpenAI 兼容 API,同时原生支持工具调用和提示缓存。 github.com/raullenchai/Ra… Rapid-MLX 用 Apple 自家的 MLX 框架做推理,搭了个 FastAPI 服务跑 OpenAI 兼容 API。在 Apple Silicon 上比 Ollama 快 2-4 倍,靠 KV 缓存裁剪和 DeltaNet 状态快照把续轮 TTFT 压到 0.08 秒左右。工具调用这块做了 17 种解析器,Qwen、DeepSeek、Gemma、GLM 这些模型自动识别格式,量化把输出搞坏的情况也能自动修回来。 另外还有推理链分离、云端路由、视觉/音频多模态、V 缓存压缩等功能。Cursor、Claude Code、Aider、LangChain 都能直接对接。

English
2
1
7
1.9K
Numman Ali
Numman Ali@nummanali·
@Kraggich Looks beautiful! I am heavy on cmux but will give this a try
English
1
0
1
40
Kraggi
Kraggi@Kraggich·
Nyx is live. An infinite-canvas IDE where Claude Code, Codex, Gemini, Hermes Agent and Droid all live on tiles. Arrange them however you think. Watch them all work at the same time. No context switching.
English
45
32
462
73.6K
Numman Ali
Numman Ali@nummanali·
@ryoppippi Rust is great but not sure if it can be the main language
English
0
0
1
31
Numman Ali
Numman Ali@nummanali·
Which one in the Age of AI
English
2
0
1
539
Rhys
Rhys@RhysSullivan·
i'll put feedback into a doc but my immediate impression on it is probably a non starter since i can't bring in context like my git repo what i really want is the Linear MCP to ship with skills (you can do this via exposing a load skill tool) and then being able to use s/claude code/codex with it locally with the normal agent i use + default skills provided by linear to get my ideas -> tickets
Rhys tweet media
English
2
0
4
468
Rhys
Rhys@RhysSullivan·
current dream agent workflow: - set of skills that is able to take a project, feature, bug fix etc and break it down into linear tickets - helps make MVPs of concepts and then implements them for real - an agent orchestrator like symphony, shows what tasks are actively running, clicking on a ticket doesn't show you the agent chat but it does let you run the dev server, tests, and see the diff - backlink the local orchestrator url from the linear ticket and github pull request - leverages graphite for stacked diffs where it makes sense, do this via child tickets in plannng - a local pull request reviewer like diffity, where you either leave a comment to refine the original ticket, or, leave a comment to setup a skill to prevent a pattern from occurring again - enforce anti patterns w/ warden from sentry the idea behind this is you can stop baby sitting the agent runs along with this, it's harness / agent agnostic, you don't ever look at the running agent you're just looking at the output of it also, you're trying to leverage well built tools already for your planning, issue management, agent running, you're just maintaining the symphony layer end goal you look at the input ticket -> output result, if you're not happy with the output, you use the local review flow and iterate
English
19
3
206
15K
Numman Ali
Numman Ali@nummanali·
The Pro $100 plan is helpful as a boost Upgrades my wife's and using that to continue Codexing
Numman Ali tweet media
English
1
0
23
3K
Tom Moor
Tom Moor@tommoor·
@just_be_dev @linear We have that behind a flag, looking into why it's not enabled 😅 One problem is that GitHub doesn't allow mentioning apps which sucks, so you just have to type the at-mention out in plaintext and know that it will work
English
6
0
23
15K
Justin Bennett
Justin Bennett@just_be_dev·
@linear I'd really, really love it if I could chat with the linear agent in a github comment, similar to the claude github integration. Just being able to CC it to make a follow up issue would be huge.
English
1
0
7
1.6K
Numman Ali retweetledi
fks
fks@FredKSchott·
Introducing Flue — The First Agent Harness Framework Flue is a TypeScript framework for building the next generation of agents, designed around a built-in agent harness. Flue is like Claude Code, but 100% headless and programmable. There's no baked in assumption like requiring a human operator to function. No TUI. No GUI. Just TypeScript. But using Flue feels like using Claude Code. The agents you build act autonomously to solve problems and complete tasks. They require very little code to run. Most of the "logic" lives in Markdown: skills and context and AGENTS.md. Flue is like Astro or Next.js for agents (not surprising, given my background 🙃). It's not another AI SDK. It's a proper runtime-agnostic framework. Write once, build, and deploy your agents anywhere (Node.js, Cloudflare, GitHub Actions, GitLab CI/CD, etc). We originally built Flue to power AI workflows inside of the Astro GitHub repo. But then @_bgiori got his hands on it, and we realized that every agent needs a framework like Flue, not just us. Check it out! It's early, but I'm curious to hear what people think. Are agents ready for their library -> framework moment?
fks tweet media
English
172
329
3.6K
681.1K
Numman Ali
Numman Ali@nummanali·
@HiTw93 @jxnlco can you fast track his access to OSS Codex Pro This man is a legend
English
1
0
1
521
Tw93
Tw93@HiTw93·
@nummanali Thanks for the recommendation bro, I will give it a try.
English
2
0
5
2.9K
Tw93
Tw93@HiTw93·
有点儿想包两个 claude max 250 账号了,原来 20 倍也不够我用,不过这个过程给我带来的收益远大于 20 倍。
中文
20
0
75
49.3K
Numman Ali
Numman Ali@nummanali·
@fcoury Isn’t Plan mode able to cover this as well? As in, shouldn’t plan be kept alive across turns and compactions?
English
1
0
1
718
Felipe Coury 🦀
/goal also lands in Codex CLI 0.128.0. Our take on the Ralph loop: keep a goal alive across turns. Don't stop until it's achieved. Built by my co-worker and OpenAI mentor Eric Traut, aka the Pyright guy. One of the GOATs I get to work with daily.
English
167
237
3.5K
839.6K
Numman Ali
Numman Ali@nummanali·
Came across Claires article through @GergelyOrosz And, I must say, there is lessons I have learned as well For one, the monetization of X is a double edged and rewarding but equally a source of false dopamine. You should ignore any signals from it and let it be a tip rather than a source of income Second, no more quote tweeting unless you genuinely have something valuable to add - otherwise let the author be credited with a simple retweet Lastly - everything else is in Claires article, read it and keep X a healthy growing space for tech
claire vo 🖤@clairevo

x.com/i/article/2035…

English
2
0
6
803
signüll
signüll@signulll·
one of the most refreshing things on the planet is talking to someone who just *gets it*. like you don’t need a preamble, & you don’t need to articulate the shape of the thought before you can share it cuz they just meet you where you already are. as if they skimmed your mind & married to the culture before you say a single word. these people are rare, & conversations with them are incredible because you skip the surface layer entirely & land in the depth almost immediately. they’re the best ppl to riff with, ideate with, & think forward with.. the bandwidth is wide & already open. this is true for any type of relationship.
English
123
346
4.4K
174.7K
Numman Ali
Numman Ali@nummanali·
@cwaldow Really? You find the codex plan is not sufficient?
English
1
0
0
458
Christian Waldow
Christian Waldow@cwaldow·
@nummanali Quality is superb but pricing got out of control. I hope competition will keep up.
English
1
0
0
498
Numman Ali
Numman Ali@nummanali·
GPT 5.5 is beyond my expectations 9h 00m 49s of coherent work on a ML library Every night this week I have let it build with constraints through: - AGENTS .md - CONTINUITY .md - MEMORY .md - PLAN .md - .agents/skills Through the combination of these, it has been steering its own work, deciding what to pick up from a plan, how to approach it as a tranche, and dynamically updating its inbuilt task list. This morning I woke up to complete MLX support in TypeScript for Flux2, Z Turbo, and Qwen image generation. Mind blowing. This model is the biggest release we have had since The Great Awakening in December 2025. It has changed the way I approach a problem and how I construct the prompting guidelines and the infra around a project. It comes down to some simple principles that enable long-running, coherent agents: - clear instructions on outcomes, but freedom on solutions - dynamic memory workspace and task management - a verifiable repository with all gates in place - ability to create skills and self-improve on them through use - powerful sub-agent access to validate hypothesis in the absence of humans And the one last key thing here is the Codex CLI or the app, if you prefer. The magic that they have done with their compaction strategies is phenomenal. It is remarkable how the agent is able to stay coherent after multiple, if not hundreds, of compactions, especially when given the additional space to jot down its findings and updated approach. My recommendation is to ask Codex to optimise your repo for agentic engineering and apply its principles so that it is able to work for a long period of time. Hopefully, when I have more time, I will write a longer article on this, but I am very happy to answer any questions anyone might have and provide guidance to the best of my ability.
Numman Ali tweet media
English
20
36
536
30.7K
Kit Langton
Kit Langton@kitlangton·
@tolani_doye like all my actions, it was guided by the sure & scabrous hand of Quetzalcoatl.
English
3
0
60
3.1K
fks
fks@FredKSchott·
@thdxr If this added “check out PR for local review” that would pretty fundamentally change OSS for me. review locally, work with my agent, push changes back instead of comments, merge. I don’t want this to own full e2e review flow, but if it was just the glue that would be 💯
English
2
0
9
2.1K