Mahmoud El Hadidy

41.1K posts

Mahmoud El Hadidy banner
Mahmoud El Hadidy

Mahmoud El Hadidy

@mikemadmod

3D Artist and Gfx with my big Bro, Freelance. “If wars can be started by lies, peace can be started by truth” Assange "Divided By label, united by noticing" MM

Egypt Katılım Ekim 2014
3.9K Takip Edilen364 Takipçiler
Mahmoud El Hadidy
Mahmoud El Hadidy@mikemadmod·
thanks to unsloth team🦥 for dropping another banger update 20-30% faster inference and full AMD Linux chat support 😍😍 more tokens more speed more unsloth studio👑💚
GIF
Unsloth AI@UnslothAI

Inference in Unsloth Studio is now ~20% faster. You can also use older pre-downloaded GGUFs from Hugging Face etc. AMD chat support for Linux now works. Data Recipes now works on macOS, AMD, CPU setups. GitHub: github.com/unslothai/unsl… Changelog: unsloth.ai/docs/new/chang…

English
0
0
1
5
Mahmoud El Hadidy retweetledi
Matt Pocock
Matt Pocock@mattpocockuk·
Feel like in the AI age, the optimal size for a team of devs on a single decent-sized project is around 3. 1 is untenable. You can't just pause development during their holidays. 2 is OK, but still a lot of bus factor to contend with. 3 is nice and comfortable. Each day the team manages the queue of tickets for the AFK agent, discusses feature requests, architecture, reviews code, improves feedback loops, shares knowledge. Probably some devs contribute to multiple teams.
English
62
13
377
37.3K
Mahmoud El Hadidy retweetledi
Lou
Lou@louszbd·
finally glm-5.1 at the very beginning we were teaching models how to write code, basically training a system that could imitate developers. back then AI lived inside the IDE as an intelligent assistant, but we were still the main driver. that was the copilot era of AI coding. then it started to become something more collaborative. we could express a vague intention (prompt), and the model translates that intention into structured software. in a way, that was the first time we taught machines to understand vibe. earlier this year, we entered the agentic engineering era. we stopped programming line by line. models began to form plans, maintain them, and operate inside a feedback loop. the model takes responsibility for planning. and now we are approaching a moment where AI can operate on the same time horizon as engineers. this is why we built glm-5.1. we want to unlock a new long-horizon paradigm. where it starts to tackle the kinds of problems that unfold over weeks: debugging, integration. an agent to remember context over long stretches, still stay aligned with the objective (and keep correcting itself along the way)
Z.ai@Zai_org

GLM-5.1 is available to ALL GLM Coding Plan users! z.ai/subscribe

English
74
49
1.1K
81.1K
Mahmoud El Hadidy retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc... I am really looking forward to a day where I could simply tell my agent: "build menugen" (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human. It's easy to state, it's now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction!
Patrick Collison@patrickc

When @karpathy built MenuGen (karpathy.bearblog.dev/vibe-coding-me…), he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services, docs, API keys, configurations, dev/prod deployments, team and security features, rate limits, pricing tiers." We've all run into this issue when building with agents: you have to scurry off to establish accounts, clicking things in the browser as though it's the antediluvian days of 2023, in order to unblock its superintelligent progress. So we decided to build Stripe Projects to help agents instantly provision services from the CLI. For example, simply run: $ stripe projects add posthog/analytics And it'll create a PostHog account, get an API key, and (as needed) set up billing. Projects is launching today as a developer preview. You can register for access (we'll make it available to everyone soon) at projects.dev. We're also rolling out support for many new providers over the coming weeks. (Get in touch if you'd like to make your service available.) projects.dev

English
515
477
5.6K
2M
Mahmoud El Hadidy retweetledi
Unsloth AI
Unsloth AI@UnslothAI·
We shipped 50+ updates to Unsloth Studio in just one week! 🚀 - Unsloth Studio now installs in just 2mins - 10x faster via pre-compiled llama.cpp binaries - New Desktop app icon shortcuts - 50% less disk space - Update via `unsloth studio update` - Upload multiple files to Data Recipes - Context length now adjustable - Inference token, context observability - Windows, CPU, GPU now works great - Tool calling improved with parsing, no raw tool markup in chat, faster inference, a new Tool Outputs panel, timers. Full Changelog: unsloth.ai/docs/new/chang… GitHub: github.com/unslothai/unsl…
Unsloth AI@UnslothAI

Introducing Unsloth Studio ✨ A new open-source web UI to train and run LLMs. • Run models locally on Mac, Windows, Linux • Train 500+ models 2x faster with 70% less VRAM • Supports GGUF, vision, audio, embedding models • Auto-create datasets from PDF, CSV, DOCX • Self-healing tool calling and code execution • Compare models side by side + export to GGUF GitHub: github.com/unslothai/unsl… Blog and Guide: unsloth.ai/docs/new/studio Available now on Hugging Face, NVIDIA, Docker and Colab.

English
14
61
447
23.4K
Mohamed Elmorsy | اَلْمُرْسِي
عارف انها جت متأخر 😂 بس حابب اشارك معاكو كل التصاميم بتاعت رمضان علي البيهانس ادخل اللينك و هتشوف كل التصاميم لو جبتك متنساش تسيب لايك😂😂❤️ behance.net/gallery/246183…
العربية
2
0
8
123
Mahmoud El Hadidy retweetledi
Bing Xu
Bing Xu@bingxu_·
This may be one of the first real signs of superhuman intelligence in software. On some of the most optimized attention workloads, agents can now outperform almost all human GPU experts by searching continuously for 7 days with no human intervention inside the optimization loop. Terry and I started agentic coding efforts at NVIDIA 1.5 years ago. Neither of us knew GPU programming, so from day one we pushed toward fully automated, human-out-of-the-loop systems. We call it blind coding. Over those 1.5 years, the two of us generated 4 generations across 2 agent systems. Since the 2nd generation, the stacks have been self-evolving. Each agent is now around 100k non-empty LOC. When we released the blind-coding framework VibeTensor in January, the implication was easy to miss. AVO makes the signal clearer. My bet is: blind coding is the future of software engineering. Human cognition is the bottleneck.
Bing Xu tweet mediaBing Xu tweet media
English
46
150
1K
199K
Mahmoud El Hadidy retweetledi
Armin Ronacher ⇌
Armin Ronacher ⇌@mitsuhiko·
This is the talk I gave in San Francisco at PyAI on how you can figure out what present and future models are good at when it comes to writing your own agents.
Armin Ronacher ⇌ tweet media
Pydantic@pydantic

"How do agentic coding tools work? I have no idea. And you probably also don't." That was @mitsuhiko opening his talk at PyAI. What follows is one of the most useful 20 minutes on working with agents we've ever seen. youtu.be/8RHYyRUxVrA

English
4
14
208
28.2K
Mahmoud El Hadidy retweetledi
Matt Pocock
Matt Pocock@mattpocockuk·
Every time an LLM says anything to me, I automatically assume it's BS unless it's read a source confirming it NONE of the non-devs I talk to have this instinct
English
193
48
1.2K
57.9K
Mahmoud El Hadidy retweetledi
Daniel Han
Daniel Han@danielhanchen·
New Unsloth Studio update! 1. 10x faster via pre-compiled llama.cpp + mamba binaries 2. 6x faster, -50% less disk space installs via bun, uv 3. Studio is now in PATH + `unsloth studio update` works 4. Lots of UI UX improvements And my fav: Desktop + launch shortcuts for Studio!
Unsloth AI@UnslothAI

You don’t need to manually set LLM parameters anymore! llama.cpp uses only the context length + compute your local setup needs. Unsloth also auto-applies the correct model settings Try in Unsloth Studio - now with precompiled llama.cpp binaries. GitHub: github.com/unslothai/unsl…

English
8
23
181
16.3K
Mahmoud El Hadidy retweetledi
Matt Pocock
Matt Pocock@mattpocockuk·
We can't get rid of the calls, but we can get rid of the coding: 1. Jump on a call with your dev colleague/domain expert, creates a transcript 2. Generate notes from the transcript 3. Pass notes to coding agent, creates tickets 4. Pass tickets to AFK agent, creates code 5. Repeat with a new call
English
23
15
179
19.8K
Mahmoud El Hadidy retweetledi
Jen Zhu
Jen Zhu@jenzhuscott·
Cuteness is such an underrated under discussed highly valuable attribute to consumer home robotics market. The image, movements, “personalities” are highly influential to human reactions. A cute, friendly, non threatening robots that can make kids relaxed/happy will sell better than dystopian sci-fi creepy looking humanoids.
Humanoids daily@humanoidsdaily

🚨 BREAKING: Amazon has officially acquired New York-based Fauna Robotics, the startup behind the soft-bodied Sprout humanoid. While Amazon has aggressively pursued automation in its fulfillment centers, this acquisition signals a massive escalation in the race for the consumer home market. Sprout is a bipedal robot designed specifically for the "messy reality" of shared human spaces.

English
22
8
111
11.5K
Mahmoud El Hadidy retweetledi
Docker
Docker@Docker·
Instead of one AI doing everything, what if you had a team of agents? One plans. One builds. One tests. This post from Docker Captains @mfranz_on & @esteban_x64 shows how Docker Agent + Sandboxes make it possible while keeping everything isolated from your machine. Read → bit.ly/4lRrHK2
English
5
21
89
8.7K
Mahmoud El Hadidy retweetledi
Matt Pocock
Matt Pocock@mattpocockuk·
Good tip for avoiding cognitive debt in codebases where AI has run wild: Design the interface, delegate the implementation
English
70
71
949
60.6K
Mahmoud El Hadidy retweetledi
Chayenne Zhao
Chayenne Zhao@GenAI_is_real·
the pattern ethan describes - dump everything into one powerful model and get great results - is quietly killing the multi-agent paradigm. i do the same with cross-validation workflows and most of the "agentic" techniques from 2024 are now pure overhead. debating, voting, self-verification - all were patches for model inadequacy. this has a massive implication for inference economics: the optimal serving strategy is shifting from "many cheap calls orchestrated by a framework" to "one expensive call with rich context". completely different infrastructure requirements. the agent framework builders wont like hearing this but the models are eating the middleware @emollick
Ethan Mollick@emollick

GPT-5.4 Pro continues to be the only model of its class. For anything really hard & complex, I throw it into the maw with every bit of context I can think of. More often than not, something very useful comes out. I can't get the same results from Codex or Code or anything else.

English
2
1
20
2.7K
Mahmoud El Hadidy retweetledi
Jen Zhu
Jen Zhu@jenzhuscott·
Worst idea ever & half baked. 1. Collective punishment of the genuine cross border, international voices, international travellers, expats, nomads. 2. Reward geography rather than quality of content 3. X’s greatest value is that it shrinks the world to a borderless village for free exchange of ideas & knowledge. Its success didn’t come from well walled-up geographic silos. 4. Those who are gaming for income can still find ways to game the system. 5. How do you even define one’s “home country”?? Place you were born? IP at the moment when you post? Passport? Language? 6. Most importantly, many top accounts are the most outstanding outliers of their region/country who are posting globally relevant high quality content. Now you want them to post about the corner bread shop? If it’s implemented, it’d be one of dumbest move in social media history.
Nikita Bier@nikitabier

Starting Thursday, we'll be updating our revenue sharing incentives to better reward the content we want on X: We will be giving more weight to impressions from your home region—to encourage content that resonates with people in your country, in neighboring countries and people who speak your language. While we appreciate everyone's opinion on American politics, we hope this will disincentivize gaming the attention of US or Japanese accounts and instead, drive diverse conversations on the platform. We invite creators to start building an audience locally. X will be a much richer community when there's relevant posts for people in all parts of the world.

English
101
62
460
30.6K
Mahmoud El Hadidy retweetledi
Jen Zhu
Jen Zhu@jenzhuscott·
Agree w Terence Tao - LLMs limitations are structural. I’ve always said the usefulness of current AI correlates w the users expertise. So the illusion of creativity can impress/fool non experts. The current LLMs excel at Keplerian work (empirically testing many combinations via brute/compute scaling) but not Newtonian unification or genuine leaps. They act as a “super-assistant” for literature search, candidate generation, formalization, and exposition - freeing us for the creative core - but there is no evidence yet of autonomous originality at the frontier. Solving a Millennium Prize problem de novo w a genuinely novel technique (not latent in the corpus) would constitute such evidence; it has not occurred.
Valerio Capraro@ValerioCapraro

Terence Tao put it plainly: there is no evidence that LLMs exhibit genuine creativity. Yes, they have solved some Erdős problems. But these are low-hanging fruit, questions that attracted little attention and that yield once the right existing techniques are applied. That is not creativity. That is search plus recombination. Yes, LLM outputs can look impressive. But look at who is impressed: typically non-experts. Experts know very well that LLM performance gets terrible when you approach the frontier of human knowledge. And this is not a temporary gap. It reflects a structural limitation. We do not fully understand human creativity. But we do know a key property: Conceptual leaps: the ability to generate new representations, not just recombine existing ones. LLMs do not do this. They interpolate in representation space. They operate within existing conceptual frameworks; they do not create new ones. This is why we haven’t “yet seen them take the next step”.

English
33
65
376
87.7K
Mahmoud El Hadidy retweetledi
Chayenne Zhao
Chayenne Zhao@GenAI_is_real·
Today I read a lengthy piece on Harness Engineering — tens of thousands of words, almost certainly AI-written. My first reaction wasn't "wow, what a powerful concept." It was "do these people have any ideas beyond coining new terms for old ones?" I've always been annoyed by this pattern in the AI world — the constant reinvention of existing concepts. From prompt engineering to context engineering, now to harness engineering. Every few months someone coins a new term, writes a 10,000-word essay, sprinkles in a few big-company case studies, and the whole community starts buzzing. But if you actually look at the content, it's the same thing every time: Design the environment your model runs in — what information it receives, what tools it can use, how errors get intercepted, how memory is managed across sessions. This has existed since the day ChatGPT launched. It doesn't become a new discipline just because someone — for whatever reason — decided to give it a new name. That said, complaints aside, the research and case studies cited in the article do have value — especially since they overlap heavily with what I've been building with how-to-sglang. So let me use this as an opportunity to talk about the mistakes I've actually made. Some background first. The most common requests in the SGLang community are How-to Questions — how to deploy DeepSeek-V3 on 8 GPUs, what to do when the gateway can't reach the worker address, whether the gap between GLM-5 INT4 and official FP8 is significant. These questions span an extremely wide technical surface, and as the community grows faster and faster, we increasingly can't keep up with replies. So I started building a multi-agent system to answer them automatically. The first idea was, of course, the most naive one — build a single omniscient Agent, stuff all of SGLang's docs, code, and cookbooks into it, and let it answer everything. That didn't work. You don't need harness engineering theory to explain why — the context window isn't RAM. The more you stuff into it, the more the model's attention scatters and the worse the answers get. An Agent trying to simultaneously understand quantization, PD disaggregation, diffusion serving, and hardware compatibility ends up understanding none of them deeply. The design we eventually landed on is a multi-layered sub-domain expert architecture. SGLang's documentation already has natural functional boundaries — advanced features, platforms, supported models — with cookbooks organized by model. We turned each sub-domain into an independent expert agent, with an Expert Debating Manager responsible for receiving questions, decomposing them into sub-questions, consulting the Expert Routing Table to activate the right agents, solving in parallel, then synthesizing answers. Looking back, this design maps almost perfectly onto the patterns the harness engineering community advocates. But when I was building it, I had no idea these patterns had names. And I didn't need to. 1. Progressive disclosure — we didn't dump all documentation into any single agent. Each domain expert loads only its own domain knowledge, and the Manager decides who to activate based on the question type. My gut feeling is that this design yielded far more improvement than swapping in a stronger model ever did. You don't need to know this is called "progressive disclosure" to make this decision. You just need to have tried the "stuff everything in" approach once and watched it fail. 2. Repository as source of truth — the entire workflow lives in the how-to-sglang repo. All expert agents draw their knowledge from markdown files inside the repo, with no dependency on external documents or verbal agreements. Early on, we had the urge to write one massive sglang-maintain.md covering everything. We quickly learned that doesn't work. OpenAI's Codex team made the same mistake — they tried a single oversized AGENTS.md and watched it rot in predictable ways. You don't need to have read their blog to step on this landmine yourself. It's the classic software engineering problem of "monolithic docs always go stale," except in an agent context the consequences are worse — stale documentation doesn't just go unread, it actively misleads the agent. 3. Structured routing — the Expert Routing Table explicitly maps question types to agents. A question about GLM-5 INT4 activates both the Cookbook Domain Expert and the Quantization Domain Expert simultaneously. The Manager doesn't guess; it follows a structured index. The harness engineering crowd calls this "mechanized constraints." I call it normal engineering. I'm not saying the ideas behind harness engineering are bad. The cited research is solid, the ACI concept from SWE-agent is genuinely worth knowing, and Anthropic's dual-agent architecture (initializer agent + coding agent) is valuable reference material for anyone doing long-horizon tasks. What I find tiresome is the constant coining of new terms — packaging established engineering common sense as a new discipline, then manufacturing anxiety around "you're behind if you don't know this word." Prompt engineering, context engineering, harness engineering — they're different facets of the same thing. Next month someone will probably coin scaffold engineering or orchestration engineering, write another lengthy essay citing the same SWE-agent paper, and the community will start another cycle of amplification. What I actually learned from how-to-sglang can be stated without any new vocabulary: Information fed to agents should be minimal and precise, not maximal. Complex systems should be split into specialized sub-modules, not built as omniscient agents. All knowledge must live in the repo — verbal agreements don't exist. Routing and constraints must be structural, not left to the agent's judgment. Feedback loops should be as tight as possible — we currently use a logging system to record the full reasoning chain of every query, and we've started using Codex for LLM-as-a-judge verification, but we're still far from ideal. None of this is new. In traditional software engineering, these are called separation of concerns, single responsibility principle, docs-as-code, and shift-left constraints. We're just applying them to LLM work environments now, and some people feel that warrants a new name. I don't know how many more new terms this field will produce. But I do know that, at least today, we've never achieved a qualitative leap on how-to-sglang by swapping in a stronger model. What actually drove breakthroughs was always improvements at the environment level — more precise knowledge partitioning, better routing logic, tighter feedback loops. Whether you call it harness engineering, context engineering, or nothing at all, it's just good engineering practice. Nothing more, nothing less. There is one question I genuinely haven't figured out: if model capabilities keep scaling exponentially, will there come a day when models are strong enough to build their own environments? I had this exact confusion when observing OpenClaw — it went from 400K lines to a million in a single month, driven entirely by AI itself. Who built that project's environment? A human, or the AI? And if it was the AI, how many of the design principles we're discussing today will be completely irrelevant in two years? I don't know. But at least today, across every instance of real practice I can observe, this is still human work — and the most valuable kind.
Chayenne Zhao tweet media
English
42
140
1.2K
152.4K