Tarik Hammadou

2.8K posts

Tarik Hammadou banner
Tarik Hammadou

Tarik Hammadou

@thammadou

2 x startup founder, started career @Motorola Labs. Today, Director Developer’ Relations @Nvidia. Driven by passion & guided by experience. Opinions my own.

Palo Alto, CA Katılım Ocak 2010
412 Takip Edilen2.2K Takipçiler
Tarik Hammadou
Tarik Hammadou@thammadou·
Honored to participate in the annual innovation event hosted by @Prologis in the beautiful town of Charleston. What stood out most was the energy in the room — leaders, operators, technologists, and innovators deeply focused on the future of industrial infrastructure, logistics, energy, and supply chain transformation. The conversations were not just about warehouses or facilities; they were about building the intelligent foundation powering the modern economy. One of the most powerful realizations from the event is this: Real estate is no longer just physical space. When connected with AI, data, automation, robotics, and energy intelligence, it becomes a strategic platform that maps value across the entire supply chain ecosystem — from manufacturing and fulfillment to transportation, inventory flow, and ultimately customer experience. The convergence of industrial real estate, digital infrastructure, and AI-driven operations will define the next era of global competitiveness. The opportunity ahead is massive. A big thank you to the entire Prologis team for the outstanding hospitality, the world-class organization, and for bringing together such an exceptional group of leaders and innovators. Truly inspiring. @ToriDeems @wbodonnell @WalterKemmsies @richardteachout @avihou @FreightAlley #innovation #supplychain #AI #automation #physicalai #prologis #nvidia
Tarik Hammadou tweet mediaTarik Hammadou tweet mediaTarik Hammadou tweet mediaTarik Hammadou tweet media
English
1
1
2
1.2K
Tarik Hammadou
Tarik Hammadou@thammadou·
Most people watch platform shifts from the outside. I’m watching this one get built. Working in Developer Relations at NVIDIA means you don’t just hear about the technology — you sit inside the decisions about how it reaches the world. And what I keep coming back to isn’t the scale. It’s the architecture of the conviction. Long before AI was loud, NVIDIA was making bets that looked strange: → GPUs for general compute, not just graphics → CUDA as a developer platform, not just a toolkit → Simulation and physical AI as a foundation, not a feature None of those were obvious. All of them compounded. What we’re building now — AI factories, accelerated inference, agentic systems that close the loop between decision and execution — that isn’t a product cycle. It’s the payoff on decades of disciplined platform thinking. The developers writing on top of NIM today are standing on 20 years of that infrastructure. I’ll be honest about something: Working inside a shift this large forces a question you can’t ignore. Am I operating at the level this moment requires? Not in terms of output. In terms of depth. Curiosity. Willingness to stay uncomfortable. But here’s what I’ve come to believe: No individual carries a platform shift. These moments are collective — built by people who push the boundary consistently, quietly, over time. The privilege isn’t being the one who built it. It’s being in the room where it’s being built. Grateful. Grounded. Back to work. #AI #NVIDIA #AcceleratedComputing #DeveloperRelations #AIFactory #Leadership
English
0
0
0
36
Tarik Hammadou
Tarik Hammadou@thammadou·
The next great equalizer won't be free education. It'll be free compute. The countries, companies, and communities that figure this out first will define the next era. We democratized information. Now we need to democratize thinking. #AI #GenAI #NVIDIA #FutureOfWork #AIForAll
English
0
0
0
23
Tarik Hammadou
Tarik Hammadou@thammadou·
A PhD student with a $100 API budget can now run experiments that used to require a team of 10 and six months. A founder in Lagos or Lahore with the same token budget has the same reasoning power as a Stanford lab. But only if they can afford the tokens.
English
1
0
0
25
Tarik Hammadou
Tarik Hammadou@thammadou·
School taught you what to think. College taught you how to think. AI gives you the power to think at scale. The scarce resource isn't information anymore — it's the tokens to reason over it. 🧵👇
English
1
0
0
35
Tarik Hammadou
Tarik Hammadou@thammadou·
The 3 layers of agentic AI governance: 1. Model layer — alignment, safety training 2. Tool layer — MCP server monitoring, HITL gates ← this is where the gap is 3. Deployment layer — NIM guardrails, live evaluation Most orgs are stuck at layer 1. The Oxford paper (arxiv.org/abs/2603.23802) shows why layer 2 matters most now. #AgenticAI #AISafety #MCP #NVIDIADev #AIGovernance
English
0
0
0
31
Tarik Hammadou
Tarik Hammadou@thammadou·
Great work by @merlinstein_ and @AISecurityInst — first empirical ground truth for the agentic AI ecosystem. The MCP telemetry approach to monitoring agent deployment is exactly what the field needs.
English
1
0
0
23
Tarik Hammadou
Tarik Hammadou@thammadou·
Fascinating read — "How Are AI Agents Used? Evidence from 177,000 MCP Tools" by Merlin Stein, Oxford/UK AI Security Institute. First large-scale empirical analysis of the agentic AI ecosystem using real deployment telemetry from 177,436 MCP server tools tracked Nov 2024 → Feb 2026. The taxonomy is clean: • Perception tools — read/access data • Reasoning tools — analyze data • Action tools — modify external environments (file edits, emails, API calls, device control) Key data points: • Software development dominates: 67% of all agent tools, 90% of MCP server downloads. Coding agents are the killer app of the agentic era. • Action tools grew from 27% → 65% of total usage in 16 months. Agents aren't reading anymore — they're writing to file systems, triggering financial transactions, and controlling physical devices. • The paper uses O*NET task mapping to score consequentiality. Most action tools today are medium-stakes (file editing, code commits), but higher-stakes tools for financial transactions and system administration are growing fast. • The policy contribution: governments should monitor the tool layer (MCP servers), not just model outputs. The risk surface has shifted from what the model says to what the agent does. Great work grounding the agentic AI conversation in actual deployment data instead of benchmarks. This is the kind of empirical foundation the field needs as we move from prototype to production. arxiv.org/abs/2603.23802 #AgenticAI #MCP #LLM #AIAgents #AISafety #NVIDIADev #GTC2026 #MultiAgentSystems #DevTools #AIGovernance
English
5
0
0
49
Tarik Hammadou
Tarik Hammadou@thammadou·
The NeMo framework puts guardrails and human-in-the-loop approval gates directly at the tool layer. When action tools go from 27% → 65%, the safety architecture has to move there too.
English
0
0
0
17
Tarik Hammadou
Tarik Hammadou@thammadou·
This is relevant to work we are doing on multi agent implementation in supply chain and of interest to many of the ISVs I am working with. New paper from NVIDIA Research: "High Accuracy Agentic Post-Training at Low Compute Cost" — introducing PivotRL. If you're training LLMs for multi-turn agentic tasks (software engineering, tool use, web browsing), you've hit this wall: → SFT is cheap but breaks on out-of-domain tasks → End-to-end RL generalizes but costs 4-5x more compute PivotRL eliminates the tradeoff. Here's how: The core insight: not all turns matter equally. In a 20-turn agent trajectory, most steps are routine. Only a few are pivots — critical decision points where the model's action choices lead to wildly different outcomes. PivotRL identifies these high-variance turns and concentrates training compute there. Instead of rolling out entire trajectories for every gradient update, it does local, targeted rollouts at pivot points only. Two mechanisms that make it work: 1- Pivot Filtering — Profile all turns from SFT data. Keep only the ones where sampled actions show mixed success/failure outcomes. If the model always gets it right (or always wrong), there's no gradient signal. Focus on the hard decisions. 2- Functional Rewards — Stop demanding the model produce the exact same command as the training data. Multiple shell commands can achieve the same result. PivotRL uses domain-specific verifiers that accept any functionally equivalent action. The results speak for themselves: * +14.1 pts in-domain accuracy (vs +9.9 for SFT) * Near-zero OOD degradation (+0.21) vs SFT's -9.83 regression * 4x fewer rollout turns to match E2E RL accuracy * 5.5x faster wall-clock training time * Already proven on Nemotron-3-Super-120B with results Why this matters for enterprise AI: The "functional rewards" concept is a game-changer for domain-specific agents. In warehouse management, supply chain, or any operations setting — there are multiple valid actions for any given scenario. Scoring by outcome equivalence rather than exact match is how real experts evaluate decisions. And the pivot filtering? It's essentially the same insight behind efficient human learning: you don't practice what you've already mastered. You drill the edge cases. This is the bridge between academic RL research and production deployment. 📄 Paper: arxiv.org/abs/2603.21383 👥 Yi, Mosk-Aoyama, Huang, Gala, Wang et al. (NVIDIA) #AI #NVIDIA #ReinforcementLearning #AgenticAI #LLM #NeMo #MachineLearning
English
0
0
1
90
Tarik Hammadou
Tarik Hammadou@thammadou·
Just back from an inspiring NVIDIA GTC 2026 week in San Jose—tons of great things happening around AI factories, agentic systems, and turning data into real-time decisions. Now chilling in Napa, diving into The Decision Factory: A Novel about Decisions Under Uncertainty by Adam DeJans Jr. and John Brandon Elam (foreword by Warren B. Powell). Love the business-novel style—like The Goal, it uses a gripping story (Fulcrum Logistics team facing uncertainty, cascading failures, and adaptive fixes) instead of dry equations. You watch characters wrestle with the gap between forecasts and reality, building “decision factories” that actually work when plans go sideways. It makes sequential decision analytics and handling uncertainty feel practical and intuitive—no heavy math upfront, just relatable struggles and breakthroughs. Post-GTC, it hits perfectly: the future of ops isn’t just bigger models—it’s scalable systems that thrive under uncertainty. Highly recommend for anyone in AI deployment, operations, or data-driven leadership. Quick, engaging read that bridges theory and real-world execution. Who’s read it? Or got other fable-style books that made tough concepts click? #DecisionIntelligence #AIDecisions #GTC2026 #OperationsResearch #UncertaintyManagement #TheDecisionFactory
Tarik Hammadou tweet media
English
0
0
0
42
Tarik Hammadou
Tarik Hammadou@thammadou·
I’ve been running OpenClaw 🦞as my always-on personal AI assistant for a week. It writes its own skills. It spawns subagents. It keeps running after I close my laptop. And honestly? It made me a little nervous. Not because it wasn’t productive — it was extremely productive. But because the safety model was entirely trust-based. Guardrails living inside the same process they were supposed to be guarding. NVIDIA just solved that problem with OpenShell. Here’s what clicked for me reading this blog: developer.nvidia.com/blog/run-auton… The fundamental insight isn’t about agents being more capable. It’s about the runtime layer finally catching up to the agents. OpenShell sits outside the agent — between the agent and your infrastructure. Out-of-process policy enforcement. The agent literally cannot override it, even if compromised. It’s the browser tab model applied to autonomous agents: isolated sessions, permissions verified before any action executes. Three components that make this real: 1- The Sandbox — not generic container isolation. Built specifically for self-evolving agents that write their own code mid-task. Agents can break the environment without touching the host. 2- The Policy Engine — evaluates every action at the binary, destination, method, and path level. If an agent hits a constraint, it reasons about the roadblock and proposes a policy update. You approve. It evolves within boundaries you define. 3- The Privacy Router — keeps sensitive context on-device with local models, routes to frontier models only when policy allows. Decisions made by your policy, not the agent’s judgment. And to deploy it with OpenClaw? - openshell sandbox create --remote spark --from openclaw - Zero code changes. One thing interesting here about my colleagues, the team behind this — acquired from Gretel in 2025 — built their careers at NSA, AWS Macie (petabyte-scale data protection), and enterprise synthetic data infrastructure. This isn’t academic. It’s production-grade security thinking applied to the agent layer. The infrastructure decisions made in the next 6–12 months will define what enterprise agent deployment looks like for years. We’re not in the “AI assistant” era anymore. We’re in the agent runtime era. And it’s just getting started. #AgenticAI #NVIDIA #OpenShell #OpenClaw #NemoClaw #AIAgents #DeveloperTools #GTC2026 #EnterpriseAI #AIInfrastructure
Tarik Hammadou tweet media
English
0
0
2
81
Tarik Hammadou
Tarik Hammadou@thammadou·
What if an LLM could design your entire image generation pipeline — and optimize it with reinforcement learning? That’s exactly what FlowRL (ComfyGen-RL) does. A new paper from NVIDIA Research and Technion introduces the first RL-based framework for automatically generating ComfyUI workflows that are tailored to your prompt and aligned with human preferences. Here’s the problem it solves: Text-to-image has moved far beyond single models. The best results now come from complex multi-component pipelines — fine-tuned generators, LoRAs, upscalers, editing steps — all stitched together in tools like ComfyUI. But designing these workflows requires deep expertise and hours of manual experimentation. Previous approaches like ComfyGen tried to automate this using LLMs, but they had a key limitation: they essentially memorized and replicated existing workflows from training data. 94% of their outputs already existed in the training set. FlowRL takes a fundamentally different approach: Stage 1 — Supervised fine-tuning on 500K prompt-flow pairs teaches the LLM the structure and vocabulary of workflows. A custom tokenization scheme reduces flow representations from ~1,500 tokens to ~200, enabling a 16x batch size increase. Stage 2 — GRPO-based reinforcement learning drives the model toward higher-quality workflow regions. Instead of generating actual images during training (which takes ~1 minute per step), they train an ensemble of surrogate reward models on ModernBERT that predict image quality directly from prompt-workflow pairs. To prevent reward hacking, they use ensemble variance as an uncertainty signal — any prediction the models disagree on gets zero reward. The result? FlowRL generates genuinely novel workflows (0% overlap with training data vs. ComfyGen’s 94%), achieves a 60% win rate over ComfyGen on human preference metrics, and matches state-of-the-art prompt adherence on GenEval. They also introduce a clever inference trick inspired by classifier-free guidance: extrapolating between the SFT and GRPO-tuned model logits at generation time to further boost quality. This feels like an important direction — not just for image generation, but for the broader idea of using RL to optimize compound AI systems. As pipelines get more complex, we need automated ways to navigate the design space. FlowRL shows that’s not just possible — it outperforms human-curated workflows. Paper: arxiv.org/pdf/2505.21478 #AI #GenerativeAI #TextToImage #ReinforcementLearning #GRPO #ComfyUI #NVIDIA #MachineLearning #DiffusionModels #Research
Tarik Hammadou tweet media
English
1
0
3
96
Tarik Hammadou
Tarik Hammadou@thammadou·
I published early this week a deep technical walkthrough and open-source reference repo for building a Multi-Agent Warehouse AI Command Layer, powered by NVIDIA NeMo, NIMs, and MCP This system moves beyond single-LLM assistants by combining LangGraph orchestration, Model Context Protocol (MCP) for dynamic tool discovery, and NVIDIA NIMs for production-grade inference — turning live warehouse data into actionable, low-latency decisions. What developers will find inside: • Multi-agent architecture (planner/router + domain agents) orchestrated with LangGraph • MCP-based context & tool injection, enabling safe, modular integration with WMS, ERP, IoT, and document systems • NeMo-powered pipelines for document understanding, embeddings, and guardrails • NIM-based inference services for scalable, optimized deployment of LLMs and AI services • Hybrid RAG + GPU-accelerated forecasting grounded in real operational telemetry • Production patterns: JWT/RBAC, service isolation, observability This is a concrete blueprint for developers building agentic AI systems in real industrial environments — designed to be forked, extended, and deployed. Technical blog: developer.nvidia.com/blog/multi-age… Repo: github.com/NVIDIA-AI-Blue… #NeMo #NIM #ModelContextProtocol #MCP #AgenticAI #MultiAgentSystems #WarehouseAI #AIDevelopers #NVIDIAAI #supplychain
English
0
0
2
85
Tarik Hammadou
Tarik Hammadou@thammadou·
I just published a deep technical walkthrough and open-source reference repo for building a Multi-Agent Warehouse AI Command Layer, powered by NVIDIA NeMo, NIMs, and MCP This system moves beyond single-LLM assistants by combining LangGraph orchestration, Model Context Protocol (MCP) for dynamic tool discovery, and NVIDIA NIMs for production-grade inference — turning live warehouse data into actionable, low-latency decisions. What developers will find inside: • Multi-agent architecture (planner/router + domain agents) orchestrated with LangGraph • MCP-based context & tool injection, enabling safe, modular integration with WMS, ERP, IoT, and document systems • NeMo-powered pipelines for document understanding, embeddings, and guardrails • NIM-based inference services for scalable, optimized deployment of LLMs and AI services • Hybrid RAG + GPU-accelerated forecasting grounded in real operational telemetry • Production patterns: JWT/RBAC, service isolation, observability This is a concrete blueprint for developers building agentic AI systems in real industrial environments — designed to be forked, extended, and deployed. developer.nvidia.com/blog/multi-age… Repo + architecture diagrams: linked in the blog #NeMo #NIM #ModelContextProtocol #MCP #AgenticAI #MultiAgentSystems #WarehouseAI #AIDevelopers #NVIDIAAI
English
0
0
2
233
Tarik Hammadou
Tarik Hammadou@thammadou·
@NVIDIA is now the driving force behind open AI model innovation powering the ecosystem with Nemotron, Cosmos, Gr00t, BioNeMo, and Canary. Nemotron → agent intelligence BioNeMo → biopharma discovery Cosmos → physical reasoning Gr00t → robotics learning Canary → speech & voice AI’s next wave is running on NVIDIA. #OpenSource #nvidia #agenticAI #AI #Robotics
English
0
0
0
54
Tarik Hammadou
Tarik Hammadou@thammadou·
AI just learned to prove theorems in BOTH mathematics AND quantum physics! Ax-Prover is a multi-agent system that combines LLMs with formal verification (Lean) to generate rigorous mathematical proofs. The breakthrough: It outperforms existing methods on research-level problems while working autonomously OR with human experts. Already being used for cryptography research. This is AI-assisted science at its finest. 📄 Read the paper: arxiv.org/abs/2510.12787 #AI #MachineLearning #Mathematics #QuantumPhysics #MCP #AgenticAI Summary: Paper: Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics What it does: Ax-Prover is a multi-agent system that automates theorem proving in the Lean formal verification system. It combines Large Language Models (for reasoning and knowledge) with Lean tools (for formal correctness) via the Model Context Protocol (MCP). Key innovations: •Multi-agent architecture that iteratively edits proofs, inspects goals, and diagnoses errors •Can operate autonomously or collaborate with human researchers •Introduces two new benchmarks in abstract algebra and quantum theory Results: •Competitive with state-of-the-art on public math benchmarks (PutnamBench, NuminaMath) •Significantly outperforms existing systems on the new research-level benchmarks •Successfully used in a cryptography case study Why it matters: This represents a major step toward AI systems that can assist with rigorous scientific discovery in fields requiring formal verification. It’s not just solving textbook problems—it’s tackling real research challenges with mathematically verifiable proofs.
English
0
1
3
405