Deepak Dubey 🇩🇪

473 posts

Deepak Dubey 🇩🇪

@thatdeepak

infrastructure engineer , Kubernetes on the brain 💡🌥️ Applied maths 👨‍🎓

Hamburg, Deutschland Katılım Eylül 2019

209 Takip Edilen146 Takipçiler

Deepak Dubey 🇩🇪@thatdeepak·9 May

@brankopetric00 Reason why Sealed secrets here

English

437

Branko@brankopetric00·8 May

Kubernetes Secrets are not encrypted. They are base64 encoded. We have been calling them secrets for ten years and the auditors have not caught on.

English

749

65.6K

Deepak Dubey 🇩🇪@thatdeepak·29 Nis

TurboQuant 2-bit KV killer upgrade

vLLM@vllm_project

vLLM v0.20.0 is here! 752 commits from 320 contributors (123 new). 🎉 Highlights: DeepSeek V4, Hunyuan v3 preview support, CUDA 13 / PyTorch 2.11 / Transformers v5 baseline, FA4 as default MLA prefill, TurboQuant 2-bit KV (4× capacity), vLLM IR foundation. Thread 👇

Deutsch

Deepak Dubey 🇩🇪@thatdeepak·29 Nis

@JIACHENLIU8 Very cool

English

Amber Liu@JIACHENLIU8·28 Nis

My bet: in the near future, 80%⬆️ of CS research will be done by AI in collaboration with humans. However, today's research ecosystem is still built around the human, not the AI scientist. For example, the 8-page paper PDF is a lossy compression of months of branching exploration into a linear story, optimized for a human reviewer to skim in 30 minutes. It hides two structural taxes: 📖 Storytelling Tax — failures, rejected hypotheses, and dead ends get stripped. On RE-Bench (24,008 runs, 21 frontier models), failed runs = 90.2% of total compute cost, with a 113× median failed-to-success token ratio. Every lab independently rediscovers the same dead ends. 🔧 Engineering Tax — the gap between reviewer-sufficient prose and agent-sufficient spec. Across 8,921 PaperBench requirements (23 ICML'24 papers), only 45.4% are fully specified in the PDF. The rest is tacit lab knowledge. Tolerable when readers were human. Critical now that agents read, reproduce, and extend. We propose ARA: the Agent-Native Research Artifact — replace the narrative PDF with an agent-executable package, in 4 layers: 🧠 structured scientific logic ⚙️ executable code w/ full specs 🌳 exploration graph (every failure preserved) 📊 evidence grounding every claim

English

563

99.8K

Deepak Dubey 🇩🇪@thatdeepak·27 Nis

@ClementDelangue But for multi-user scenarios, vLLM is still ahead

English

267

clem 🤗@ClementDelangue·27 Nis

llamacpp is the future of AI (local + free + fast + secure + powerful)!

English

1.4K

81.8K

Deepak Dubey 🇩🇪@thatdeepak·26 Nis

@dealignai Vllm support?

Deutsch

543

dealign.ai@dealignai·26 Nis

MiniMax m2.7 - 36gb HumanEval+: 81% pass1, 90% pass5. huggingface.co/OsaurusAI/Mini…

Eesti

301

43.5K

Deepak Dubey 🇩🇪@thatdeepak·22 Nis

@mytechceoo @albertadevs @getaxal CTO = Chief Token Officer

English

146

Jason@mytechceoo·21 Nis

CEO obsessed with token maxxing

English

282

13K

1.9M

Deepak Dubey 🇩🇪@thatdeepak·21 Nis

@vllm_project @PyTorch Could you also add v100 gpu?

English

164

vLLM@vllm_project·21 Nis

🎉 We just shipped a major redesign of recipes.vllm.ai. "How do I run model X on hardware Y for task Z?" now has a clickable answer. What's new: - URLs mirror HuggingFace: just swap huggingface.co → recipes.vllm.ai in any model URL to jump straight to its recipe (e.g. recipes.vllm.ai/Qwen/Qwen3.6-3…) - Interactive command builder: pick hardware, variant, strategy (tensor, tensor+expert, or data+expert; single or multi-node; or a prefill/decode disaggregated cluster), toggle features → get the exact `vllm serve` command - Pluggable hardware: NVIDIA + AMD already integrated. One-click switch between Hopper/Blackwell and MI300X/MI355X, and the right flags and env are applied automatically - JSON API for agents: every recipe is also published at //.json (e.g. recipes.vllm.ai/Qwen/Qwen3.6-3…), so tools and agents can consume recipes without scraping - Contribute a new recipe end-to-end with the agent skill shipped in the repo: github.com/vllm-project/r… 🔗 recipes.vllm.ai Enjoy! ✨

English

114

759

72.1K

Deepak Dubey 🇩🇪@thatdeepak·18 Nis

@0xSero @AkankshaGa20248

QAM

0xSero@0xSero·18 Nis

Compression is how people can run GLM at home GPTQ is such an interesting algorithm. It goes layer by layer, creating mapping of related weights, it then compresses a weight, then adjusts related weights. Step by step towards 75% compression with minimal degradation.

English

300

11.6K

Deepak Dubey 🇩🇪@thatdeepak·14 Nis

@akshay_pachaar Its not open-source check license

English

108

Akshay 🚀@akshay_pachaar·14 Nis

MiniMax M2.7 is open-source! The most interesting part of this release isn't a benchmark number. It's what MiniMax calls "self-evolution," and it's essentially Karpathy's Autoresearch applied at full scale. Every AI agent today runs inside a harness: the scaffolding of skills, tools, memory, and workflow rules that surrounds it. Normally a human engineer builds this, and the agent operates within it. The harness stays fixed. M2.7 treats its harness as something it can rewrite. The agent runs a task, analyzes where things went wrong, plans changes to its own scaffold, applies them, evaluates against a benchmark, and decides whether to keep or revert. It writes self-criticism into memory so the next round starts smarter, then loops again. MiniMax ran this for 100+ rounds internally. The model discovered optimizations on its own: it systematically searched for optimal sampling parameters, wrote workflow-specific guidelines (like checking for the same bug pattern in other files after a fix), and added loop detection to avoid getting stuck. They also tested it on 22 ML competitions from OpenAI's MLE Bench Lite, each running 24 hours fully autonomous. With every round, the trained models achieved higher medal rates. The best run earned 9 gold medals. The weights never changed. What improved was the system around the model: better skills, better memory, better workflow rules. That distinction matters because the improvement loop can run continuously without any retraining. I'm pretty sure every major AI lab is doing some version of this internally. The fact that MiniMax is publishing it openly is what makes this release worth paying attention to. huggingface : huggingface.co/MiniMaxAI/Mini… Blog: minimax.io/news/minimax-m… Note: The model licence is NON-COMMERCIAL LICENSE, that said, there's a lot to learn from this work being available in the open.

GIF

English

221

17.9K

Deepak Dubey 🇩🇪@thatdeepak·7 Nis

@shawnchauhan1 the real value lies in solving niche, high-impact problems not just building what general LLMs will eventually be able to do with enough training and scale

English

286

Shawn Chauhan@shawnchauhan1·7 Nis

India is not waiting for OpenAI to build a Hindi voice model. Sarvam AI is close to raising $300–350 million at a $1.5 billion valuation, with NVIDIA, Amazon, and Bessemer all participating. The product: voice-first, multilingual models covering 22 Indian languages. This is not a ChatGPT wrapper. It is a direct bet that the next billion AI users will not interact in English, and that whoever builds the native-language infrastructure first will own the relationship. Every large language market without a domestic frontier model is a gap waiting to be filled. Sarvam is the first serious attempt to fill India's.

English

491

15.5K

Deepak Dubey 🇩🇪@thatdeepak·3 Nis

@DominiqueCAPaul Thank you Dominique

English

169

Dominique Paul@DominiqueCAPaul·3 Nis

How to found a GmbH in Germany in 48 hours (digitally) 🇩🇪. A lot of people commented on my other post, so I thought I'd share all the details. Here is a step-by-step guide for how to do it. 1. Get your electronic ID activated. Any German ID issued after 2022 works. You sign up online and get a PIN when you first receive it. 2. Call or email a notary and ask for a digital appointment. I called two in Munich on a Wednesday. One had a slot 18 hours later. 3. Optional: get a free confirmation from your local IHK that your "Gesellschaftszweck" doesn't conflict with existing companies. Takes just a web form and 24 hours. Create your company purpose with Claude. 4. Use the "Musterprotokoll", the simplest standard articles of association. Every notary has them ready to go. 5. Send the notary your company basics: name, address, share capital, and personal data. 6. They send you an invite to a video appointment. 7. The call takes ~10 minutes. You hold your ID to your phone via NFC in the notary app to verify your identity. 8. They send you the incorporation documents after the call. 9. Open a business bank account (I used Qonto), wire the capital. A GmbH requires €25K, but you can start with €12.5K. The bank confirms your deposit to the notary. 10. The notary triggers entry into the commercial register. Took me three working days. bUt YoU haV'nT rEalLY inCorPorATeD yEt! 😡😭 > The GmbH can operate as a "GmbH in Gründung" (GmbH i.G.) from the moment the notarization is complete, so even before entry in the commercial register. Once the entry is made, it becomes a full GmbH. > The VAT ID is separate and not required to start operating, though you'll need it for invoicing with VAT. Godspeed.

English

612

87.5K

Deepak Dubey 🇩🇪@thatdeepak·20 Mar

@mntruell Isn’t kimi 2.5?

English

Michael Truell@mntruell·19 Mar

Composer 2 is out! Cursor is an example of a new type of company, not a pure app maker and not a model provider. Our aim is to build the most useful coding agents by combining the best API models and our domain-specific models.

Cursor@cursor_ai

Composer 2 is now available in Cursor.

English

129

1.1K

149.6K

Deepak Dubey 🇩🇪@thatdeepak·14 Mar

@ZaiforStartups We already applied waiting for response

English

153

Z.ai for Startups@ZaiforStartups·14 Mar

Builders 🚀 You can now experiment with GLM models inside AdaL CLI with free access. A coding agent designed for real developer workflows. • multi-model support • long-term memory • designed for shipping products faster Exactly the kind of tooling we want to support through the Z.ai Startup Program. Watch the video below.👇

English

358

37.4K

Deepak Dubey 🇩🇪@thatdeepak·10 Mar

@HelenaWangZuZAI Helena We already applied could you check our application thank you 🤩

English

Helena WangZu @ Z.AI@HelenaWangZuZAI·10 Mar

GLM models support founders via free API tokens! Up to $10000! Apply here: startup.z.ai/?src=linkedin&…

English

257

Deepak Dubey 🇩🇪@thatdeepak·2 Mar

@CarolGLMs @Zai_org Applying for intuitive-ai.de

English

108

Carol Lin@CarolGLMs·2 Mar

We’re seeing more startups move from “demo” to real production AI. But the hard part isn’t the prototype. It’s shipping reliably. Scaling economically. Handling real users. That’s why we built the “Z.ai Startup Program” → Grow your startup beyond limits with Z.ai. Apply now → lnkd.in/gNJhHmGG Ship faster, scale cheaper.

English

7.4K

Z.ai@Zai_org·2 Mar

Z.ai Startup Program is NOW OPEN. What you can get: ·Free API credits ·Priority rate limits ·Exclusive Community ·Early API Access Who we're looking for: ·AI-native startups ·Agent builders ·SaaS founders integrating LLM infra ·Global teams building for real-world scale If you're building something that matters, don't wait!! Apply now: startup.z.ai Questions? Details? Follow & DM @ZaiforStartups

English

120

238

2.4K

267.5K

Deepak Dubey 🇩🇪@thatdeepak·2 Mar

@Zai_org Perfect 🤩 what is those Priority rate limits?

English

186

Deepak Dubey 🇩🇪@thatdeepak·6 Şub

@_trish_xD Its always skill issue

English

161

trish@TrisH0x2A·5 Şub

kubernetes is the most over-engineered solution to a problem most companies don't have.

English

183

194

3.9K

331.2K

Deepak Dubey 🇩🇪@thatdeepak·1 Şub

@livingdevops @mischavdburg @brankopetric00 Reason why there is cloud native pg exits We have been managing PostgreSQL clusters on-premises for over five years

English

161

Akhilesh Mishra@livingdevops·1 Şub

@mischavdburg @brankopetric00 One should not put their main db in a containers in production. It’s ok for demo, dev environments, but never in production.

English

2.2K

Branko@brankopetric00·1 Şub

The obsession with putting everything in a container has to stop. "We should containerize our PostgreSQL database!" said someone who has never managed stateful data in their life. Not everything is a stateless 12-factor app. You're adding a layer of abstraction, performance overhead, and operational complexity for literally zero benefit. Use a managed database service. Stop trying to Dockerize things that want to live on a file system.

English

379

46.9K

Deepak Dubey 🇩🇪@thatdeepak·8 Ara

@brankopetric00 Hetzner > AWS

Deutsch