Paramendra Kumar Bhagat

118K posts

Paramendra Kumar Bhagat banner
Paramendra Kumar Bhagat

Paramendra Kumar Bhagat

@paramendra

Entrepreneur/Author/Activist Marketing: The New Bottleneck in the Age of AI https://t.co/fkAbmJaCtd Books https://t.co/dUGQBiAUND

TX Katılım Ocak 2009
34.6K Takip Edilen32.2K Takipçiler
Paramendra Kumar Bhagat retweetledi
Fuli Luo
Fuli Luo@_LuoFuli·
MiMo-V2-Flash is live. It’s just step 2 on our AGI roadmap, but I wanted to dump some notes on the engineering choices that actually moved the needle. Architecture: We settled on a Hybrid SWA. It’s simple, elegant, and in our internal benchmarks, it outperformed other Linear Attention variants on long context reasoning. Plus, a fixed KV cache just plays way nicer with current infra. Note: Window size 128 turned out to be the magic number (512 actually degraded performance). Also, sink values are non-negotiable—don't skip them. MTP (Multi-Token Prediction): This is underrated for efficient RL. Aside from the first layer, it needs surprisingly little fine-tuning to hit high accept length. With a 3-layer MTP, we're seeing >3 accept length and ~2.5x speedup in coding tasks. It effectively solves the GPU idle time from long-tail samples in small-batch On-Policy RL. We didn't get to squeeze it into the RL loop this time due to deadlines, but it’s a perfect fit. We open-sourced the 3-layer MTPs so you can develop with it. Posttrain with MOPD: We adopted On-Policy-Distillation from Thinking Machine to merge multiple RL models, and the efficiency gains were wild. We matched the teacher model's performance using less than 1/50th the compute of a standard SFT+RL pipeline. There’s a clear path here for a self-reinforcing loop where the student evolves into a stronger teacher. Huge props to my team. They sculpted these ideas from scratch into production in just a few months. Full breakdown is in the tech report. If this kind of pragmatic engineering resonates with you, we should talk.
English
79
113
1.2K
391.7K
Paramendra Kumar Bhagat retweetledi
Fuli Luo
Fuli Luo@_LuoFuli·
Imagination is the ceiling of productivity in the new era. Inspiring imagination is the core of management in the age of Claw.
English
13
24
271
42.7K
Paramendra Kumar Bhagat retweetledi
Fuli Luo
Fuli Luo@_LuoFuli·
MiMo-V2-Pro & Omni & TTS is out. Our first full-stack model family built truly for the Agent era. I call this a quiet ambush — not because we planned it, but because the shift from Chat to Agent paradigm happened so fast, even we barely believed it. Somewhere in between was a process that was thrilling, painful, and fascinating all at once. The 1T base model started training months ago. The original goal was long-context reasoning efficiency. Hybrid Attention carries real innovation, without overreaching — and it turns out to be exactly the right foundation for the Agent era. 1M context window. MTP inference for ultra-low latency and cost. These architectural decisions weren't trendy. They were a structural advantage we built before we needed it. What changed everything was experiencing a complex agentic scaffold — what I'd call orchestrated Context — for the first time. I was shocked on day one. I tried to convince the team to use it. That didn't work. So I gave a hard mandate: anyone on MiMo Team with fewer than 100 conversations tomorrow can quit. It worked. Once the team's imagination was ignited by what agentic systems could do, that imagination converted directly into research velocity. People ask why we move so fast. I saw it firsthand building DeepSeek R1. My honest summary: — Backbone and Infra research has long cycles. You need strategic conviction a year before it pays off. — Posttrain agility is a different muscle: product intuition driving evaluation, iteration cycles compressed, paradigm shifts caught early. — And the constant: curiosity, sharp technical instinct, decisive execution, full commitment — and something that's easy to underestimate: a genuine love for the world you're building for. We will open-source — when the models are stable enough to deserve it. From Beijing, very late, not quite awake.
English
232
342
3.8K
1M
Paramendra Kumar Bhagat retweetledi
Rui Ma
Rui Ma@ruima·
The “genius girl” who previously worked at DeepSeek and was recruited by Lei Jun for Xiaomi AI is now on Twitter as well. It feels like more Chinese AI talent is realizing they can come here, speak for themselves, and build influence directly. I’m all for the added interaction and transparency.
Fuli Luo@_LuoFuli

MiMo-V2-Pro & Omni & TTS is out. Our first full-stack model family built truly for the Agent era. I call this a quiet ambush — not because we planned it, but because the shift from Chat to Agent paradigm happened so fast, even we barely believed it. Somewhere in between was a process that was thrilling, painful, and fascinating all at once. The 1T base model started training months ago. The original goal was long-context reasoning efficiency. Hybrid Attention carries real innovation, without overreaching — and it turns out to be exactly the right foundation for the Agent era. 1M context window. MTP inference for ultra-low latency and cost. These architectural decisions weren't trendy. They were a structural advantage we built before we needed it. What changed everything was experiencing a complex agentic scaffold — what I'd call orchestrated Context — for the first time. I was shocked on day one. I tried to convince the team to use it. That didn't work. So I gave a hard mandate: anyone on MiMo Team with fewer than 100 conversations tomorrow can quit. It worked. Once the team's imagination was ignited by what agentic systems could do, that imagination converted directly into research velocity. People ask why we move so fast. I saw it firsthand building DeepSeek R1. My honest summary: — Backbone and Infra research has long cycles. You need strategic conviction a year before it pays off. — Posttrain agility is a different muscle: product intuition driving evaluation, iteration cycles compressed, paradigm shifts caught early. — And the constant: curiosity, sharp technical instinct, decisive execution, full commitment — and something that's easy to underestimate: a genuine love for the world you're building for. We will open-source — when the models are stable enough to deserve it. From Beijing, very late, not quite awake.

English
29
123
2.8K
305.3K
Paramendra Kumar Bhagat retweetledi
Rahul Mathur
Rahul Mathur@Rahul_J_Mathur·
This is my 3rd trip to SF in the past 6 months - my key learning remains that: The best research in a post Claude world will still be done by stepping out of our virtual office to have the in-person coffee conversation or to attend a curated offline event AI agents in most fields are rate limited by human prompts & availability of proprietary signals (latter of which can only be gathered by human effort in the physical world) While prompting, agents & tools continue to get better - you'll find the real business decisions being made by human beings based on offline signals in the physical world.
English
9
9
146
11.3K
Paramendra Kumar Bhagat retweetledi
Jeson Lee
Jeson Lee@thejesonlee·
If you have something of value, even people who hate you will crawl back to be your friends. Stop networking, start building.
English
1
2
14
336
Y Combinator
Y Combinator@ycombinator·
Congrats to @beparallelhq on their $20M Series A! Parallel's AI agents integrate directly into hospital software to automate administrative workflows. In less than 12 months, they've deployed across dozens of hospitals, reached millions in ARR, and recovered tens of millions for clients—with a team of just over 10 people. tech.eu/2026/03/19/par…
Y Combinator tweet media
English
17
8
127
12.8K
Paramendra Kumar Bhagat retweetledi
Marco Dewey
Marco Dewey@marco_dewey·
I am looking to invest in hard tech founders just starting out. If you’re building in: - robotics - energy - manufacturing - aerospace - defense - semiconductors - industrial software - advanced materials I want to meet you. Reply here or DM me with what you are building (or want to build)
English
32
6
138
6.9K
Will McKelvey
Will McKelvey@Will_McKelvey·
We’re hiring a director of talent, and my gut is the right person has near-zero recruiting experience. The right person will have the energy of an associate investor but with a passion for collecting great people, rather than companies. IMO, this style of search is the future of hiring: traits/skills > credentials/experience Anyone come to mind?
English
41
6
224
39.2K
Paramendra Kumar Bhagat retweetledi
Brent Fulfer
Brent Fulfer@Brent_Fulfer·
Founders who raise on Zoom alone don't raise. When I was raising my first fund I lived in San Diego. I flew to Dubai. Switzerland. Israel. New York. Then moved to Dubai for a month. Not because I wanted to. Because you cannot raise money on Zoom calls alone. The founders I meet who can't raise have one thing in common. They're pitching from their bedroom. Sending decks. Booking Zoom calls. Following up on emails. And wondering why nobody writes the check. Investors back people before they back products. And they decide if they back you in the first 10 minutes of being in the same room. Not on a 30 minute video call where you're a face on a screen. The founders closing rounds in 2026 are on planes. They're at @consensus2026. @EthCC. @ParisBlockWeek. @token2049. They're having dinner with the right people. They're doing the uncomfortable thing most founders won't do. Get out of your house!!!
English
35
7
119
10.7K
Paramendra Kumar Bhagat retweetledi
Narendra Modi
Narendra Modi@narendramodi·
Conveyed advance Eid wishes to my brother, His Majesty King Abdullah II, the King of Jordan, over phone. We expressed concern at the evolving situation in West Asia and highlighted the need for dialogue and diplomacy for the early restoration of peace, security and stability in the region. Attacks on energy infrastructure in West Asia are condemnable and can lead to avoidable escalation. India and Jordan stand in support of unhindered transit of goods and energy. Deeply appreciated Jordan’s efforts in facilitating the safe return of Indians stranded in the region. @KingAbdullahII
English
923
2.2K
18.7K
9M