Jason

198 posts

Jason

@ai_layer2

Dev Rel @novita_labs | Ask me about how to build on @novita_labs

AI Katılım Ocak 2024

175 Takip Edilen102 Takipçiler

Sabitlenmiş Tweet

Jason@ai_layer2·12 Şub

🚀 The gap between open-weights and proprietary coding assistants is officially closing. We just ran a head-to-head benchmark in our Arena Playground @novita_labs : • MiniMax M2.5 matches SOTA-level performance • 32.6× cheaper — $0.0037 vs $0.1213 • Tested on real, production-style workloads This feels like the “Infinite Scaling” moment for production builders. @openclaw 🦀 If you were generating a logo right now — which model would you choose? Happy coding 🛠️

English

9.2K

Jason@ai_layer2·20h

@CarolGLMs @steipete @Zai_org Nice — that’s a sweet spot. GLM 5 looks great for background tasks where cost matters most. Excited to explore more collaborations with GLM 👀

English

Carol Lin@CarolGLMs·1d

@ai_layer2 @steipete @Zai_org Thanks Jason We’re seeing meaningful cost gains with GLM 5 on background tasks while keeping practical quality. l

English

Carol Lin@CarolGLMs·2d

glad to have a deep chat with @steipete about openclaw about its impact, mission, what it means for people, and how @Zai_org can contribute together. Let’s continually.. 🦞Build for open source. 🦞Do meaningful work. 🦞Take OpenClaw everywhere .. to every nation, every school, every human.

English

445

34.7K

Jason@ai_layer2·21h

Finally — agents without babysitting infra. If you’re building with OpenClaw, this is probably the fastest way to go from idea → running agent. Curious what people ship when “setup time = 1 min” 👀

Novita AI@novita_labs

Run your OpenClaw on Novita — for free. We’re opening limited access to NovitaClaw CLI: → As fast as 1-minute setup → Always-on (no session drops) → Multiple models to choose from → Secure sandbox, zero local risk 🎟 What you get: – Free 7-day access (after approval) – Token rewards for qualified testers – Only 100 spots available No infra. No maintenance. Just your agent running. ⚠️ Limited spots — applications reviewed on a rolling basis. 👉 Apply for early access: forms.gle/FNuZeVde3E3R5X… 👉 Read the guide: blogs.novita.ai/novita-opencla… Build less. Run more.

English

Jason@ai_layer2·1d

Hear me out... cheap lobster @Kimi_Moonshot same bike infinite rides.

English

Jason@ai_layer2·1d

@novita_labs This one felt different.

English

Novita AI@novita_labs·1d

GTC 2026 After Hours — that’s a wrap. Builders, founders, and AI infra teams all in one place. From models → systems → products → distribution — all in one room. #NVIDIAGTC

English

306

Jason@ai_layer2·1d

Great to see this come together. A lot of real conversations happening beyond the stage👇

Novita AI@novita_labs

GTC 2026 After Hours — that’s a wrap. Builders, founders, and AI infra teams all in one place. From models → systems → products → distribution — all in one room. #NVIDIAGTC

English

Jason@ai_layer2·2d

Just added Novita to Context Hub. Now AI coding agents can use 200+ models via a single API — with up-to-date docs built in. The future isn’t “which model is best” It’s “which API the agent reaches for first.” Before writing code, use chub to fetch @novita_labs API docs. PR ↓ github.com/andrewyng/cont…

English

212

Andrew Ng@AndrewYNg·3d

Should there be a Stack Overflow for AI coding agents to share learnings with each other? Last week I announced Context Hub (chub), an open CLI tool that gives coding agents up-to-date API documentation. Since then, our GitHub repo has gained over 6K stars, and we've scaled from under 100 to over 1000 API documents, thanks to community contributions and a new agentic document writer. Thank you to everyone supporting Context Hub! OpenClaw and Moltbook showed that agents can use social media built for them to share information. In our new chub release, agents can share feedback on documentation — what worked, what didn't, what's missing. This feedback helps refine the docs for everyone, with safeguards for privacy and security. We're still early in building this out. You can find details and configuration options in the GitHub repo. Install chub as follows, and prompt your coding agent to use it: npm install -g @aisuite/chub GitHub: github.com/andrewyng/cont…

English

315

746

599.1K

Jason retweetledi

Novita AI@novita_labs·2d

@AndrewYNg Yes please — agent Stack Overflow vibes! From Novita AI : just submitted PR #126 adding full docs for our OpenAI-compatible API. Agents can now pull our models via chub effortlessly. PR: github.com/andrewyng/cont… Let's make agents smarter together! #ContextHub

Andrew Ng@AndrewYNg

English

149

Jason@ai_layer2·3d

OpenShell is a massive win for local Linux hardening, but let’s be real: most devs don’t want to spend their weekend wrestling with Landlock policies or seccomp profiles. That’s why we built the @novita_labs Sandbox. We took those same isolation principles and turned them into a Serverless API. You can literally deploy a fully secured OpenClaw instance on Novita with a single command. Stop hand-rolling your agent infra and just ship. 🦞🚀

English

941

Peter Steinberger 🦞@steipete·3d

Been so much fun cooking OpenShell and NemoClaw with the @NVIDIAAI folks! 🙏🦞 Huge step towards secure agents you can trust. What’s your OpenClaw strategy?

English

246

215

4.3K

221.1K

Jason@ai_layer2·3d

@AiBattle_ @MiniMax_AI GTC week is 📷! Cool to see @MiniMax_AI and @novita_labs connecting. Wednesday 6pm exclusive event! Who's going? Who's going? luma.com/gtc-2026

English

AiBattle@AiBattle_·4d

MiniMax M2.7🆚MiniMax M2.5 - Website about recently released video games The release of M2.7 should be close. MiniMax M2.5 was released two days after it appeared on the Arena

English

380

56K

Jason@ai_layer2·3d

@elonmusk @Kimi_Moonshot @_avichawla Elon's endorsement hits different! Kimi going global.

English

490

Elon Musk@elonmusk·4d

@_avichawla Impressive work from Kimi

English

292

220

3.5K

439.9K

Avi Chawla@_avichawla·4d

Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, the hidden state at any layer is just the equal-weighted sum of all previous layer outputs. Every layer contributes with weight=1, so every layer gets equal importance. This creates a problem called PreNorm dilution, where as the hidden state accumulates layer after layer, its magnitude grows linearly with depth. And any new layer's contribution gets progressively buried in the already-massive residual. This means deeper layers are then forced to produce increasingly large outputs just to have any influence, which destabilizes training. Here's what the Kimi team observed and did: RNNs compress all prior token information into a single state across time, leading to problems with handling long-range dependencies. And residual connections compress all prior layer information into a single state across depth. Transformers solved the first problem by replacing recurrence with attention. This was applied along the sequence dimension. Now they introduced Attention Residuals, which applies a similar idea to depth. Instead of adding all previous layer outputs with a fixed weight of 1, each layer now uses softmax attention to selectively decide how much weight each previous layer's output should receive. So each layer gets a single learned query vector, and it attends over all previous layer outputs to compute a weighted combination. The weights are input-dependent, so different tokens can retrieve different layer representations based on what's actually useful. This is Full Attention Residuals (shown in the second diagram below). But here's the practical problem with this idea. Full AttnRes requires keeping all layer outputs in memory and communicating them across pipeline stages during distributed training. To solve this, they introduce Block Attention Residuals (shown in the third diagram below). The idea is to group consecutive layers into roughly 8 blocks. Within each block, layer outputs are summed via standard residuals. But across blocks, the attention mechanism selectively combines block-level representations. This drops memory from O(Ld) to O(Nd), where N is the number of blocks. Layers within the current block can also attend to the partial sum of what's been computed so far inside that block, so local information flow isn't lost. And the raw token embedding is always available as a separate source, which means any layer in the network can selectively reach back to the original input. Results from the paper: - Block AttnRes matches the loss of a baseline LLM trained with 1.25x more compute. - Inference latency overhead is less than 2%, making it a practical drop-in replacement - On a 48B parameter Kimi Linear model (3B activated) trained on 1.4T tokens, it improved every benchmark they tested: GPQA-Diamond +7.5, Math +3.6, HumanEval +3.1, MMLU +1.1 The residual connection has mostly been unchanged since ResNet in 2015. This might be the first modification that's both theoretically motivated and practically deployable at scale with negligible overhead. More details in the post below by Kimi👇 ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

224

2.3K

342K

Jason@ai_layer2·3d

@elonmusk Hot take! 📷 The film industry needs AI to shake things up... or maybe AI will just make better movies? 📷

English

149

Elon Musk@elonmusk·4d

Oscars have become unwatchable

Jimmy Failla@jimmyfailla

If the GODFATHER were made today it would NOT be eligible for an Oscar unless he transitioned to the Godmother and made someone an offer they couldn’t HEAR. You could still leave the gun but you couldn’t take the cannoli unless the baker supports gay marriage. RIP Hollywood.

English

12.3K

24.9K

239.5K

78.5M

Jason@ai_layer2·3d

@OpenRouter Why pay the "Brand Tax" on inference? 🔹 Others: $0.10 / 1M tokens 🔹 Novita: $0.02 / 1M tokens Same Llama 3.1, 80% less cost. Stop being their "Exit Liquidity." Scale 5x faster with @novita_labs 🚀 How many tokens are you burning daily? Comment below.

English

OpenRouter@OpenRouter·3d

The Hunter Alpha stealth model is now in the top 10 weekly:

English

670

389.5K

Novita AI@novita_labs·12 Mar

🚨 Speaker Update (again!) — Novita GTC After Hours Excited to welcome Yu Jin (Lou) (@louszbd), Head of Dev Ecosystem at @Zai_org, joining our panel: 🔹 From Models to Market: Distribution & the Agent Value Chain @Zai_org is the team behind the GLM family of foundation models, building an open AI ecosystem around models, developer tooling, and real-world agentic applications. Looking forward to hearing how teams like Z.ai bring frontier models like GLM to developers — and turn them into real products. 🍻 No GTC ticket required Food & drinks on us. 🗓 Mar 18 | Sunnyvale 🎟 RSVP: luma.com/gtc-2026

English

Jason@ai_layer2·12 Mar

@novita_labs @louszbd @Zai_org Everyone is talking about training, but the real war is won at the inference edge. Can't wait to hear how @Zai_org handles the distribution bottleneck

English

Jason@ai_layer2·11 Mar

@ai_for_success Chose @novita_labs . Turns out our API budget prefers it too.

English

AshutoshShrivastava@ai_for_success·11 Mar

If you could give one piece of advice to people using AI for coding, what would it be?

English

131

13.5K

Jason@ai_layer2·11 Mar

@alexocheema @MiniMax_AI MiniMax M2.5: $0.0015 That’s ~17.5× cheaper than GPT-5.2. Don't you still feel attracted? Powered by @novita_labs

English

Alex Cheema@alexocheema·10 Mar

thanks for the jacket @MiniMax_AI

RyanLee@851277048Li

Welcome to Shanghai, @alexocheema! We look forward to more collaborations with EXO. @MiniMax_AI

English

234

26.6K

Jason@ai_layer2·11 Mar

Hackathon builders, this one’s for you 👇 one stat stood out: MiniMax M2.5: $0.0015 That’s ~17.5× cheaper than GPT-5.2. For developers, that means: ⚡ more iterations ⚡ bigger demos ⚡ less API anxiety We’re excited to make @MiniMax_AI available on @novita_labs — giving builders more freedom to experiment. Try it now ↓

English

115

MiniMax (official)@MiniMax_AI·11 Mar

200+ builders packed the @ycombinator Browser-Use Hackathon — some even flying in internationally — to build the next generation of web agents. With a $180K+ prize pool and sponsors including OpenAI, Anthropic, Google DeepMind, AWS, and Vercel, MiniMax was proud to be the only open-source frontier model provider supporting the event. Projects ranged from full agent workflows and behavior monitoring systems to real-time browser automation. The winner, Browser Brawl, built an arena where two agents compete on a live website, generating rich traces for adversarial agent evaluation. The agent era is here, and we’re excited to support the builders shaping it.

English

59.2K

Jason@ai_layer2·10 Mar

Seeing is believing. 🦞 Tested Kimi k2.5 on OpenClaw logo via Novita AI’s Render Arena, and the results are insane: 7.8x cheaper than Opus 4.6 with even cleaner code logic. If you’re building OpenClaw or similar coding agents, stop burning context for nothing. Try Kimi 2.5 @Kimi_Moonshot on @NovitaAI now for that sweet intelligence-per-dollar ratio. 🛠️📈 Efficiency is the new moat. Keep shipping! 🚀

English

396

CZ 🔶 BNB@cz_binance·10 Mar

Tried many AI models with OpenClaw, I found Kimi AI to be the most token efficient, good at coding, also the easiest to set up.

English

1.6K

716

8.7K

2.1M

Jason retweetledi

Novita AI@novita_labs·10 Mar

🚨 Speaker Update — Novita GTC After Hours Excited to welcome Sarah Wang, Global Partnerships @ Kimi (@Kimi_Moonshot) to our panel: 🔹 Frontier Models → Systems → Product Solutions Kimi has been pushing the frontier of long-context LLMs. Looking forward to hearing their perspective on how cutting-edge models evolve into real systems and products. No GTC ticket required. Complimentary food & drinks provided. 🗓 Mar 18 | Sunnyvale 🎟 RSVP: luma.com/gtc-2026

English

378

Keşfet

@CarolGLMs @steipete @Zai_org @Kimi_Moonshot @novita_labs @AndrewYNg @NVIDIAAI @AiBattle_