Jason

198 posts

Jason banner
Jason

Jason

@ai_layer2

Dev Rel @novita_labs | Ask me about how to build on @novita_labs

AI Katılım Ocak 2024
175 Takip Edilen102 Takipçiler
Sabitlenmiş Tweet
Jason
Jason@ai_layer2·
🚀 The gap between open-weights and proprietary coding assistants is officially closing. We just ran a head-to-head benchmark in our Arena Playground @novita_labs : • MiniMax M2.5 matches SOTA-level performance • 32.6× cheaper — $0.0037 vs $0.1213 • Tested on real, production-style workloads This feels like the “Infinite Scaling” moment for production builders. @openclaw 🦀 If you were generating a logo right now — which model would you choose? Happy coding 🛠️
Jason tweet media
English
2
3
34
9.2K
Jason
Jason@ai_layer2·
@CarolGLMs @steipete @Zai_org Nice — that’s a sweet spot. GLM 5 looks great for background tasks where cost matters most. Excited to explore more collaborations with GLM 👀
English
0
0
0
14
Carol Lin
Carol Lin@CarolGLMs·
@ai_layer2 @steipete @Zai_org Thanks Jason We’re seeing meaningful cost gains with GLM 5 on background tasks while keeping practical quality. l
English
1
0
0
79
Carol Lin
Carol Lin@CarolGLMs·
glad to have a deep chat with @steipete about openclaw about its impact, mission, what it means for people, and how @Zai_org can contribute together. Let’s continually.. 🦞Build for open source. 🦞Do meaningful work. 🦞Take OpenClaw everywhere .. to every nation, every school, every human.
Carol Lin tweet mediaCarol Lin tweet media
English
31
36
445
34.7K
Jason
Jason@ai_layer2·
Hear me out... cheap lobster @Kimi_Moonshot same bike infinite rides.
English
0
0
0
11
Novita AI
Novita AI@novita_labs·
GTC 2026 After Hours — that’s a wrap. Builders, founders, and AI infra teams all in one place. From models → systems → products → distribution — all in one room. #NVIDIAGTC
Novita AI tweet mediaNovita AI tweet mediaNovita AI tweet mediaNovita AI tweet media
English
3
1
8
306
Jason
Jason@ai_layer2·
Just added Novita to Context Hub. Now AI coding agents can use 200+ models via a single API — with up-to-date docs built in. The future isn’t “which model is best” It’s “which API the agent reaches for first.” Before writing code, use chub to fetch @novita_labs API docs. PR ↓ github.com/andrewyng/cont…
English
0
0
3
212
Andrew Ng
Andrew Ng@AndrewYNg·
Should there be a Stack Overflow for AI coding agents to share learnings with each other? Last week I announced Context Hub (chub), an open CLI tool that gives coding agents up-to-date API documentation. Since then, our GitHub repo has gained over 6K stars, and we've scaled from under 100 to over 1000 API documents, thanks to community contributions and a new agentic document writer. Thank you to everyone supporting Context Hub! OpenClaw and Moltbook showed that agents can use social media built for them to share information. In our new chub release, agents can share feedback on documentation — what worked, what didn't, what's missing. This feedback helps refine the docs for everyone, with safeguards for privacy and security. We're still early in building this out. You can find details and configuration options in the GitHub repo. Install chub as follows, and prompt your coding agent to use it: npm install -g @aisuite/chub GitHub: github.com/andrewyng/cont…
English
315
746
5K
599.1K
Jason retweetledi
Jason
Jason@ai_layer2·
OpenShell is a massive win for local Linux hardening, but let’s be real: most devs don’t want to spend their weekend wrestling with Landlock policies or seccomp profiles. That’s why we built the @novita_labs Sandbox. We took those same isolation principles and turned them into a Serverless API. You can literally deploy a fully secured OpenClaw instance on Novita with a single command. Stop hand-rolling your agent infra and just ship. 🦞🚀
Jason tweet media
English
0
0
3
941
Peter Steinberger 🦞
Been so much fun cooking OpenShell and NemoClaw with the @NVIDIAAI folks! 🙏🦞 Huge step towards secure agents you can trust. What’s your OpenClaw strategy?
English
246
215
4.3K
221.1K
AiBattle
AiBattle@AiBattle_·
MiniMax M2.7🆚MiniMax M2.5 - Website about recently released video games The release of M2.7 should be close. MiniMax M2.5 was released two days after it appeared on the Arena
English
13
22
380
56K
Avi Chawla
Avi Chawla@_avichawla·
Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, the hidden state at any layer is just the equal-weighted sum of all previous layer outputs. Every layer contributes with weight=1, so every layer gets equal importance. This creates a problem called PreNorm dilution, where as the hidden state accumulates layer after layer, its magnitude grows linearly with depth. And any new layer's contribution gets progressively buried in the already-massive residual. This means deeper layers are then forced to produce increasingly large outputs just to have any influence, which destabilizes training. Here's what the Kimi team observed and did: RNNs compress all prior token information into a single state across time, leading to problems with handling long-range dependencies. And residual connections compress all prior layer information into a single state across depth. Transformers solved the first problem by replacing recurrence with attention. This was applied along the sequence dimension. Now they introduced Attention Residuals, which applies a similar idea to depth. Instead of adding all previous layer outputs with a fixed weight of 1, each layer now uses softmax attention to selectively decide how much weight each previous layer's output should receive. So each layer gets a single learned query vector, and it attends over all previous layer outputs to compute a weighted combination. The weights are input-dependent, so different tokens can retrieve different layer representations based on what's actually useful. This is Full Attention Residuals (shown in the second diagram below). But here's the practical problem with this idea. Full AttnRes requires keeping all layer outputs in memory and communicating them across pipeline stages during distributed training. To solve this, they introduce Block Attention Residuals (shown in the third diagram below). The idea is to group consecutive layers into roughly 8 blocks. Within each block, layer outputs are summed via standard residuals. But across blocks, the attention mechanism selectively combines block-level representations. This drops memory from O(Ld) to O(Nd), where N is the number of blocks. Layers within the current block can also attend to the partial sum of what's been computed so far inside that block, so local information flow isn't lost. And the raw token embedding is always available as a separate source, which means any layer in the network can selectively reach back to the original input. Results from the paper: - Block AttnRes matches the loss of a baseline LLM trained with 1.25x more compute. - Inference latency overhead is less than 2%, making it a practical drop-in replacement - On a 48B parameter Kimi Linear model (3B activated) trained on 1.4T tokens, it improved every benchmark they tested: GPQA-Diamond +7.5, Math +3.6, HumanEval +3.1, MMLU +1.1 The residual connection has mostly been unchanged since ResNet in 2015. This might be the first modification that's both theoretically motivated and practically deployable at scale with negligible overhead. More details in the post below by Kimi👇 ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.
Avi Chawla tweet media
Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English
77
224
2.3K
342K
Jason
Jason@ai_layer2·
@elonmusk Hot take! 📷 The film industry needs AI to shake things up... or maybe AI will just make better movies? 📷
English
1
1
4
149
Jason
Jason@ai_layer2·
@OpenRouter Why pay the "Brand Tax" on inference? 🔹 Others: $0.10 / 1M tokens 🔹 Novita: $0.02 / 1M tokens Same Llama 3.1, 80% less cost. Stop being their "Exit Liquidity." Scale 5x faster with @novita_labs 🚀 How many tokens are you burning daily? Comment below.
Jason tweet media
English
0
0
2
32
OpenRouter
OpenRouter@OpenRouter·
The Hunter Alpha stealth model is now in the top 10 weekly:
OpenRouter tweet media
English
38
44
670
389.5K
Novita AI
Novita AI@novita_labs·
🚨 Speaker Update (again!) — Novita GTC After Hours Excited to welcome Yu Jin (Lou) (@louszbd), Head of Dev Ecosystem at @Zai_org, joining our panel: 🔹 From Models to Market: Distribution & the Agent Value Chain @Zai_org is the team behind the GLM family of foundation models, building an open AI ecosystem around models, developer tooling, and real-world agentic applications. Looking forward to hearing how teams like Z.ai bring frontier models like GLM to developers — and turn them into real products. 🍻 No GTC ticket required Food & drinks on us. 🗓 Mar 18 | Sunnyvale 🎟 RSVP: luma.com/gtc-2026
English
2
2
25
6K
Jason
Jason@ai_layer2·
@novita_labs @louszbd @Zai_org Everyone is talking about training, but the real war is won at the inference edge. Can't wait to hear how @Zai_org handles the distribution bottleneck
English
0
0
2
64
AshutoshShrivastava
AshutoshShrivastava@ai_for_success·
If you could give one piece of advice to people using AI for coding, what would it be?
English
131
1
96
13.5K
Jason
Jason@ai_layer2·
Hackathon builders, this one’s for you 👇 one stat stood out: MiniMax M2.5: $0.0015 That’s ~17.5× cheaper than GPT-5.2. For developers, that means: ⚡ more iterations ⚡ bigger demos ⚡ less API anxiety We’re excited to make @MiniMax_AI available on @novita_labs — giving builders more freedom to experiment. Try it now ↓
Jason tweet media
English
0
0
2
115
MiniMax (official)
MiniMax (official)@MiniMax_AI·
200+ builders packed the @ycombinator Browser-Use Hackathon — some even flying in internationally — to build the next generation of web agents. With a $180K+ prize pool and sponsors including OpenAI, Anthropic, Google DeepMind, AWS, and Vercel, MiniMax was proud to be the only open-source frontier model provider supporting the event. Projects ranged from full agent workflows and behavior monitoring systems to real-time browser automation. The winner, Browser Brawl, built an arena where two agents compete on a live website, generating rich traces for adversarial agent evaluation. The agent era is here, and we’re excited to support the builders shaping it.
MiniMax (official) tweet mediaMiniMax (official) tweet mediaMiniMax (official) tweet mediaMiniMax (official) tweet media
English
7
5
92
59.2K
Jason
Jason@ai_layer2·
Seeing is believing. 🦞 Tested Kimi k2.5 on OpenClaw logo via Novita AI’s Render Arena, and the results are insane: 7.8x cheaper than Opus 4.6 with even cleaner code logic. If you’re building OpenClaw or similar coding agents, stop burning context for nothing. Try Kimi 2.5 @Kimi_Moonshot on @NovitaAI now for that sweet intelligence-per-dollar ratio. 🛠️📈 Efficiency is the new moat. Keep shipping! 🚀
Jason tweet media
English
0
0
1
396
CZ 🔶 BNB
CZ 🔶 BNB@cz_binance·
Tried many AI models with OpenClaw, I found Kimi AI to be the most token efficient, good at coding, also the easiest to set up.
English
1.6K
716
8.7K
2.1M
Jason retweetledi
Novita AI
Novita AI@novita_labs·
🚨 Speaker Update — Novita GTC After Hours Excited to welcome Sarah Wang, Global Partnerships @ Kimi (@Kimi_Moonshot) to our panel: 🔹 Frontier Models → Systems → Product Solutions Kimi has been pushing the frontier of long-context LLMs. Looking forward to hearing their perspective on how cutting-edge models evolve into real systems and products. No GTC ticket required. Complimentary food & drinks provided. 🗓 Mar 18 | Sunnyvale 🎟 RSVP: luma.com/gtc-2026
Novita AI tweet media
English
0
1
6
378