Jai

923 posts

Jai

@JSupa15

ML Engineer on Twitter to see cool stuff and meet interesting people. Founding Dev at @NousResearch

Katılım Ağustos 2013

265 Takip Edilen778 Takipçiler

Jai retweetledi

Nous Research@NousResearch·14h

Today we release Lighthouse Attention, a selection-based hierarchical attention for long-context pre-training that delivers a 1.4-1.7× wall-clock speedup at 98K context. It runs the same forward+backward pass ~17× faster than standard attention at 512K context on a single B200, without a custom sparse attention kernel, a straight-through estimator, or an auxiliary loss. During training, queries, keys, and values are pooled symmetrically into a multi-resolution pyramid. We then score every pyramid heads, and a top-k cascade selects a small hierarchical dense sub-sequence, and after a sorting pass that enforces causality, we use standard attention for token mixing. A brief full attention resume at the end converts the checkpoint back into a competent dense-attention model. Validated this using 530M parameter Llama-3 models across 50B tokens, with up to 1M-token benchmarks across 32 B200s under context parallelism. The work on Lighthouse Attention was led by @bloc97_, @SubhoGhosh02, and @theemozilla.

English

158

1.4K

81.2K

Jai retweetledi

Nous Research@NousResearch·2d

Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.

English

148

415

3.7K

426K

Jai retweetledi

Nous Research@NousResearch·9 May

Hermes Agent is now #1 on the Global @OpenRouter token rankings. While our journey together has just begun, we'd like to take this opportunity to thank our contributors, supporters, and users for all they have done to get us this far.

English

423

707

2.9M

Jai retweetledi

Brooklyn!@imbabybrooklyn·5 May

👀 v. soon

English

555

40.5K

Jai retweetledi

Nous Research@NousResearch·4 May

Trinity-Large-Thinking, @arcee_ai's latest model, is now free on Nous Portal for the next week Sign up for Nous Portal to use it in your Hermes Agent today

English

673

224.4K

Jai@JSupa15·4 May

@Shpigford God bless

English

172

Josh Pigford@Shpigford·4 May

absolutely livid at how good hermes is (with gpt 5.5, no less!)

English

155

47.3K

Jai retweetledi

Eth@EtherCoins·1 May

@Teknium We do really need a Hermes nice cheatsheet at this point :)

English

277

9.9K

Jai retweetledi

D.F.K. Bananas@nunyabydnez·2 May

@NousResearch @Shopify haha goddammit, my job is basically making an app that does with this agent does. Oh well. I just have to make sure the company doesn't know about this.

English

2.8K

Jai retweetledi

Nous Research@NousResearch·2 May

Shopify is the all-in-one commerce platform powering millions of businesses worldwide Thank you to the @Shopify team for building their own official Hermes Agent skill enabling your agent to manage products, orders, inventory, and fulfillments from any channel.

English

135

202

2.7K

438.4K

Jai retweetledi

Teknium 🪽@Teknium·30 Nis

Introducing Hermes Curator! The new system built in to Hermes Agent now helps you keep your skills that the self improvement loop creates in check, by consolidating and pruning automatically. The curator does multiple things: - keeps track of how often you use each skill, when it was last updated/created, etc - Once a week runs automatically (configurable) - Uses the analytics plus it's own scanning of your skills and consolidates or prunes them if necessary - Skips externally installed skills, built in skills, and skills you "pin" that you dont' want touched. It will only attempt curation over agent created/updated skills or user written skills. - It will then determine whether skills can be consolidated, pruned, or otherwise made more manageable. It will convert some skills that are too specific into references, templates or scripts for larger/broader skills, or integrate them directly into a consolidation of an existing skill. You can also disable it entirely in the config.yaml and/or run it manually with `hermes curator run ` Learn more on the docs here: hermes-agent.nousresearch.com/docs/user-guid…

English

133

163

2.2K

474.4K

Jai retweetledi

Nous Research@NousResearch·23 Nis

Kimi K2.6 is free on Nous Portal for the next 24 hours Made possible by @vercel's AI Gateway & @Kimi_Moonshot Run 'hermes update', then 'hermes model' and select Kimi K2.6 to try out one of the most impressive open model releases ever

English

131

154

1.7K

256.7K

Jai retweetledi

Kimi.ai@Kimi_Moonshot·20 Nis

Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…

English

929

2.4K

18.2K

7.5M

Jai retweetledi

Nous Research@NousResearch·19 Nis

Honored to announce we are partnering with Jim Liu to port over his wildly popular skills for infographics and design to work best in Hermes Agent using our native tooling! The first skill ported today, the Infographic Skill, is available after updating hermes. Just start a new chat and type `/baoyu-infographic ` to get started! Recommended image generation model is Nano Banana.

宝玉@dotey

Truly honored! My project has gained significant traction with 14k+ stars on GitHub. Specifically, my skills for technical infographic generation and social media (Little Red Book style) visual content are extremely popular in the Chinese developer community. They bridge the gap between LLM reasoning and aesthetic visual output. Would love to see them integrated as built-in options for Hermes! Repo: github.com/jimliu/baoyu-s…

English

186

2.4K

254.8K

Jai retweetledi

ollama@ollama·18 Nis

ollama launch hermes Ollama 0.21 includes supports Hermes Agent, the self-improving AI agent built by @NousResearch.

English

267

2.7K

309.9K

Jai@JSupa15·17 Nis

@NousResearch @Kimi_Moonshot You know you were going to hack on Hermes Agent regardless. Now there's a chance to actually get paid from it 🔥

English

1.6K

Nous Research@NousResearch·17 Nis

The Hermes Agent Creative Hackathon starts now 16 Days, $25k in Prizes Presented by @Kimi_Moonshot & @NousResearch For the tinkerers pushing Hermes Agent into creative domains: video, image, audio, 3D, long-form writing, creative software, interactive media and more. Show us what your Hermes Agent can do. Details Below ↓

English

132

242

2.1K

1.5M

Jai retweetledi

Nous Research@NousResearch·9 Nis

Thank you for helping to make Hermes Agent amazing.

English

1.3K

102.3K

Jai retweetledi

ksa 🏴‍☠️@kosa12m·8 Nis

How Anthropic talks about Claude Mythos rn:

English

1.7K

31.8K

528.3K

Jai retweetledi

Nous Research@NousResearch·8 Nis

We’re partnering with @MiniMax_AI across product and models to make their upcoming releases the best for Hermes Agent users. MiniMax models are already some of the most-used in Hermes Agent. If you haven’t tried MiniMax M2.7 in Hermes Agent, try it today in the Nous Portal!

English

101

1.3K

343.2K

Jai@JSupa15·8 Nis

@MiniMax_AI @NousResearch <3

356

MiniMax (official)@MiniMax_AI·8 Nis

Proud to power @NousResearch's Hermes Agent with MiniMax M2.7, and excited for what we're building together. Try MiniMax M2.7 in Hermes Agent today → portal.nousresearch.com #MiniMax #NousResearch #AI

English

712

525.9K

Jai@JSupa15·7 Nis

@cgtwts lol Ik it’s a meme but they definitely didn’t feel the AGI

English

CG@cgtwts·6 Nis

>be Microsoft >spend billions on AI >back OpenAI early >also invest in Anthropic >hedge every outcome >put Copilot in everything >call it the future of work >users start asking real questions > “don’t use it for anything important btw” Lmaoo.

English

929

42.7K

Keşfet

@bloc97_ @SubhoGhosh02 @theemozilla @gigant_theo @OpenRouter @arcee_ai @Shpigford @Teknium