Jai

923 posts

Jai banner
Jai

Jai

@JSupa15

ML Engineer on Twitter to see cool stuff and meet interesting people. Founding Dev at @NousResearch

Katılım Ağustos 2013
265 Takip Edilen778 Takipçiler
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Today we release Lighthouse Attention, a selection-based hierarchical attention for long-context pre-training that delivers a 1.4-1.7× wall-clock speedup at 98K context. It runs the same forward+backward pass ~17× faster than standard attention at 512K context on a single B200, without a custom sparse attention kernel, a straight-through estimator, or an auxiliary loss. During training, queries, keys, and values are pooled symmetrically into a multi-resolution pyramid. We then score every pyramid heads, and a top-k cascade selects a small hierarchical dense sub-sequence, and after a sorting pass that enforces causality, we use standard attention for token mixing. A brief full attention resume at the end converts the checkpoint back into a competent dense-attention model. Validated this using 530M parameter Llama-3 models across 50B tokens, with up to 1M-token benchmarks across 32 B200s under context parallelism. The work on Lighthouse Attention was led by @bloc97_, @SubhoGhosh02, and @theemozilla.
Nous Research tweet media
English
39
158
1.4K
81.2K
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.
Nous Research tweet media
English
148
415
3.7K
426K
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Hermes Agent is now #1 on the Global @OpenRouter token rankings. While our journey together has just begun, we'd like to take this opportunity to thank our contributors, supporters, and users for all they have done to get us this far.
Nous Research tweet media
English
423
707
7K
2.9M
Jai retweetledi
Brooklyn!
Brooklyn!@imbabybrooklyn·
👀 v. soon
Brooklyn! tweet media
English
36
14
555
40.5K
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Trinity-Large-Thinking, @arcee_ai's latest model, is now free on Nous Portal for the next week Sign up for Nous Portal to use it in your Hermes Agent today
English
36
40
673
224.4K
Josh Pigford
Josh Pigford@Shpigford·
absolutely livid at how good hermes is (with gpt 5.5, no less!)
English
30
2
155
47.3K
Jai retweetledi
Eth
Eth@EtherCoins·
@Teknium We do really need a Hermes nice cheatsheet at this point :)
Eth tweet media
English
12
38
277
9.9K
Jai retweetledi
D.F.K. Bananas
D.F.K. Bananas@nunyabydnez·
@NousResearch @Shopify haha goddammit, my job is basically making an app that does with this agent does. Oh well. I just have to make sure the company doesn't know about this.
English
1
1
18
2.8K
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Shopify is the all-in-one commerce platform powering millions of businesses worldwide Thank you to the @Shopify team for building their own official Hermes Agent skill enabling your agent to manage products, orders, inventory, and fulfillments from any channel.
English
135
202
2.7K
438.4K
Jai retweetledi
Teknium 🪽
Teknium 🪽@Teknium·
Introducing Hermes Curator! The new system built in to Hermes Agent now helps you keep your skills that the self improvement loop creates in check, by consolidating and pruning automatically. The curator does multiple things: - keeps track of how often you use each skill, when it was last updated/created, etc - Once a week runs automatically (configurable) - Uses the analytics plus it's own scanning of your skills and consolidates or prunes them if necessary - Skips externally installed skills, built in skills, and skills you "pin" that you dont' want touched. It will only attempt curation over agent created/updated skills or user written skills. - It will then determine whether skills can be consolidated, pruned, or otherwise made more manageable. It will convert some skills that are too specific into references, templates or scripts for larger/broader skills, or integrate them directly into a consolidation of an existing skill. You can also disable it entirely in the config.yaml and/or run it manually with `hermes curator run ` Learn more on the docs here: hermes-agent.nousresearch.com/docs/user-guid…
Teknium 🪽 tweet media
English
133
163
2.2K
474.4K
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Kimi K2.6 is free on Nous Portal for the next 24 hours Made possible by @vercel's AI Gateway & @Kimi_Moonshot Run 'hermes update', then 'hermes model' and select Kimi K2.6 to try out one of the most impressive open model releases ever
English
131
154
1.7K
256.7K
Jai retweetledi
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…
Kimi.ai tweet media
English
929
2.4K
18.2K
7.5M
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Honored to announce we are partnering with Jim Liu to port over his wildly popular skills for infographics and design to work best in Hermes Agent using our native tooling! The first skill ported today, the Infographic Skill, is available after updating hermes. Just start a new chat and type `/baoyu-infographic ` to get started! Recommended image generation model is Nano Banana.
Nous Research tweet media
宝玉@dotey

Truly honored! My project has gained significant traction with 14k+ stars on GitHub. Specifically, my skills for technical infographic generation and social media (Little Red Book style) visual content are extremely popular in the Chinese developer community. They bridge the gap between LLM reasoning and aesthetic visual output. Would love to see them integrated as built-in options for Hermes! Repo: github.com/jimliu/baoyu-s…

English
94
186
2.4K
254.8K
Jai retweetledi
ollama
ollama@ollama·
ollama launch hermes Ollama 0.21 includes supports Hermes Agent, the self-improving AI agent built by @NousResearch.
ollama tweet media
English
98
267
2.7K
309.9K
Jai
Jai@JSupa15·
@NousResearch @Kimi_Moonshot You know you were going to hack on Hermes Agent regardless. Now there's a chance to actually get paid from it 🔥
English
0
0
6
1.6K
Nous Research
Nous Research@NousResearch·
The Hermes Agent Creative Hackathon starts now 16 Days, $25k in Prizes Presented by @Kimi_Moonshot & @NousResearch For the tinkerers pushing Hermes Agent into creative domains: video, image, audio, 3D, long-form writing, creative software, interactive media and more. Show us what your Hermes Agent can do. Details Below ↓
English
132
242
2.1K
1.5M
Jai retweetledi
Nous Research
Nous Research@NousResearch·
Thank you for helping to make Hermes Agent amazing.
Nous Research tweet media
English
95
67
1.3K
102.3K
Jai retweetledi
ksa 🏴‍☠️
ksa 🏴‍☠️@kosa12m·
How Anthropic talks about Claude Mythos rn:
ksa 🏴‍☠️ tweet media
English
85
1.7K
31.8K
528.3K
Jai retweetledi
Nous Research
Nous Research@NousResearch·
We’re partnering with @MiniMax_AI across product and models to make their upcoming releases the best for Hermes Agent users. MiniMax models are already some of the most-used in Hermes Agent. If you haven’t tried MiniMax M2.7 in Hermes Agent, try it today in the Nous Portal!
English
101
78
1.3K
343.2K
Jai
Jai@JSupa15·
@cgtwts lol Ik it’s a meme but they definitely didn’t feel the AGI
English
0
0
0
14
CG
CG@cgtwts·
>be Microsoft >spend billions on AI >back OpenAI early >also invest in Anthropic >hedge every outcome >put Copilot in everything >call it the future of work >users start asking real questions > “don’t use it for anything important btw” Lmaoo.
English
24
72
929
42.7K