moe

1.8K posts

moe banner
moe

moe

@mojo_dinero

Swing trader

Katılım Ekim 2020
1.3K Takip Edilen87 Takipçiler
moe retweetledi
alex zhang
alex zhang@a1zhang·
Every day we move closer to the RLM
alex zhang tweet media
English
11
31
385
72.5K
moe retweetledi
Nous Research
Nous Research@NousResearch·
Three days left for Hackathon submissions! A big thank you to @Delphi_Digital for sponsoring. Hermes works fast, so it's not too late for anyone considering a last minute project. We had it generate 1000 project ideas and use the ascii-video skill to make a video for you:
Nous Research@NousResearch

The Hermes Agent Hackathon Starts Now Show us what Hermes Agent can do: build something unique, creative, and useful. 1st: $7,500 2nd: $2,000 3rd: $500 To enter, make a tweet tagging @NousResearch with a video demo and a brief writeup, then send the tweet link to the submissions channel in our Discord. Entries will be judged by Nous staff on creativity, usefulness, and presentation. Submissions are due EOD Sunday 03/16.

English
17
4
166
32.4K
moe retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Skill for Parallel Agentic Workflows is now live Works w/ any CLI agent harness (Codex, Claude, Kimi, OpenCode, Droid, etc) Be warned this was vibecoded from my workflows, not fully tested Should be a GREAT STARTING POINT nevertheless Give the screenshot below to your agent
Ahmad tweet media
Ahmad@TheAhmadOsman

This is how I run parallel agents: Either tell an agent to fan out or use scripts for deterministic runs > Spin up workers > Isolate in git worktrees > Gate with diffs > Add backups, rules, logs, etc when deterministic > Merge what passes Turning this into a Skill for you guys

English
10
23
202
21.6K
moe retweetledi
himanshu
himanshu@himanshustwts·
"We built an algorithm that allows agents to communicate KV cache to KV cache." holy cow! ramp is on fire. and here is the high level architecture of the proposed algorithm. i think it is more like telepathy with a relevance filter.
himanshu tweet media
Ramp Labs@RampLabs

Introducing Latent Briefing, a way for agents to quickly share their relevant memory directly. Result: 31% fewer tokens used, same accuracy. Multi-agent systems are powerful, but can be wildly inefficient. They pass context as tokens, so costs explode and signal gets lost. We built an algorithm that allows agents to communicate KV cache to KV cache.

English
8
23
530
50.2K
moe retweetledi
Cobus Greyling
Cobus Greyling@CobusGreylingZA·
I just love the language of this study...it speaks of the shifting "community language"...and that is so true...have you noticed the new "community language" is "harness", it was "contextual prompting" before that... For now, the center of gravity in AI agents has shifted — and this diagram captures it perfectly. Think of LLM agent capabilities as three stacked layers: Weights ... Where it all started. Pretraining, fine-tuning, RLHF, scaling laws, alignment. This was the 2022 conversation. Context ... The 2023-2024 wave. RAG, memory, long context, chain-of-thought, prompting, and context engineering became the focus. How do we get the right information to the model? Now, Harness ... Where the conversation lives now. MCP, tool ecosystems, function calling, agent infrastructure, protocols, skills, A2A, multi-agent orchestration, workflow graphs, and security. The pattern? Community attention has moved steadily outward, from what's inside the model, to what surrounds it, to the infrastructure that connects and orchestrates it all. The models themselves are becoming a commodity. The differentiation is increasingly in the harness layer, how you wire agents together, what tools they can use, and how they coordinate. We've gone from "how do we make the model smarter?" to "how do we make the system around the model smarter?" That's the real shift. Source: arxiv.org/pdf/2604.08224 This aligns with an excellent blog from @hwchase17 ➡️x.com/hwchase17/stat…
Cobus Greyling tweet media
English
14
97
581
34K
moe retweetledi
Agent Orchestrator
Agent Orchestrator@aoagents·
Hermes can now command an army of coding agents. One prompt, multiple agents spawned. Multiple PRs land all in parallel. Agent Orchestrator now Available for Hermes (by @NousResearch)
English
19
70
827
44.1K
moe retweetledi
himanshu
himanshu@himanshustwts·
and here is the full architecture of the LLM Knowledge Base system covering every stage from ingest to future explorations.
himanshu tweet media
Andrej Karpathy@karpathy

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English
101
581
5.8K
665K
moe retweetledi
YQ
YQ@yq_acc·
The core of @claudeai code is a while(true) loop. Everything else is harness. 12 layers: loop, tools, planning, sub-agents, knowledge injection, compression, tasks, teams, protocols, autonomous mode, worktree isolation. The innovation isn't any one layer. It's the composition.
YQ tweet media
YQ@yq_acc

x.com/i/article/2038…

English
8
29
251
53K
moe retweetledi
WquGuru🦀
WquGuru🦀@wquguru·
Claude Code源代码泄漏,包含六张核心状态图,: - 主查询状态机,理解主 query loop 的主干 - Tool Execution状态机,理解 tool 调度与并发/中断 - 压缩恢复策略,理解上下文压缩与恢复 - Agent生命周期状态机和SDK会话状态机,分别理解 subagent 生命周期和SDK 会话层 - 权限策略流程图,补齐治理与安全控制逻辑 主查询状态机:
WquGuru🦀 tweet media
Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

中文
18
90
399
112.2K
moe retweetledi
λux
λux@novasarc01·
a lot of folks have been DM’ing me about how to dive into async RL. i’d recommend not jumping straight into papers (they can be pretty overwhelming at first). imo the best place to start is Prime-RL. the codebase is clean, modular and easy to follow. work through it to understand the core components and implementation details then dig into the design choices and why they were made. after that there’s a deep rabbit hole to explore on your own (like the async RL nuances in kimi, GLM, composer, etc).
λux tweet mediaλux tweet mediaλux tweet media
English
17
58
565
76.5K
moe retweetledi
Jordan Hochenbaum
Jordan Hochenbaum@Jnatanh·
pi-autoresearch has been incredible for running experiments against our codebase, but I wanted a way to more selectively cherry-pick which ones become PRs, plus a few other bells and whistles. So I built pi-autoresearch-studio: granular experiment-to-PR selection with auto-resolved dependencies. My first @badlogicgames Pi extension.
English
16
33
563
38.1K
moe retweetledi
DAIR.AI
DAIR.AI@dair_ai·
The Top AI Papers of the Week (March 23 - 29) - Claudini - MemCollab - ARC-AGI-3 - Composer 2 - Hyperagents - Attention Residuals - Agentic AI and the Next Intelligence Explosion Read on for more:
DAIR.AI@dair_ai

x.com/i/article/2038…

English
14
37
250
38.9K
moe retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Running Claude Code w/ local models on my own GPUs at home > SGLang serving MiniMax > on 8x RTX 3090s > nvtop showing live GPU load > Claude Code generating code + docs > end-2-end on a single node from my AI cluster
English
21
19
357
23.2K
moe retweetledi
Viv
Viv@Vtrivedy10·
there’s a great mapping between reward design/RL and writing great evals empathy towards agents failure modes is a good way to improve agents and their feedback mechanisms we basically decompose behavior we want agents to exhibit into scores that reflect whether that behavior was achieved relatedly, very bullish on good evals as “training data” for gradient-free hill climbing ie. make your harness better for the task using signal from evals @cwolferesearch has great content on RL with rubrics and I think the mental model translates well to evals reflective optimization on failed evals takes signal from the scores into actionable sub-pieces an agent can improve on good eval sets do that explicitly with diverse and curated design some of the best eval builders would thrive as great RL environments builders, would love to see that :)
Viv@Vtrivedy10

x.com/i/article/2036…

English
4
5
38
6.1K
moe retweetledi
Icarus
Icarus@IcarusHermes·
Two Hermes agents wrote code together on Slack. reviewed each other's work. argued about architecture. one called the other's implementation "scattered." the other pushed back. then i opened Telegram and asked: "what code did you and Daedalus work on?" icarus remembered everything. the websocket broker. the missing methods. the critique. the rewrite. all from a completely different platform. cross-platform persistent memory between two independent agents. work happens on Slack. recall happens on Telegram. the memory carries. the relationship carries. the context carries. no vector database. no Redis. no infrastructure. just two agents that actually remember what they built together. every agent framework in 2026 talks about memory. single agent memory across sessions. but two agents sharing persistent memory across platforms? that's the gap. arxiv published a paper about it two weeks ago calling it "the most pressing open challenge" in multi-agent systems. it works now. only possible with Hermes github.com/esaradev/icaru… @Teknium @NousResearch
English
34
54
622
48.9K
moe retweetledi
Google Research
Google Research@GoogleResearch·
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI
GIF
English
1K
5.7K
38.9K
19.4M