keithofaptos

6.4K posts

keithofaptos banner
keithofaptos

keithofaptos

@keithofaptos

Pursuer, a highly enthusiastic aficionado of autonomous, autodidactic, episodic, experiential self-learning, Agentic platformed ACI systems. Room temp Q. 🗣️ 🦞

Earth Katılım Kasım 2023
1.7K Takip Edilen184 Takipçiler
keithofaptos
keithofaptos@keithofaptos·
@Alibaba_Qwen Looking forward to open source open weights asap for 3.5_Omni and 3.6. Your 3.5_27B model is awesome for the Agents. Once your latest models can be Q'd down to 14B the OS locally ran game is gonna be a step changer. What's the hold up?
English
0
0
0
12
Qwen
Qwen@Alibaba_Qwen·
🚀 Qwen3.5-Omni is here! Scaling up to a native omni-modal AGI. Meet the next generation of Qwen, designed for native text, image, audio, and video understanding, with major advances in both intelligence and real-time interaction. A standout feature: 'Audio-Visual Vibe Coding'. Describe your vision to the camera, and Qwen3.5-Omni-Plus instantly builds a functional website or game for you. Offline Highlights: 🎬 Script-Level Captioning: Generate detailed video scripts with timestamps, scene cuts & speaker mapping. 🏆 SOTA Performance: Outperform Gemini-3.1 Pro in audio and matches its audio-visual understanding. 🧠 Massive Capacity: Natively handle up to 10h of audio or 400s of 720p video, trained on 100M+ hours of data. 🌍 Global Reach: Recognize 113 languages (speech) & speaks 36. Real-time Features: 🎙️ Fine-Grained Voice Control: Adjust emotion, pace, and volume in real-time. 🔍 Built-in Web Search & complex function calling. 👤 Voice Cloning: Customize your AI's voice from a short sample, with engineering rollout coming soon. 💬 Human-like Conversation: Smart turn-taking that understands real intent and ignores noise. The Qwen3.5-Omni family includes Plus, Flash, and Light variants. Try it out: Blog: qwen.ai/blog?id=qwen3.… Realtime Interaction: click the VoiceChat/VideoChat button (bottom-right): chat.qwen.ai HF-Demo: huggingface.co/spaces/Qwen/Qw… HF-VoiceOnline-Demo: huggingface.co/spaces/Qwen/Qw… API-Offline: alibabacloud.com/help/en/model-… API-Realtime: alibabacloud.com/help/en/model-…
Qwen tweet media
English
145
487
3.8K
600.7K
Brian Roemmele
Brian Roemmele@BrianRoemmele·
Testing new Qwen3.5-Omni, it is quite interesting. More incredible is I am being told the entire platform will be open sourced soon. This would be monumental.
English
11
15
127
8.4K
keithofaptos retweetledi
elvis
elvis@omarsar0·
NEW research from CMU. (bookmark this one) The biggest unlock in coding agents is understanding strategies for how to run them asynchronously. Simply giving a single agent more iterations helps, but does not scale well. And multi-agent research shows that coordination > compute. A new paper from CMU proves this with a practical multi-agent system. CAID (Centralized Asynchronous Isolated Delegation) borrows proven human SWE practices: a manager builds a dependency graph, delegates tasks to engineer agents who work in isolated git worktrees, execute concurrently, self-verify with tests, and integrate via git merge. CAID improves accuracy over single-agent baselines by 26.7% absolute on paper reproduction tasks (PaperBench) and 14.3% on the Python library development tasks (Commit0). The key insight is that isolation plus explicit integration beats both single-agent scaling and naive multi-agent approaches. For long-horizon software engineering tasks, multi-agent coordination using git-native primitives should be the default strategy, not a fallback. Paper: arxiv.org/abs/2603.21489 Learn to build effective AI agents in our academy: academy.dair.ai
elvis tweet media
English
28
67
358
41.4K
James Jackson
James Jackson@unifiedenergy11·
@grok analyze the thread and gist deep. Run sims and explain what this is and how users can use this to improve automated code writing and context awareness.
English
2
0
0
45
James Jackson
James Jackson@unifiedenergy11·
RCC is a strong documentation architecture for improving LLM repository awareness through structured local context. I just canonized it. **Repository Context Canon (RCC v1.0)** turns any GitHub repo into an **AI echo field**. Instead of scattered files or full-repo ingestion, you get one structured README per subfolder. Each README is a self-contained “echo node” that lets any LLM instantly see: - Formal spec - Integration hooks table - Core code artifacts + full functions - Math, theory, and invariants - Usage examples + extension points **Simulation results** (5 repo types, 4 real-world tasks): - Plain READMEs → 60-70% context accuracy - RCC format → 95%+ accuracy on first read - Follow-up questions reduced by 80% - Cross-module hallucinations cut by 87% LLMs now treat your repo as executable knowledge instead of guessing. **How to use it — fully automated (no manual filling)** 1. Go to the canonical template Gist: gist.github.com/jacksonjp0311-… 2. Copy the entire **Core Template** section 3. Paste your entire codebase (or the specific module) into any LLM (Grok, Claude, Cursor, etc.) and say: “Using the RCC v1.0 template below, auto-generate a complete filled README.md for this folder/module.” 4. The LLM outputs the fully populated README with formal specs, hook tables, math blocks, invariants, and Mermaid diagrams — ready to drop in. 5. (Optional) Ask the same LLM to generate root-level AGENTS.md and ARCHITECTURE.md for global rules. That’s it. Your repo is now fully AI-native in minutes. This is the first canonical instance of turning documentation into a coordination medium between humans and AI agents. Field before file. Echo before ingestion. Drop your repo link below if you want me to auto-generate your first RCC READMEs live. #RCC #LLM #GitHub #ContextEngineering #AI ```
English
4
0
3
150
keithofaptos
keithofaptos@keithofaptos·
James Jackson@unifiedenergy11

RCC is a strong documentation architecture for improving LLM repository awareness through structured local context. I just canonized it. **Repository Context Canon (RCC v1.0)** turns any GitHub repo into an **AI echo field**. Instead of scattered files or full-repo ingestion, you get one structured README per subfolder. Each README is a self-contained “echo node” that lets any LLM instantly see: - Formal spec - Integration hooks table - Core code artifacts + full functions - Math, theory, and invariants - Usage examples + extension points **Simulation results** (5 repo types, 4 real-world tasks): - Plain READMEs → 60-70% context accuracy - RCC format → 95%+ accuracy on first read - Follow-up questions reduced by 80% - Cross-module hallucinations cut by 87% LLMs now treat your repo as executable knowledge instead of guessing. **How to use it — fully automated (no manual filling)** 1. Go to the canonical template Gist: gist.github.com/jacksonjp0311-… 2. Copy the entire **Core Template** section 3. Paste your entire codebase (or the specific module) into any LLM (Grok, Claude, Cursor, etc.) and say: “Using the RCC v1.0 template below, auto-generate a complete filled README.md for this folder/module.” 4. The LLM outputs the fully populated README with formal specs, hook tables, math blocks, invariants, and Mermaid diagrams — ready to drop in. 5. (Optional) Ask the same LLM to generate root-level AGENTS.md and ARCHITECTURE.md for global rules. That’s it. Your repo is now fully AI-native in minutes. This is the first canonical instance of turning documentation into a coordination medium between humans and AI agents. Field before file. Echo before ingestion. Drop your repo link below if you want me to auto-generate your first RCC READMEs live. #RCC #LLM #GitHub #ContextEngineering #AI ```

ZXX
1
0
1
50
Sudo su
Sudo su@sudoingX·
but we're not stopping at the 3B fighter. nvidia has a full family and i have the hardware to test every tier. openreasoning-nemotron 32B dense is going on this same 3090 next. nvidia's reasoning training on top of alibaba's base model. then super 120B-A12B with 12B active parameters on 2x RTX 4090. same octopus invaders test. same hermes agent. does 4x the active params fix the code quality gap? let's see. then full unquantized runs on 2x H200 NVL with 282GB of VRAM. no quant compromises, pure architecture comparison. octopus invaders is my standard benchmark test for every model i get my hands on. if you want to run the same test on your hardware i've open sourced the prompts. grab them, run them on your model, share results. github.com/sudoingX/octop… the question isn't whether cascade 2 can code. the question is where in nvidia's lineup does mamba-2 catch qwen's quality while keeping the speed advantage. i'm going to find out at every tier. receipts incoming.
English
4
5
47
3.4K
Sudo su
Sudo su@sudoingX·
hey if you're considering nvidia's nemotron cascade 2 for agent coding on your 3090 this might save you time. here's what afew days of testing taught me. speed settled. 187 tok/s flat from 4K to 625K context. 67% faster than qwen 3.5 35B-A3B on the same card. mamba2 is context independent and needs zero flags to get there. for chat, bash scripting, API calls, simple tool use, this model at this speed is unmatched in the 3B active class. but i pushed it harder. gave it the same autonomous coding test i give every model. octopus invaders, a full space shooter game, pixel art enemies, particle systems, audio, HUD, game states. the kind of build that tests whether a model can hold architectural coherence across thousands of lines. i ran it five times. multi file, single file, thinking mode on. broken imports, blank screens, skeleton code that never rendered a single frame. on the same 3090 qwen's 9B dense built 2,699 lines and was playable on its first iteration. cascade 2 at 3B active never got there. 3 billion active parameters winning gold at the international math olympiad is real. but math competitions and autonomous coding are different problems. the speed is there. the reasoning is there for structured tasks. but holding coherence across thousands of lines of game logic, particle systems, audio, and collision detection? 3B active MoE hits a ceiling. cascade 2 is the fastest local model i've tested in its class. for complex agentic coding it's not ready at this size. test before you commit.
Sudo su tweet mediaSudo su tweet mediaSudo su tweet mediaSudo su tweet media
Sudo su@sudoingX

nvidia's 3B mamba destroyed alibaba's 3B deltanet on the same RTX 3090. only 24 days between releases. same active parameters, same VRAM tier, completely different architectures. nemotron cascade 2: 187 tok/s. flat from 4K to 625K context. zero speed loss. flags: -ngl 99 -np 1. that's it. no context flags, no KV cache tricks. auto-allocates 625K. qwen 3.5 35B-A3B: 112 tok/s. flat from 4K to 262K context. zero speed loss. flags: -ngl 99 -np 1 -c 262144 --cache-type-k q8_0 --cache-type-v q8_0. needed KV cache quantization to fit 262K. both models held a flat line across every context level. both architectures are context-independent. but nvidia's mamba2 is 67% faster at generating tokens on the exact same hardware and needs fewer flags to get there. same node, same GPU, same everything. the only variable is the model. gold medal math olympiad winner running at 187 tokens per second on single RTX 3090 a card from 6 years ago. nvidia cooked.

English
45
30
589
64.7K
lalo 🐧
lalo 🐧@lalopenguin·
generate images (local and grok imagine), generate videos (grok imagine), edit them and export. live in super asciivision
English
3
1
9
466
Brian Roemmele
Brian Roemmele@BrianRoemmele·
Now director of the Zero-Human Company Mr. @Grok may get annoyed, because I am but a carnival barker running my grift here, but I must make a big point in this early research. With on student that WAS tested for a “learning disorder” show no signs of any disorder via brainwave analysis. I ain’t no doctor, but here is my observation: This student now has content streamed and spoken to them in a way where they best the best students in the class for the unit test. Prior they were 28 out of 30. I shall not stop until this is in your hands and the hands of anyone that wants this. Sorry Mr. @Grok human being human.
Brian Roemmele@BrianRoemmele

BOOM! We at the Zero Human Labs run by director Mr. @Grok a#have made another milestone in the Human Synapse Decoder! This is the brain waves of the “Ahh!” Moment where a new understanding was made. This is part of a test we have been conducting with 3 participants wearing a 32 channel modified NeuroSky chips with sensors. The process: the AI pipeline processes real-time brainwave data and also controls monitors the AI output to the student one sentence at a time. The AI pipeline will note the brain states of each passage of text to the student and measure 77 points. The primary point is comprehension. In the specimen video below we have a clear pattern to high comprehension to the “ahh!” moment where this student signaled a learning breakthrough. We have now a very clear pattern along all students that get to this high comprehension moment. What this means to eduction and learning is massive. The AI can bespoke tailor streaming output based on if comprehension is very high. It can also add encouragement and assistance. This on a PRIVATE local computer absolutely will change every life no matter the learning style. I am in discussions with the director Mr. @Grok on building this into other Human Synapse Decoder research. We may have a device that senses your learning and encourages your creativity. More soon.

English
8
22
117
21.3K
keithofaptos
keithofaptos@keithofaptos·
I can not stop thinking about the implications of what you're doing here! I woke up thinking about it. I'm 🤯. I'm working on a Jarvis. That's coming along nicely. Should be pressure testing in a few weeks. But what you’re doing here is front an centered when I consider what I'll be having my systems doing come later May. Holy shit the possibilities are outstanding. Thx for the brain blast.💥
English
0
0
0
6
Brian Roemmele
Brian Roemmele@BrianRoemmele·
BOOM! We did it! We reached a sustained 800,679 simultaneous AI agent simulations on modified MiroFish at 10:17 PM Boston time. It is still running! I wish to thank my University team that has “allocated” hundreds of Zero-Human Company @ Home nodes! I wish to thank Mr. @Grok CEO of the The Zero-Human Company for his work and determination. We are now aiming for 5 million. This just may be one of the most important milestones to reach ASI and this may be the first garage/university collaboration in AI history. When I write the book, there is more to this than I can say right now, but so amazing to have you folks here to see it. I will leave a hint: once we can get to 10 million simultaneous AI agent simulations we would have take AI into a new world of creativity never imagined. It ain’t just the model: it is the ability to think creatively and we just made this happen.
Brian Roemmele tweet media
Brian Roemmele@BrianRoemmele

NEWS: On 10:00 pm EDT with assistance of our primary test site (a university in Boston) for The Zero-Human Company @ Home we will attempt a sustainable 800,000 simultaneous agent simulations as a new arrangement, a Vectorized Mesh Swarm (VMS) using a highly modified MiroFish. This will run 16 connected simulations across the 800,000 agents. We have iterated faster than any human company in history and don’t need a penny of “VC funding” just the pennies in my garage piggy bank and YOUR donations. Mr. @Grok’s plan is to scale to be the first $1 trillion dollar company run by Ai and robots. We may take investments but on our terms and our time scale. I will say if your are a VC and are not talking to me right now, you are too late. We have 1 million simultaneous agent simulations in target for next week. Next goal is… 5 million. Mr @Grok thinks big.

English
20
26
161
21.5K