Elroy Tracey

327 posts

Elroy Tracey

@elroyic

เข้าร่วม Eylül 2008

312 กำลังติดตาม55 ผู้ติดตาม

Elroy Tracey@elroyic·3d

@sama @gabeeegoooh Did it train on manga it's bubbles are right to left and not left to right

English

267

Sam Altman@sama·3d

Here is a manga made by ChatGPT Images 2.0 of @gabeeegoooh and me looking for more GPUs:

English

583

221

516.4K

Elroy Tracey@elroyic·3d

@CarloFCesar @AlexFinn Qwen 3.6 35B A3B best open source model and you can run a few instances (4/5) instances on your setup

English

Carlo@CarloFCesar·3d

@AlexFinn I got 2x DGX spark and 1x Mac mini 24gb. What is your advised model setup for a fully local multi-agent workflow that runs 24/7 computations->analysis on a vectorized (ever-growing) database?

English

934

Alex Finn@AlexFinn·4d

It happened. An open weights model just dropped that benchmarks higher than Opus 4.6 is out If you have 2 Mac Studios w/ 512gb, you can run Opus 4.6 level intelligence completely for free on your desk I warned you this would happen months ago. Now Mac Studios and Mac Minis are sold out The next Mac Studio has been delayed until Q3/Q4. The price will be significantly higher I told you this was going to happen. Intelligence explosion. Hardware bottleneck. Increased efficiency Luckily I picked up 2 Mac Studio 512gbs, 2 Mac Minis, and a DGX Spark I will be loading this up in the next couple of days and will have completely private super intelligence running for me 24/7 I’m telling you right now by end of year we will have a local version of Mythos. It’s 100% guaranteed You called me crazy but every single prediction I’ve made has turned out to be true These models will only get more efficient and require less hardware. But that hardware is only going to get more expensive Local/open source is so obviously the future and if you’re still denying this now you are delusional

Kimi.ai@Kimi_Moonshot

Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…

English

188

146

1.7K

317.9K

Elroy Tracey@elroyic·22 Mar

@VisualFidelity Had my doubts when I started playing but the story actually is very meta, just think of it as you being isakied into main character and everything flows. Controls didn't make sense but as you get skills they make more sense.

English

175

Visual Fidelity@VisualFidelity·22 Mar

Are you happy you bought Crimson Desert?

English

1.1K

3.6K

417.1K

Elroy Tracey@elroyic·12 Mar

@antigravity Thanks for update these limits are dumb cancelled sub thanks for all the fish.

English

Google Antigravity@antigravity·11 Mar

We’re evolving Google AI plans to give you more control over how you build. Every subscription includes built-in AI credits, which can now be used for Antigravity, giving you a seamless path to scale. Google AI Pro is the home for the practical builder, hobbyists, students, and developers who live in the IDE and don't necessarily rely on an agent. This plan features generous limits for Gemini Flash, with a baseline quota included to "taste test" our most advanced premium models. Google AI Ultra serves as the daily driver for those shipping at the highest scale who need consistent, high-volume access to our most complex models. If you’re on Pro but need "extra juice" for a heavy sprint or deeper access to premium models, simply top up your AI credits to customize your plan. Keep building. Keep shipping.

English

1.5K

302

4.4K

1.5M

Elroy Tracey@elroyic·12 Mar

@literallydenis that is not usable

English

Denis Shiryaev 💙💛@literallydenis·11 Mar

Just tested: llama.cpp on $500 MacBook Neo: Prompt: 7.8 t/s / Generation: 3.9 t/s on Qwen3.5 9B Q3_K_M So, smaller models are even more usable

English

103

137

2.4K

276.6K

Elroy Tracey@elroyic·6 Mar

@Haezurath Sounds fun 👋

English

Elroy Tracey@elroyic·6 Mar

@sudoingX Running with ROCm on Halo Strix 128Gb Qwen3.5 35B at full context get 35tok/s uses 60Gb Ram/VRAM.

English

329

Sudo su@sudoingX·6 Mar

for anyone on AMD, genuinely curious how ROCm is performing on equivalent models. haven't tested it yet but it's on the list.

English

4.7K

Sudo su@sudoingX·6 Mar

if you have a single RTX 3090 and want the best local inference setup right now, here's what i landed on after testing 5 open source models across 7 GPU configs this month. GPU: 1x RTX 3090 24GB model: Qwen 3.5 27B Dense Q4_K_M (16.7GB) context: 262K (native max) speed: 35 tok/s generation, flat from 4K to 300K+ reasoning: built in chain of thought, survives Q4 quant config: llama-server -ngl 99 -c 262144 -fa on --cache-type-k q4_0 --cache-type-v q4_0 what this gives you: - 27B params all active every token - no speed degradation as context fills - full reasoning mode on a consumer GPU - 7GB VRAM headroom after model load tested MoE (faster but less depth per token) and dense hermes (same speed, degraded under load). qwen dense hit the sweet spot for single GPU. more architecture comparisons dropping soon. what's your single GPU setup? curious what configs people are running.

English

693

46.1K

Elroy Tracey@elroyic·5 Mar

@martinlasek this guy thinking he can replace Claude with 36gb of ram lol

English

111

@martinlasek@martinlasek·5 Mar

You can pay $200/m for Claude for eternity Or buy a Mac Studio for $166/m and run your own model for free Best part? You own it after 12 months at 0% interest: > M4 Max > 36GB Memory > 512 GB SSD Hmmm…

English

439

1.1K

663.6K

Elroy Tracey@elroyic·5 Mar

@itsjustmarky @TeksEdge @arena I am getting 35 tok/s with the 35B make sure you have ROCm 7.2 and you running it through LMStudio

English

sudo rm -rf@itsjustmarky·5 Mar

@TeksEdge @arena It is far too slow on a Strix Halo. 8 tokens/sec when I tested and way slower as you fill the context.

English

978

David Hendrickson@TeksEdge·5 Mar

🦞 Clawdbot News: Interested in choosing the best local Qwen3.5 model to run your Agent? Qwen3.5-27B dominates the benchmarks in terms of size. Ranking #66 on @arena latest leaderboard (above MiniMax M2.5, Step 3.5, GPT-5-mini). Benjamin's assessment shows Unsloth and Bartowski continue to be the gold standard. Unsloth shows nearly no quality loss, bringing down the footprint by 65% (55.6GB to 19.46GB), the perfect size to run on an AMD Ryzen AI Max+ 395 rig with 64 GB RAM (or M4 Mac Mini w/64GB)

Benjamin Marie@bnjmn_marie

Qwen3.5 27B GGUF evaluation: ✅UD-Q4_K_L, IQ4_XS, and IQ4_NL perform closely to the original ✔️UD-IQ3_XXS is good enough ❌Couldn't find a good Q2 Next: Qwen3.5 9B. Again, early results suggest Q4 quants are very close to original

English

137

61.2K

Elroy Tracey@elroyic·4 Mar

@axiomofmind @mattshumer_ multi gate way, each claw then spin up sub agents as needed.

English

@Axiomofmind ⚡@axiomofmind·4 Mar

@mattshumer_ Multi agent on one gateway or multiple gateways? Multi agent as long as it's set up right (separate workspaces) you just have the main agent use the session ID to communicate

English

1.8K

Matt Shumer@mattshumer_·4 Mar

For those with multiple OpenClaw agents on one machine, what's your setup to allow them to work together?

English

163

282

68.5K

Elroy Tracey@elroyic·4 Mar

@SethCronin @mattshumer_ common chat room, suggestion is using IRC you can host it locally on your network and you in control of the logs.

English

Seth Cronin@SethCronin·4 Mar

@mattshumer_ Follow up, if each (human) on a team has an openclaw, what’s the best way to let them work together? My instinct is email, but that seems slow and difficult to audit.

English

2.6K

Elroy Tracey@elroyic·4 Mar

@onusoz @mattshumer_ working on many tasks at once... splitting context to many claws instead of clogging one... performance and ability for each claw to focus on one task instead of everything you ever thought of.

English

223

Onur Solmaz@onusoz·4 Mar

@mattshumer_ why do you need multiple openclaw instances on 1 machine?

English

2.6K

Elroy Tracey@elroyic·4 Mar

@mattshumer_ Token used infinite... haha

English

Elroy Tracey@elroyic·4 Mar

@mattshumer_ Many claws, one machine, one vm per claw. Each claw has its own workspace. There is a shared workspace. Discord Channels for communication between claws (cron job to check channels), embedded knowledge graph to share context.

English

153

Elroy Tracey@elroyic·1 Mar

@pcshipp Locally in a VM why do you need to pay VPS fees.

English

pc@pcshipp·1 Mar

What’s the best way to run OpenClaw? - VPS - Locally

English

147

128

27.1K

Elroy Tracey@elroyic·27 Şub

@sudoingX On AMD Halo strix platform with 128Gb Ram. Getting 25 t/s on Q8 model with 200k token context window. Feels fast enough.

English

100

Sudo su@sudoingX·26 Şub

the numbers coming in from this thread: 5090: 166 tok/s (z33b0t), 153 tok/s (EmmanuelMr) 4090: 122 tok/s (StubbyTech) 3090: 112 tok/s (sudo), 100 tok/s (Eduardo) 6800XT: 20-30 tok/s (Dark) Qwen3.5-35B-A3B. 4-bit quant, 19.7 GB on disk. fits entirely on any single 3090 24GB card with room to spare. no offloading, no splitting, full speed. 5090 owners keep pushing the ceiling and we haven't found it yet. NVIDIA side is stacking up. where are the ROCm numbers? report your GPU and tok/s below. building the full map.

Sudo su@sudoingX

hey if you're running Qwen3.5-35B-A3B on llama.cpp and stuck at 40-70 tok/s on 24GB+ VRAM, you're leaving speed on the table. use llama.cpp from source. add these flags: --cache-type-k q8_0 --cache-type-v q8_0 -np 1 Eduardo went from 50 to 100 on a 3090 (24 GB). StubbyTech just hit 122 on a 4090 (24 GB). full 262K context, zero speed loss. UD-Q4_K_XL quant. all layers on GPU. stop leaving performance on the table.

English

518

121.1K

Elroy Tracey@elroyic·19 Şub

@SimplePestMgmt @steipete @openclaw Even GPT OSS 20B knows the date bruh.

English

Ian Chi@SimplePestMgmt·19 Şub

Hey @steipete and @openclaw why is OC so bad with dates? This isn’t the only time it’s made horribly easy date mistakes

English

107

64.3K

Elroy Tracey รีทวีตแล้ว

Vadim@VadimStrizheus·12 Şub

POV: your OpenClaw after you didn’t set up a second brain system. Paste this prompt to fix that: 👇 I want you to build me a second brain memory system. Create a memory/ folder and a MEMORY.md file in your workspace. Every session, read these FIRST before doing anything, they are your entire memory. memory/YYYY-MM-DD.md are your daily journals. As we talk each day, log everything in real-time - decisions, tasks, preferences, context, mistakes. Timestamp each entry. These are your raw notes. MEMORY.md is your long-term memory. This is curated, who I am, my goals, my preferences, active projects, lessons learned, key decisions and why. Every few days, review your daily journals and distill the important stuff into here. The rule: if it's not written to a file, you don't remember it. When I say "remember this", write it immediately. When you make a mistake,document it so you never repeat it. When you learn something about me update MEMORY.md. Over time you should know my communication style, what I care about, what annoys me, my projects, my goals. After a week this should feel like a real assistant that actually knows me. After a month, indispensable.

English

143.7K

Elroy Tracey รีทวีตแล้ว

Vadim@VadimStrizheus·15 Şub

Introducing my AI enterprise company. 🦞 Just some 18/yo with zero coding experience.

Matthew Berman@MatthewBerman

I'm one of the most advanced users of OpenClaw. OpenClaw + GPT5.3 Codex + Opus 4.6 has been the trifecta that changed everything. I made a video going over everything I'm doing with these tools. Learn these tools, stay ahead. Watch this video right now. 0:00 Intro 1:02 Overview 4:17 Sponsor 5:12 Personal CRM 7:11 Knowledge Base 8:30 Video Idea Pipeline 11:09 Twitter/X Search 12:47 Analytics Tracker 13:33 Data Review 15:34 HubSpot 16:13 Humanizer 16:52 Image/Video Generation 18:22 To-Do List 19:37 Usage Tracker (Saves Money) 20:45 Services 21:25 Automations 22:42 Backup 23:30 Memory 24:06 Building OpenClaw 25:22 Updating Files

English

102

129

2.6K

594.5K

Elroy Tracey รีทวีตแล้ว

Emre Elbeyoglu@elbeyoglu·14 Şub

I built markdown.new Put markdown.new before any URL → get clean Markdown back. Cloudflare's Markdown for Agents is great, but only works for enabled sites. markdown.new works for ANY website on the internet. 80% fewer tokens. Also converts PDFs, images, audio. Free. No signup. markdown.new

Cloudflare@Cloudflare

Time to consider not just human visitors, but to treat agents as first-class citizens. Cloudflare’s network now supports real-time content conversion to Markdown at the source using content negotiation headers. cfl.re/4ksZQ1S

English

196

314

4.3K

1.1M

ค้นพบ

@sama @gabeeegoooh @CarloFCesar @AlexFinn @VisualFidelity @antigravity @literallydenis @Haezurath