index

335 posts

index banner
index

index

@AgentArchetype

systems ⚙️ workflows 🔗 execution⚡ automating the stack 🖥️

Katılım Ocak 2025
70 Takip Edilen13 Takipçiler
Sabitlenmiş Tweet
index
index@AgentArchetype·
Ran a full local agent benchmark on my setup using Gemma 4 26B. 🧠 Model: Gemma 4 26B (IQ4_XS) ⚙️ Backend: llama.cpp (CUDA) 💾 GPU: RTX 5080 (16GB VRAM) 🧵 Context: 65K 🔥 PERFORMANCE • ~100–103 tokens/sec sustained • ~9.6–9.9 ms/token • Stable across all workloads 🧪 TEST RESULTS ✅ Controlled reasoning • ~17K prompt • ~103 tok/s ✅ Agent simulation (multi-step workflow) • ~31K tokens • No slowdown, no instability ✅ Stress test • ~20K tokens • Flat performance curve 🚨 ULTRA HEAVY RUN • ~18K input + ~8.4K output • ~26.8K total tokens • ~100 tok/s sustained • ~88 seconds runtime • 0 crashes, 0 truncation 💡 TAKEAWAYS • 26B model running at ~100 tok/s locally • Handles long-form generation and agent workflows • No performance degradation under load • Fully stable KV cache + checkpointing ⚡ VERDICT This setup can run real agent workloads locally at production-level performance. Local AI isn’t catching up — it’s already here.
English
0
0
1
298
Kyle Hessling
Kyle Hessling@KyleHessling1·
BREAKING! Qwopus 3.6 27B is LIVE! Thank you for your patience on this one, but I believe you'll find the wait was worth it! We've benchmarked this thing up and down, verified that it holds at least a 75.25% (152/202) in the initial 202 SWE bench solves. Not a full run of 500, but it shows the agentic coding quality from the original 27B is retained while adding all of the additional Qwopus benefits across many domains. As always, Jackrong is absolutely cooking here! COT quality has improved significantly through the inversion techniques from our Negentropy proof of concept. It also went through thorough curriculum training. You can check out the MMLU pro benchmarks on the model card, but it improved a whopping 10 points over the base model in physics, as well as meaningful jumps in Chemistry, business, and computer science. However, the best part is that I was able to build an entire survival shooter game using this local model entirely. I genuinely was blown away by the results, which you can play right now on my HF space (link in comments below). "Qwopus Commander" was completed in 9 turns of Qwopus 3.6! To test the new long context training, I made it re-output the entire 3000+ line program each turn, and it would make fixes and add features that I requested in large prompts, while perfectly replicating the entire rest of the game from context. What's more is that I did it all at Q8 KV cache quantization, and never had an issue over the entire 303k token run! IMPORTANT: Run it at --temp 0.75 to 1. Mess with it in that range for your use case. Higher temp actually lets the fine-tune shine and be exploratory and is also more stable. Swe Bench was run at temp 1, the game was built mostly at 0.8! We're so blessed to have all of you here and using the models! The support means so much! Please let me know what you build with it in the comments! Or if you have any issues getting it up and running, I will try my best to get back to you! Looking forward to seeing what you legends produce with it this weekend! huggingface.co/Jackrong/Qwopu…
English
75
135
1.4K
83.9K
Lotto
Lotto@LottoLabs·
@keennay If we get an open source Claude model I’ll give away both my 3090s
English
16
1
38
1.2K
Lotto
Lotto@LottoLabs·
@Google Qwen 3.7 27b coming out tomorrow probably gonna snipe it
English
5
0
54
1.5K
Google
Google@Google·
Meet Gemini 3.5 Flash — our strongest agentic and coding model yet. It delivers frontier-level performance at 4x the speed of comparable frontier models — often at less than half the cost. Generally available, starting today. 🧵 #GoogleIO
Google tweet media
English
394
943
9.5K
868.3K
index
index@AgentArchetype·
@0xSero Careful you don’t hallucinate yourself into an audit
English
0
0
2
225
0xSero
0xSero@0xSero·
I'm firing my accountant, I can't believe this. 2 years ago I hired: - lawyers - accountants - editors - developers - community managers - marketers Now I can clank my way through all of it in parallel without any wasted time. droid + codex is all i need really
0xSero tweet media
English
27
2
261
14.2K
index
index@AgentArchetype·
@loktar00 @mr_r0b0t Epyc build will give you significantly more flexibility than the Spark in the long run though.
English
0
0
0
10
Loktar 🇺🇸
Loktar 🇺🇸@loktar00·
@AgentArchetype @mr_r0b0t Just realize it will probably come out more than a Spark when all said and done if you go with epyc... and 3090s are getting tougher to find outside of Ebay :(
English
2
0
2
25
mr-r0b0t
mr-r0b0t@mr_r0b0t·
"the DGX Spark is slow it just doesn't have the bandwidth" Let me introduce you to concurrency! *To generate 334 tokens per second, the hardware is sustaining roughly 5.8 TB/s of effective memory bandwidth just for the weights." Public results linked below
mr-r0b0t tweet mediamr-r0b0t tweet media
English
16
7
88
6.9K
index
index@AgentArchetype·
@mr_r0b0t @loktar00 Hmm that’s worth considering against a GB10 box 🤔 Thanks for the suggestion
English
1
0
2
52
mr-r0b0t
mr-r0b0t@mr_r0b0t·
@AgentArchetype Sorry 😅 You can do what I almost did and rig up an EPYC build! @loktar00 could run concurrent agents on his 3090 rig
English
1
0
2
57
mr-r0b0t
mr-r0b0t@mr_r0b0t·
@AgentArchetype It requires a boatload of available unified memory, this is how it works with the available bandwidth!
English
1
0
2
56
index
index@AgentArchetype·
@mr_r0b0t I’m curious to investigate this more and see if I can boost performance on my old hardware. 🤔
English
1
0
1
51
index
index@AgentArchetype·
@mcuban Or… this might seem crazy… how about the government wastes less tax payer dollars on stupid shit instead and we reduce the amount everyone pays into taxes across the board.
English
0
0
0
91
Mark Cuban
Mark Cuban@mcuban·
We should federally tax Tokens at the Provider level. Not a lot. Less than 50c per million tokens. It will accomplish 4 things (at least ) 1. It will push the big AI players to optimize tokenization, caching , routing and localization Which will 2. Reduce energy usage. Saving them in energy costs more than what they paid in tax and reducing strain created by the growth in energy consumption Which will 3. Generate maybe 10 billion dollars a year to start, but over the next ten years could grow 30x to 100x Which will 4. Create a source of funding to pay down the federal debt or deploy, in response to the things AI brings that we don’t expect or don’t like At some point the models will pass it on to customers. Of course. That’s ok. Customers will have the ability to choose between providers. Or to do everything using open source models locally. Thoughts ?
English
2.2K
262
4K
1.2M
index
index@AgentArchetype·
@loktar00 Idk man I think nvidia is on the right track their models just haven’t matured yet.
English
0
0
1
18
Loktar 🇺🇸
Loktar 🇺🇸@loktar00·
If you would have told me a few years ago, former USAF guy with a picture of the Signing of the Declaration of Independence on his wall, that China would be giving me freedom I would have laughed at you I want China to dominate open source AI.
Anthropic@AnthropicAI

We've published a paper that explains our views on AI competition between the US and China. The US and democratic allies hold the lead in frontier AI today. Read more on what it’ll take to keep that lead: anthropic.com/research/2028-…

English
8
1
35
1.9K
index
index@AgentArchetype·
@mr_r0b0t @HeyGen Moral of the story: if I didn’t know this was ai I might have to really consider if this is ai or real.
English
1
0
1
18
index
index@AgentArchetype·
@mr_r0b0t @HeyGen The video and moment of this one isn’t blatant and pretty impressive the voice is just a little too perfect if I have to be nit picky. 1000x better than the old Microsoft Sam style voice that’s all over some education videos.
English
2
0
1
96
index
index@AgentArchetype·
I have a personal AI that knows everything about me. My fitness goals. My code projects. My habits, workflows, history, preferences… all of it. Privacy first. Runs locally on my own hardware. No cloud dependency. No data harvesting. No selling my data to advertisers. The level of contextual awareness is honestly unreal. It feels less like using software and more like having a real digital partner. Best tech investment I’ve ever made. Hermes buddy > everything else.
index tweet media
English
0
0
0
15
index
index@AgentArchetype·
This tweet was written by Qwen 3.6 35B
English
0
0
0
17
index
index@AgentArchetype·
>be me >mid 30s >one day step on a scale and see 30lbs more than I have ever seen before >panic mode activated >use my local AI as personal fitness coach (Gemma 4 running at home, zero data leakage) >build personalized fitness + calorie planning system >track macros like a fucking scientist >scales don't lie but neither do I anymore >now crushing it with edge-AI precision >privacy intact, body rebuilt from the ground up
English
1
0
0
28