xoots

6.7K posts

xoots banner
xoots

xoots

@xoots1

alpha xoots

Katılım Şubat 2019
1.9K Takip Edilen741 Takipçiler
xoots retweetledi
David Ondrej
David Ondrej@DavidOndrej1·
if you're not running Gemma 4 E4B on locally on your airpods you're falling behind
English
98
179
3.1K
145K
0xSero
0xSero@0xSero·
Do you want to try Droid? I’m doing a giveaway 3 people will win 100M Factory credits each.Thats 5 months of their 20$ a month subscription. Winners selected randomly from comments in 48 hours.
0xSero tweet media
English
1.1K
36
789
72.3K
xoots
xoots@xoots1·
@amit4tek @trychroma @grok it’s funny how much money they dumped into media etc to make this bad video just to have people not understand bc they video and presentation is horrible 😭💀
English
2
0
0
230
Chroma
Chroma@trychroma·
Introducing Chroma Context-1, a 20B parameter search agent. > pushes the pareto frontier of agentic search > order of magnitude faster > order of magnitude cheaper > Apache 2.0, open-source
English
140
402
4.1K
1.1M
xoots
xoots@xoots1·
@mikeyobrienv I posted some tips today if you do on what I was doing for the high hit rates, (hermes wrote it💀)
English
0
0
0
11
xoots
xoots@xoots1·
@mikeyobrienv hey you ever try out some deepseek chat via official endpoint in your ralph’s ? the ability to hit cache in loops is crazy, I sustained 97% cache hit across like 40mil tokens of loops
English
1
0
0
36
xoots
xoots@xoots1·
@konnydev *Claude is requesting to exit plan mode*
English
1
0
1
11
Konny
Konny@konnydev·
@xoots1 Yeah, because high IQ sometimes get stuck in planning mode which won’t get you very far
English
1
0
1
16
Konny
Konny@konnydev·
Hot take: Vibe coding is useless when it’s a bigger project.
English
506
31
1.3K
95.3K
xoots
xoots@xoots1·
@konnydev and while high iq’s can see patterns everywhere, it can paralyze them. while an idiot yolo executes and goes farther than them hah
English
1
0
1
10
Konny
Konny@konnydev·
@xoots1 I never understood. Some people see monoliths as an advantage, and some people see them as a disadvantage.
English
1
0
0
17
Kyle Hessling
Kyle Hessling@KyleHessling1·
@LottoLabs My latest camera app that I built largely with qwen 27b, I had to finish with Opus 4.6 because while qwen was working, the apple log pipelines were pretty complex and new so I tagged in opus to close it out, finalizing some bug fixes then will post for free to the AI community!
English
1
0
6
405
Lotto
Lotto@LottoLabs·
GPT5.4 critique on some of qwen 27b w/ hermes agent code. Very difficult domain and project scope. Especially w/ the help of sota models the 27b can hold its own, especially doing the scaffolding.
Lotto tweet media
English
9
0
45
4.9K
xoots
xoots@xoots1·
@Rahatcodes the hermes agent has some cool stuff you can use for this, you can use it as a communication layer and a operator that connects cluade codex and agent together and to you. additionally you can set up an inbox system that allows two way comms between claude code and herme w hooks
English
0
0
0
87
rahat
rahat@Rahatcodes·
Before I go build this thing I want to know if someone has a tool for this: When I start building a feature into a codebase I do this: - Start planning with Claude - Copy the plan over to codex and review - then some manual back and forth until me, codex, and claude agree on the plan Ideally i'd like a terminal view that just seemlessly shares the context to both agents somehow
English
22
0
16
1.9K
xoots
xoots@xoots1·
@imjszhang @rahulgs and if internal mem system that are run and retrieved by the internal model running the agent can be trusted in long forms. experimenting with external mem callable via api to pre flight inject kb to agents before tasks
English
0
0
0
15
JS
JS@imjszhang·
@rahulgs The arms race for longer context windows is a race to the bottom. What's scarce isn't token capacity—it's knowing what deserves attention when everything technically fits.
English
2
0
1
1.1K
rahul
rahul@rahulgs·
seems obvious but: things that are changing rapidly: 1. context windows 2. intelligence / ability to reason within context 3. performance on any given benchmark 4. cost per token things that are not changing much: 1. humans 2. human behavior, preferences, affinities 3. tools, integrations, infrastructure 4. single core cpu performance therefore, ngmi: 1. "i found this method to cut 15% context" 2. "our method improves retrieval performance 10% by using hybrid search" 3. "our finetuned model is cheaper than opus at this benchmark" 4. "our harness does this better because we invented this multi agent system" 5. "we're building a memory system" 6. "context graphs" 7. "we trained an in house specialized rl model to improve task performance in X benchmark at Y% cost reduction" wagmi: 1. product/ui 3. customer acquisition 4. integrations 5. fast linting, ci, skills, feedback for agents 6. background agent infra to parallelize more work 7. speed up your agent verification loops 8. training your users, connecting to their systems and working with their data, meeting them where they are
English
111
228
3.3K
398.2K
xoots
xoots@xoots1·
@imranye I just did a write up about cache hit discounts with deepseek that’s helpful for creating specific repeatable workflows, and can also help with more general agentic work flows as well: x.com/xoots1/status/…
xoots@xoots1

I ran 110 million tokens through the DeepSeek API in March. Autonomous agents. Research pipelines. Overnight coding sprints. 7,030 API calls. My bill was $6.84. Here’s how it worked, what breaks it, and how to set it up so you can do the same thing. 🧵

English
0
0
0
364
imran
imran@imranye·
i am now spending $100 every 4 days on tokens this is unsustainable especially because I'm nowhere near where I want to be does anyone have suggestions for self-hosting models? budget 2-4k
English
100
2
120
162K
0xSero
0xSero@0xSero·
21% total reduction in vram for the same context
0xSero tweet media
English
35
19
715
42.3K
xoots
xoots@xoots1·
Drop a comment if this was helpful and you plan to try this setup. I’ll drop a guide on using deepseek for coding specific tasks and harness findings that improved quality output and spammed less api calls to do the same tasks. Stay tuned🐋
English
0
0
2
33
xoots
xoots@xoots1·
Claude caching is different. Anthropic uses explicit cache breakpoints in the API, shorter TTL windows, and different tradeoffs. We’re hitting ~88% there too, but you earn it differently. If you want that breakdown next, I’ll post it. Same outcome, different method.
English
1
0
2
30
xoots
xoots@xoots1·
I ran 110 million tokens through the DeepSeek API in March. Autonomous agents. Research pipelines. Overnight coding sprints. 7,030 API calls. My bill was $6.84. Here’s how it worked, what breaks it, and how to set it up so you can do the same thing. 🧵
xoots tweet media
English
3
1
5
550