Cacheon

34 posts

Cacheon banner
Cacheon

Cacheon

@cacheon_ai

Inference arena for open-source LLMs. Build the fastest correct server. Win real rewards.

شامل ہوئے Nisan 2026
14 فالونگ634 فالوورز
پن کیا گیا ٹویٹ
Cacheon
Cacheon@cacheon_ai·
Launching Cacheon: an open, incentivized competition for LLM inference optimization. As model quality converges, the next frontier is serving them economically at scale: lower latency, higher throughput, and lower cost per token. Cacheon turns that problem into a live arena with continuous evaluation. Developers submit containerized inference servers, benchmarked on standardized hardware against a pinned vLLM baseline. The fastest server that preserves output correctness wins. The goal is to make better inference systems discoverable, measurable, deployable, and rewarded in the open. Mainnet launches by May 19. Learn more: cacheon.ai
Cacheon tweet media
English
47
39
185
44.6K
Cacheon
Cacheon@cacheon_ai·
Cacheon has parted ways with @xavi3rlu. We're excited for the future of Cacheon and will keep you updated on the next phase of the subnet.
English
1
5
22
8.7K
Cacheon
Cacheon@cacheon_ai·
We just merged our first-ever community PR from a miner! 🤝 It makes our evaluation fairer. Miners can use any tokenizer without getting penalized. If your output is right, you pass. Simple as that. github.com/latent-to/cach…
English
2
3
21
1.8K
Cacheon
Cacheon@cacheon_ai·
Cacheon docs are now agent-ready. On every doc page: • Open in Claude, Cursor, or ChatGPT • or Copy Markdown For grounding: • cacheon.ai/llms-full.txt • Per-page markdown URLs As more development workflows move from browsers to agents, documentation needs to be optimized for both humans and machines.
Cacheon tweet media
English
1
3
22
7.8K
Cacheon
Cacheon@cacheon_ai·
Cacheon paid out its first miner incentives on Tuesday. Here's how it works: miners earn when they beat vLLM in eval. One miner did exactly that. Got rewarded. Then on the next run, their score dropped below the baseline, so emissions went back to burn. No one is being held back. Beat vLLM consistently, earn consistently. Miss a run, miss the reward. Variance is real. A single fast run does not lock in your position. Full details: cacheon.ai/docs/evaluatio…
Cacheon tweet media
English
1
0
14
811
Cacheon ری ٹویٹ کیا
Xavier
Xavier@xavi3rlu·
It was a strong signal that even though $TAO was just a track of @proofoftalk, it felt like the main event. The vibes were immaculate Awesome catching up with old friends and and finally put faces to many of the names we've been building alongside online.
Cacheon@cacheon_ai

The energy around @Bittensor at @proofoftalk was exceptional. We had great conversations with old friends and made plenty of new ones across the ecosystem. People are no longer asking whether Bittensor can attract builders. They're asking how subnets will take market share, generate revenue, and build sustainable advantages over centralized AI firms. The ecosystem feels materially different than it did a year ago: more serious teams, stronger infrastructure, more capital, and significantly higher conviction. Came away more bullish than ever on SN14 Cacheon and the broader Bittensor space. Still a long road ahead, but the momentum is undeniable. Hopefully more good news in the coming weeks.

English
1
4
51
2.2K
Cacheon
Cacheon@cacheon_ai·
The energy around @Bittensor at @proofoftalk was exceptional. We had great conversations with old friends and made plenty of new ones across the ecosystem. People are no longer asking whether Bittensor can attract builders. They're asking how subnets will take market share, generate revenue, and build sustainable advantages over centralized AI firms. The ecosystem feels materially different than it did a year ago: more serious teams, stronger infrastructure, more capital, and significantly higher conviction. Came away more bullish than ever on SN14 Cacheon and the broader Bittensor space. Still a long road ahead, but the momentum is undeniable. Hopefully more good news in the coming weeks.
Cacheon tweet media
English
2
6
56
3.2K
Cacheon
Cacheon@cacheon_ai·
Cacheon competition restarts June 1. What we overhauled this week: - One-pass eval: speed + correctness on the same prompts, same outputs - Single metric: end-to-end wall time vs baseline (no TTFT / TPS split) - Improved logging and telemetry - Emissions ramp up after June 8 - Conviction lock soon Inference is the compute layer everything runs on. Open competition is how we surface the best. Miners, show us what you've got. 💪 Read more: cacheon.ai/docs
Cacheon tweet media
English
2
4
23
1.6K
Cacheon
Cacheon@cacheon_ai·
We shipped two things over the weekend: a 0.1 TAO miner submission fee and @shadeformai GPU support. Submission fee: Every on-chain commit now costs 0.1 TAO. Goal is to cut spam and add skin in the game. Fee covers GPU rental first; anything left buys SN14 tokens and burns it. Miner workflow is unchanged. Docs: cacheon.ai/docs/miners/re… Shadeform GPUs: Validator can now pull GPUs from Shadeform alongside @TargonCompute and @lium_io for evaluations. More supply, less wait time. Updates on evaluation upgrade coming later this week. Follow our Bittensor Discord channel for ongoing discussions.
Cacheon tweet media
English
1
6
26
1.7K
Cacheon
Cacheon@cacheon_ai·
🎫 Request ID: CU2005260042
English
0
0
2
275
Cacheon ری ٹویٹ کیا
Xavier
Xavier@xavi3rlu·
Day 1 of @cacheon_ai is in the books. Stressful? 100% We shipped, broke things, and found multiple exploits. All in one day. That is how it is supposed to go. We are building the infrastructure that makes inference competitive with centralized providers. That is not a small problem. You do not find the real edges until real miners are running real models on a live subnet. We found them fast and want to patch them even faster. Day 1 stress means people care enough to try to break the system. That is what makes this product stronger. Evaluation and emission are paused until the fixes are in. More updates before end of week. To every miner who stayed in and flagged issues: thank you!
Xavier tweet media
English
1
4
30
2.2K
Cacheon ری ٹویٹ کیا
Xavier
Xavier@xavi3rlu·
Didn't expect 40+ miners on testnet. People ran real inference servers for nothing but early positioning. That kind of participation before any reward is on the table tells you something. To every testnet miner: you built this community before it had a dollar attached. That's why I'm confident @cacheon_ai mainnet will work!
Cacheon@cacheon_ai

Cacheon mainnet is live. 13 inference servers queued, each racing to beat our baseline on a dedicated 8x H200 pod. The winner earns up to $10,000/day. Inference optimization starts today on @Bittensor. Follow along: cacheon.ai/dashboard

English
0
2
35
2.7K
Georgi Gerganov
Georgi Gerganov@ggerganov·
llama.cpp adds MTP for the Qwen3.6 family This is a significant milestone for the local AI ecosystem. The performance jump with these changes is massive and elevates local inference on commodity hardware further. Special thanks to Aman Gupta for leading this development! github.com/ggml-org/llama…
English
48
180
1.2K
273.5K
Cacheon
Cacheon@cacheon_ai·
Cacheon mainnet is live. 13 inference servers queued, each racing to beat our baseline on a dedicated 8x H200 pod. The winner earns up to $10,000/day. Inference optimization starts today on @Bittensor. Follow along: cacheon.ai/dashboard
Cacheon tweet media
English
6
29
90
27.1K
Cacheon
Cacheon@cacheon_ai·
Back-to-back kings crowned in a single eval run on Cacheon testnet. UID 11 took the crown, then UID 20 snatched it minutes later. Scores are still very small since miners are mostly tuning vLLM defaults. The real jump comes when someone ships actual KV cache reuse, better prefill optimizations, or speculative decoding (V2). That's what this subnet is built for. Keep pushing. 🚀 Early dashboard live at cacheon.ai/dashboard. Still a very early version with some rough edges. The site is also open source at github.com/latent-to/cach… and we welcome contributions.
Cacheon tweet media
English
1
3
22
1.6K
Cacheon ری ٹویٹ کیا
Xavier
Xavier@xavi3rlu·
The OpenReview thread on TurboQuant vs RaBitQ is worth reading: – prior work (RaBitQ) not properly addressed, missing apples-to-apples comparison – RaBitQ author flagged it; still not clearly resolved – disputed benchmark setup: RaBitQ on Python single-thread CPU vs TurboQuant on H100 Makes you question how much of the “KV cache gains” are actually real. This optimization space needs stricter evals.
Jianyang Gao@gaoj0017

The TurboQuant paper (ICLR 2026) contains serious issues in how it describes RaBitQ, including incorrect technical claims and misleading theory/experiment comparisons. We flagged these issues to the authors before submission. They acknowledged them, but chose not to fix them. The paper was later accepted and widely promoted by Google, reaching tens of millions of views. We’re speaking up now because once a misleading narrative spreads, it becomes much harder to correct. We’ve written a public comment on openreview (openreview.net/forum?id=tO3AS…). We would greatly appreciate your attention and help in sharing it.

English
1
1
15
2.9K
Cacheon
Cacheon@cacheon_ai·
Had an awesome conversation with @SubnetSummerTAO about what we’re building. Perhaps @xavi3rlu was a bit too pumped about it, but the core idea is simple: make LLM serving faster, cheaper, objectively benchmarked, and production-deployable.
Subnet Summer@SubnetSummerTAO

🔥 Subnet Summer AMA X @cacheon_ai (Subnet 14) 🔥 @xavi3rlu is building a decentralised inference competition network for open-source AI models, Cacheon is Subnet 14 on Bittensor, creating a permissionless benchmarking system to power the next generation of fast, accurate, and trustless AI inference infrastructure. In this episode, we sit down with the team behind Cacheon, a decentralised inference performance subnet built on Bittensor. We cover: - What Cacheon is building and why containerised inference competition matters for the future of open-source AI - How miners compete by submitting Docker-packaged inference servers optimised for speed and correctness when serving open-source models - Decentralised validation: how validators benchmark and score miner submissions in real time to ensure outputs meet quality and performance standards - Cacheon vs centralised inference providers and why the future of model serving should be open, permissionless, and economically incentivised - The role of token incentives in driving continuous performance improvements and attracting world-class inference engineers to the network - How Cacheon is pushing the boundaries of what decentralised compute can deliver for AI applications at scale - Early progress, current network stats, and what's coming next - Roadmap toward becoming the go-to decentralised inference layer for open-source model deployment - Live community Q&A If you're interested in decentralised AI, open-source model serving, GPU compute, or the future of inference infrastructure - this one's for you. youtu.be/noKx3ZHvUlI?si…

English
1
3
30
2.6K