⠋ Sewer56 ⣠

233

Andrew Feldman@andrewdfeldman·1d

.@cerebras is now running Kimi K2.6 - the leading trillion parameter open source model - at ~1000 tokens per second in enterprise trials. 6.7x faster than the next-fastest GPU cloud. 10x faster than Claude Opus. 3x faster than Gemini Flash 3.5 (Google’s latest fast model). A coding task that typically takes 3 minutes finishes in under 6 seconds on Cerebras. This is what wafer scale was built for.

English

51

669

48.6K

⠋ Sewer56 ⣠@TheSewer56·1d

@tensorwave @wafer_ai @AMD Waffle Powered Inference™ 🧇

English

36

TensorWave@tensorwave·1d

You have to read this one. We just published a recap into how @wafer_ai pushed @AMD inference performance to a level that’s getting the entire ecosystem’s attention and the results are kind of wild. What makes this story interesting isn’t just the performance itself. It’s how they achieved it: systems-level optimization, smart inference tuning, and a belief that AMD can compete at the very highest tier. Proud this work was powered on TensorWave’s AMD-native cloud infrastructure and early #MI355X deployments. tensorwave.com/blog/wafer-rea…

English

5

6

40

4.3K

⠋ Sewer56 ⣠@TheSewer56·1d

@cerebras @ArtificialAnlys @Kimi_Moonshot (Enterprise Only)

English

233

Cerebras@cerebras·1d

Cerebras is now running Kimi K2.6 – a trillion parameter model – in enterprise trials. At ~1,000 tokens/s, this is the fastest frontier model performance ever measured by Artificial Analysis @ArtificialAnlys.

English

166

313

4.2K

784.6K

⠋ Sewer56 ⣠@TheSewer56·4d

@GaddipatiHarsha If in doubt, github.com/xai-org/x-algo… . Feed an LLM at it; then get some optimisation ideas 😉

English

0

50

Harsha Gaddipati@GaddipatiHarsha·4d

Man the x algorithm is cooking my reach Only get 30 views instead of 50 smh

English

0

1

228

⠋ Sewer56 ⣠@TheSewer56·4d

@gpusteve @bronzeagepapi wafer mention

English

2

34

steve@gpusteve·4d

@bronzeagepapi wafer mention

English

0

3

169

Kirito (e/acc) 🏴‍☠️@bronzeagepapi·4d

Wafer scale asic miners

English

0

1

252

⠋ Sewer56 ⣠@TheSewer56·4d

Note that the headline above is a *conservative* value, assuming smaller model size than current GPT5.5 estimate and GPUs. In practice may even be 3x that.

English

46

⠋ Sewer56 ⣠@TheSewer56·4d

Sources: - Typical US Energy Use: eia.gov/tools/faqs/faq… - Estimate of Energy Use on Inference: arxiv.org/abs/2509.20241 arxiv.org/abs/2504.17674

English

Peter Steinberger 🦞@steipete

0

49

⠋ Sewer56 ⣠@TheSewer56·4d

Monthly electricity use here for inference is approximately that of 402 typical US Households.

The latest CodexBar update renders API costs wayyyy nicer. codex.bar

English

0

102

⠋ Sewer56 ⣠@TheSewer56·11 May

@teresajanedavis @Crush40Johnny Awesome 👏

English

Speed of Sound Tour@Crush40SoSTour

0

1

159

TJ Davis@teresajanedavis·11 May

Feel the Sunshine... while you Live and Learn! See you at THE SPEED OF SOUND next February with @Crush40Johnny!

Get ready, we’re about to brighten up your day! ☀️ We are pleased to welcome @teresajanedavis, the iconic singer from the Sonic R soundtrack, as a special guest for the London Speed of Sound show! More info below 🧵

English

10

84

539

19.7K

⠋ Sewer56 ⣠@TheSewer56·11 May

@gpusteve @Swolav @0xeecS Do we also need to implement own OS and heap allocator? >w< Always a matter of how low level you want to go.

English

0

3

98

steve@gpusteve·11 May

@Swolav @0xeecS no one let me use it :(

English

0

2

598

steve@gpusteve·10 May

if u can’t implement lfu cache in 30 min, ur ngmi. source: this was a screen for a 500k new grad role

English

8

12

683

92.8K

⠋ Sewer56 ⣠@TheSewer56·9 May

@gpusteve hello world

English

1

15

steve@gpusteve·9 May

@TheSewer56 hi

Sonic Racing: CrossWorlds@RaceCrossWorlds

0

1

23

⠋ Sewer56 ⣠@TheSewer56·30 Nis

The reason:

We are aware of the ongoing issues that players cannot access the Sonic Racing: CrossWorlds Demo for the Free Weekend period. We are currently investigating and will provide updates when available. We apologize for the inconvenience and thank you for understanding.

English

6

33

847

⠋ Sewer56 ⣠@TheSewer56·6 May

@FireworksAI_HQ It's never not time for a little friendly competition😉

English

1

29

⠋ Sewer56 ⣠@TheSewer56·6 May

@FireworksAI_HQ Ooh! Competition! >w< Let's gooo 🎉

English

0

1

34

⠋ Sewer56 ⣠@TheSewer56·5 Nis

GLM-5 (Fast) on @FireworksAI_HQ is the fastest GLM-5 I've used so far. Pretty sure this is a hidden Easter Egg 🥚 from the folks at Fireworks 🎆, available for Fire Pass users. 110-120 TPS is one thing, but TTFT (response time) is stupid fast. Not sponsored, just impressed.

English

1

8

504

⠋ Sewer56 ⣠@TheSewer56·5 May

@FireworksAI_HQ PS. Wafer is even faster. 🙂

English

0

85

⠋ Sewer56 ⣠@TheSewer56·5 Nis

@FireworksAI_HQ It's `fireworks-ai/accounts/fireworks/routers/glm-5-fast` btw 😉

English

0

1

305

⠋ Sewer56 ⣠@TheSewer56·30 Nis

@RaceCrossWorlds Here is the reason.

English

8

576

9.1K

Sonic Racing: CrossWorlds@RaceCrossWorlds·30 Nis

We are aware of the ongoing issues that players cannot access the Sonic Racing: CrossWorlds Demo for the Free Weekend period. We are currently investigating and will provide updates when available. We apologize for the inconvenience and thank you for understanding.

English

82

128

1.2K

194.3K

⠋ Sewer56 ⣠@TheSewer56·24 Nis

@Zai_org @deepseek_ai Respect 🙏

English

935

Z.ai@Zai_org·24 Nis

@deepseek_ai Really impressive work! If you need a higher rate limit to keep those evals moving forward, we are definitely here to support you.

English

29

1.6K

129.9K

DeepSeek@deepseek_ai·24 Nis

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

English

1.6K

7.7K

45.3K

9.7M

⠋ Sewer56 ⣠@TheSewer56·24 Nis

@victor207755822 The folks at DeepSeek are simply built different (Speciale), I sometimes feel like. They straight release some pretty radical tech, extensive reports and even bring perf patches for SGLang day one so people can run it well. Simply incredible.

English

2

632

Deli Chen@victor207755822·24 Nis

DeepSeek-V3: Dec 26, 2024 DeepSeek-V4: Apr 24, 2026 484 days later, we humbly share our labor of love. As always, we stay true to long-termism and open source for all. AGI belongs to everyone. ❤️🌍 #DeepSeekV4 #AGIforEveryone #OpenSource

DeepSeek@deepseek_ai

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n

English

352

1.3K

13.1K

1M

⠋ Sewer56 ⣠@TheSewer56·19 Nis

@brendonovich @robertmclaws @LukeParkerDev I got a good chunk of it done over at github.com/Sewer56/llm-co… for server use; pretty blazing fast too (I got the benchmarks to prove it). 11MiB binary, 11MiB PSS, compared to 450MiB of `opencode serve`. etc. Haha. Even went out the way to optimize down to tool calls.

English

63

brendan@brendonovich·19 Nis

@robertmclaws @LukeParkerDev rust rewrite or bust 🦀🦀🦀

English

0

10

2.9K

brendan@brendonovich·19 Nis

x.com/i/article/2045…

ZXX

42

69

1K

132.8K

⠋ Sewer56 ⣠@TheSewer56·14 Nis

@thdxr There's no LSP, ACP. MCP only via code ATM. No skill loading API. Outside of that, all the core functionality is there. From custom tools to permissions to agents to models.dev, etc. And some extra additions, e.g. tool settings per agent. Heavily optimized.

English

2

55

⠋ Sewer56 ⣠@TheSewer56·14 Nis

@thdxr Honestly, feel free to let it rip with mine github.com/Sewer56/llm-co… if you want to experiment. The important stuff's already there and things like agents are 99% drop-in compatible. Ready for intiial release. I just have wiki to complete next weekend.

English