J Allen
7K posts

J Allen
@BTC_Yogi
Partner to @siereina_ - Father - Yogi - BTC Maxi - Explorer of Worlds - Builder of Things.



The speed-of-light optimization for Qwen3.5 on the TokenSpeed inference engine is a significant milestone, achieving a record-breaking 580 tokens per second (tps) for agentic workloads on NVIDIA GPUs. In the PyTorch Foundation's latest community blog post, you can learn all about the complete design, implementation, and optimization of Qwen3.5 models in the TokenSpeed inference framework and see for yourself how this work is improving performance 👉 bit.ly/4uGUvIS This achievement was a joint effort between the @Alibaba_Qwen inference team, @lightseekorg Foundation TokenSpeed team, @NVIDIAAI , and the Mooncake team, with special contributions from @tri_dao for FlashAttention-4 (FA4) optimization. @KVCache_AI




Clashes break out at protest in Norwich after seven Afghan migrants are charged with rape and child sexual abuse offences trib.al/XG7CNLx


Moderate Muslims, please speak up !





November, here we come.




