Stijn
3.9K posts

Stijn
@StijnSmits
Fouding AI Engineer @ Schematik ex-Zyphra 2x marathoner


GPT-5.5 Instant is starting to roll out in ChatGPT. It’s a big upgrade, giving you smarter, clearer, and more personalized answers in a warmer, more natural tone. And it's also more concise, which we heard you wanted. We think you'll love chatting with it.




Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.



Introducing folded Tensor and Sequence Parallelism (TSP), a new way to split large models across GPUs that achieves lower per-GPU peak memory than any standard parallelism scheme. Scaled on @AMD MI300x. Bigger models, longer contexts, and higher throughput 🧵













