Documenting AGI
119 posts

Documenting AGI
@DocumentingAGI
Tracking the breakthroughs, systems, and moments accelerating the world toward AGI.
Earth Katılım Kasım 2016
17 Takip Edilen66.1K Takipçiler

@alex_whedon Less than 5% of the cost of Opus - very impressive
English

Introducing SubQ - a major breakthrough in LLM intelligence.
It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA),
And the first frontier model with a 12 million token context window which is:
- 52x faster than FlashAttention at 1MM tokens
- Less than 5% the cost of Opus
Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention).
Only a small fraction actually matter.
@subquadratic finds and focuses only on the ones that do.
That's nearly 1,000x less compute and a new way for LLMs to scale.
English

For a while, “long context” mostly meant bigger numbers and bigger bills. SubQ is making a different pitch: long-context LLMs do not need to get this expensive to stay useful. The model outperforms Opus 4.6 on long-context benchmarks at less than 10% of the cost.
Alexander Whedon@alex_whedon
Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.
English







