Felipe Vallejo Uribe (@fvallejouribe) - Twitter Profili

Felipe Vallejo Uribe retweetledi

@AppenResearch independently evaluated @subquadratic's SSA kernel - a learned sparse attention mechanism designed to reduce the quadratic scaling limitations of full attention. Results at 1M-token context lengths: - 56.2× wall clock speedup vs. FA2 - 62.8× FLOP reduction

English

3

17

72

35.7K

Felipe Vallejo Uribe retweetledi

Justin Mateen@justinmateen·5 May

The current AI stack has a problem frontier labs don’t want to acknowledge: Doubling the data, 4x the compute costs. SubQuadratic is solving this: – 12M token context (vs 1M today) – 52x faster – less than 5% of the cost This is the most exciting pre launch company I’ve ever invested in. Only $29 million raised to date. Is this the real DeepSeek moment?

Alexander Whedon@alex_whedon

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

English

5

6

20

2.2K

Felipe Vallejo Uribe retweetledi

Subquadratic@subquadratic·6 May

Introducing SubQ. The first fully sub-quadratic LLM with 12M-token context. 150 tokens per second. Get early access at subq.ai

English

30

88

968

87.6K

Felipe Vallejo Uribe retweetledi

Alexander Whedon@alex_whedon·5 May

We were a little slow on this, but we just got a technical blog post up with more details. Please take a look! subq.ai/how-ssa-makes-… We have a model card coming next week, and we are happy to take requests for any specific details there. I am happy to answer any questions

English

69

58

673

264.5K

Felipe Vallejo Uribe retweetledi

Alexander Whedon@alex_whedon·5 May

SubQ is available for early access today, alongside our coding agent, SubQ Code Get access today ↓ subq.ai

English

142

124

1.9K

931.7K

Felipe Vallejo Uribe@fvallejouribe·5 May

RT @subquadratic: The numbers behind the SubQ announcement: Speed: 52x faster than Flash Attention SWE Bench Verified: 81.8% Ruler (128K):…

English

0

3

0

Felipe Vallejo Uribe retweetledi

Alexander Whedon@alex_whedon·5 May

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.