@ericzakariasson Totally awesome. Completely replaced Claude 4.7 and Codex. One job refactoring a repo took hours with Claude. Took minutes with Composer 2.5. Keep going. Love it.
what do you think of composer 2.5 so far? how can we make the next model even better?
want to hear your feedback on behavior, speed, quality, whatever!
Uh Julia...this is from Google (via Gemini 3.5 Flash). Not sure your social attention vs transformer attention difference holds. "Imagine you're at a massive, chaotic cocktail party, and you're trying to have a conversation with someone.
There are a hundred other people talking at the same time. If you tried to listen to every single word from every single person with equal weight, your brain would melt.
Instead, you do something clever: you pay attention. You tune out the guy bragging about his crypto portfolio, ignore the background music, and laser-focus on the specific words coming out of the person in front of you.
That is exactly what Transformer Attention (specifically "Self-Attention") does for AI.
The Old Way vs. The Attention Way
Before transformers, AI read text like a conveyor belt. It looked at word one, then word two, then word three. By the time it got to the end of a long paragraph, it had already started to "forget" what happened at the beginning.
Transformers don't do that. They look at the whole sentence all at once, and they figure out which words dynamically relate to each other.
The Classic Example: "It"
Look at these two sentences:
The animal didn't cross the street because it was too tired.
The animal didn't cross the street because it was too wide.
As a human, you instantly know that in sentence one, "it" refers to the animal. In sentence two, "it" refers to the street.
An old AI would get incredibly confused by this. A Transformer uses Attention to solve it. When processing the word "it," the model looks at every other word in the sentence and asks, "Who gives me the most context right now?"
In the first sentence, the word "tired" shines a giant spotlight back on animal.
In the second sentence, the word "wide" shines that same spotlight on street.
How It Works Behind the Scenes (The Dating App Analogy)
In the actual code, every single word goes through a matching process. Think of it like a dating app algorithm where words are trying to find their best matches. Every word is given three roles:
The Query (What I'm looking for): The word says, "Hey, this is my current vibe, and this is what I need to make sense of myself."
The Key (What I offer): The word says, "This is my identity and what I bring to the table if someone else needs me."
The Value (My actual meaning): The raw information the word contributes once a match is made.
The AI multiplies the Queries and the Keys together to calculate an "attention score" (a match percentage). If the match is high, it pulls in a lot of that word's Value.
When you type a prompt into an AI, it runs this matching game for every single word, across multiple layers, hundreds of times simultaneously (called Multi-Head Attention). It looks at the sentence from a grammatical angle, a tense angle, a emotional angle, and so on.
The result? The AI doesn't just read the words; it understands the relationships between them."
This is why we need to gate-keep science. Two pseudo-intellectuals thinking they discovered something deep, conflating
>social attention
>transformers attention
>quantum physics observer (attention)
These have nothing in common, other than the ambiguity of English language. Naked ladies on Instagram have nothing to do with a weighted average followed by softmax.
But they're both so mind-blown by their discovery. Dunning–Kruger will only get amplified by AI sycophancy.
Please call me out if you see me going beyond my own DK threshold.
@juliarturc Funny. I frequently code in Cursor with a Hallmark movie in the background. I may have even watched a video or two on 4bit LLM models you did while watching a movie. I'm curious, does your coding get better when they start showing Christmas movies?
🚨 BREAKING: OpenAI unveils "Quark" - a wild new AI device designed by Jony Ive!
👂 Features Ferengi-like ear pieces & forehead scanner
⚡ 8-hour battery, 680g weight
💰 $4,000 price tag + ChatGPT5 subscription
The future of wearable AI is here... and it looks absolutely bonkers!
#OpenAI#AI#TechNews#JonyIve#Quark#ChatGPT5#WearableTech
Maybe the SecDef should use Grok 3 more and Signal less...
In summary, the Secretary of Defense using Signal for official communications poses multiple problems: it’s an unauthorized platform for sensitive matters, leaves personal devices vulnerable to hacking, increases the risk of human error, ignores explicit DoD warnings, violates federal recordkeeping requirements, and creates an impression of recklessness. While Signal is excellent for civilian privacy, it falls short of the rigorous security, legal, and operational standards required for such a critical role, making its use a significant concern.