Outcome School

1.6K posts

Outcome School banner
Outcome School

Outcome School

@outcome_school

Get High-Paying Tech Job. Software engineers like you join Outcome School to achieve the outcome that is a high-paying tech job.

Internet Katılım Haziran 2024
2 Takip Edilen1K Takipçiler
Sabitlenmiş Tweet
Outcome School
Outcome School@outcome_school·
Our recent 6 articles on X: - KV Cache in LLMs - Paged Attention in LLMs - Causal Masking in Attention - Byte Pair Encoding in LLMs - Harness Engineering in AI - Math behind Attention - Q, K, and V X is a knowledge sharing platform.
Amit Shekhar@amitiitbhu

x.com/i/article/2039…

English
0
1
8
538
Outcome School retweetledi
Amit Shekhar
Amit Shekhar@amitiitbhu·
Why do frameworks fuse softmax and cross-entropy into one operation? Because exp() of large numbers overflows, and log() of tiny numbers underflows. The fused version avoids both. Math that works on paper can break on hardware.
Amit Shekhar@amitiitbhu

x.com/i/article/2046…

English
0
4
19
1.3K
Outcome School
Outcome School@outcome_school·
Chain-of-Thought vs Direct Answer. Same question. Two different ways to ask the LLM.
GIF
English
0
3
2
250
Outcome School retweetledi
Amit Shekhar
Amit Shekhar@amitiitbhu·
Cross-entropy penalizes hedging moderately and confident mistakes severely. Saying ‘33% each’ when unsure is okay. Saying ‘97% wrong answer’ is catastrophic. The log makes this distinction automatic.
Amit Shekhar@amitiitbhu

x.com/i/article/2046…

English
2
6
61
5.4K
Outcome School retweetledi
Pallavi
Pallavi@pallavishekhar_·
The Multi-Agent Handoff One agent cannot do everything well. Hence, specialists come into the picture. Multi-Agent Handoff = delegate + work + return. Here is the flow: - The user asks Agent A a question. - Agent A plans and decides to hand the work off. - Agent A delegates the subtask to Agent B. - Agent B does the work. - Agent B returns the result to Agent A. - Agent A returns the final answer to the user. Agent A is the coordinator. Agent B is the specialist. Each one does what it is best at. That is the power of the handoff pattern. Use case: A research agent hands off math problems to a math specialist agent. Watch the GIF for the full flow.
GIF
English
0
2
5
472
Outcome School retweetledi
Amit Shekhar
Amit Shekhar@amitiitbhu·
Cross-entropy answers the simplest question in machine learning: how surprised are you by the correct answer? Not surprised at all? Low loss. Very surprised? High loss. -log(p) quantifies surprise.
Amit Shekhar@amitiitbhu

x.com/i/article/2046…

English
0
10
77
5.1K
Outcome School retweetledi
Amit Shekhar
Amit Shekhar@amitiitbhu·
The next decade belongs to three domains: • Math & research in AI • Data centers • Robotics Pick one. Go deep.
English
14
18
267
9.5K
Outcome School retweetledi
Amit Shekhar
Amit Shekhar@amitiitbhu·
Physics tells you what's happening. Math tells you why it had no choice.
English
4
3
35
1.1K
Outcome School retweetledi
Pallavi
Pallavi@pallavishekhar_·
A model that says '97% cat' when the answer is dog gets punished 100x harder than a model that says '50% cat'. Cross-entropy does not just penalize mistakes. It destroys overconfidence.
Amit Shekhar@amitiitbhu

x.com/i/article/2046…

English
0
1
2
86
Outcome School retweetledi
Pallavi
Pallavi@pallavishekhar_·
The feed-forward network is the most underrated part of the Transformer. It holds most of the parameters, stores most of the knowledge, and runs on every single token. Yet we barely talk about it.
Amit Shekhar@amitiitbhu

x.com/i/article/2043…

English
0
5
14
2K
Outcome School
Outcome School@outcome_school·
Q × Kᵀ tells the model how relevant every word is to every other word. Softmax turns that into probabilities. V delivers the actual content. One formula. Three steps. The entire foundation of modern AI.
Amit Shekhar@amitiitbhu

x.com/i/article/2039…

English
0
10
58
5.7K