Santa

667 posts

Santa banner
Santa

Santa

@nottherealsanta

consciousness exists to seek beauty

Katılım Aralık 2011
383 Takip Edilen79 Takipçiler
Santa
Santa@nottherealsanta·
still cannot find out what is the intro music from @aiDotEngineer
English
0
0
0
13
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
Monte Carlo Tree Search training corrects the model move by move, while current LLM training only tells it whether the whole trajectory worked. MCTS is preferable if you can get it. But nobody's managed to get MCTS to work for language models. In his blackboard lecture @ericjang11 talked to me about why:
English
28
93
1.1K
175.8K
Santa
Santa@nottherealsanta·
It’s Gemini 3.5 Flash using MTP?
English
0
0
0
36
Santa
Santa@nottherealsanta·
Woah - a question that tells so much about people’s worldview. I wonder how people who act if we had a ‘fake’ poll and then make the results know before having another poll which executes
Tim Urban@waitbutwhy

Everyone in the world has to take a private vote by pressing a red or blue button. If more than 50% of people press the blue button, everyone survives. If less than 50% of people press the blue button, only people who pressed the red button survive. Which button would you press?

English
0
0
0
14
Santa
Santa@nottherealsanta·
NLA + RL is the next thing?
English
0
0
0
3
Santa
Santa@nottherealsanta·
Natural Language Autoencoders = LLM mind reading
English
0
0
0
3
Santa
Santa@nottherealsanta·
context > memory
English
0
0
0
0
Santa
Santa@nottherealsanta·
may be we need a OSI Model like layers for AI
Santa tweet media
English
0
0
1
8
Demis Hassabis
Demis Hassabis@demishassabis·
Hard to believe it’s been 10 years since AlphaGo! It was wonderful to catch up with Lee Sae Dol last week in Korea and join Shin Jin-seo for a special Go match. Great to reminisce about AlphaGo & super interesting to hear how it changed the way players approach the game of Go!
Demis Hassabis tweet mediaDemis Hassabis tweet media
Demis Hassabis@demishassabis

#AlphaGo WINS!!!! We landed it on the moon. So proud of the team!! Respect to the amazing Lee Sedol too

English
115
320
3.5K
359.1K
Santa
Santa@nottherealsanta·
🤔
Santa tweet mediaSanta tweet mediaSanta tweet mediaSanta tweet media
QME
0
0
0
4
Santa retweetledi
kache
kache@yacineMTB·
you can outsource your thinking but you cannot outsource your understanding
English
260
4K
17.6K
2.7M
Santa
Santa@nottherealsanta·
Here’s the quick trick for VRAM: A 27B parameter model in 8-bit quantization needs roughly 27 GB just for the weights. Rule of thumb: • 16-bit (FP16) → ~2 GB / B • 8-bit → ~1 GB / B • 4-bit → ~0.5 GB / B Add overhead for KV-cache, activation etc.
English
0
0
0
15
Santa
Santa@nottherealsanta·
a picture is worth a thousand words but around same amount of tokens
English
0
0
0
7