Stephen Roller

6.6K posts

Stephen Roller banner
Stephen Roller

Stephen Roller

@stephenroller

MTS @thinkymachines. previously pre-training @googledeepmind, @character_ai, and @aiatmeta.

NYC Katılım Şubat 2008
1.3K Takip Edilen5.7K Takipçiler
Stephen Roller retweetledi
Cute
Cute@cutecorestar·
.
Cute tweet media
ZXX
4
170
689
11.8K
Stephen Roller retweetledi
Soumith Chintala
Soumith Chintala@soumithchintala·
Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training: scheduling, storage, networking, reliability, and distributed systems at scale. Hiring in NYC and SF job-boards.greenhouse.io/thinkingmachin…
English
26
32
582
49.2K
will depue
will depue@willdepue·
Tired of holding your laptop half open to keep your agents running? Introducing AgentPlug: A USB-C dummy plug that keeps your Mac in clamshell mode by pretending to be an external display! No commands, no security worries (just pull it out to stop!), no hassle.
will depue tweet mediawill depue tweet media
English
485
185
7.5K
1.2M
Stephen Roller retweetledi
Thinking Machines
Thinking Machines@thinkymachines·
People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…
English
444
1.9K
15K
7.1M
Stephen Roller retweetledi
地獄ケーキ(Hokusaist)👹🐉🗡️🇺🇸
This is the kind of shit Godspeed You! Black Emperor would use as an album cover with a title like "And the Holy Flame Blinded us for 333000 Years Pt. II"
地獄ケーキ(Hokusaist)👹🐉🗡️🇺🇸 tweet media
English
51
732
6.7K
180K
Stephen Roller
Stephen Roller@stephenroller·
@BeidiChen imho hash layers are the logistic regression of routers — a surprisingly strong baseline that you absolutely must beat
English
0
1
7
769
Beidi Chen
Beidi Chen@BeidiChen·
@stephenroller That was a compelling debate on hash-layer MoE vs. trainable routers—especially valuable with so many experts weighing in now ~
English
1
0
7
1.2K
Stephen Roller retweetledi
Jason Weston
Jason Weston@jaseweston·
DeepSeek-V4 uses our Hash routing approach developed back in 2021 -- see screenshot of their tech report! (Looks like a great model, congrats!) Bonus note: our same blogpost (& paper) back in 2021 also introduced 'looped transformers', but we called that staircase & ladder (see screenshot): parl.ai/projects/param… huggingface.co/deepseek-ai/De…
Jason Weston tweet mediaJason Weston tweet media
English
0
38
456
31.4K
Stephen Roller retweetledi
Jacob van Gogh
Jacob van Gogh@JayArrVeeGee·
me: Make me the most AI slop image that ever AI slopped. The pinnacle of slop. A seminal work on AI slop. ChatGPT Images 2.0:
Jacob van Gogh tweet media
English
212
199
2.6K
912.1K
Stephen Roller retweetledi
Cat TikToks
Cat TikToks@CatTikToks·
Cat TikToks tweet media
ZXX
2
142
1.3K
16.3K
Stephen Roller retweetledi
Tinker
Tinker@tinkerapi·
Long context windows are now available for select models on Tinker! - 128k tokens for Kimi K2.5 and GPT-OSS-120B - 256k for Nemotron 3 Super 120B and Qwen3.5 397B. For more details and pricing, see our full model lineup: tinker-docs.thinkingmachines.ai/tinker/models/
English
2
6
135
12.6K
Boris Cherny
Boris Cherny@bcherny·
Today we're excited to announce NO_FLICKER mode for Claude Code in the terminal It uses an experimental new renderer that we're excited about. The renderer is early and has tradeoffs, but already we've found that most internal users prefer it over the old renderer. It also supports mouse events (yes, in a terminal). Try it: CLAUDE_CODE_NO_FLICKER=1 claude
Curt Tigges@CurtTigges

@bcherny @UltraLinx please at least fix the uncontrollable scrolling/flickering before the next 3000 features

English
663
705
10.3K
2.9M
Stephen Roller retweetledi
Brydon Eastman
Brydon Eastman@brhydon·
I heard ASL-5 is when the Claude code TUI stops flickering in tmux
English
0
2
16
1.6K
Stephen Roller retweetledi
Cameron 🇺🇸 🗽🦅
Cameron 🇺🇸 🗽🦅@CameronCorduroy·
this is what actual national suicide looks like btw
Cameron 🇺🇸 🗽🦅 tweet media
English
52
1.1K
10.8K
933.7K
Stephen Roller retweetledi
Mira Murati
Mira Murati@miramurati·
Grateful to Jensen and @nvidia team for their support. Together, we’re working to deploy at least 1GW of Vera Rubin systems, bringing adaptable collaborative AI to everyone. thinkingmachines.ai/nvidia-partner…
Mira Murati tweet media
English
169
284
3.9K
558.2K
Stephen Roller retweetledi
Eric Zhang
Eric Zhang@ekzhang1·
times are hard. had to teach an AI researcher how to use kubernetes today
English
31
20
1.1K
76.5K
Stephen Roller retweetledi
Larissa Schiavo
Larissa Schiavo@lfschiavo·
things are gonna get weird. you must get commensurately weird.
English
37
61
567
29.1K
Stephen Roller
Stephen Roller@stephenroller·
@bilaltwovec i like to imagine myself carrying a node out of the burning dc when i do squats
English
1
0
1
96
bilal
bilal@bilaltwovec·
@stephenroller damn i can only bench a single gb200..mo(r)e weight training needed
English
1
0
1
233
Stephen Roller
Stephen Roller@stephenroller·
A fully-loaded nvl72 rack weighs 3,300lbs. Meaning the per-gpu weight is ~45lb, or one standard olympic plate.
English
1
0
11
1.1K