Matt Krzus

566 posts

Matt Krzus banner
Matt Krzus

Matt Krzus

@mattkrzus

research enjoyoor. jiu jitsu black belt. roadhouse is the greatest movie ever made.

Chicago Katılım Mayıs 2010
1.7K Takip Edilen103 Takipçiler
Matt Krzus
Matt Krzus@mattkrzus·
i hate the gi (just kidding bb i love you)
Matt Krzus tweet media
English
0
0
0
9
Matt Krzus
Matt Krzus@mattkrzus·
come hang - i’ll be giving free leglocks
uRun@urunml

urun.sh launch party - Wednesday, April 29 · 6PM: 🕹️ Arcade games 🍹 Open bar 💻 Live demos 🥽 Meta Quest Giveaway Spots are limited - click the link to grab your invite. 👉 luma.com/3vemq53b

English
0
0
1
38
Matt Krzus
Matt Krzus@mattkrzus·
as i try to solve the problem of making a robot lawyer go brrr, i found myself on a small grpo sidequest tldr: its probably a nothing burger, but i hacked together a faster than vanilla grpo implementation w triton/cuda streams. it was beating torch compile, so thats cool
English
2
0
1
58
Matt Krzus
Matt Krzus@mattkrzus·
mech interp has been growing on me as of late so i wrote some code/words on replicating the circuit kings
Matt Krzus tweet media
English
1
0
3
54
Matt Krzus
Matt Krzus@mattkrzus·
Matt Krzus tweet media
DeepSeek@deepseek_ai

🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster ⚡ 40+ GiB/s peak throughput per client node for KVCache lookup 🧬 Disaggregated architecture with strong consistency semantics ✅ Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 📥 3FS → github.com/deepseek-ai/3FS ⛲ Smallpond - data processing framework on 3FS → github.com/deepseek-ai/sm…

ZXX
0
0
0
134
Matt Krzus
Matt Krzus@mattkrzus·
based whale. i’ve got like 20 tweets and half are whale quote tweets.
DeepSeek@deepseek_ai

🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 github.com/deepseek-ai/Du… ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗 github.com/deepseek-ai/ep… 📊 Analyze computation-communication overlap in V3/R1. 🔗 github.com/deepseek-ai/pr…

English
1
0
3
455
Matt Krzus
Matt Krzus@mattkrzus·
based whale
DeepSeek@deepseek_ai

🚀 Day 0: Warming up for #OpenSourceWeek! We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented, deployed and battle-tested in production. As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey. Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.

English
0
0
1
71
kalomaze
kalomaze@kalomaze·
which of these was your "first" exposure to the concept of machine learning
English
120
1
121
69.9K
Matt Krzus
Matt Krzus@mattkrzus·
wrote a silly little blog about finding second amendment attn heads w/in the qwen-r1 distill
English
2
0
7
172
Matt Krzus
Matt Krzus@mattkrzus·
this is probably the most uniquely human thing: to turn off your moral compass and do your job. it’s also, probably, most attainable by open-source models where you can turn on/off constitutional features.
English
1
0
1
114