Matt Krzus

566 posts

Matt Krzus

@mattkrzus

research enjoyoor. jiu jitsu black belt. roadhouse is the greatest movie ever made.

Chicago Katılım Mayıs 2010

1.7K Takip Edilen103 Takipçiler

Matt Krzus@mattkrzus·2 May

sidakov has the most teachable system of all time. just so good.

Wrestling Rabbit Hole@wrestlingrh

Sidakov's Handfight System youtu.be/lV_8E3C8Bv8

English

Matt Krzus@mattkrzus·29 Nis

i hate the gi (just kidding bb i love you)

English

Matt Krzus@mattkrzus·27 Nis

come hang - i’ll be giving free leglocks

uRun@urunml

urun.sh launch party - Wednesday, April 29 · 6PM: 🕹️ Arcade games 🍹 Open bar 💻 Live demos 🥽 Meta Quest Giveaway Spots are limited - click the link to grab your invite. 👉 luma.com/3vemq53b

English

Matt Krzus@mattkrzus·9 Mar

this is why i have a black belt. when my job is completely automated, at least i can fallback on teaching people how to enter cross ashi from seated positions.

himanshu@himanshustwts

a significant % of ml researchers might be hooked by what happened in ONE day. ai seems to be doing a research loop fascinatingly well (understand the problem + propose a change + train/test it + measure results + keep the better version + repeat) and genuinely reducing research friction. we are early to automated experimentation, frontier scale could be an interesting watch.

English

Matt Krzus@mattkrzus·9 Eki

blog: mattkrzus.com/posts/flashgrp… code: github.com/KayneWest/flas…

English

Matt Krzus@mattkrzus·9 Eki

as i try to solve the problem of making a robot lawyer go brrr, i found myself on a small grpo sidequest tldr: its probably a nothing burger, but i hacked together a faster than vanilla grpo implementation w triton/cuda streams. it was beating torch compile, so thats cool

English

Matt Krzus@mattkrzus·16 Nis

link to lib: github.com/Safyrus/NES_PW…

English

Matt Krzus@mattkrzus·16 Nis

based i had been grpo-ing over reasoning tasks from ace attorney need more legal games like something fierce

Hao AI Lab@haoailab

When Ilya Sutskever once explained why next-word prediction leads to intelligence, he made a metaphor: if you can piece together the clues and deduce the criminal’s name on the last page, you have a real understanding of the story. 🕵️‍♂️ Inspired by that idea, we turned to Ace Attorney to test AI's reasoning. It’s the perfect stage: the AI plays as a detective to collect clues, expose contradictions, and uncover the truth. We put the latest top AI models—GPT-4.1, Gemini 2.5 Pro, Llama-4 Maverick, and more—to the test in Ace Attorney, to see if they could shout Objection! ⚖️, turn the case around, and uncover the truth behind the lies.

English

283

Matt Krzus@mattkrzus·13 Nis

mattkrzus.com/posts/circuit_…

ZXX

Matt Krzus@mattkrzus·13 Nis

mech interp has been growing on me as of late so i wrote some code/words on replicating the circuit kings

English

Matt Krzus@mattkrzus·1 Mar

DeepSeek@deepseek_ai

🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks. ⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster ⚡ 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster ⚡ 40+ GiB/s peak throughput per client node for KVCache lookup 🧬 Disaggregated architecture with strong consistency semantics ✅ Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1 📥 3FS → github.com/deepseek-ai/3FS ⛲ Smallpond - data processing framework on 3FS → github.com/deepseek-ai/sm…

ZXX

134

Matt Krzus@mattkrzus·27 Şub

the race to give your legal model the actual personality of a real life judge, with that judge’s morals/ethics/etc is on. c thomas, posner, scalia, etc.

Nathan Lambert@natolambert

In 2023 and 2024 labs perfected the listicle with post-training/rlhf. In 2025 the personality training of models is on center stage. There's almost 0 academic work on Character Training and almost 0 work on the web writ large. We need to change that - it starts with this post.

English

Matt Krzus@mattkrzus·27 Şub

@Ishaank1999 they clappin

English

Ishaan Kapoor@Ishaank1999·27 Şub

@mattkrzus deepcheeks my fav llm fr

English

Matt Krzus@mattkrzus·27 Şub

based whale. i’ve got like 20 tweets and half are whale quote tweets.

DeepSeek@deepseek_ai

🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 github.com/deepseek-ai/Du… ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗 github.com/deepseek-ai/ep… 📊 Analyze computation-communication overlap in V3/R1. 🔗 github.com/deepseek-ai/pr…

English

455

Matt Krzus@mattkrzus·24 Şub

xmas in jan

lujianqiao@lujianqiao3

🚀 Exciting News! We’ve just released [Native Sparse Attention Triton](github.com/XunhaoLai/nati…) – a full Triton-based implementation of NSA (fwd + bwd) with guaranteed efficiency! 🚀

English

Matt Krzus@mattkrzus·24 Şub

bunch of beauties

DeepSeek@deepseek_ai

🚀 Day 1 of #OpenSourceWeek: FlashMLA Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production. ✅ BF16 support ✅ Paged KV cache (block size 64) ⚡ 3000 GB/s memory-bound & 580 TFLOPS compute-bound on H800 🔗 Explore on GitHub: github.com/deepseek-ai/Fl…

English

241

Matt Krzus@mattkrzus·21 Şub

based whale

DeepSeek@deepseek_ai

🚀 Day 0: Warming up for #OpenSourceWeek! We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency. These humble building blocks in our online service have been documented, deployed and battle-tested in production. As part of the open-source community, we believe that every line shared becomes collective momentum that accelerates the journey. Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.

English

Matt Krzus@mattkrzus·17 Şub

@kalomaze LDA stans unite

kalomaze@kalomaze·17 Şub

which of these was your "first" exposure to the concept of machine learning

English

120

121

69.9K

Matt Krzus@mattkrzus·13 Şub

@Ishaank1999 ty sir

English

177

Ishaan Kapoor@Ishaank1999·13 Şub

@mattkrzus Very cool

English

Matt Krzus@mattkrzus·13 Şub

wrote a silly little blog about finding second amendment attn heads w/in the qwen-r1 distill

English

172

Matt Krzus@mattkrzus·13 Şub

mattkrzus.com/posts/constitu…

ZXX

844

Matt Krzus@mattkrzus·13 Şub

this is probably the most uniquely human thing: to turn off your moral compass and do your job. it’s also, probably, most attainable by open-source models where you can turn on/off constitutional features.

English

114

Keşfet

@Ishaank1999 @kalomaze @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA