Dan Kondratyuk

521 posts

Dan Kondratyuk banner
Dan Kondratyuk

Dan Kondratyuk

@hyperparticle

Co-Founder. Prev. Research Scientist at @LumaLabsAI (Realtime Video World Models, Ray), @GoogleAI (VideoPoet). Let's automate research!

Mountain View, CA Katılım Mart 2015
651 Takip Edilen2.2K Takipçiler
Sabitlenmiş Tweet
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
Today we are launching Dream Machine, our first AI model that generates cinematic and fluid videos from text instructions and images. I generated this 1-minute 60 fps video entirely from our model. Try Dream Machine → lumalabs.ai/dream-machine Join us → lumalabs.ai/join
English
25
51
415
39K
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
Coding agents are great at writing new code, but pretty bad at deleting code. It's what inevitably leads to a lot of bloat over time. Deleting code is the halmark of a great senior engineer, i.e., one that can write the least amount of code to get the job done. In my mind that's what's missing to make them robust at building good software.
English
0
1
2
46
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
@alex_whedon Let me guess, you're already thinking of breaking the 100M token context barrier :)
English
0
0
0
122
Alexander Whedon
Alexander Whedon@alex_whedon·
Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.
English
1.5K
2.9K
23.1K
12.7M
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
@willccbb If you make compaction differentiable, maybe you can learn the optimal compaction strategy across most tasks you care about
English
0
0
0
263
will brown
will brown@willccbb·
why aren't more people studying self-compaction at artificially low context lengths. there's no reason you can't benchmaxx math RL with 4k tokens across many turns
English
28
16
527
50.1K
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
@JiaweiYang118 Nice result. Makes me wonder what is the "ultimate form" of loss we should be optimizing
English
0
0
0
179
Jiawei Yang
Jiawei Yang@JiaweiYang118·
Two months ago, I vaguely posted a number: 0.9 FID, one-step, pixel space. Now it is 0.75, and can be even lower. Many wonder how. I thought it might end as a small FID prank: simple and deliberate. It started with one question: can FID be optimized directly, and what does it reveal? Introducing FD-loss.
Jiawei Yang tweet media
English
55
156
922
211.1K
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
@vai_viswanathan I like to think of world models as simulations of some environment, able to represent what comes next. Most commonly it's seen as something visual or tangible (video/3D/robotics/etc), but perhaps LLMs that simulate OS (e.g., shell envs) might also be considered as world models.
English
1
0
2
52
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
As with any new software, it's still going to have some rough edges. But I put a lot of checks/manual reviews in place to make sure the code quality is to a good standard: 90+% test coverage, fully typed, docstrings that explain intent, and lots of examples.
English
0
0
0
85
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
I wanted to speedrun how fast I could OSS a complete Python package that solves a non-trivial, important job, and I managed to pull it off in about a day. I've never felt so productive writing software, especially complete packages. The most joy I've felt in a long time.
Dan Kondratyuk tweet media
English
2
0
2
236
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
The cost of software is starting to get really cheap. What took an entire dev team months can soon be accomplished with a single determined person. I suspect we're going to start seeing a proliferation of apps with weird and crazy ideas that wouldn't have been tried until now.
English
0
1
2
170
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
One thought that scares me a bit: the proliferation of "AI Viruses": tiny coding agents which can break into unsecured systems, replicate themselves, and adapt/mutate over time. And like real viruses, might hide, spread repeatedly like a botnet and impossible to fully eradicate.
English
0
0
1
178
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
Our team has developed a new diffusion distillation technique which is overall much simpler and more robust than prior methods, and scales well to large model training. We make the code and paper freely available github.com/lumalabs/tvm
Luma@LumaLabsAI

Introducing Terminal Velocity Matching: a scalable, single-stage generative training method that delivers diffusion-level quality with a 25× fewer inference steps, now trained at 10B+ scale. lumalabs.ai/blog/engineeri…

English
0
0
9
1.2K
Dan Kondratyuk
Dan Kondratyuk@hyperparticle·
It took an incredible amount of energy to get here, but now we're ready to unleash Ray3, our new frontier video model with reasoning capabilities. I especially love the HDR video generations, the colors and lighting just pop in ways that make SDR look dull. Check it out!
Luma@LumaLabsAI

This is Ray3. The world’s first reasoning video model, and the first to generate studio-grade HDR. Now with an all-new Draft Mode for rapid iteration in creative workflows, and state of the art physics and consistency. Available now for free in Dream Machine.

English
1
3
18
2K
Yiheng Li
Yiheng Li@Yiheng_Li_Cal·
🎉Introducing Improved Immiscible Diffusion - Accelerating Diffusion Training by Reducing Its Miscibility. 🔥 Supported by detailed feature analysis, we further clarify that the miscibility problem, i.e. the mix of diffusion paths of different images during training, reduces the training efficiency. 🤔 Based on this, we design a new KNN implementation, which not only is efficient (unrelated to batch sizes) but also performs well in diverse baseline models, especially in flow matching. 🤩 We hope our miscibility problem lights the way for further improving diffusion training efficiency. ✈️ arxiv.org/abs/2505.18521
Yiheng Li tweet media
English
7
27
105
16.4K