David Nordström

116 posts

David Nordström

@davnords

PhD. Student @ Chalmers Computer Vision and Deep Learning Code: https://t.co/R54G3HJNlo

Katılım Şubat 2022

104 Takip Edilen24 Takipçiler

Sabitlenmiş Tweet

David Nordström@davnords·28 Haz

Want stronger Vision Transformers? Use octic-equivariant layers (arxiv.org/abs/2505.15441). TLDR; We create ViTs equivariant to the (octic) group of 90-degree rotations and reflections and show their strength on DeiT III + DINOv2. Code: github.com/davnords/octic…

English

230

David Nordström@davnords·5d

@chrisoffner3d Jia Deng = data god

Indonesia

Chris Offner@chrisoffner3d·5d

arxiv.org/abs/2603.24836

ZXX

310

Chris Offner@chrisoffner3d·5d

"Trained exclusively on synthetic data, [WAFT-Stereo] achieves the best BP-0.5 on the ETH3D benchmark among all existing submissions, corresponding to an 81% error reduction over the strongest established zero-shot baseline."

English

2.2K

David Nordström@davnords·6d

@chrisoffner3d @Parskatt @eric_dexheimer @jianyuan_wang My intitial thinking was that it was something like this, i.e.. same as NLL loss in RoMa v2, guess we will see what they are cooking once legal has been sorted out :). I guess it is a little unclear how you do this for arbitrary sequence lengths.

English

Chris Offner@chrisoffner3d·30 Mar

@Parskatt @eric_dexheimer @davnords @jianyuan_wang So just the same contrastive loss (InfoNCE) as in MASt3R except at the patch level instead of the pixel level? The talk mentions this slide in the context of replacing unnecessary (and computationally expensive) heads with multi-task losses, so I'd expect something different.

English

Chris Offner@chrisoffner3d·30 Mar

Already the next question during Christian's talk on attention-based cross-view patch matching was "If that's what it's doing, can't we just directly supervise the attention maps to strictly enforce correct matches?" This is very characteristic of the 3D vision community. ;)

Chris Offner@chrisoffner3d

@anand_bhattad I'd rephrase it to "We _think_ we know what the algorithm should be doing." because, if we fully knew what it should be doing, we wouldn't need ML. I love this interpretability work but it runs the risk of seducing people into imposing classical methods onto learned models.

English

6.8K

David Nordström@davnords·29 Mar

@gabriberton I guess this one should not work. Really nice tip though, will certainly come in handy.

English

David Nordström@davnords·29 Mar

@gabriberton Does this work for LightGlue-type loss where you supervise each layer or is this then the case of "entangled" terms?

English

Gabriele Berton@gabriberton·28 Haz

I can't stress enough how useful this trick has been for me in all these years It reduces GPU memory by N equal the number of losses, at literally no cost (same speed, exactly same results down to the last decimal digit) For example ... [1/2]

Gabriele Berton@gabriberton

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

English

203

2.7K

286.3K

David Nordström@davnords·27 Mar

@gabriberton @GoogleDeepMind Big!

Gabriele Berton@gabriberton·27 Mar

I have joined @GoogleDeepMind! I'll be training VLMs And I'll still keep posting about latest developments on AI, Computer Vision and LLMs So no more posts on PyTorch tricks. I might post about JAX. Stay tuned...

English

122

3.6K

145.3K

David Nordström@davnords·25 Mar

@lucasmaes_ E.g. even this claim "Prior JEPA methods avoid collapse through heuristics or tricks" seems overly strong when considering that this is the same claim made in LeJEPA, which constitutes prior work.

English

David Nordström@davnords·25 Mar

@lucasmaes_ Great work and very nice with such a small model. However, were JEPAs not already 'easy to train' after LeJEPA was introduced (in a previous paper)? I think this is not clearly stated in this thread, nor in the paper.

English

Lucas Maes@lucasmaes_·23 Mar

JEPA are finally easy to train end-to-end without any tricks! Excited to introduce LeWorldModel: a stable, end-to-end JEPA that learns world models directly from pixels, no heuristics. 15M params, 1 GPU, and full planning <1 second. 📑: le-wm.github.io

English

103

539

3.9K

909.2K

David Nordström retweetledi

Gabriele Berton@gabriberton·17 Mar

VisMatch is on pypi! VisMatch is a wrapper for image matching models, like LightGlue, RoMa-v2, MASt3R, LoFTR, and 50+ more! It's literally as simple as: pip install vismatch vismatch-match --inputs img0 img1 --matcher choose_any To run image matching on any 2 images [1/4]

English

417

50K

David Nordström@davnords·12 Mar

@ducha_aiki I should probably read the paper.... but I like ImLoc so something like it with feedforward reconstruction methods would be nice

English

David Nordström@davnords·12 Mar

@ducha_aiki Hehe sounds that way... Though if you think about it in a similar way as ImLoc they might mean you never have the full scene reconstructed but just query images and create a map near the query?

English

Dmytro Mishkin 🇺🇦@ducha_aiki·10 Mar

Am I stupid, or the idea is to "make offline processing online"?

English

3.3K

David Nordström@davnords·3 Mar

@Parskatt @ZGojcic Good question....

English

204

Johan Edstedt @Parskatt·3 Mar

@ZGojcic Why don't you cite R3D2? research.zenseact.com/publications/R…

English

1.8K

Zan Gojcic@ZGojcic·2 Mar

We're releasing DiffusionHarmonizer, an online diffusion enhancer bridging neural reconstruction and photorealistic simulation by correcting artifacts, and harmonizing inserted objects so they truly belong in the scene: matching shadows, lighting & color research.nvidia.com/labs/sil/proje…

English

276

45.2K

David Nordström@davnords·3 Mar

@Parskatt Hmm.... (from research.zenseact.com/publications/R…)

English

842

David Nordström retweetledi

Johan Edstedt @Parskatt·3 Mar

They basically copied an existing paper without citing, not a good look. arxiv.org/abs/2506.07826

Zan Gojcic@ZGojcic

English

17.5K

David Nordström@davnords·2 Mar

@WilliamHolmbe19 99% of people's launches: chirping sound Wavh's launches: Chinese tiktok go brrrrr

English

William Holmberg@WilliamHolmbe19·2 Mar

When an influencer with millions of followers drops a video of your app and your app is not production ready... and your cloud-bill reaches 1k+ overnight LOL

English

398

David Nordström retweetledi

William Holmberg@WilliamHolmbe19·1 Mar

Alright we are live!!! Fly anywhere on earth!

English

1.4K

David Nordström@davnords·22 Şub

@ChuhanZhang5 Best paper inc.

English

108

Chuhan Zhang@ChuhanZhang5·22 Şub

D4RT is now accepted at #CVPR2026 with full scores (straight 6s) from all the reviewers! Deeply grateful to the reviewers for their time, thoughtful feedback, and for seeing the value in this work. Hope to see everyone in Denver. 🏔️

Chuhan Zhang@ChuhanZhang5

A SINGLE encoder + decoder for all the 4D tasks! We release 🎯 D4RT (Dynamic 4D Reconstruction and Tracking). 📍 A simple, unified interface for 3D tracking, depth, and pose 🌟 SOTA results on 4D reconstruction & tracking 🚀 Up to 100x faster pose estimation than prior works

English

237

15.8K

David Nordström@davnords·21 Şub

@Parskatt Sigma cfg goated, change my mind

English

Johan Edstedt @Parskatt·21 Şub

We'll get AGI before deciding on the best config system for making AGI.

Jon Barron@jon_barron

Deleting all my config dataclasses today. Each hparam is now just a default argument value and an `# hparam` comment so it’s greppable. Sweeps are just an agent running a sed script on the repo, launching, and reverting. Model variant configs are all bash scripts of sed calls.

English

David Nordström@davnords·21 Şub

@ducha_aiki Thank you for your service :(

English

374

Dmytro Mishkin 🇺🇦@ducha_aiki·21 Şub

#CVPR2026 reviewing -- this year my usefulness score is zero, meaning that my absence would not change any paper outcome.

English

5.2K

David Nordström@davnords·21 Şub

@Almorgand Hehe the naming is getting out of hand...

English

199

Alexandre Morgand@Almorgand·20 Şub

"YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting" TL;DR: a unified 3D Gaussian splatting model that reconstructs high-quality scene geometry and camera poses from unposed/uncalibrated images in a single forward pass.

English

254

14.9K

Keşfet

@chrisoffner3d @Parskatt @eric_dexheimer @jianyuan_wang @gabriberton @GoogleDeepMind @lucasmaes_ @ducha_aiki