Daniel Lichy

60 posts

Daniel Lichy

@daniel_lichy

PhD student. Computer Vision/Machine Learning

Katılım Mart 2021

112 Takip Edilen107 Takipçiler

Sabitlenmiş Tweet

Daniel Lichy@daniel_lichy·23 Tem

🚀 Excited to release the code from our #3DV2024 oral presentation: FoVA-Depth: Field-of-View Agnostic Depth Estimation for cross-dataset generalization! 📊 🔗 Project details: research.nvidia.com/labs/lpr/fova-… 🔗 Code: github.com/NVlabs/fova-de… (1/8)

English

3.2K

Daniel Lichy@daniel_lichy·25 Eki

@Songwei_Ge @reve Big congrats Songwei!

English

Songwei Ge@Songwei_Ge·25 Eki

Honored to receive the Larry S. Davis Doctoral Dissertation Award! Big thanks to my advisors for the constant support — excited to continue pushing the frontier at @reve!!

Jia-Bin Huang@jbhuang0604

Proud advisor moment 😊 Congrats @Songwei_Ge for winning the Larry S. Davis Doctoral Dissertation Award @umdcs! Songwei is now cooking as a research scientist at @reve. Looking forward to amazing work!

English

10.3K

Daniel Lichy@daniel_lichy·25 Eki

@jayinnn @mkturkcan @janusch_patas The paper mentions satelliteSfM and MoGe depth. Did you run these, and does the repo include them? I’d love to try this on my own data, but I didn’t see any instructions for how to do that.

English

Jie-Ying Lee 李杰穎@jayinnn·25 Eki

@mkturkcan @janusch_patas Hi @mkturkcan ! I am the author of this project. Thanks for your interest of our project. I wonder is it possible to share your Columbia datasets, I am very curious why the output looks over saturated. Also, our method requires z-axis to be perpendicular to the ground plane.

English

MrNeRF@janusch_patas·24 Eki

This is fun project. Has anyone already tested the code?

Jie-Ying Lee 李杰穎@jayinnn

🛰️ Excited to share Skyfall-GS - the FIRST method to create real-time navigable 3D cities from satellite imagery alone! We transform multi-view satellite images into immersive 3D scenes you can freely fly through! 🚁✨ 🌐 Project Page: skyfall-gs.jayinnn.dev 1/5

English

184

16K

Daniel Lichy@daniel_lichy·17 Eki

@UUUUUsher Really interesting work! I was wondering whether your bundle adjustment requires FP64 precision. In my experiments, BA was too unstable in FP32, but FP64 ran very slowly on consumer GPUs (e.g., GeForce - up to ~50× slower than FP32). Did you run into this issue?

English

512

Quankai Gao@UUUUUsher·16 Eki

🚀 Introducing InstantSfM: Fully Sparse and Parallel Structure-from-Motion. ✅ Python + GPU-optimized implementation, no C++ anymore! ✅ 40× faster than COLMAP with 5K images on single GPU! ✅ Scales beyond 100 images (more than VGGT/VGGSfM can consume)! ✅ Support metric scale.

English

372

26.4K

Daniel Lichy@daniel_lichy·19 Ağu

@jonstephens85 @otri You can do that. It just outputs a soft binary classification. P(pixel contains human) = 1 - P(pixel doesn’t contain human)

English

Jonathan Stephens@jonstephens85·19 Ağu

@daniel_lichy @otri I want the reverse, remove the humans.

English

Jonathan Stephens@jonstephens85·18 Ağu

Got COLAMP Native 360 -> Nerfstudio working. I don' think this is the best workflow unfortunately. Unless I can figure out how to mask myself out...

English

4.1K

Daniel Lichy@daniel_lichy·18 Ağu

@otri @jonstephens85 Have you tried video matting? This repo is pretty good: github.com/PeterL1n/Robus…. I’ve never tried it on 360 images, but if it does not work directly you can potentially warp the 360 to many pinhole images e.g. with this github.com/NVlabs/nvTorch….

English

Aaron Hilton@otri·18 Ağu

@jonstephens85 I'd really love to figure out a way to auto-mask all humans in 360 photos. There's got to be a way! Also for all the dynamic things in motion, to auto-segment those, so what's left are only unmoving things. I recall seeing a bunch of research papers that address that lately.

English

138

Daniel Lichy@daniel_lichy·13 Ağu

@janusch_patas Does it compare to any classical SLAM or SfM frameworks?

English

271

MrNeRF@janusch_patas·12 Ağu

ViPE: Video Pose Engine for 3D Geometric Perception Contributions: • A robust and efficient framework, ViPE, for estimating camera parameters and dense depth from diverse, in-the-wild videos. • A system design that integrates the strengths of classical SLAM (efficiency, scalability) and learned models (robustness), with key improvements in efficiency, dynamic object handling, and depth quality over prior work. • A large-scale dataset of annotated videos, created using ViPE, to facilitate future research in 3D computer vision.

English

276

42.4K

Daniel Lichy@daniel_lichy·1 Tem

🔧Also, check out our library nvTorchCam, built for seamless PyTorch workflows across pinhole, fisheye, ERP, and other camera models. 💻 github.com/NVlabs/nvTorch…

English

Daniel Lichy@daniel_lichy·1 Tem

🎉 Congrats on DAC and the CVPR’25 acceptance! We explored a similar idea for stereo in FoVA-Depth (3DV’24 oral): 📌 Warp pinhole to canonical ERP to generalize across camera models. Glad to see this working in monocular too! 📄 Project: research.nvidia.com/labs/lpr/fova-… 💻 Code: github.com/NVlabs/fova-de… 🧵 Original tweet: x.com/daniel_lichy/s… #CVPR2025 #DepthEstimation #360Images

Yuliang Guo@33yuliangguo

🎉 We’re excited to announce our paper Depth Any Camera (DAC), accepted to 𝗖𝗩𝗣𝗥 𝟮𝟬𝟮𝟱! 🚀 Along with this, we have a few exciting updates! To support NeRF & Gaussian Splatting on fisheye inputs, we now provide DAC’s depth estimation results for #ZipNeRF on fisheye images. 📥 Download depth maps: 🔗 yuliangguo.github.io/depth-any-came… Methods like #SMERF, #FisheyeGS, & #EVER can leverage this fisheye depth prior! #CVPR2025 #NeRF #GaussianSplatting #3DReconstruction #ComputerVision

English

623

Daniel Lichy retweetledi

Zhenjun Zhao@zhenjun_zhao·24 Haz

Light of Normals: Unified Feature Representation for Universal Photometric Stereo Hong Li, Houyuan Chen, @ychngji6, @Frozen_Burning, Bohan Li, @xshocng1, Xianda Guo, Xuhui Liu, Yikai Wang, Baochang Zhang, Satoshi Ikehata, Boxin Shi, @raoanyi, @HaoZhao_AIRSUN tl;dr: learnable light register tokens; wavelet transform-based sampling; normal-gradient perception loss arxiv.org/abs/2506.18882

Indonesia

6.7K

Daniel Lichy@daniel_lichy·17 Haz

@ducha_aiki @Rafael_L_Spring Yeah, I think that’s right. I wanted to try this, which maybe you posted about: github.com/zju3dv/MatchAn…. 900 stars and 25 forks is pretty good for an empty repo 😂

English

139

Dmytro Mishkin 🇺🇦@ducha_aiki·17 Haz

@daniel_lichy @Rafael_L_Spring I'd say image inversion is crucial (as rough approximation as RGB->IR translation) and the rest is minor, no?

English

152

Dmytro Mishkin 🇺🇦@ducha_aiki·17 Haz

What are some still relevant _algorithms_ in computer vision beyond LM (Levenb...), RANSAC and hungaruan matching? OK, also K-Means and Mean-shift? EM?

English

3.3K

Daniel Lichy@daniel_lichy·17 Haz

@ducha_aiki @Rafael_L_Spring I’ve found on some cross domain image matching (RGB/IR) you can improve Superpoint/Superglue significantly with histogram equalization or even just flipping the colors.

English

190

Dmytro Mishkin 🇺🇦@ducha_aiki·17 Haz

@Rafael_L_Spring I guess I am spoiled by SIFT/ORB, where histeq doesn't help much

English

246

Daniel Lichy@daniel_lichy·13 Haz

@AntonObukhov1 @peter_wonka @yiyi_liao_ Interesting, but with such a small rotation angle, it’s tough to evaluate how well it works. Still a step up from a depth map that doesn’t show much.

English

Anton Obukhov@AntonObukhov1·12 Haz

Monodepth workshop -- now at #CVPR2025 loc. 109! Keynotes by Konrad Schindler @peter_wonka @yiyi_liao_, discussion of the challenge results, and more!

English

5.9K

Daniel Lichy retweetledi

Roni Sengupta@SenguptRoni·6 Haz

🚀 Introducing ProJo4D — a new method for Inverse Physics estimation: recovering 3D shape and physical behavior of deformable objects. It can simulate future motion and render novel views — all from sparse multi-view videos! 🔗 daniel03c1.github.io/ProJo4D/

English

112

6.4K

Daniel Lichy@daniel_lichy·15 Nis

@jonstephens85 @NVIDIAAIDev How did you estimate camera pose and intrinsics?

English

628

Jonathan Stephens@jonstephens85·15 Nis

I created a 3D Gaussian Splat of my kitchen using 1/3rd the images it used to take me by using @NVIDIAAIDev's 3DGRUT! I can now use 180 degree fisheye images and ray tracing to make detailed splats. The only reason the scene isn't sharper is because my input images weren't super sharp - when I took the images back in October, I was still learning to use the lens. I plan to make a "first reactions/overview video". Tutorial after that. For reference, this took 206 images and the ultrawide on my iPhone took 608 images to capture. #3D #AEC #Computervision

English

397

30.6K

Daniel Lichy@daniel_lichy·18 Eki

Exciting update! 🎉 🔑 By popular demand, nvTorchCam is now under the Apache License 2.0. 🛠️ Get the code: github.com/NVlabs/nvTorch… 📄 Check out the arXiv article: arxiv.org/abs/2410.12074 🎥 Demo video below: Big thanks to @zhenjun_zhao for spreading the word! 🙌 #opensource #AI #python #arxiv

Daniel Lichy@daniel_lichy

🎉 Thrilled to introduce nvTorchCam, our new #PyTorch library designed to support the development of models using camera geometry like plane-sweep volumes (PSV) and related concepts like sphere-sweep volumes or epipolar attention, in a camera model-agnostic way! 🚀 🔗 Code: github.com/NVlabs/nvTorch… (1/6)

English

5.8K

Daniel Lichy@daniel_lichy·15 Eki

@dihuang52453419 Cool work! Can you show some unprotected point clouds? It is hard to get a sense of how good the model is from just the depth maps.

English

173

Di Huang@dihuang111·15 Eki

We just released Depth Any Video: a new model for high-fidelity & consistent video depth estimation. Large-scale synthetic data training makes the model robust for various scenarios in our use cases. Paper: arxiv.org/abs/2410.10815 Website: depthanyvideo.github.io

English

306

38.9K

Daniel Lichy@daniel_lichy·7 Eki

@kimmonismus Show us a crash or falling off the motorcycle.

English

Chubby♨️@kimmonismus·7 Eki

An even better example on how GenAI (in this example Runway) will improve / create realistic graphics in the near future (obviously with hier frame rate and more stability. But gives good impression). And remember: that’s something Jensen Huang stated many times as well.

English

237

22.7K

Daniel Lichy retweetledi

Dmytro Mishkin 🇺🇦@ducha_aiki·1 Eki

Good intro into photometic stereo and why it is cool #ECCV2024

English

11.1K

Daniel Lichy@daniel_lichy·15 Eyl

@jon_barron Agreed. I’d did this in my Shape and Material at Home paper arxiv.org/pdf/2104.06397. @tomgoldsteincs ‘s group did something similar in this paper arxiv.org/abs/2106.04537.

English

Jon Barron@jon_barron·13 Eyl

o1 seems very cool! But I don't get all the takes here about it being "the first time anyone scaled up test-time compute". Wasn't this the central idea in diffusion? GANs/VAEs etc made images by running a network once, but diffusion runs the network n times. What am I missing?

English

208

38.2K

Daniel Lichy retweetledi

Orazio Gallo@0razio·31 Tem

Proud of this project led by @daniel_lichy. FoVA-Depth is our answer to a problem we experience in many projects: for uncommon cameras, eg fisheye, we don't have as much training depth data as we do for pinhole cameras.

Daniel Lichy@daniel_lichy

English

1.9K

Keşfet

@Songwei_Ge @reve @jayinnn @mkturkcan @janusch_patas @UUUUUsher @jonstephens85 @otri