Daniel Lichy

60 posts

Daniel Lichy

Daniel Lichy

@daniel_lichy

PhD student. Computer Vision/Machine Learning

Katılım Mart 2021
112 Takip Edilen107 Takipçiler
Daniel Lichy
Daniel Lichy@daniel_lichy·
@jayinnn @mkturkcan @janusch_patas The paper mentions satelliteSfM and MoGe depth. Did you run these, and does the repo include them? I’d love to try this on my own data, but I didn’t see any instructions for how to do that.
English
1
0
0
54
Jie-Ying Lee 李杰穎
Jie-Ying Lee 李杰穎@jayinnn·
@mkturkcan @janusch_patas Hi @mkturkcan ! I am the author of this project. Thanks for your interest of our project. I wonder is it possible to share your Columbia datasets, I am very curious why the output looks over saturated. Also, our method requires z-axis to be perpendicular to the ground plane.
English
2
0
1
76
Daniel Lichy
Daniel Lichy@daniel_lichy·
@UUUUUsher Really interesting work! I was wondering whether your bundle adjustment requires FP64 precision. In my experiments, BA was too unstable in FP32, but FP64 ran very slowly on consumer GPUs (e.g., GeForce - up to ~50× slower than FP32). Did you run into this issue?
English
1
0
2
512
Quankai Gao
Quankai Gao@UUUUUsher·
🚀 Introducing InstantSfM: Fully Sparse and Parallel Structure-from-Motion. ✅ Python + GPU-optimized implementation, no C++ anymore! ✅ 40× faster than COLMAP with 5K images on single GPU! ✅ Scales beyond 100 images (more than VGGT/VGGSfM can consume)! ✅ Support metric scale.
Quankai Gao tweet mediaQuankai Gao tweet mediaQuankai Gao tweet media
English
5
50
372
26.4K
Daniel Lichy
Daniel Lichy@daniel_lichy·
@jonstephens85 @otri You can do that. It just outputs a soft binary classification. P(pixel contains human) = 1 - P(pixel doesn’t contain human)
English
0
0
1
50
Jonathan Stephens
Jonathan Stephens@jonstephens85·
Got COLAMP Native 360 -> Nerfstudio working. I don' think this is the best workflow unfortunately. Unless I can figure out how to mask myself out...
Jonathan Stephens tweet media
English
4
7
55
4.1K
Aaron Hilton
Aaron Hilton@otri·
@jonstephens85 I'd really love to figure out a way to auto-mask all humans in 360 photos. There's got to be a way! Also for all the dynamic things in motion, to auto-segment those, so what's left are only unmoving things. I recall seeing a bunch of research papers that address that lately.
English
1
0
0
138
MrNeRF
MrNeRF@janusch_patas·
ViPE: Video Pose Engine for 3D Geometric Perception Contributions: • A robust and efficient framework, ViPE, for estimating camera parameters and dense depth from diverse, in-the-wild videos. • A system design that integrates the strengths of classical SLAM (efficiency, scalability) and learned models (robustness), with key improvements in efficiency, dynamic object handling, and depth quality over prior work. • A large-scale dataset of annotated videos, created using ViPE, to facilitate future research in 3D computer vision.
English
4
41
276
42.4K
Daniel Lichy
Daniel Lichy@daniel_lichy·
🔧Also, check out our library nvTorchCam, built for seamless PyTorch workflows across pinhole, fisheye, ERP, and other camera models. 💻 github.com/NVlabs/nvTorch…
English
0
0
1
77
Daniel Lichy
Daniel Lichy@daniel_lichy·
🎉 Congrats on DAC and the CVPR’25 acceptance! We explored a similar idea for stereo in FoVA-Depth (3DV’24 oral): 📌 Warp pinhole to canonical ERP to generalize across camera models. Glad to see this working in monocular too! 📄 Project: research.nvidia.com/labs/lpr/fova-… 💻 Code: github.com/NVlabs/fova-de… 🧵 Original tweet: x.com/daniel_lichy/s… #CVPR2025 #DepthEstimation #360Images
Daniel Lichy tweet media
Yuliang Guo@33yuliangguo

🎉 We’re excited to announce our paper Depth Any Camera (DAC), accepted to 𝗖𝗩𝗣𝗥 𝟮𝟬𝟮𝟱! 🚀 Along with this, we have a few exciting updates! To support NeRF & Gaussian Splatting on fisheye inputs, we now provide DAC’s depth estimation results for #ZipNeRF on fisheye images. 📥 Download depth maps: 🔗 yuliangguo.github.io/depth-any-came… Methods like #SMERF, #FisheyeGS, & #EVER can leverage this fisheye depth prior! #CVPR2025 #NeRF #GaussianSplatting #3DReconstruction #ComputerVision

English
1
1
7
623
Daniel Lichy retweetledi
Zhenjun Zhao
Zhenjun Zhao@zhenjun_zhao·
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Hong Li, Houyuan Chen, @ychngji6, @Frozen_Burning, Bohan Li, @xshocng1, Xianda Guo, Xuhui Liu, Yikai Wang, Baochang Zhang, Satoshi Ikehata, Boxin Shi, @raoanyi, @HaoZhao_AIRSUN tl;dr: learnable light register tokens; wavelet transform-based sampling; normal-gradient perception loss arxiv.org/abs/2506.18882
Zhenjun Zhao tweet mediaZhenjun Zhao tweet mediaZhenjun Zhao tweet mediaZhenjun Zhao tweet media
Indonesia
1
9
48
6.7K
Dmytro Mishkin 🇺🇦
Dmytro Mishkin 🇺🇦@ducha_aiki·
What are some still relevant _algorithms_ in computer vision beyond LM (Levenb...), RANSAC and hungaruan matching? OK, also K-Means and Mean-shift? EM?
English
4
0
12
3.3K
Daniel Lichy
Daniel Lichy@daniel_lichy·
@ducha_aiki @Rafael_L_Spring I’ve found on some cross domain image matching (RGB/IR) you can improve Superpoint/Superglue significantly with histogram equalization or even just flipping the colors.
English
1
0
2
190
Daniel Lichy retweetledi
Roni Sengupta
Roni Sengupta@SenguptRoni·
🚀 Introducing ProJo4D — a new method for Inverse Physics estimation: recovering 3D shape and physical behavior of deformable objects. It can simulate future motion and render novel views — all from sparse multi-view videos! 🔗 daniel03c1.github.io/ProJo4D/
English
3
14
112
6.4K
Jonathan Stephens
Jonathan Stephens@jonstephens85·
I created a 3D Gaussian Splat of my kitchen using 1/3rd the images it used to take me by using @NVIDIAAIDev's 3DGRUT! I can now use 180 degree fisheye images and ray tracing to make detailed splats. The only reason the scene isn't sharper is because my input images weren't super sharp - when I took the images back in October, I was still learning to use the lens. I plan to make a "first reactions/overview video". Tutorial after that. For reference, this took 206 images and the ultrawide on my iPhone took 608 images to capture. #3D #AEC #Computervision
English
14
48
397
30.6K
Daniel Lichy
Daniel Lichy@daniel_lichy·
Exciting update! 🎉 🔑 By popular demand, nvTorchCam is now under the Apache License 2.0. 🛠️ Get the code: github.com/NVlabs/nvTorch… 📄 Check out the arXiv article: arxiv.org/abs/2410.12074 🎥 Demo video below: Big thanks to @zhenjun_zhao for spreading the word! 🙌 #opensource #AI #python #arxiv
Daniel Lichy@daniel_lichy

🎉 Thrilled to introduce nvTorchCam, our new #PyTorch library designed to support the development of models using camera geometry like plane-sweep volumes (PSV) and related concepts like sphere-sweep volumes or epipolar attention, in a camera model-agnostic way! 🚀 🔗 Code: github.com/NVlabs/nvTorch… (1/6)

English
0
7
57
5.8K
Daniel Lichy
Daniel Lichy@daniel_lichy·
@dihuang52453419 Cool work! Can you show some unprotected point clouds? It is hard to get a sense of how good the model is from just the depth maps.
English
0
0
0
173
Di Huang
Di Huang@dihuang111·
We just released Depth Any Video: a new model for high-fidelity & consistent video depth estimation. Large-scale synthetic data training makes the model robust for various scenarios in our use cases. Paper: arxiv.org/abs/2410.10815 Website: depthanyvideo.github.io
English
8
48
306
38.9K
Chubby♨️
Chubby♨️@kimmonismus·
An even better example on how GenAI (in this example Runway) will improve / create realistic graphics in the near future (obviously with hier frame rate and more stability. But gives good impression). And remember: that’s something Jensen Huang stated many times as well.
English
21
32
237
22.7K
Jon Barron
Jon Barron@jon_barron·
o1 seems very cool! But I don't get all the takes here about it being "the first time anyone scaled up test-time compute". Wasn't this the central idea in diffusion? GANs/VAEs etc made images by running a network once, but diffusion runs the network n times. What am I missing?
English
27
13
208
38.2K
Daniel Lichy retweetledi
Orazio Gallo
Orazio Gallo@0razio·
Proud of this project led by @daniel_lichy. FoVA-Depth is our answer to a problem we experience in many projects: for uncommon cameras, eg fisheye, we don't have as much training depth data as we do for pinhole cameras.
Daniel Lichy@daniel_lichy

🚀 Excited to release the code from our #3DV2024 oral presentation: FoVA-Depth: Field-of-View Agnostic Depth Estimation for cross-dataset generalization! 📊 🔗 Project details: research.nvidia.com/labs/lpr/fova-… 🔗 Code: github.com/NVlabs/fova-de… (1/8)

English
0
3
17
1.9K