Peter Kocsis

272 posts

Peter Kocsis

@Peter4AI

PhD student at TUM, Visual Computing & Artificial Intelligence Group

Munich Katılım Kasım 2021

137 Takip Edilen714 Takipçiler

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·18 Mar

📢 3D world models from video diffusion suffer from inconsistent frames -> blurry output. Our fix: instead of naïve 3D reconstruction, we non-rigidly align each frame into a globally-consistent 3DGS representation. ->sharp visuals on top of any VDM! lukashoel.github.io/video_to_world

English

497

39.4K

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·22 Oca

📢📢📢Data release: high-res, multi-view, OLAT face recordings 📢📢📢 We captured individuals in our custom light stage with 16 high-end, global shutter cameras (72 fps) and 40 LED modules, totaling 2.8M precisely calibrated frames. We us the data for BecomingLit (#NeurIPS2025): intrinsically decomposed Gaussian avatars, enabling photorealistic and real-time relighting via hybrid neural shading. Code & Data: jonathsch.github.io/becominglit/ Great work by @jnthnschmdt, @SGiebenhain

English

237

19.2K

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·23 Ara

📢Pix2NPHM: Learning to Regress NPHM Reconstructions From a Single Image📢 We directly regress neural parametric head models (NPHMs) from a single image — fast, stable, and significantly more expressive than classical 3DMMs such as FLAME. Face tracking & 3D reconstruction are often limited by the representational capacity of PCA-based face models. By lifting NPHMs to a first-class reconstruction primitive, we enable more accurate geometry, richer expressions, and finer animation control. Pix2NPHM obtains fast and reliable NPHM reconstructions on real-world data. Inference-time optimization against surface normals and canonical point maps can further increase fidelity. Key to successful and generalized training of our ViT-based network are: (1) large-scale registration of existing 3D head datasets, and (2) self-supervised training on vast in-the-wild 2D video datasets using pseudo ground-truth surface normals. Finally, we show that geometry-aware pretraining on pixel-aligned reconstruction tasks significantly outperforms generic visual pretraining (e.g., DINO-style features) in terms of generalization. 🌍simongiebenhain.github.io/Pix2NPHM 🎥youtu.be/MgpEJC5p1Ts Great work by @SGiebenhain, @TobiasKirschst1, @liamschoneveld, Davide Davoli, Zhe Chen

YouTube

English

542

37.6K

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·21 Ara

📢📢📢𝐌𝐞𝐬𝐡𝐑𝐢𝐩𝐩𝐥𝐞: Structured Autoregressive Generation of Artist-Meshes High-fidelity, topologically complete 3D assets that expand naturally like a ripple on a surface! 🌊 Existing AR models often rely on sliding-window inference over truncated segments. However, this limitation breaks long-range geometric dependencies, causing holes and fragmentation. Instead, MeshRipple uses frontier-aware BFS and sparse-attention global memory to ensure coherent growth with an unbounded receptive field. -> Highly detailed-mesh generations -> Artist-like meshing quality -> Works on room-scale environments 🌍maymhappy.github.io/MeshRipple 🎥youtu.be/AHvaLslzXQU Great work by Junkai Lin, Hang Long, Huipeng Guo, Jielei Zhang, Jiayi Yang, Tianle Guo, Yang Yang, Jianwen Li, Wenxiao Zhang, Wei Yang

YouTube

English

101

657

33.9K

Peter Kocsis@Peter4AI·18 Ara

@karanjagtiani04 @MattNiessner For the occluded regions we don't have any predictions, so now they don't receive gradients, but one option would be to backprop through multiple bounces.

English

Karan Jagtiani@karanjagtiani04·17 Ara

@MattNiessner Interesting approach to texture consistency. How do you handle occlusions in the monocular predictions?

English

229

Matthias Niessner@MattNiessner·17 Ara

📢 Intrinsic Image Fusion for Multi-View 3D Material Reconstruction 📢 We combine generative material priors with inverse path tracing: 1) define a parametric texture space 2) fuse monocular predictions across views into consistent textures 3) optimize low-dimensional parameters for physically-grounded reconstructions. The results are relightable PBR textures for 3D scenes: check out the result on a real-world 3D scan from the ScanNet++ dataset! 🌍peter-kocsis.github.io/IntrinsicImage… 🎥youtu.be/-Vs3tR1Xl7k Great work by @Peter4AI @LukasHollein!

YouTube

English

291

15.6K

Peter Kocsis@Peter4AI·17 Ara

📢 Intrinsic Image Fusion for Multi-View 3D Material Reconstruction We reconstruct clean and sharp relightable textures using inverse path tracing and monocular priors. Check out our project page for more results: peter-kocsis.github.io/IntrinsicImage…

Matthias Niessner@MattNiessner

English

667

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·15 Ara

Releasing Echo today is incredibly exciting for me — because it is a critical step for generative AI, enabling the creation of virtual worlds. Echo is our first world model at SpAItial AI. It turns text or images into explorable 3D environments — spaces you can move through, inspect, and build on. Seeing this work in real time still feels a bit surreal. My fascination with this goes back a long way: video games, virtual environments, and the idea of capturing the real world in 3D. As a researcher, I spent years working on 3D reconstruction, neural rendering, and scene understanding — all driven by the same question: how do we teach machines to understand the world? One thing became clear over time: the biggest bottleneck isn’t compute or rendering — it’s 3D worlds themselves. High-quality, consistent environments are expensive to create by hand and don’t scale to the experiences we want to build. In particular, I believe that the ability to generate virtual worlds is ultimately key towards understanding the real world. That’s why we founded SpAItial AI. We’re building spatial world models that combine geometric understanding with creative generation — models that can generate, edit, and eventually reason about 3D environments. Echo is just the beginning. For me, this feels like the moment when decades of research finally meet the imagination that got many of us into graphics, games, 3D understanding in the first place.🌍 spaitial.ai

SpAItial AI@SpAItial_AI

🚀 Announcing Echo — our new frontier model for 3D world generation. Echo turns a simple text prompt or image into a fully explorable, 3D-consistent world. Instead of disconnected views, the result is a single, coherent spatial representation you can move through freely. This is part of a bigger shift in AI: from generating pixels and tokens to generating spaces. Echo predicts a geometry-grounded 3D scene at metric scale, meaning every novel view, depth map, and interaction comes from the same underlying world — not independent hallucinations. Once generated, the world is interactive in real time. You control the camera, explore from any angle, and render instantly — even on low-end hardware, directly in the browser. High-quality 3D world exploration is no longer gated by expensive equipment. Under the hood, Echo infers a physically grounded 3D representation and converts it into a renderable format. For our web demo, we use 3D Gaussian Splatting (3DGS) for fast, GPU-friendly rendering — but the representation itself is flexible and can be easily adapted. Why this matters: consistent 3D worlds unlock real workflows — digital twins, 3D design, game environments, robotics simulation, and more. From a single photo or a line of text, Echo builds worlds that are reliable, editable, and spatially faithful. Echo also enables scene editing and restyling. Change materials, remove or add objects, explore design variations — all while preserving global 3D consistency. Editing no longer breaks the world. This is only the beginning. Echo is the foundation for future world models with dynamics, physical reasoning, and richer interaction — environments that don’t just look right, but behave right. Explore the generated worlds on our website and sign up for the closed beta. The era of spatial intelligence starts here. 🌍 #Echo #WorldModels #SpatialAI #3DFoundationModels Check it out: spaitial.ai

English

635

70.7K

Peter Kocsis retweetledi

George Kopanas@gkopanas·4 Ara

Radiance Meshes for Volumetric Reconstruction 🎉 We made a thing! A Radiance Field that is composited out of a triangle meshes that renders accurately the underlying volumetric field, with no popping AND faster than approximate methods like Gaussian Splatting.

English

308

26.4K

Peter Kocsis@Peter4AI·28 Kas

📢 𝙄𝙣𝙩𝙧𝙞𝙣𝙨𝙞𝙓 @NeurIPSConf If you are excited about PBR materials, drop by at Wednesday from 4:30pm to 7:30pm (poster 4308), or feel free to dm me. PS: We have just released the training code, so you can also train your own models now! peter-kocsis.github.io/IntrinsiX/

Matthias Niessner@MattNiessner

📢 IntrinsiX: High-Quality PBR Generation using Image Priors 📢 From text input, we generate renderable PBR maps! Next to editable image generation, our predictions can be distilled into room-scale scenes using SDS for large-scale PBR texture generation. We first train separate LoRA modules for the intrinsic properties of albedo, rough/metal, normal. Then, we introduce cross-intrinsic attention using a rerendering loss with importance-weighted light sampling to enable coherent PBR generation. Our method outperforms text -> image -> PBR methods both in generalization and quality, since directly generating PBR maps does not suffer from the inherent ambiguity of intrinsic image decomposition. In addition, our design choice facilitates SDS-based PBR texture distillation. 🌍 peter-kocsis.github.io/IntrinsiX/ 🎥 youtu.be/b0wVA44R93Y Great work by @Peter4AI, @LukasHollein

English

1.2K

Peter Kocsis retweetledi

Black Forest Labs@bfl_ml·25 Kas

FLUX.2 is here - our most capable image generation & editing model to date. Multi-reference. 4MP. Production-ready. Open weights. Into the new.

English

162

472

481.5K

Peter Kocsis@Peter4AI·20 Kas

Congrats @yawarnihal! Amazing work, really well-deserved!

Matthias Niessner@MattNiessner

Congrats to @yawarnihal for winning the @MdsiTum best paper award for his amazing 𝐌𝐞𝐬𝐡𝐆𝐏𝐓 work🎉 MeshGPT autoregressively generates compact, artist-style triangle meshes by tokenizing faces into a learned discrete vocabulary (VQ-style codebook) and training a decoder-only transformer to predict those face tokens — because discrete tokenization + attention lets GPT-style models learn long-range geometric & topological patterns and produce coherent, high-fidelity 3D assets. MeshGPT's use cases go far beyond traditional content creation applications in computer graphics. For instance, the method was developed in collaboration with @Audi to help rapid prototyping of car designs, where explicit and precise mesh design are essential. In the research community, there have already been many follow ups such as MeshAnything, MeshXL, Meshtron, and many more - finally, we can use AI to generate high-fidelity 3D content :) Project: nihalsid.github.io/mesh-gpt/ Video: youtu.be/UV90O1_69_o

English

681

Peter Kocsis retweetledi

Angela Dai@angelaqdai·11 Kas

📢ProcGen3D: Learning Neural Procedural Graphs for Image-to-3D Reconstruction @xinyi092298 learns neural procedural graphs to generate high-fidelity 3D - MCTS-guided sampling maintains consistency with the input image, even from real images! Check it out: xzhang-t.github.io/project/ProcGe…

English

236

14.5K

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·5 Kas

📢📢 𝐏𝐞𝐫𝐜𝐇𝐞𝐚𝐝: 𝐏𝐞𝐫𝐜𝐞𝐩𝐭𝐮𝐚𝐥 𝐇𝐞𝐚𝐝 𝐌𝐨𝐝𝐞𝐥 𝐟𝐨𝐫 𝐒𝐢𝐧𝐠𝐥𝐞-𝐈𝐦𝐚𝐠𝐞 𝟑𝐃 𝐇𝐞𝐚𝐝 𝐑𝐞𝐜𝐨𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐢𝐨𝐧 & 𝐄𝐝𝐢𝐭𝐢𝐧𝐠📢📢 PercHead reconstructs realistic 3D heads from a single image and enables disentangled 3D editing via geometric controls and style inputs from images or text. At its core is a generalized 3D head decoder trained with perceptual supervision from DINOv2 and SAM 2.1. We find that our new perceptual loss formulation improves reconstruction fidelity compared to commonly-used methods such as LPIPS. Our trained reconstruction model is able to generate 3D-consistent heads from a single input image. Even with challenging side-view inputs, the model robustly infers missing regions for a coherent, high-fidelity output. In addition, our architecture seamlessly adapts to downstream tasks: by swapping the encoder, we can transform the model into a disentangled 3D editing pipeline. In this scenario, we can control geometry through - potentially hand-drawn - segmentation maps, and condition style via image or text prompt. We also provide an interactive GUI to enable the exploration of our editing pipeline. 🌍 antoniooroz.github.io/PercHead/ 📽️ youtu.be/4hFybgTk4kE Great work by @antonio_oroz and @TobiasKirschst1

YouTube

English

264

18.8K

Peter Kocsis retweetledi

Angela Dai@angelaqdai·30 Eki

📢New in ScanNet++: High-Res 360° Panos! @chandan__yes & @liuyuehcheng have added pano captures for 956 ScanNet++ scenes, fully aligned with the 3D meshes, DSLR, and iPhone data - multiple panos per scene Check it out: Docs kaldir.vc.in.tum.de/scannetpp/docu… Code #backproject-panocam-images" target="_blank" rel="nofollow noopener">github.com/scannetpp/scan…

English

225

12.8K

Peter Kocsis@Peter4AI·30 Eyl

@chrisbward Thanks, repo was still private🤦Should work now!

English

Chris B. Ward (e/bored) 🇬🇧🇮🇪🤐@chrisbward·30 Eyl

@Peter4AI Is there a PoC? The link to the code is broken - repo not published yet?

English

Peter Kocsis@Peter4AI·30 Eyl

📢 Code for 𝙄𝙣𝙩𝙧𝙞𝙣𝙨𝙞𝙓: High-Quality PBR Generation using Image Priors 📢 We have just released the inference code and the pre-trained weights! Our model generates intrinsic properties (albedo, roughnes, metallic, normal) directly from text. peter-kocsis.github.io/IntrinsiX/

English

4.3K

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·28 Eyl

Fantastic retreat this weekend by our research groups! Internal reviews, ideas brainstorming, paper reading, and much more! Of course also many social activities -- the highlight being our kayaking trip - lots of fun :)

English

10.2K

Peter Kocsis retweetledi

Anand Bhattad@anand_bhattad·21 Eyl

So You Want to Be an Academic? A couple of years into your PhD, but wondering: "Am I doing this right?" Most of the advice is aimed at graduating students. But there's far less for junior folks who are still finding their academic path. My candid takes: anandbhattad.github.io/blogs/jr_grads…

English

745

90.4K

Peter Kocsis@Peter4AI·19 Eyl

🎉Our paper, 𝗜𝗻𝘁𝗿𝗶𝗻𝘀𝗶𝗫, just got accepted to #NeurIPS2025! 🎉 Our model generates renderable PBR maps directly from text and can also be used for rooms-scale scene texturing using SDS. peter-kocsis.github.io/IntrinsiX/ Let's meet in San Diego!

Matthias Niessner@MattNiessner

All six of our submissions were accepted to #NeurIPS2025 🎉🥳 Awesome works about Gaussian Splatting Primitives, Lighting Estimation, Texturing, and much more GenAI :) Great work by @Peter4AI, @YujinChen_cv, @ZheningHuang, @jiapeng_tang, @nicolasvluetzow, @jnthnschmdt 🔥🔥🔥

English

4.5K

Peter Kocsis retweetledi

Matthias Niessner@MattNiessner·17 Eyl

Can we use video diffusion to generate 3D scenes? 𝐖𝐨𝐫𝐥𝐝𝐄𝐱𝐩𝐥𝐨𝐫𝐞𝐫 (#SIGGRAPHAsia25) creates fully-navigable scenes via autoregressive video generation. Text input -> 3DGS scene output & interactive rendering! 🌍mschneider456.github.io/world-explorer/ 📽️youtu.be/N6NJsNyiv6I

YouTube

English

373

30.7K

Keşfet

@jnthnschmdt @SGiebenhain @TobiasKirschst1 @liamschoneveld @karanjagtiani04 @MattNiessner @LukasHollein @NeurIPSConf