Alberto Tono

3.1K posts

Alberto Tono banner
Alberto Tono

Alberto Tono

@albertotono3

Applied Scientist @adobe l PhD Candidate @Stanford | Founder @CDInstitut | Ex-SR @googledeepmind

Earth Katılım Mart 2015
984 Takip Edilen1.4K Takipçiler
Sabitlenmiş Tweet
Alberto Tono
Alberto Tono@albertotono3·
1/3 We are thrilled to introduce MORPHEUS 🎊: a design space to survey the Deep Sketch-Based 3D Modeling (DS-3DM) literature. ✍Navigating the shift from 2D sketches to 3D raises fundamental questions: How do we ensure novel interfaces truly foster Engelbartian human augmentation? Our work bridges #CV, #ComputerGraphics, and #HCI to advocate for human-centered, informed, and controlled design, requiring a holistic understanding of input, model, and output. Ready for the R3D pill? 💊🐇 🌐Website: cdinstitute.github.io/Morpheus/ 📄Paper: onlinelibrary.wiley.com/doi/10.1111/cg… 💻Github: github.com/albertotono/Aw… Authors: @albertotono3 , @jiajunwu_cs , @GordonWetzstein , @ir0armeni , @HariSubramonyam , @landay , @fischermartin
English
2
2
14
1.9K
Matthias Niessner
Matthias Niessner@MattNiessner·
Congrats to @Normanisation for his successful PhD defense 🥳🎓 Norman's thesis about 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 𝐨𝐧 𝟑𝐃 𝐑𝐞𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧𝐬 makes important contributions to the 3D vision community. For instance, DiffRF, a generative approach directly operating in 3D space, was among the first diffusion techniques for neural radiance fields. This led to many follow up works in this area and sparked interest across the computer vision community, establishing generative approaches as a corner stone in the 3D domain. Also after his PhD, Norman continues to work on the forefront in computer vision, such as his contributions to MapAnything, a universal feedforward approach for 3D reconstruction. Check out Norman's amazing work: normanm.de Congratulations Dr. Mueller - super proud!
Matthias Niessner tweet mediaMatthias Niessner tweet media
English
6
10
115
13.3K
Alberto Tono retweetledi
Marktechpost AI Dev News ⚡
Marktechpost AI Dev News ⚡@Marktechpost·
Google AI Releases MedGemma-1.5: The Latest Update to their Open Medical AI Models for Developers Google has released MedGemma 1.5 and MedASR as open components in its Health AI Developer Foundations program, giving developers a practical starting point for medical imaging, text and speech workflows. MedGemma-1.5-4B is a multimodal model that supports text, two dimensional images, three dimensional CT and MRI volumes and whole slide pathology, with accuracy gains for disease findings and histopathology that reach or match strong task specific baselines. It also improves MedQA and EHRQA scores, which makes it suitable as a backbone for clinical question answering and chart summarization pipelines. MedASR is a Conformer based medical speech recognition model that reduces word error rate by 58 percent on chest X ray dictation and 82 percent on broader medical dictation benchmarks compared to Whisper large v3, providing a domain tuned speech front end for MedGemma centered applications.......... Full analysis: marktechpost.com/2026/01/13/goo… Model weights: huggingface.co/google/medgemm… Technical details: research.google/blog/next-gene… @googleaidevs @GoogleAI
Marktechpost AI Dev News ⚡ tweet media
English
0
10
29
3.4K
Alberto Tono retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
MedGemma 1.5 is a major upgrade to our open models for healthcare developers. The new 4B model enables developers to build applications that natively interpret full 3D scans (CTs, MRIs) with high efficiency - a first, we believe, for an open medical generalist model. MedGemma 1.5 also pairs well with MedASR, our speech-to-text model fine-tuned for highly accurate medical dictation. Developers can now use these multimodal capabilities to build medical apps that reach patients in more places.
English
179
700
6K
394.8K
Alberto Tono retweetledi
Chubby♨️
Chubby♨️@kimmonismus·
Really exciting: Google's MedGemma 1.5 packs 3D radiology, whole-slide pathology, longitudinal X-ray analysis, and clinical document understanding into a single open-weight 4B model (!), with a massive +47% F1 jump in pathology and +11% in MRI classification over v1. The specialized 4B model outperforms Gemini 3.0 Flash on out-of-distribution CT analysis, proving that targeted medical post-training beats raw scale. <3
Chubby♨️ tweet mediaChubby♨️ tweet media
Samuel Schmidgall@SRSchmidgall

The MedGemma 1.5 technical report is out 👇 arxiv.org/pdf/2604.05081…

English
10
47
427
31.7K
Alberto Tono
Alberto Tono@albertotono3·
Limitations: because these models relied heavily on synthetic, distortion-free edge-maps during training, they failed to generalize to real human sketches. They simply couldn't handle the massive domain gap, varying stylistic abstractions, physically unrealistic strokes, and the inherent figure/ground ambiguity of real free-hand drawings Credit: Nan Xiang, Ruibin Wang, Tao Jiang, Li Wang, Yanran Li, Xiaosong Yang, Jianjun Zhang
English
1
0
3
107
Alberto Tono
Alberto Tono@albertotono3·
-26... Xiang et al. (2020) shattered the 3D-data dependency, They introduced an end-to-end CNN framework equipped with a differentiable renderer. By first generating an intermediate normal map from the sketch, their system recovers complete 3D polygon meshes using exclusively 2D supervision, entirely eliminating the need for 3D ground-truth data during training
Alberto Tono tweet media
English
1
0
3
199
Alberto Tono
Alberto Tono@albertotono3·
Limitations: Despite the leap, Wang et al.'s model was fundamentally constrained by its reliance on explicit 3D ground-truth supervision during training, optimizing point clouds via 3D metrics like Chamfer and Earth Mover's Distances Credit: Jiayun Wang, @Peter_j_Wang , @JieruiLin, Qian Yu, @rleobest , Yubei Chen, Stella X. Yu.
English
1
0
4
76
Alberto Tono
Alberto Tono@albertotono3·
-27... 3D Shape Reconstruction from Free-Hand Sketches solves x.com/albertotono3/s… ShapeMVD rigid requirement for multi-view, distortion-free line drawings (edge maps). Wang et al. solves this view-dependency and domain gap, allowing true single-view free-hand inputs, it generalizes to real, distorted human sketches.
Alberto Tono tweet media
Alberto Tono@albertotono3

-28 ShapeMVD bypasses these strict volumetric resolution constraints of Delanoy x.com/albertotono3/s…. Instead of trying to directly predict a full 3D voxel grid, ShapeMVD predicts multi-view depth and normal maps from the 2D sketches. By operating in this 2.5D map space, ShapeMVD avoids the low-resolution, blocky limitations of a 64^3 grid, allowing it to capture finer surface details. However, this workaround comes with a trade-off, as ShapeMVD's generated depth maps still need to be registered and fused together using a complex optimization method to produce the final 3D surface.

English
1
0
5
312
Alberto Tono
Alberto Tono@albertotono3·
Limitations: It requires precisely aligned, multi-view line drawings and struggles to generate shapes from a single viewpoint. It fails on abstract or rough doodles, relying heavily on precise, distortion-free drawings that require professional skill. It outputs 2.5D depth and normal maps, which then require computationally heavy optimization steps to successfully fuse into a final 3D surface. Credit: Zhaoliang Lun, Matheus Gadelha, @EvangelosKalog1 , @MajiSubhransu, @Summer912_
English
1
0
4
149
Alberto Tono
Alberto Tono@albertotono3·
-28 ShapeMVD bypasses these strict volumetric resolution constraints of Delanoy x.com/albertotono3/s…. Instead of trying to directly predict a full 3D voxel grid, ShapeMVD predicts multi-view depth and normal maps from the 2D sketches. By operating in this 2.5D map space, ShapeMVD avoids the low-resolution, blocky limitations of a 64^3 grid, allowing it to capture finer surface details. However, this workaround comes with a trade-off, as ShapeMVD's generated depth maps still need to be registered and fused together using a complex optimization method to produce the final 3D surface.
Alberto Tono tweet media
Alberto Tono@albertotono3

-29.. Delanoy et al. (2017) shattered previous bottlenecks (x.com/albertotono3/s…) by abandoning rigid templates! Instead of regressing parameters for predefined models, they introduced an updater-CNN that iteratively mapped 2D sketches directly into a voxel-based 3D space.

English
1
0
6
649
Alberto Tono retweetledi
Yael Vinker🎗
Yael Vinker🎗@YVinker·
I am *very* excited to announce our SIGGRAPH 2026 workshop: Lines & Minds: Visual Abstraction in Art, Psychology, and Computer Graphics 🎨🧠🫖 🔗 lines-and-minds.github.io 📅 Sunday, July 19 Join us to explore how visual abstraction shapes how we think, create, and communicate.
Yael Vinker🎗 tweet media
English
6
18
101
9.9K
Alberto Tono retweetledi
Rubaiat Habib
Rubaiat Habib@rubaiat·
Really happy to be part of this amazing lineup of speakers and organizers on a topic I care deeply about. Also can’t wait to meet Scott McCloud — "Understanding Comics" has been hugely influential in how I think about scientific communication
Yael Vinker🎗@YVinker

I am *very* excited to announce our SIGGRAPH 2026 workshop: Lines & Minds: Visual Abstraction in Art, Psychology, and Computer Graphics 🎨🧠🫖 🔗 lines-and-minds.github.io 📅 Sunday, July 19 Join us to explore how visual abstraction shapes how we think, create, and communicate.

English
2
1
13
510
Alberto Tono retweetledi
Mia Tang
Mia Tang@Miamiamia0103·
Mark your #SIGGRAPH2026 calendar for our Lines and Minds workshop! This year we have super exciting speakers from psychology, HCI, computer graphics, and comics theory. For more information, please visit: lines-and-minds.github.io 🥳!
Yael Vinker🎗@YVinker

I am *very* excited to announce our SIGGRAPH 2026 workshop: Lines & Minds: Visual Abstraction in Art, Psychology, and Computer Graphics 🎨🧠🫖 🔗 lines-and-minds.github.io 📅 Sunday, July 19 Join us to explore how visual abstraction shapes how we think, create, and communicate.

English
0
1
17
1.5K
Alberto Tono
Alberto Tono@albertotono3·
This fundamental shift from parametric regression to direct volumetric prediction finally allowed the system to generate complex 3D shapes with completely arbitrary and varied topologies! 🚀 #DeepSketchBased3DModeling #DS3DM
English
1
0
3
118
Alberto Tono
Alberto Tono@albertotono3·
-29.. Delanoy et al. (2017) shattered previous bottlenecks (x.com/albertotono3/s…) by abandoning rigid templates! Instead of regressing parameters for predefined models, they introduced an updater-CNN that iteratively mapped 2D sketches directly into a voxel-based 3D space.
Alberto Tono@albertotono3

-30... 1/4 This SIGGRAPH 2016 pioneered interactive sketching for urban procedural models. Instead of complex point clouds, their neural network focused on simple models, regressing basic geometric properties like height, width, and depth...

English
1
0
9
979