Alberto Tono

3.1K posts

Alberto Tono banner
Alberto Tono

Alberto Tono

@albertotono3

Applied Scientist @adobe l PhD Candidate @Stanford | Founder @CDInstitut | Ex-SR @googledeepmind

Earth Beigetreten Mart 2015
980 Folgt1.4K Follower
Angehefteter Tweet
Alberto Tono
Alberto Tono@albertotono3·
Incredibly honored to present our work at #Eurographics2026 here in beautiful Aachen! 🇩🇪 Huge thanks to @EurographicsC, @CDInstitut and @StanfordEthics for the support and the scholarship that made this trip possible. I’m beyond thrilled to be part of such an amazing research community passionate about Sketch-to-3D. Thank you all for having us and for the fantastic questions and discussions today. 🎨✨ You can check out our slides here: cdinstitute.github.io/Morpheus/ Finally, I want to take a moment to deeply thank my incredible collaborators who pushed me toward the finish line. This work simply wouldn't have been possible without you: @jiajunwu_cs, @ir0armeni, @HariSubramonyam, @GordonWetzstein, @landay & @fischermartin 🙏👏 Soon, I will present my PhD research at the EG PhD Consortium, if you're around the conference, I'd love to connect! #cv #hci #graphics
Alberto Tono@albertotono3

1/3 We are thrilled to introduce MORPHEUS 🎊: a design space to survey the Deep Sketch-Based 3D Modeling (DS-3DM) literature. ✍Navigating the shift from 2D sketches to 3D raises fundamental questions: How do we ensure novel interfaces truly foster Engelbartian human augmentation? Our work bridges #CV, #ComputerGraphics, and #HCI to advocate for human-centered, informed, and controlled design, requiring a holistic understanding of input, model, and output. Ready for the R3D pill? 💊🐇 🌐Website: cdinstitute.github.io/Morpheus/ 📄Paper: onlinelibrary.wiley.com/doi/10.1111/cg… 💻Github: github.com/albertotono/Aw… Authors: @albertotono3 , @jiajunwu_cs , @GordonWetzstein , @ir0armeni , @HariSubramonyam , @landay , @fischermartin

English
0
1
10
640
Alberto Tono retweetet
reactor
reactor@reactorworld·
Today Reactor is coming out of stealth. We’ve raised $59M in Seed and Series A funding, led by @lightspeedvp, with participation from @AmplifyPartners, @wndrco, @Sky9Capital, and @FPVventures. Reactor is the platform for building in the World Model era: the infrastructure that lets developers build with them at global scale for the first time. Stream from a frontier World Model to your app, in real time, all in under 10 lines of code. World Models represent the next major shift in AI: pixels, audio and actions are generated on the fly, in real-time, in response to user inputs, and to the environment. Every time computing has made a shift from passive to interactive, entire industries appeared that didn't exist before. We're standing in front of such moment again. Over the last 6 months, we’ve assembled an all-star team with alumni from Apple, Meta, Google, Luma AI, Netflix, and Replicate. We're already partnering with some of the biggest names and labs in the world, and hundreds of developers are already building on Reactor. The World Model era starts now.
English
170
335
2.6K
12.6M
Chin-Yi Cheng
Chin-Yi Cheng@chinyich·
Today, we’re launching illoca Tracing Paper and announcing our $13M Seed round led by Bessemer Venture Partners.
English
40
125
1.3K
373.1K
Matthias Niessner
Matthias Niessner@MattNiessner·
Congrats to @Normanisation for his successful PhD defense 🥳🎓 Norman's thesis about 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 𝐨𝐧 𝟑𝐃 𝐑𝐞𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧𝐬 makes important contributions to the 3D vision community. For instance, DiffRF, a generative approach directly operating in 3D space, was among the first diffusion techniques for neural radiance fields. This led to many follow up works in this area and sparked interest across the computer vision community, establishing generative approaches as a corner stone in the 3D domain. Also after his PhD, Norman continues to work on the forefront in computer vision, such as his contributions to MapAnything, a universal feedforward approach for 3D reconstruction. Check out Norman's amazing work: normanm.de Congratulations Dr. Mueller - super proud!
Matthias Niessner tweet mediaMatthias Niessner tweet media
English
6
10
115
13.4K
Alberto Tono retweetet
Marktechpost AI
Marktechpost AI@Marktechpost·
Google AI Releases MedGemma-1.5: The Latest Update to their Open Medical AI Models for Developers Google has released MedGemma 1.5 and MedASR as open components in its Health AI Developer Foundations program, giving developers a practical starting point for medical imaging, text and speech workflows. MedGemma-1.5-4B is a multimodal model that supports text, two dimensional images, three dimensional CT and MRI volumes and whole slide pathology, with accuracy gains for disease findings and histopathology that reach or match strong task specific baselines. It also improves MedQA and EHRQA scores, which makes it suitable as a backbone for clinical question answering and chart summarization pipelines. MedASR is a Conformer based medical speech recognition model that reduces word error rate by 58 percent on chest X ray dictation and 82 percent on broader medical dictation benchmarks compared to Whisper large v3, providing a domain tuned speech front end for MedGemma centered applications.......... Full analysis: marktechpost.com/2026/01/13/goo… Model weights: huggingface.co/google/medgemm… Technical details: research.google/blog/next-gene… @googleaidevs @GoogleAI
Marktechpost AI tweet media
English
0
9
28
3.4K
Alberto Tono retweetet
Sundar Pichai
Sundar Pichai@sundarpichai·
MedGemma 1.5 is a major upgrade to our open models for healthcare developers. The new 4B model enables developers to build applications that natively interpret full 3D scans (CTs, MRIs) with high efficiency - a first, we believe, for an open medical generalist model. MedGemma 1.5 also pairs well with MedASR, our speech-to-text model fine-tuned for highly accurate medical dictation. Developers can now use these multimodal capabilities to build medical apps that reach patients in more places.
English
179
694
6K
395.2K
Alberto Tono retweetet
Chubby♨️
Chubby♨️@kimmonismus·
Really exciting: Google's MedGemma 1.5 packs 3D radiology, whole-slide pathology, longitudinal X-ray analysis, and clinical document understanding into a single open-weight 4B model (!), with a massive +47% F1 jump in pathology and +11% in MRI classification over v1. The specialized 4B model outperforms Gemini 3.0 Flash on out-of-distribution CT analysis, proving that targeted medical post-training beats raw scale. <3
Chubby♨️ tweet mediaChubby♨️ tweet media
Samuel Schmidgall@SRSchmidgall

The MedGemma 1.5 technical report is out 👇 arxiv.org/pdf/2604.05081…

English
10
47
424
32.4K
Alberto Tono
Alberto Tono@albertotono3·
Limitations: because these models relied heavily on synthetic, distortion-free edge-maps during training, they failed to generalize to real human sketches. They simply couldn't handle the massive domain gap, varying stylistic abstractions, physically unrealistic strokes, and the inherent figure/ground ambiguity of real free-hand drawings Credit: Nan Xiang, Ruibin Wang, Tao Jiang, Li Wang, Yanran Li, Xiaosong Yang, Jianjun Zhang
English
1
0
3
114
Alberto Tono
Alberto Tono@albertotono3·
-26... Xiang et al. (2020) shattered the 3D-data dependency, They introduced an end-to-end CNN framework equipped with a differentiable renderer. By first generating an intermediate normal map from the sketch, their system recovers complete 3D polygon meshes using exclusively 2D supervision, entirely eliminating the need for 3D ground-truth data during training
Alberto Tono tweet media
English
1
0
3
206
Alberto Tono
Alberto Tono@albertotono3·
Limitations: Despite the leap, Wang et al.'s model was fundamentally constrained by its reliance on explicit 3D ground-truth supervision during training, optimizing point clouds via 3D metrics like Chamfer and Earth Mover's Distances Credit: Jiayun Wang, @Peter_j_Wang , @JieruiLin, Qian Yu, @rleobest , Yubei Chen, Stella X. Yu.
English
1
0
4
82
Alberto Tono
Alberto Tono@albertotono3·
-27... 3D Shape Reconstruction from Free-Hand Sketches solves x.com/albertotono3/s… ShapeMVD rigid requirement for multi-view, distortion-free line drawings (edge maps). Wang et al. solves this view-dependency and domain gap, allowing true single-view free-hand inputs, it generalizes to real, distorted human sketches.
Alberto Tono tweet media
Alberto Tono@albertotono3

-28 ShapeMVD bypasses these strict volumetric resolution constraints of Delanoy x.com/albertotono3/s…. Instead of trying to directly predict a full 3D voxel grid, ShapeMVD predicts multi-view depth and normal maps from the 2D sketches. By operating in this 2.5D map space, ShapeMVD avoids the low-resolution, blocky limitations of a 64^3 grid, allowing it to capture finer surface details. However, this workaround comes with a trade-off, as ShapeMVD's generated depth maps still need to be registered and fused together using a complex optimization method to produce the final 3D surface.

English
1
0
5
368
Alberto Tono
Alberto Tono@albertotono3·
Limitations: It requires precisely aligned, multi-view line drawings and struggles to generate shapes from a single viewpoint. It fails on abstract or rough doodles, relying heavily on precise, distortion-free drawings that require professional skill. It outputs 2.5D depth and normal maps, which then require computationally heavy optimization steps to successfully fuse into a final 3D surface. Credit: Zhaoliang Lun, Matheus Gadelha, @EvangelosKalog1 , @MajiSubhransu, @Summer912_
English
1
0
4
157
Alberto Tono
Alberto Tono@albertotono3·
-28 ShapeMVD bypasses these strict volumetric resolution constraints of Delanoy x.com/albertotono3/s…. Instead of trying to directly predict a full 3D voxel grid, ShapeMVD predicts multi-view depth and normal maps from the 2D sketches. By operating in this 2.5D map space, ShapeMVD avoids the low-resolution, blocky limitations of a 64^3 grid, allowing it to capture finer surface details. However, this workaround comes with a trade-off, as ShapeMVD's generated depth maps still need to be registered and fused together using a complex optimization method to produce the final 3D surface.
Alberto Tono tweet media
Alberto Tono@albertotono3

-29.. Delanoy et al. (2017) shattered previous bottlenecks (x.com/albertotono3/s…) by abandoning rigid templates! Instead of regressing parameters for predefined models, they introduced an updater-CNN that iteratively mapped 2D sketches directly into a voxel-based 3D space.

English
1
0
6
715
Alberto Tono retweetet
Yael Vinker🎗
Yael Vinker🎗@YVinker·
I am *very* excited to announce our SIGGRAPH 2026 workshop: Lines & Minds: Visual Abstraction in Art, Psychology, and Computer Graphics 🎨🧠🫖 🔗 lines-and-minds.github.io 📅 Sunday, July 19 Join us to explore how visual abstraction shapes how we think, create, and communicate.
Yael Vinker🎗 tweet media
English
6
19
102
10.4K