Ben Caine

9 posts

Ben Caine

Ben Caine

@bcaine

Research Engineer @ Google DeepMind. I lead Gemini's vision post-training effort. Opinions my own.

Cambridge, MA Beigetreten Haziran 2011
126 Folgt492 Follower
Ben Caine
Ben Caine@bcaine·
@bhalligan I'd also be interested. Not a founder, but grew up in MA, went to Northeastern, left for SF, and came back. I lead vision post training for Gemini / Google DeepMind and used to work on self driving cars in Boston and later in SF. I'd love to figure out how to keep folks here.
English
0
0
0
64
Ben Caine
Ben Caine@bcaine·
The power of generalization
Rohan Paul@rohanpaul_ai

Google’s Gemini 3.0 Pro decodes 500-Year-Old Nuremberg chronicle mysteries. 👏 Gemini 3.0 Pro was given high-resolution page images and produced a coherent explanation of what the notes were doing, not just a transcription. It concluded the roundels are a small calculation table meant to reconcile 2 competing biblical timelines and pin down Abraham’s birth year. The hard part for earlier attempts was that the notes mix abbreviated Latin, Roman numerals, and implied context from the printed page, so reading the marks alone does not reveal intent. The prompt asked for transcription, translation, and meaning using the surrounding text, and it reportedly used 5 images total, a 2-page spread plus 4 zooms. Gemini tied the scribbles to “Anno Mundi” dating, meaning “Year of the World,” and treated them as conversions into a “before Christ” timeline. It linked 3184 and 2040 “Year of the World” figures to the Septuagint and Hebrew Bible traditions, then mapped them to roughly 2015BC and 1915BC, a 100-year gap the annotator was trying to resolve. A great example of multimodal models as research assistants when the task needs reading, cross-referencing, and arithmetic in one pass, but it still needs expert verification because a single digit slip can change the conclusion. --- siliconangle. com/2026/01/01/googles-gemini-3-0-pro-helps-solve-long-standing-mystery-nuremberg-chronicle/

English
0
0
2
146
Ben Caine
Ben Caine@bcaine·
@DemetriusZhomir 😂 sorry... It's just Gemini 3 predicting bounding boxes on OAI's blog post image, but I couldn't find the original image so I had nano banana delete their detections first before feeding it into Gemini
English
2
0
50
11.1K
Ben Caine
Ben Caine@bcaine·
I had Nano Banana remove GPT5.2's bounding boxes and Gemini 3 give it a go Left: GPT5.2 Right: Gemini 3.0
Ben Caine tweet mediaBen Caine tweet media
English
57
95
1.5K
230.5K
mewtwo
mewtwo@neomewtwo·
@bcaine did it output any numbers or no
English
1
0
6
14.5K
Ben Caine
Ben Caine@bcaine·
@_beenkim You can even just ask it for the coordinates :)
Ben Caine tweet mediaBen Caine tweet media
English
2
0
8
807
Been Kim
Been Kim@_beenkim·
Gemini3 CAN. 🥳🎉 Rejoice all couples with at least one absent-minded party!
Been Kim tweet media
English
4
1
25
11.8K
Been Kim
Been Kim@_beenkim·
My husband lost her wedding ring during our honeymoon 🙄 (this was many years ago) "Stop playing with it - you will lose it" "Nah, it's fine. It's right here!" Then it was gone-never seen again. 💍 Only if I had Gemini 3 to find it then! Here is a picture - there is gold ring in it. Can you find it?
Been Kim tweet media
English
55
8
165
77.9K
Ben Caine
Ben Caine@bcaine·
@_arohan_ Come play with us in the vision arena 😉
English
0
0
0
38