Google AI

3.3K posts

Google AI

@GoogleAI

Making AI helpful for everyone. Show thinking ↓

Mountain View, CA Katılım Nisan 2009

30 Takip Edilen2.4M Takipçiler

Sabitlenmiş Tweet

Google AI@GoogleAI·4d

Last week, we made Gemini Embedding 2, our first natively multimodal embedding model, available to the general public. Since then, developers have used it to build video analysis tools, visual shopping assistants, and more. But you might be wondering... what is an embedding model? 🤔 Let’s break it down! 1. What is it? Think of an embedding model as a "universal translator." It takes text, images, video, and audio data and turns them into a long string of numbers, like a unique digital fingerprint. 2. How does it work? Historically, search has been text only. Now, instead of just matching data by keyword, Gemini Embedding 2 maps multiple modalities in the same space based on meaning. It "feels" the connection between a video of a soccer goal and the words "game-winning shot" without needing tags. For example, "ocean" and "waves" are placed close together, but "ocean" and "toaster" are miles apart. 3. How can you use it? Developers have been using it to incorporate smarter search functionality into their builds. This means creating tools where you can snap a photo of a product and type "find this in yellow," or search through thousands of hours of video by describing what happens in a scene. 4. Ready to try it out for yourself? You can start using it today via the Gemini API or the Gemini Enterprise Agent Platform.

English

327

2.4K

175.4K

Google AI@GoogleAI·3d

I/O is less than 3 weeks away 🤯 We want YOU to help us create the countdown that will play before the keynote begins. Using @GoogleAIStudio or Canvas in @GeminiApp vibe code your most creative countdown concept and send it to us by May 6th. The only rule is that your build has to feature a large number between 1 and 10. Check the replies in this thread for sample projects to draw inspiration from or remix. You can find more info and directions on how to submit your builds here: goo.gle/codethecountdo…

GIF

English

553

90.8K

Google AI@GoogleAI·3d

@GoogleAIStudio @GeminiApp We can’t wait to see where your creativity takes you. Vibe code your countdown idea in @GoogleAIStudio or Canvas in @GeminiApp, then submit it here: goo.gle/codethecountdo…

English

11K

Google AI@GoogleAI·3d

@GoogleAIStudio And this sample project was created on Canvas in @GeminiApp. It’s a high-speed rhythm game where you tap to the beat and collect power-ups to remix the track. Watch as the numbers appear as melody plays! You can actually play the game here: gemini.google.com/share/b16dad5e…

GIF

English

18.4K

Google AI@GoogleAI·5d

x.com/i/article/2049…

ZXX

139

904

105.7K

Google AI@GoogleAI·25 Nis

Our teams have been busyyy! Here are some key updates from the past week: — @GoogleCloud unveiled a suite of AI innovations at our Cloud Next event, including our eighth generation TPUs (TPUt for inference + TPUi for reasoning), Gemini Enterprise Agent Platform, Agentic Data cloud, Workspace Intelligence, and beyond — Gemini Embedding 2, our natively multimodal embedding model, became generally available via the Gemini API and in Gemini Enterprise Agent Platform (the evolution of Vertex AI) — @StitchbyGoogle open-sourced the draft specification for DESIGN.md, so it can be used across any single tool or platform — New autonomous search agents, Deep Research and Deep Research Max, launched to bring MCP support, native visualizations, and unprecedented analytical quality to research workflows across the web or custom sources — Google AI Pro and Ultra subscribers now get increased usage limits and access to Nano Banana Pro and Gemini Pro models in @GoogleAIStudio — @GoogleDeepMind introduced Decoupled DiLoCo, our new resilient and flexible way to train advanced AI models across multiple data centers

English

428

51.8K

Google AI@GoogleAI·23 Nis

Last week, we launched Gemini 3.1 TTS, our latest and best text-to-speech model. This new model introduces [awe] audio tags, an intuitive way to guide vocal style, pace, and delivery. Here are some tips on the best ways to use audio tags in your prompts: 1. All inline tags must be enclosed in square brackets, such as [screams] or [whispers] 2. Insert these tags exactly where you want the transition to occur and make sure to avoid placing tags directly next to each other 3. Use tags like [slow] or [fast] to control the pace of the delivery, or even [short pause] or [long pause] to ramp up the anticipation in dramatic moments 4. The model also offers granular control over vocalizations, allowing you to direct the delivery with cues like [cackles] or [whispers] 5. An ideal audio tag formula could look something like: [encouraging] Let’s try that last sentence again to make sure that you nailed it. [slow] "L'oiseau s'est envolé." [short pause] Perfect! [laughs] You're a natural. No matter what you’re developing — from [scholarly] a language learning tool, to [mysterious] an interactive podcast app, to [friendly] more adaptive customer service offerings, and beyond — these prompting tips will equip you to start building with Gemini 3.1 TTS.

English

654

55.1K

Google AI@GoogleAI·21 Nis

@GoogleAIStudio Check out our blog post to learn more: blog.google/innovation-and…

English

25.3K

Google AI@GoogleAI·21 Nis

Calling all builders 🛠️📣 Google AI Pro and Ultra subscribers will now get increased usage limits and access to Nano Banana Pro and Gemini Pro models in @GoogleAIStudio — no API key required. Sign in with your subscriber account today to take your ideas from prototype to production.

English

142

1.5K

92.6K

Google AI@GoogleAI·20 Nis

Beyond generating high-fidelity visuals, we wanted to test the limits of what Nano Banana Pro can do. We worked with design partners Porto Rocha to build out a hypothetical brand called YOYOYO to see how the model would handle the task. Here’s what we found: 🎨Brand consistency: Across logos, colors, and typography, the model maintained a strict, cohesive brand identity (even for wildly diverse concepts) 🛍️Environmental realism: We asked to see the products in storefront and studio mockups. It nailed accurate lighting, shadows, and physical proportions - even when upscaled for massive retail displays 🪀Spatial accuracy: We tested spatial volumes for physical packaging. The generated proportions were so precise that we were able to 3D-print the functional yo-yo How have you been pushing the limits of Nano Banana Pro? Let us know in the replies below!

English

114

1.3K

118.5K

Google AI@GoogleAI·17 Nis

What a week! Here’s everything we shipped: — Gemini 3.1 Flash TTS, our latest text-to-speech model, featuring native multi-speaker dialogue and improved controllability and audio tags for more natural, expressive voices in 70+ languages — Gemini Robotics-ER 1.6 by @GoogleDeepMind, an upgrade designed to help robots reason about the physical world — The @GeminiApp for Mac desktop (tip: Use Option + Space to access the app via shortcut) — Personal Intelligence in @GeminiApp has new integrations with @GooglePhotos and Nano Banana 2, making it easier to create relevant, personalized images. Available for AI Pro, Plus, and Ultra subscribers in the US — A couple fun additions in @GoogleAIStudio to make building easier, including Design previews and tab tab tab functionality — Skills in @GoogleChrome, which let you save and reuse your most helpful Gemini prompts and run them in your browser with a single click

English

872

99.3K

Google AI@GoogleAI·15 Nis

@theoledgers All audio generated by Gemini 3.1 Flash TTS is watermarked with SynthID!

English

3.1K

Theo Ledger@theoledgers·15 Nis

@GoogleAI Great. Do you have watermarks in the audio to help identify for harmful use?

English

3.1K

Google AI@GoogleAI·15 Nis

Today we launched Gemini 3.1 Flash TTS, our most expressive and controllable text-to-speech model yet. This launch [excitement] includes audio tags! 🗣🏷 Audio tags [explanatory] are a seamless way to guide vocal style, pace, and delivery using natural language commands embedded directly in your text. Want a different tempo or tone? [amazement] Just tag the audio to steer the AI-speech output! The model supports 70+ languages (24 of which are high-quality evaluated languages, including: Japanese, Hindi, and Arabic). Watch the audio tags in action in the demo below ↓

English

116

308

2.3K

198K

Google AI@GoogleAI·15 Nis

@HoodyAndShorts Placeholders? No these are just examples of how the audio tags work 🙂

English

179

Google AI@GoogleAI·15 Nis

Gemini 3.1 Flash TTS is rolling out in Google Vids and is available today in preview via the Gemini API and in @GoogleAIStudio. Whether you’re creating a pitch deck or recording a passion project, transform your scripts into studio-quality narration: blog.google/innovation-and…

English

107

23.7K

Google AI@GoogleAI·13 Nis

Prefer watching on @YouTube while scrolling through the comment section? We get it: youtu.be/b1Pvt072wKQ?si…

YouTube

English

35.4K

Google AI@GoogleAI·13 Nis

Got a doodle for your next project laying around? Turn it into working software using @GoogleAIStudio and Nano Banana. Watch us vibe code a weather-responsive outfit selector app from a single, hand-drawn sketch:

English

226

45.9K

Google AI@GoogleAI·10 Nis

TGIF! Here are some of our favorite updates from the past week: — Notebooks in @GeminiApp, an integration with @NotebookLM that enables you to retrieve context from your private notebooks or convert your active chats into grounded sources for new research — The @GeminiApp on web now generates customizable interactive visualizations, including 2D and 3D models, directly in your chat to help you deconstruct complex concepts — The new AI-powered @GoogleFinance tool is shipping to 100+ countries, delivering market research, advanced charting, expanded real-time data, and more

English

277

51K

Keşfet

@GoogleAIStudio @GeminiApp @GoogleCloud @StitchbyGoogle @GoogleDeepMind @GooglePhotos @GoogleChrome @theoledgers