Adrià Recasens

4.4K posts

Adrià Recasens

@arecasens

👨‍💻 Research Scientist @DeepMind 👀🔊 Multimodal 🗣️ Views are on my own

London, England Katılım Temmuz 2007

1.9K Takip Edilen1.5K Takipçiler

Sabitlenmiş Tweet

Adrià Recasens@arecasens·29 Mar

This April I will be running the @LLHalf in memory of my nephew Guifré and in support of the work that @tommys does for those who suffered baby loss. If you are interested on contributing, here is my fundraising page! llhm.tommys.org/fundraising/ar…

English

1.6K

Adrià Recasens retweetledi

Demis Hassabis@demishassabis·26 Mar

Gemini 3.1 Flash Live is our highest quality audio & voice model yet - and a big leap towards building next-gen voice-first agents. Lower latency, better precision, more natural interactions... try it now with Gemini Live in the @GeminiApp or build with it in @GoogleAIStudio!

Google DeepMind@GoogleDeepMind

Say hello to Gemini 3.1 Flash Live. 🗣️ Our latest audio model delivers more natural conversations with improved function calling – making it more useful and informed. Here’s what’s new 🧵

English

125

137

1.5K

286.9K

Adrià Recasens@arecasens·27 Kas

@OriolVinyalsML @la_UPC Moltes felicitats!

Català

Oriol Vinyals@OriolVinyalsML·25 Kas

On my way to Barcelona to receive a Doctor Honoris Causa from my alma mater, @la_UPC. Truly honored! 🎓 Join Thursday for my Master Class, "From AI to AGI: The Quest for True Intelligence." Hope to see you there! telecos.upc.edu/ca/esdevenimen… "Create an image at 41.4036° N, 2.1744° E, January 1st, 1983, 15:00 hours."

English

159

1.5K

254.1K

Adrià Recasens@arecasens·21 Eki

@BieXiaoyu @GoogleDeepMind Of course, no problem!

English

XIAOYU BIE@BieXiaoyu·20 Eki

@arecasens @GoogleDeepMind Hi Adrià! I’m a postdoc working on generative AI for audio, and this role looks like a perfect fit with my background and interests. Mind if I DM you a couple of quick questions?

English

138

Adrià Recasens@arecasens·16 Eki

We are hiring! I'm looking for a Research Scientist to work on Gemini Audio in @GoogleDeepMind London ♊️🔊 Feel free to reach out if you have any questions! job-boards.greenhouse.io/deepmind/jobs/…

English

2.1K

Adrià Recasens retweetledi

Google AI Developers@googleaidevs·3 Haz

🔊Native audio outputs in Gemini 2.5 give developers new ways to build richer applications with conversation and speech. ↓ blog.google/technology/goo…

English

111

853

78.1K

Adrià Recasens@arecasens·24 May

Very nice demo of many of the capabilities available with native audio out, try it yourself in aistudio.google.com/apps/bundled/m…

Google AI Developers@googleaidevs

See Native Audio in action 🤠🦊 Our "Mumble Jumble" demo in Google AI Studio showcases the Live API's advanced voice capabilities: natural flow, distinct tone, emotion, and multilingual support.

English

456

Adrià Recasens@arecasens·24 May

We are shipping! 🚢🚢🚢 Native audio output is now available in the Live API. Very natural interaction with the option of using Google Search or thinking for more refined answers. Try it out in aistudio.google.com/live and let us know what you think!

Google AI Developers@googleaidevs

Gemini 2.5 Flash Preview now supports native audio output via the Live API for seamless and natural spoken interactions. With support for 30+ voices, build conversational AI agents and experiences that feel more intuitive and natural → #native-audio-output" target="_blank" rel="nofollow noopener">ai.google.dev/gemini-api/doc…

English

1.4K

Adrià Recasens retweetledi

Ankur Bapna@ankurbpn·29 Nis

Happy to see the first feature powered by Gemini native audio outputs ship out to public - especially since it's MASSIVELY multilingual. Lots more coming soon 😉

NotebookLM@NotebookLM

This just in... the @NotebookLM hosts have some rather exciting news they'd like to share with you all:

English

330

20.8K

Adrià Recasens@arecasens·29 Nis

NotebookLM parla català! Molt content de veure com Gemini contribueix a que NotebookLM parli més de 50 idiomes!

NotebookLM@NotebookLM

This just in... the @NotebookLM hosts have some rather exciting news they'd like to share with you all:

Català

3.6K

Adrià Recasens retweetledi

Oriol Vinyals@OriolVinyalsML·25 Mar

Introducing Gemini 2.5 Pro Experimental! 🎉 Our newest Gemini model has stellar performance across math and science benchmarks. It’s an incredible model for coding and complex reasoning, and it’s #1 on the @lmarena_ai leaderboard by a drastic 40 ELO margin. Only a handful of model releases have leaped ahead so strongly in ELO. 📈 ELO score differences map directly to win rate: e.g. a 400 ELO difference yields a ~91% win rate. Incredible that since 1.5, just a year ago, we jumped 200 ELO (300 since 1.0). Here’s a fun example where Gemini 2.5 Pro writes code to create an animated swarm of colorful boids swimming in a rotating hexagon. 💫 Try the model for free today in AI Studio. It’s also available to Gemini Advanced users in @geminiapp. aistudio.google.com/app/prompts/ge… Blog: goo.gle/4c3NitO

English

162

1.1K

211.7K

Adrià Recasens@arecasens·8 Oca

Gràcies a @parentesismedIA pel reconeixement!

Paréntesis MEDia@parentesismedIA

🎉 ¡Todos los ganadores de los Premios Paréntesis 2024! 🏆 👤 Personaje del Año: Sam Altman. Español del Año: Mateo Valero, director del BSC. 🚀 Mejor Emprendedora: Anna Giralt, de Artefacto. Catalán del Año: Adrià Recasens Continente. 📩 Detalles: bsniu.r.ag.d.sendibm3.com/mk/mr/sh/OycXx…

Català

525

Adrià Recasens retweetledi

Alexander Chen@alexanderchen·21 Ara

New Gemini 2.0 modalities will enable entirely new interfaces! ✨ that's why I love this early experimentation my teammate @trudypainter is doing with native audio output in her VoiceCursor prototype. I've been playing with this UI and it really feels like a magical piece of paper coming to life. Excited for native audio to roll out more widely so more people can start experimenting.

Trudy Painter@trudypainter

I’ve been exploring Gemini 2.0’s new native audio output capability, which is available for early testers. I’m a developer at Google Creative Lab, and wanted to share one of my favorite experiments so far called ✨ VoiceCursor (🔊 sound on for video) Unlike traditional TTS, native audio lets you prompt the model with expressive styles, ie “Say this like a disgruntled pirate…” So I made ✨VoiceCursor… it lets you rapidly try different prompts. Just type, highlight your phrase, then hear it spoken in different ways! My code is open-sourced here: github.com/googlecreative… Here’s a thread 🧵

English

1.5K

Adrià Recasens retweetledi

Google DeepMind@GoogleDeepMind·16 Ara

Gemini 2.0 Flash Experimental has the ability to produce native audio in a variety of styles and languages - all from scratch. 🗣️ Here’s how this is different to traditional text-to-speech systems ↓ aistudio.google.com/live

English

250

1.4K

125.7K

Adrià Recasens@arecasens·17 Ara

Congrats to the #Veo2 team, brilliant work! This is so far my favorite example, combining generation and reasoning 🤯🤯🤯

Hernan Moraldo@hhm

Prompt: "Bear writing the solution to 2x-1=0. But only the solution!"

English

855

Adrià Recasens retweetledi

Antoine Yang@AntoineYang2·17 Ara

Gemini 2.0 Flash's video understanding is here 🚀 Think: search in videos via timecodes, extract text from moving camera footage, analyze screen recordings in real-time interactions with native audio out 🔊 Come and try it aistudio.google.com 😀 youtu.be/Mot-JEU26GQ?si…

YouTube

English

8.6K

Adrià Recasens retweetledi

Joost van Amersfoort@joost_v_amersf·13 Ara

A very interesting opportunity to work at the intersection of data and scaling. Paul's insights have been crucial to the success of Gemini 2.0 flash (and 1.5 and 1.0 and... 👀). He makes for an excellent mentor/manager! Come help us push the frontier further 🦾.

Paul Michel@pmichelX

Interested in working on Gemini pre-training? I'm hiring a research scientist to work on pre-training data @GoogleDeepMind in London: boards.greenhouse.io/deepmind/jobs/… I am unfortunately not at #NeurIPS2024 but feel free to reach out to ask questions or see the team at the booth there!

English

2.8K

Adrià Recasens retweetledi

Ankur Bapna@ankurbpn·11 Ara

Great work by many amazing folks to put this together - looking forward to when it's available to everyone ♊️

Logan Kilpatrick@OfficialLoganK

Gemini 2.0 Flash comes with native audio output, and it’s actually wild 🤯 we are working hard to roll this out quickly to more folks!

English

1.2K

Adrià Recasens retweetledi

Alexander Chen@alexanderchen·11 Ara

"Say this in a whisper ..." 💬 Native audio output was definitely my favorite Gemini 2.0 demo to make. Being able to steer the voice so expressively with just prompts felt totally new. 🙂x.com/googleaidevs/s…