Basil Mustafa

209 posts

Basil Mustafa

@_basilM

ML research @ google deepmind Zürich vision things in gemini opinions are mine but registered copyright of my mother

Zürich, Schweiz Katılım Kasım 2019

128 Takip Edilen1.7K Takipçiler

Basil Mustafa@_basilM·25 Mar

🔥

Arena.ai@arena

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer Query, and Multi-Turn! Massive congrats to @GoogleDeepMind for this incredible Arena milestone! 🙌 More highlights in thread👇

ART

Basil Mustafa retweetledi

Kevin Patrick Murphy@sirbayes·14 Mar

Gemma 3 is best in class for a VLM that runs on 1 GPU. Should make RL fine tuning feasible. Also Academic researchers can apply for Google Cloud credits (worth $10,000 per award) to accelerate their Gemma 3-based research.

Zoubin Ghahramani@ZoubinGhahrama1

Introducing Gemma 3: our open model that runs on single GPU/TPU! SoTA AI for its size; multimodal; supporting 140 languages; long context of 128k; fast and high-performance with quantization; open weights; function calling. What more do you want? @google blog.google/technology/dev…

English

305

40.4K

Basil Mustafa@_basilM·12 Mar

native image generation in gemini! this has been a tireless effort from AWESOME people - the list is too long, I am not on it, just hyped to be working with such incredible people. Have fun with it, but hold your breath for more - this is just the beginning

Robert Riachi@robertriachi

some cool examples with Gemini 2.0 native image output 🧵

English

2.7K

Basil Mustafa retweetledi

Fabian Mentzer@mentzer_f·7 Mar

Chilling more

English

4.7K

Basil Mustafa@_basilM·12 Şub

Awesome work as always from the legendary @ibomohsin

Ibrahim Alabdulmohsin | إبراهيم العبدالمحسن@ibomohsin

🔥Excited to introduce RINS - a technique that boosts model performance by recursively applying early layers during inference without increasing model size or training compute flops! Not only does it significantly improve LMs, but also multimodal systems like SigLIP. (1/N)

English

619

Basil Mustafa@_basilM·17 Ara

amazing work from video understanding jesus @AntoineYang2 alongside @MarioLucic_ @FPavetic @skprat and many others! they've been bringing better, faster video reasoning to a whole new level and have so much more in store ✨🚀♊

Antoine Yang@AntoineYang2

Gemini 2.0 Flash's video understanding is here 🚀 Think: search in videos via timecodes, extract text from moving camera footage, analyze screen recordings in real-time interactions with native audio out 🔊 Come and try it aistudio.google.com 😀 youtu.be/Mot-JEU26GQ?si…

English

1.9K

Basil Mustafa retweetledi

Ethan Mollick@emollick·17 Ara

Dang, Google's veo 2, same prompt.

Ethan Mollick@emollick

"golden retriever running through a brutalist art gallery with modern art on the walls as a single red balloon floats overhead" in Sora, Kling (the best Chinese model), Runway (best American model). Best of 2 videos This is a hard prompt, so look at consistency, reflections, etc

English

215

3.3K

Basil Mustafa retweetledi

Michael Tschannen@mtschannen·2 Ara

Have you ever wondered how to train an autoregressive generative transformer on text and raw pixels, without a pretrained visual tokenizer (e.g. VQ-VAE)? We have been pondering this during summer and developed a new model: JetFormer 🌊🤖 arxiv.org/abs/2411.19722 A thread 👇 1/

English

142

838

248.6K

Basil Mustafa@_basilM·12 Ara

I cannot recommend this enough. Paul is absolutely legendary, the whole team in London is fantastic and the vibes are impeccable. You will learn so much and accomplish awesome things. Apply!!! 🚀♊

Paul Michel@pmichelX

Interested in working on Gemini pre-training? I'm hiring a research scientist to work on pre-training data @GoogleDeepMind in London: boards.greenhouse.io/deepmind/jobs/… I am unfortunately not at #NeurIPS2024 but feel free to reach out to ask questions or see the team at the booth there!

English

904

Basil Mustafa retweetledi

Oriol Vinyals@OriolVinyalsML·11 Ara

Gemini 2.0 Flash ⚡️ has arrived! 2.0 Flash > 1.5 Pro (again!) 📈 Interacts with a browser 🤖 Native image generation 🖼️ and much more! Try it out aistudio.google.com/prompts/new_ch… As a preview of what is possible, wishing you all a Drastic Holiday powered by 2.0!

English

354

41.2K

Basil Mustafa@_basilM·22 May

Gemini's native audio capabilities are often overlooked - but very cool imo!

Alexander Chen@alexanderchen

Having fun playing with new native audio capabilities in Gemini 1.5 Pro! ♊ Here’s a demo using audio from the #GoogleIO keynote with examples you can try: transcription, word-level timecodes, and searching audio by drawing. (🔊Video has sound)

English

742

Basil Mustafa@_basilM·21 May

@giffmana @HamelHusain @_arohan_ @OfficialLoganK ah I see - the obvious solution is to rename it gemini 1.44

English

202

Lucas Beyer (bl16)@giffmana·19 May

@_basilM @HamelHusain @_arohan_ @OfficialLoganK I was going for 1.5 like Gemini version that’s overtaking OpenAI, but maybe it was too subtle. I’ll not refresh my laptop for one more year to make up my lack of frooglyness!

English

327

Hamel Husain@HamelHusain·18 May

🤣

QME

199

134.7K

Basil Mustafa@_basilM·19 May

@giffmana @HamelHusain @_arohan_ @OfficialLoganK frankly lucas I am shocked you are wasting Google's precious money by offering 501.5 when 501.44 would've done the job

English

294

Lucas Beyer (bl16)@giffmana·18 May

@HamelHusain Can we give 501.5 in gemini credits? Who do I ping to make this happen? @_arohan_ ? @OfficialLoganK ?

English

6.3K

Basil Mustafa@_basilM·18 May

🔥💪🏾♊😤

Jeff Dean@JeffDean

Gemini 1.5 Model Family: Technical Report updates now published In the report we present the latest models of the Gemini family – Gemini 1.5 Pro and Gemini 1.5 Flash, two highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. Our latest report details notable improvements in Gemini 1.5 Pro within the last four months. Our May release demonstrates significant improvement in math, coding, and multimodal benchmarks compared to our initial release in February. Furthermore, the 1.5 Pro Model is now stronger than 1.0 Ultra. The latest Gemini 1.5 Pro is now our most capable model for text and vision understanding tasks, surpassing 1.0 Ultra on 16 of 19 text benchmarks and 18 of 21 of the vision understanding benchmarks. The table below highlights the improvement in average benchmark performance for different categories in 1.5 Pro since Feb, and also shows the strength of the model relative to the 1.0 Pro and 1.0 Ultra models. The 1.5 Flash model also compares very well against the 1.0 Pro and 1.0 Ultra models. One clear example of this can be seen on MMLU On MMLU we find that 1.5 Pro surpasses 1.0 Ultra in the regular 5-shot setting scoring 85.9% versus 83.7%. However with additional inference compute, via majority voting on top of multiple language model samples, we can get a performance of 91.7% versus Ultra’s 90.0%, which extends the known performance ceiling of this task. @OriolVinyalsML and I are very proud of the whole Gemini team, and it’s fantastic to see this progress and to share these highlights from our Gemini Model Family. Read the updated report here: goo.gle/GeminiV1-5

ART

617

Basil Mustafa retweetledi

Pietro Schirano@skirano·15 May

I was so impressed with the Astra demo at Google I/O yesterday that I decided to build my own version using Gemini 1.5 Pro Flash. It's so fast and really good. ⚡️ It was even able to detect the gate! Content is streamed directly from my camera. Voice via @elevenlabs

English

663

107.7K

Basil Mustafa retweetledi

Jacob Austin@jacobaustin132·15 May

This is something I've worked on for a while! You can save the activations of one LLM call and reuse them for a follow-up that overlaps with the first. This means asking a question about a big codebase can take 30 seconds the first time and 1s after that!

Jaana Dogan ヤナドガン@rakyll

Gemini’s context caching is one of the most exciting releases that came out it of Google I/O. ai.google.dev/gemini-api/doc…

English

434

104.4K

Basil Mustafa@_basilM·15 May

@b00ml00p @mmmbchang from the inside of gemini, this is the demo we all wanted too!!!

English

Basil Mustafa@_basilM·14 May

legendary demo from @mmmbchang , no editing, no frills - the interactive agent magic actually at work, fruits of Michael's work alongside an awesome team of geniuses. 🔥♊💪🏾

Google DeepMind@GoogleDeepMind

We watched #GoogleIO with Project Astra. 👀

English

3.3K

Basil Mustafa@_basilM·15 May

Y'all this is literally 100% unscripted live content. Michael shared this with us while the GPT4o announcement was underway. It's the real deal!

Michael Chang@mmmbchang

Gemini and I also got a chance to watch the @OpenAI live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!

English

6.1K

Basil Mustafa@_basilM·14 May

lucas a.k.a. open source santa coming through again!

Lucas Beyer (bl16)@giffmana

We release PaliGemma. I'll keep it short, still on vacation: - sota open base VLM designed to transfer quickly, easily, and strongly to a wide range of tasks - Also does detection and segmentation - We provide lots of examples - Meaty tech report later! ai.google.dev/gemma/docs/pal…

English

4.6K

Keşfet

@ibomohsin @AntoineYang2 @MarioLucic_ @FPavetic @skprat @giffmana @HamelHusain @_arohan_