Harsh Mehta

70 posts

Harsh Mehta

Harsh Mehta

@HarshMeh1a

@MirendilAI, Past: AI R&D @AnthropicAI, @GoogleDeepmind, Gemini

San Francisco, CA Katılım Ağustos 2013
407 Takip Edilen5.1K Takipçiler
Sabitlenmiş Tweet
Harsh Mehta
Harsh Mehta@HarshMeh1a·
Career Update: I've left Anthropic to start something new. Anthropic is a magical place — amazing people, strong culture, and unmatched taste. I have a lot of respect for my friends and ex-colleagues, and knowing them, I'm confident they'll do the right thing, especially when the choices are hard. I feel grateful and proud to have been part of the journey with them. Excited to start a new chapter — stay tuned for more!
English
77
31
1.7K
288.3K
Harsh Mehta retweetledi
Behnam Neyshabur
Behnam Neyshabur@bneyshabur·
I've left Anthropic to start something new. 🧵
Behnam Neyshabur tweet media
English
155
62
2.9K
403.7K
Harsh Mehta retweetledi
Ashok Cutkosky
Ashok Cutkosky@AshokCutkosky·
Some ideas on a new optimizer from my student Qinzi Zhang: (github.com/ZQZCalin/train…) Early stages, but the empirical results are really promising! Would love to hear any thoughts, either on the empirical side or analysis-wise, and open to collaboration!
English
2
16
85
16.4K
Harsh Mehta retweetledi
Arena.ai
Arena.ai@arena·
Exciting News from Chatbot Arena! @GoogleDeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive score of 1300 (!), and also achieving #1 on our Vision Leaderboard. Gemini 1.5 Pro (0801) excels in multi-lingual tasks and delivers robust performance in technical areas like Math, Hard Prompts, and Coding. Huge congrats to @GoogleDeepMind on this remarkable milestone! Gemini (0801) Category Rankings: - Overall: #1 - Math: #1-3 - Instruction-Following: #1-2 - Coding: #3-5 - Hard Prompts (English): #2-5 Come try the model and let us know your feedback! More analysis below👇
Arena.ai tweet media
Logan Kilpatrick@OfficialLoganK

Today, we are making an experimental version (0801) of Gemini 1.5 Pro available for early testing and feedback in Google AI Studio and the Gemini API. Try it out and let us know what you think! aistudio.google.com

English
82
390
1.6K
1.3M
Harsh Mehta retweetledi
Aaron Defazio
Aaron Defazio@aaron_defazio·
Schedule-Free Wins AlgoPerf Self-Tuning Track 🎉 I'm pleased to announce that Schedule-Free AdamW set a new SOTA for self-tuning training algorithms, besting AdamW and all other submissions by 8% overall. Try it out: github.com/facebookresear…
MLCommons@MLCommons

@MLCommons #AlgoPerf results are in! 🏁 $50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al… #AIOptimization #AI

English
15
28
289
122.4K
Harsh Mehta
Harsh Mehta@HarshMeh1a·
Checkout our work on pushing the boundaries of reasoning capabilities of Gemini 1.5 Pro! We've been working hard at this. Excited about the progress we've made and even more so for what's next! Full report: goo.gle/GeminiV1-5
Oriol Vinyals@OriolVinyalsML

Today we have published our updated Gemini 1.5 Model Technical Report. As @JeffDean highlights, we have made significant progress in Gemini 1.5 Pro across all key benchmarks; TL;DR: 1.5 Pro > 1.0 Ultra, 1.5 Flash (our fastest model) ~= 1.0 Ultra. As a math undergrad, our drastic results in mathematics are particularly exciting to me! In section 7 of the tech report, we present new results on a math-specialised variant of Gemini 1.5 Pro which performs strongly on competition-level math problems, including a breakthrough performance of 91.1% on Hendryck’s MATH benchmark without tool-use (examples below 🧵). Gemini 1.5 is widely available, try it out for free here aistudio.google.com & read the full tech report here: goo.gle/GeminiV1-5

English
0
5
36
8.5K
Harsh Mehta
Harsh Mehta@HarshMeh1a·
If you hate LR schedules as much as I do, check this out! Think of the significance in practice, especially for LLMs — new data comes in, you continue training w/o any change, as simple as that 🔥 Stay tuned for official Jax version..
Aaron Defazio@aaron_defazio

Schedule-Free Learning github.com/facebookresear… We have now open sourced the algorithm behind my series of mysterious plots. Each plot was either Schedule-free SGD or Adam, no other tricks!

English
3
6
63
16K
Harsh Mehta retweetledi
Oriol Vinyals
Oriol Vinyals@OriolVinyalsML·
Exciting times, welcome Gemini (and MMLU>90)! State-of-the-art on 30 out of 32 benchmarks across text, coding, audio, images, and video, with a single model 🤯 Co-leading Gemini has been my most exciting endeavor, fueled by a very ambitious goal. And that is just the beginning! A long 🐍 post about our Gemini journey & state of the field. The biggest challenges in LLMs are far from trivial or obvious. Evaluation and data stand out to me. We've moved beyond the simpler "Have we won in Go/Chess/StarCraft?" to “Is this answer accurate and fair? Is this conversation good? Does this complex piece of text prove the theorem?” Exciting potential coupled with monumental challenges. The field is less ripe further down the model pipeline. Pretraining is relatively well understood. Instruction tuning and RLHF, less so. In AlphaGo and AlphaStar we spent 5% of compute in pre-training and the rest in the very important RL phase, where the model learns from its successes or failures. In LLMs, we spend most of our time on pretraining. I believe there’s huge potential to be untapped. Cakes with lots of cherries, please 🎂 @Google has demonstrated its ability to move fast. It has been an absolute blast to see the energy from my colleagues and the support received. A “random” highlight is coauthoring our tech report with a co-founder. Another is coleading with @JeffDean. But beyond individuals, Gemini is about teamwork: it is important to recognize the collective effort behind such achievements. Picture a room full of brilliant people, and avoid attributing success solely to one person. On a personal note, recently I celebrated my 10 year anniversary at Google, and it’s been 8 years since @quocleix and I co-authored “A Neural Conversational Model”, which gave us a glimpse of what was, has, and is yet to come. Back then, that line of work received a lot of skepticism. Lessons learned: whatever your passion is, push for it! Zooming back out, there’s lots of change in our field, and the stakes couldn’t be higher. Excited for what’s to come from Gemini, but humbled by the responsibility to “get it right”. 2024 will be drastic. Welcome Gemini! blog.google/technology/ai/…
GIF
English
63
267
2K
456.2K
Harsh Mehta retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵 dpmd.ai/announcing-gem…
English
162
1.5K
5.8K
1.3M
Harsh Mehta
Harsh Mehta@HarshMeh1a·
It’s been immense fun to have contributed to Gemini and a privilege to work with such talented colleagues! This is 1.0, we will continue to ship 💜
Jeff Dean@JeffDean

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks, including 10 of 12 popular text and reasoning benchmarks, 9 of 9 image understanding benchmarks, 6 of 6 video understanding benchmarks, and 5 of 5 speech recognition and speech translation benchmarks. Gemini Ultra is the first model to achieve human-expert performance on MMLU across 57 subjects with a score above 90%. It also achieves a new state-of-the-art score of 62.4% on the new MMMU multimodal reasoning benchmark, outperforming the previous best model by more than 5 percentage points. Gemini was built by an awesome team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google, and is one of the largest science and engineering efforts we’ve ever undertaken. As one of the two overall technical leads of the Gemini effort, along with my colleague @OriolVinyalsML, I am incredibly proud of the whole team, and we’re so excited to be sharing our work with you today! There’s quite a lot of different material about Gemini available, starting with: Main blog post: blog.google/technology/ai/… 60-page technical report authored by th Gemini Team: deepmind.google/gemini/gemini_… In this thread, I’ll walk you through some of the highlights.

English
2
1
17
2.7K
Harsh Mehta
Harsh Mehta@HarshMeh1a·
@keirp1 Thanks for all your awesome contributions, it was great to have you with us! Until next time :)
English
0
0
1
131
Keiran Paster
Keiran Paster@keirp1·
Heading back to Toronto after spending the fall at Google hosted by @HarshMeh1a and working with amazing Blueshift and Gemini teammates! It's a really fun time to work on LLMs and I hope to be back soon!
Keiran Paster tweet media
English
1
1
44
4.7K
Harsh Mehta retweetledi
Konstantin Mishchenko
Konstantin Mishchenko@konstmish·
Why do we need warm-up, cosine annealing, and other learning rate schedules when training with gradient descent? It turns out it's all about how gradient norms change over time. E.g., large norms at the start => warm-up. Slow decrease => cosine. Paper: arxiv.org/abs/2310.07831 0/4
Konstantin Mishchenko tweet media
English
4
66
412
56.2K
Harsh Mehta
Harsh Mehta@HarshMeh1a·
3) All the cool kids tend to use "cosine decay" schedule these days by default, our work explains and illustrates with a number of experiments that "linear decay" can be surprisingly effective and sometimes even outperform cosine, including on LLMs!
English
1
0
2
330
Harsh Mehta
Harsh Mehta@HarshMeh1a·
Check-out our new work on learning the learning rate *schedule*! In addition to providing theoretical motivation for some prevalent LR schedule best-practices, our work provides new recommendations, a short 🧵👇
Aaron Defazio@aaron_defazio

🚨 New Paper 🚨 A new approach to learning rate scheduling! Our refinement theory gives schedules that include warmup and annealing-to-zero automatically. arxiv.org/abs/2310.07831 It improves on strong baseline schedules across a majority of deep learning problems!

English
1
1
11
1.3K