Soham

627 posts

Soham banner
Soham

Soham

@sohamg121

Research Scientist @MistralAI on multimodal/audio LLMs. Previously: @GoogleDeepMind MS @CarnegieMellon

Mountain View, CA Katılım Ağustos 2013
1.2K Takip Edilen436 Takipçiler
Soham retweetledi
Alankar Jain
Alankar Jain@alankarjain91·
Two truths, no lies: - Most AI models are insanely powerful. - Most people use a tiny fraction of that power. Closing that gap isn’t about better models, it’s about better products. NextToken puts AI’s power in your hands, for your day-to-day work.
English
1
1
4
103
Soham retweetledi
Guillaume Lample @ NeurIPS 2024
Guillaume Lample @ NeurIPS 2024@GuillaumeLample·
Our first speech model, Voxtral TTS, is out. It delivers SOTA performance while significantly reducing cost compared to existing solutions, and it operates with very low latency. It uses a new architecture that combines auto-regressive generation of semantic speech tokens with flow-matching for acoustic tokens. We are also releasing a technical report sharing all our training methodology and insights. Much more to come in audio -- stay tuned !
Guillaume Lample @ NeurIPS 2024 tweet mediaGuillaume Lample @ NeurIPS 2024 tweet media
Mistral AI@MistralAI

🔊Introducing Voxtral TTS: our new frontier open-weight model for natural, expressive, and ultra-fast text-to-speech 🎭Realistic, emotionally expressive speech. 🌍Supports 9 languages and accurately captures diverse dialects. ⚡Very low latency for time-to-first-audio. 🔄Easily adaptable to new voices

English
28
53
695
45.8K
Soham retweetledi
Mistral AI
Mistral AI@MistralAI·
🔊Introducing Voxtral TTS: our new frontier open-weight model for natural, expressive, and ultra-fast text-to-speech 🎭Realistic, emotionally expressive speech. 🌍Supports 9 languages and accurately captures diverse dialects. ⚡Very low latency for time-to-first-audio. 🔄Easily adaptable to new voices
English
210
614
4.6K
869.9K
Soham
Soham@sohamg121·
@jachiam0 There is no way this happens through closed-source AI built by profit-chasing enterprise, and your CEO's offhand comments are a testimonial to that. I do believe in the potential promise of AI as you say, but one cannot ignore selfish human motivations.
English
0
0
2
60
Joshua Achiam
Joshua Achiam@jachiam0·
We're entering the phase of AI politics where society will intensely debate whether it is a good idea to build AI at all. Builders need to make the case. The way I see it, AI is our best chance to defeat hunger, want, death, and war. It's a moral imperative to try.
English
130
23
294
96.2K
Soham
Soham@sohamg121·
@giffmana did it debug the right way, i.e. put a bunch of print statements and run code again and again?
English
0
0
0
110
Lucas Beyer (bl16)
Lucas Beyer (bl16)@giffmana·
Well damn, it was bound to happen and this morning it happened. There's a big chunk of code touching many pieces that i know in depth because i proudly hand-crafted it all. But it has one bug that i wasn't able to pinpoint even after an hour of debugging yesterday evening. This morning i resume debugging, but after another 10min of being none the wiser i decided to shoot a prompt to Opus4.6 and let it search while i continue debugging. A mere 1min51s later, Opus actually found the bug, in a file that i didn't even consider looking at during my two debugging sessions. This marks the first time a LLM found the bug in my code faster than me. I've tried this many times in the past, and i was always faster or ~same. FWIW the root cause was simple: np.prod(tuple_of_pyints) returns a np.int64, not a python int. Finding this as the root cause of my bug is what was not simple: i didn't even mention that code part in my prompt because i didn't consider it.
English
33
20
762
63.7K
Soham retweetledi
Mistral AI for Developers
Mistral AI for Developers@MistralDevs·
Since launching Voxtral Realtime, the community response has been remarkable. Today, we share the technical report, launch the Realtime playground in Mistral Studio, and share the model in Hugging Face Transformers. 🧵
Mistral AI for Developers tweet media
English
26
72
555
92.6K
Soham
Soham@sohamg121·
@asharoraa mental note to not use Whatsapp as my non-work todo tracker
English
1
1
2
1.5K
Soham retweetledi
Guillaume Lample @ NeurIPS 2024
Guillaume Lample @ NeurIPS 2024@GuillaumeLample·
🚀 We are hiring Research Interns ! We are looking for Master's and PhD students (final year) with a strong background in AI / ML / NLP who want to work on cutting-edge AI systems alongside Mistral AI researchers. 📍 Location: Paris, London, or Palo Alto ⏳ Duration: 4-6 months 🎓 For Master's students: opportunity to continue with a PhD at Mistral after the internship (via the CIFRE program). Links below:
English
33
88
1K
88.6K
Soham
Soham@sohamg121·
@agihippo how about you just carry a 1.5 kg thing in a backpack man, it's really not that troublesome lol
English
0
0
1
563
yi
yi@agihippo·
I think we should normalize issuing multiple corp laptops for ai researchers. Some people live in multiple residences or work from many diff desks. It's cumbersome to bring your laptop everywhere so sometimes we just don't bring it around. The added convenience of checking in on your job saves much more money than one more MBP. This should apply to all technical staff but even more so ai researchers because the impact of checking in on your jobs more frequently can have high ROI.
English
16
2
106
37.3K
Soham
Soham@sohamg121·
@levelsio anyone claiming they know Google is going to win purely because they have "YouTube" data has no idea how careful Google is with content licensing and how shit most of YT is.
English
0
0
0
9
@levelsio
@levelsio@levelsio·
So I bought over $1M in Google today Kinda crazy but also not so crazy I've been the biggest Google hater for years, it was completely mismanaged, destroyed by politics and lack of any leadership, fumbled inventing Transformers etc. Then Sergey returned and suddenly Google is dominating not just in the AI benchmarks and leaderboards but in real usage AI benchmarks can and are easily rigged But me running an AI startup and always wanting to use the best models makes me conclude something basic now: it's really just Google and Elon Musk and the Chinese in the end who will probably win The models I use are all by either Google, xAI, or the Chinese (ByteDance, Kling, Minimax) As you know Google now has its own chips (TPUs), Google has the biggest data set in video (YouTube), images (Google images) and generally the web (for LLMs), still the one of the biggest general user bases (Google Search etc), and they finally have a real engineer being the de facto CEO now (Sergey Brin) Elon Musk with xAI you can't bet against cause he simply has the sheer willpower to get things done The Chinese are similar, sheer willpower and they don't sleep and they really want to win, and companies like ByteDance (TikTok) have massive data sets in video too of course In my opinion everyone is still staring too much at LLMs, I've always been more interested in image models, video models and now the nascent 3d and world models, that's where it's going and where we'll be able to prompt entire worlds or apps or whatever, it's hard to imagine WHAT exactly With my app Photo AI I try be a little part of that journey there of course Now I can't invest in xAI, I'm a bit invested in the Chinese via the ICHN ETF, but of course Google anyone can invest in and so I think I should I've reduced my Nvidia investments already months ago, as it was inevitable there'd be real competitors to their chips at some point, with Google's TPUs there are now I'm not an expert, and you should mostly just buy ETFs, and you shouldn't listen to me and this is not financial advice
@levelsio tweet media@levelsio tweet media@levelsio tweet media
@levelsio@levelsio

This is Sergey Brin's yacht He got so bored of sitting on this $450M yacht that he had to get out and go create things again The only true long-term satisfaction for man is to create, either things, or babies

English
535
246
6.7K
2.8M
Soham
Soham@sohamg121·
@AnjneyMidha Will this actually attract the same level of talent that GovTech in Singapore did (with salaries that matched/beat private companies)?
English
0
0
0
215
Anjney Midha
Anjney Midha@AnjneyMidha·
Alumni from this program will go on to build trillion dollar companies in the coming decade
Scott Kupor@skupor

Your government needs YOU to transform the federal government through modern software development. If you’re up for a huge challenge, join 1,000 of the country’s best and brightest technologists in the inaugural class of @USTechForce. We are partnering with the top U.S. technology companies to take on this challenge. You’ll learn a ton, network across the most important government agencies and private sector companies, ultimately creating powerful career opportunities whether you want to continue in public service or join the private sector. I am grateful to @POTUS for ensuring that America remains the world’s technology leader. Go to TechForce.gov to apply today.

English
1
0
42
15.5K
Soham
Soham@sohamg121·
@rasbt @haider1 LaMDA was decoder-only. Encoder-only models had their own place, and Google looked at both encoder-decoder/decoder-only architectures. OpenAI _did_ run with it.
English
0
0
0
460
Sebastian Raschka
Sebastian Raschka@rasbt·
@haider1 I am not sure "OpenAI ran with it" is entirely correct. I remember there was quite some rivalry between Google's encoder approach and OpenAI's push for decoder-style models. Google tried to make encoder-style models work for many years, since the original architecture.
English
11
2
257
29.2K
Haider.
Haider.@haider1·
Sergey Brin admits Google messed up by under-investing in the transformer architecture it invented Google was too scared to release chatbots that "say dumb things", so it under-invested in scaling compute "we didn't take it very seriously... and openAI ran with it"
English
109
355
5.1K
1.3M
Val
Val@onetwoval·
imagine if twitter had slack emotes
English
5
0
19
3.1K
Soham
Soham@sohamg121·
@_arohan_ @Miles_Brundage What's shameful about using OSS models to bootstrap synthetic data? (I'm assuming that's what they mean, and not logit distillation)
English
0
0
0
79
rohan anil
rohan anil@_arohan_·
Distilling news and billion dollar equity packages 6 months ago, make it make sense. Either whatever is reported is complete nonsense or it’s very over.
English
1
2
69
16.1K
Soham retweetledi
Mistral AI
Mistral AI@MistralAI·
Introducing the Devstral 2 coding model family. Two sizes, both open source. Also, meet Mistral Vibe, a native CLI, enabling end-to-end automation. 🧵
English
174
458
3.5K
1.8M
Nyanpasu
Nyanpasu@NyanpasuKA·
we can safely say lmarena is saturated ?
Arena.ai@arena

🚨BREAKING: Text Leaderboard Update 🐳 Deepseek-v3.2 enters the leaderboard at #38, and Deepseek-v3.2-thinking lands at #41. For comparison, previous versions ranked higher: 🔹 v3.2 ranks #38 (–5 pts v3.1 and –14 pts v3.2-exp) 🔹 v3.2-thinking ranks #41 (–7 pts vs v3.1-thinking and –5 pts v3.2-exp-thinking) Both models show their biggest gains in Legal by rank, with improvements of +28 points for v3.2 and +19 points for v3.2-thinking when compared to v3.1 predecessors. The largest drop appears in Healthcare for, where v3.2-thinking falls by 25 points. Where v3.2 performs strongest (among open models): 🔹 #1 in Math and Legal 🔹 Top 10 in Multi-Turn, Media, and Business Where v3.2-thinking performs strongest (among open models): 🔹 #1 in Science 🔹 Top 5 in Legal These updates reflect @deepseek_ai’s ongoing work to expand and refine its open source model family.

English
7
0
28
4.4K
Soham
Soham@sohamg121·
@docmilanfar There were more, also funny that they are exactly 1000 apart
Soham tweet media
English
0
0
4
2.8K
Soham
Soham@sohamg121·
@aurielws this is giving me Gradient Canopy nostalgia
English
0
0
1
66
Auriel
Auriel@aurielws·
Drink names for today’s event courtesy of Gemini 3 💅 🍸
Auriel tweet media
English
6
1
17
781