Ishita Mediratta

1.9K posts

Ishita Mediratta banner
Ishita Mediratta

Ishita Mediratta

@ishitamed

🤖 @meta superintelligence labs

Katılım Nisan 2013
1.4K Takip Edilen711 Takipçiler
Ishita Mediratta retweetledi
Ofir Press
Ofir Press@OfirPress·
1) Our team at Meta has a tough new coding benchmark challenging models to code entire programs including ffmpeg and the PHP compiler from scratch. 2) Top accuracy is 0% 3) We will be making the benchmark harder.
John Yang@jyangballin

How much of SQLite, FFmpeg, PHP compiler can LMs code from scratch? Given just an executable and no starter code or internet access. Introducing ProgramBench: 200 rigorous, whole-repo generation tasks where models design, build, and ship a working program end to end. 🧵

English
6
15
211
16.9K
Ishita Mediratta
Ishita Mediratta@ishitamed·
I wish macs still had those touchbars.. would have made working across multiple claude/codex sessions easier!
English
0
0
0
65
Mitchell Troyanovsky
Mitchell Troyanovsky@mitch_troy·
Model side: Model intuition around its own capabilities and solid theory of mind onto other agents. Plus a sprinkle of intelligence and I’d say good attention over 1-2m tokens not just 200-500k tokens. Applied ML side: Standardized definitions of trajectory metabehaviors to eval against Good primitives for self regulating agent state to build up a models intuitions without needing to manually curate context
English
2
0
1
233
Tanay Kothari
Tanay Kothari@tankots·
i grew up in delhi dreaming of building tech millions of people couldn't live without. today, @wisprflow is officially live in india! before this launch, i flew to india to answer one question: does wispr flow actually work here? in the back of an auto with horns blaring. a mumbai gym with punjabi music at full volume. a dhaba with the waiter rattling off the menu faster than you can type. we went and found out - it worked every single time. india became our second biggest market on its own. we 3x'd growth in 3 months with no campaigns or partnerships. people just found wispr flow organically and made it part of their daily life. the least we could do was show up for them properly. so we're launching wispr flow in india with hinglish & android support. because it's the way i've spoken my whole life. and the way everyone around me still does. grateful to my co-founder @sahajgarg6, our india lead @findingnimo_, and everyone who made this possible.
English
275
242
2.2K
417.6K
Parminder Singh
Parminder Singh@parrysingh·
Here's some news. Over a year ago, I plunged into the AI journey to shape AI fluency with my venture ClayboxAI. We received a tremendous response. But what's life without a few unexpected twists? Sometime last year, I was approached by the Reliance Group to lead their new enterprise AI JV with Meta. Over the course of several conversations it was clear that this is a once-in-a-generation opportunity to shape Enterprise AI, not just in India, but beyond. So here we are. Excited, but above all, grateful. To everyone who has been part of my journey to this point, thank you. The best chapter is just beginning.
Parminder Singh tweet media
English
136
27
470
32.7K
Ishita Mediratta
Ishita Mediratta@ishitamed·
Iced rooh afza matcha latte >>> Iced strawberry matcha latte
English
0
0
0
182
Ishita Mediratta retweetledi
Xubo Liu
Xubo Liu@LiuXub·
Heading to ICLR 2026 🇧🇷 I’ll be hosting the Meta Networking Mixer on April 24, 2026, from 6:30 pm to 9:30 pm BRT. Register your interest here: events.atmeta.com/iclrnetworking…; I’ll also be at the Meta Booth on: Thu, Apr 23, 10:30 am–12:00 pm & Sat, Apr 25, 10:30 am–12:00 pm. Our poster, “Scaling Speech Tokenizers with Diffusion Autoencoders” will be presented on Fri, Apr 24, 2026, from 10:30 AM – 1:00 PM BRT at Pavilion 4, P4-5006: iclr.cc/virtual/2026/p… If you’re interested in voice AI, multimodal LLMs, audio tokenization, or pre- and post-training, or want to learn more about what we’re building at Meta Superintelligence Labs, feel free to stop by!
English
2
5
74
7.5K
Sam Earle
Sam Earle@Smearle_RH·
My PhD thesis defense will be here zoom.us/my/togelius tomorrow (Monday) at 9am EST. All are welcome! 🙂 Talk title: "Open-ended Learning via Procedural Content Generation in Video Games: Environment Substrates, Morphogenesis, and Designer-Player Loops". Come watch me make it make sense!
English
5
5
32
3.2K
Ishita Mediratta
Ishita Mediratta@ishitamed·
Bukhara and Dum Pukht might be Delhi’s most overhyped restaurants. Had such a surprisingly bland Mughlai food. 😕
English
0
0
1
198
Ishita Mediratta retweetledi
Despoina Magka
Despoina Magka@MarlaMagka·
🚀 Happy to see AIRS-Bench, an AI R&D benchmark that Meta open-sourced earlier this year (x.com/MarlaMagka/sta…), being used in the Muse Spark Safety & Preparedness Report to assess loss of control risks stemming from acceleration of AI development. AIRS-Bench (github.com/facebookresear…) measures the ability of AI agents to execute end-to-end AI R&D across the full research lifecycle, from idea generation 💡 and implementation 🛠️ to experiment analysis 🧪 and iterative refinement 📈 Along with SWE-Bench and MLE-Bench, AIRS-Bench was used to assess the risks of models automating AI R&D work and outpacing governance mechanisms. Our findings suggest that Muse Spark does not substantially contribute to the said threat, as it achieves performance superior to human researchers in only 5 out of 20 tasks and for a fraction of its attempts 🔍 This is inline with results from comparison models and highlights the models' limitations to execute the complete research lifecycle consistently and across a wide range of domains 🤖 Head over to the 158-page report for more detailed results and a wide range of assessments and mitigations under Meta’s Advanced AI Scaling Framework 👇
Despoina Magka tweet media
Summer Yue@summeryue0

🚀 Muse Spark Safety & Preparedness Report for Meta AI is out. We start with our pre-deployment assessment under Meta's Advanced AI Scaling Framework, covering chemical and biological, cybersecurity, and loss of control risks. Our assessment flagged potentially elevated chem/bio risk, so we implemented safeguards and validated mitigations before deployment - bringing residual risk to within acceptable levels. Beyond the Framework, we also share findings and early explorations of model behavior (honesty, intent understanding, etc.), jailbreak robustness, eval awareness, and more. We're sharing this report to give a closer look at how we evaluate advanced AI safety. Always more work to do, and we welcome feedback from the community. ai.meta.com/static-resourc…

English
0
4
6
1.8K
Ishita Mediratta
Ishita Mediratta@ishitamed·
Can’t wait to read both! (I know am pretty late to Deepi’s book) 🙈
Ishita Mediratta tweet media
English
0
0
0
108
Ishita Mediratta retweetledi
Ishita Mediratta retweetledi
Summer Yue
Summer Yue@summeryue0·
🚀 Muse Spark Safety & Preparedness Report for Meta AI is out. We start with our pre-deployment assessment under Meta's Advanced AI Scaling Framework, covering chemical and biological, cybersecurity, and loss of control risks. Our assessment flagged potentially elevated chem/bio risk, so we implemented safeguards and validated mitigations before deployment - bringing residual risk to within acceptable levels. Beyond the Framework, we also share findings and early explorations of model behavior (honesty, intent understanding, etc.), jailbreak robustness, eval awareness, and more. We're sharing this report to give a closer look at how we evaluate advanced AI safety. Always more work to do, and we welcome feedback from the community. ai.meta.com/static-resourc…
English
16
73
422
258.6K
Ishita Mediratta
Ishita Mediratta@ishitamed·
Spent 24 hrs in Indore - one of India’s cleanest cities. Came back to Delhi NCR with equal parts hope & disappointment. Clearly, it can be done. Q is, how long will it take to get there?
English
0
0
1
161
Ishita Mediratta retweetledi
Arena.ai
Arena.ai@arena·
Meta is back in the Arena! Muse Spark debuts as a top frontier model across both Text and Vision: - Text Arena: #3 tied with Gemini-3.1-Pro and Claude-Opus-4.6 - Vision Arena: #2 tied with Claude-Opus-4.6 This marks Meta’s first major release since early 2025. Highlights: - #4 Hard Prompts, #6 Coding, #9 Creative Writing, #10 Instruction Following, #27 Expert - #3 tied for Business, Management, & Financial Ops, #7 Legal & Government, #12 Writing & Literature Meta is back at the frontier. Huge congrats to @AIatMeta on this incredible milestone!
Arena.ai tweet media
AI at Meta@AIatMeta

Introducing Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is available today at meta.ai and the Meta AI app. We’re also making it available in private preview via API to select partners, and we hope to open-source future versions of the model. Learn more: go.meta.me/43ea00

English
28
70
805
148.1K
Ishita Mediratta retweetledi
Shirley Wu
Shirley Wu@ShirleyYXWu·
Super proud to be in the team! I have never learned this much in the past two months. Unsurprisingly, Muses family we are building now is learning and growing more than 1000x faster than that. Stay tuned for what’s next from us!
AI at Meta@AIatMeta

Introducing Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is available today at meta.ai and the Meta AI app. We’re also making it available in private preview via API to select partners, and we hope to open-source future versions of the model. Learn more: go.meta.me/43ea00

English
9
11
195
15.6K