Ishita Mediratta (@ishitamed) - Twitter Profili

Ishita Mediratta retweetledi

1) Our team at Meta has a tough new coding benchmark challenging models to code entire programs including ffmpeg and the PHP compiler from scratch. 2) Top accuracy is 0% 3) We will be making the benchmark harder.

John Yang@jyangballin

How much of SQLite, FFmpeg, PHP compiler can LMs code from scratch? Given just an executable and no starter code or internet access. Introducing ProgramBench: 200 rigorous, whole-repo generation tasks where models design, build, and ship a working program end to end. 🧵

English

6

15

211

16.9K

Ishita Mediratta@ishitamed·3d

I wish macs still had those touchbars.. would have made working across multiple claude/codex sessions easier!

English

0

65

Ishita Mediratta@ishitamed·3d

@mitch_troy @nikunj @mitch_troy would love to hear your thoughts on this latest paper from our team: arxiv.org/pdf/2603.26499

English

0

26

Mitchell Troyanovsky@mitch_troy·4d

Model side: Model intuition around its own capabilities and solid theory of mind onto other agents. Plus a sprinkle of intelligence and I’d say good attention over 1-2m tokens not just 200-500k tokens. Applied ML side: Standardized definitions of trajectory metabehaviors to eval against Good primitives for self regulating agent state to build up a models intuitions without needing to manually curate context

English

2

0

1

233

Nikunj Kothari@nikunj·4d

@mitch_troy What’s missing right this moment?

English

1

0

1

625

Ishita Mediratta@ishitamed·3d

Somebody should launch a RAM ETF!

9to5Mac@9to5mac

Apple discontinues base Mac mini, now starts at $799 with 512GB storage 9to5mac.com/2026/05/01/app… by @ChanceHMiller

English

0

133

Ishita Mediratta@ishitamed·27 Nis

@tankots @WisprFlow Brilliant campaign! 🫡

English

0

1

46

Tanay Kothari@tankots·27 Nis

i grew up in delhi dreaming of building tech millions of people couldn't live without. today, @wisprflow is officially live in india! before this launch, i flew to india to answer one question: does wispr flow actually work here? in the back of an auto with horns blaring. a mumbai gym with punjabi music at full volume. a dhaba with the waiter rattling off the menu faster than you can type. we went and found out - it worked every single time. india became our second biggest market on its own. we 3x'd growth in 3 months with no campaigns or partnerships. people just found wispr flow organically and made it part of their daily life. the least we could do was show up for them properly. so we're launching wispr flow in india with hinglish & android support. because it's the way i've spoken my whole life. and the way everyone around me still does. grateful to my co-founder @sahajgarg6, our india lead @findingnimo_, and everyone who made this possible.

English

275

242

2.2K

417.6K

Ishita Mediratta@ishitamed·24 Nis

@parrysingh Congratulations @parrysingh !

English

0

1

58

Parminder Singh@parrysingh·24 Nis

Here's some news. Over a year ago, I plunged into the AI journey to shape AI fluency with my venture ClayboxAI. We received a tremendous response. But what's life without a few unexpected twists? Sometime last year, I was approached by the Reliance Group to lead their new enterprise AI JV with Meta. Over the course of several conversations it was clear that this is a once-in-a-generation opportunity to shape Enterprise AI, not just in India, but beyond. So here we are. Excited, but above all, grateful. To everyone who has been part of my journey to this point, thank you. The best chapter is just beginning.

English

136

27

470

32.7K

Ishita Mediratta@ishitamed·23 Nis

Iced rooh afza matcha latte >>> Iced strawberry matcha latte

English

0

182

Ishita Mediratta retweetledi

Xubo Liu@LiuXub·21 Nis

Heading to ICLR 2026 🇧🇷 I’ll be hosting the Meta Networking Mixer on April 24, 2026, from 6:30 pm to 9:30 pm BRT. Register your interest here: events.atmeta.com/iclrnetworking…; I’ll also be at the Meta Booth on: Thu, Apr 23, 10:30 am–12:00 pm & Sat, Apr 25, 10:30 am–12:00 pm. Our poster, “Scaling Speech Tokenizers with Diffusion Autoencoders” will be presented on Fri, Apr 24, 2026, from 10:30 AM – 1:00 PM BRT at Pavilion 4, P4-5006: iclr.cc/virtual/2026/p… If you’re interested in voice AI, multimodal LLMs, audio tokenization, or pre- and post-training, or want to learn more about what we’re building at Meta Superintelligence Labs, feel free to stop by!

English

2

5

74

7.5K

Ishita Mediratta@ishitamed·20 Nis

@Smearle_RH Congratulations Sam!

English

0

58

Sam Earle@Smearle_RH·20 Nis

My PhD thesis defense will be here zoom.us/my/togelius tomorrow (Monday) at 9am EST. All are welcome! 🙂 Talk title: "Open-ended Learning via Procedural Content Generation in Video Games: Environment Substrates, Morphogenesis, and Designer-Player Loops". Come watch me make it make sense!

English

5

32

3.2K

Ishita Mediratta@ishitamed·19 Nis

Bukhara and Dum Pukht might be Delhi’s most overhyped restaurants. Had such a surprisingly bland Mughlai food. 😕

English

0

1

198

Ishita Mediratta retweetledi

Despoina Magka@MarlaMagka·17 Nis

🚀 Happy to see AIRS-Bench, an AI R&D benchmark that Meta open-sourced earlier this year (x.com/MarlaMagka/sta…), being used in the Muse Spark Safety & Preparedness Report to assess loss of control risks stemming from acceleration of AI development. AIRS-Bench (github.com/facebookresear…) measures the ability of AI agents to execute end-to-end AI R&D across the full research lifecycle, from idea generation 💡 and implementation 🛠️ to experiment analysis 🧪 and iterative refinement 📈 Along with SWE-Bench and MLE-Bench, AIRS-Bench was used to assess the risks of models automating AI R&D work and outpacing governance mechanisms. Our findings suggest that Muse Spark does not substantially contribute to the said threat, as it achieves performance superior to human researchers in only 5 out of 20 tasks and for a fraction of its attempts 🔍 This is inline with results from comparison models and highlights the models' limitations to execute the complete research lifecycle consistently and across a wide range of domains 🤖 Head over to the 158-page report for more detailed results and a wide range of assessments and mitigations under Meta’s Advanced AI Scaling Framework 👇

Summer Yue@summeryue0

🚀 Muse Spark Safety & Preparedness Report for Meta AI is out. We start with our pre-deployment assessment under Meta's Advanced AI Scaling Framework, covering chemical and biological, cybersecurity, and loss of control risks. Our assessment flagged potentially elevated chem/bio risk, so we implemented safeguards and validated mitigations before deployment - bringing residual risk to within acceptable levels. Beyond the Framework, we also share findings and early explorations of model behavior (honesty, intent understanding, etc.), jailbreak robustness, eval awareness, and more. We're sharing this report to give a closer look at how we evaluate advanced AI safety. Always more work to do, and we welcome feedback from the community. ai.meta.com/static-resourc…

English

0

4

6

1.8K

Ishita Mediratta@ishitamed·15 Nis

AI Research Agents go brrr! 🚀🚀

Martin Josifoski@MartinJosifoski

Excited to share AIRA₂ — our next-generation AI Research Agents for ML that address key bottlenecks to scaling. AIRA₂ achieves SoTA on real-world ML tasks from MLE-bench-30 (81.5% vs 72.7%), exceeds human SoTA on 6/20 diverse AI research tasks from AIRS-Bench (and hacks another 5), while exhibiting strong, predictable scaling properties. To push the frontier of AI Research, we need systems that scale well. Developing AIRA₂, we learned a lot about the bottlenecks and what it takes to resolve them — insights already driving our next iteration: 1/

English

0

1

8

962

Ishita Mediratta@ishitamed·15 Nis

Can’t wait to read both! (I know am pretty late to Deepi’s book) 🙈

English

0

108

Ishita Mediratta retweetledi

Miles Turpin@milesaturpin·15 Nis

1/ ✨Muse Spark✨ not only represents a huge step forward for Meta towards personal superintelligence but also a lot of work on safety and alignment. Excited to release our Safety and Preparedness Report, 158 pages outlining our pre-deployment risk evaluations and broader evaluations of model behavior! 🧵

Summer Yue@summeryue0

🚀 Muse Spark Safety & Preparedness Report for Meta AI is out. We start with our pre-deployment assessment under Meta's Advanced AI Scaling Framework, covering chemical and biological, cybersecurity, and loss of control risks. Our assessment flagged potentially elevated chem/bio risk, so we implemented safeguards and validated mitigations before deployment - bringing residual risk to within acceptable levels. Beyond the Framework, we also share findings and early explorations of model behavior (honesty, intent understanding, etc.), jailbreak robustness, eval awareness, and more. We're sharing this report to give a closer look at how we evaluate advanced AI safety. Always more work to do, and we welcome feedback from the community. ai.meta.com/static-resourc…

English

2

23

161

22.2K

Ishita Mediratta retweetledi

Summer Yue@summeryue0·15 Nis

🚀 Muse Spark Safety & Preparedness Report for Meta AI is out. We start with our pre-deployment assessment under Meta's Advanced AI Scaling Framework, covering chemical and biological, cybersecurity, and loss of control risks. Our assessment flagged potentially elevated chem/bio risk, so we implemented safeguards and validated mitigations before deployment - bringing residual risk to within acceptable levels. Beyond the Framework, we also share findings and early explorations of model behavior (honesty, intent understanding, etc.), jailbreak robustness, eval awareness, and more. We're sharing this report to give a closer look at how we evaluate advanced AI safety. Always more work to do, and we welcome feedback from the community. ai.meta.com/static-resourc…

English

16

73

422

258.6K

Ishita Mediratta@ishitamed·13 Nis

🥑🥑

NirD@NirDiamantAI

@alexandr_wang btw muse spark also pulls from kaggle datasets automatically which saves tons of time vs manually hunting through their catalog

ART

0

2

339

Ishita Mediratta@ishitamed·12 Nis

Spent 24 hrs in Indore - one of India’s cleanest cities. Came back to Delhi NCR with equal parts hope & disappointment. Clearly, it can be done. Q is, how long will it take to get there?

English

0

1

161

Ishita Mediratta retweetledi

Arena.ai@arena·11 Nis

Meta is back in the Arena! Muse Spark debuts as a top frontier model across both Text and Vision: - Text Arena: #3 tied with Gemini-3.1-Pro and Claude-Opus-4.6 - Vision Arena: #2 tied with Claude-Opus-4.6 This marks Meta’s first major release since early 2025. Highlights: - #4 Hard Prompts, #6 Coding, #9 Creative Writing, #10 Instruction Following, #27 Expert - #3 tied for Business, Management, & Financial Ops, #7 Legal & Government, #12 Writing & Literature Meta is back at the frontier. Huge congrats to @AIatMeta on this incredible milestone!

AI at Meta@AIatMeta

Introducing Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is available today at meta.ai and the Meta AI app. We’re also making it available in private preview via API to select partners, and we hope to open-source future versions of the model. Learn more: go.meta.me/43ea00

English

28

70

805

148.1K

Ishita Mediratta retweetledi

Shirley Wu@ShirleyYXWu·8 Nis

Super proud to be in the team! I have never learned this much in the past two months. Unsurprisingly, Muses family we are building now is learning and growing more than 1000x faster than that. Stay tuned for what’s next from us!

AI at Meta@AIatMeta

Introducing Muse Spark, the first in the Muse family of models developed by Meta Superintelligence Labs. Muse Spark is a natively multimodal reasoning model with support for tool-use, visual chain of thought, and multi-agent orchestration. Muse Spark is available today at meta.ai and the Meta AI app. We’re also making it available in private preview via API to select partners, and we hope to open-source future versions of the model. Learn more: go.meta.me/43ea00

English

9

11

195

15.6K

Ishita Mediratta@ishitamed·9 Nis

🥑

ART

0

2

222

Ishita Mediratta

Keşfet