emily mcmilin

155 posts

emily mcmilin

emily mcmilin

@micmylin

RL and world models for coding at FAIR

شامل ہوئے Aralık 2008
644 فالونگ690 فالوورز
emily mcmilin ری ٹویٹ کیا
John Yang
John Yang@jyangballin·
How much of SQLite, FFmpeg, PHP compiler can LMs code from scratch? Given just an executable and no starter code or internet access. Introducing ProgramBench: 200 rigorous, whole-repo generation tasks where models design, build, and ship a working program end to end. 🧵
John Yang tweet media
English
97
242
1.5K
679.3K
emily mcmilin
emily mcmilin@micmylin·
I'll be giving a talk at the ICLR VerifAI workshop, about code execution for code world modeling, later today (Sun) at 9:05 am (Brazil time). Swing by if you are interested in learning more!
Ameesh Shah@ameeshsh

🗣️📣Announcing VerifAI 2: AI Verification in the Wild, an upcoming workshop at #ICLR2026!! 🗣️📣 VerifAI will gather researchers to explore topics at the intersection of genAI and trustworthy ML. Submit your work! Check out our website and CFP for more: verifai-workshop.github.io

English
1
2
17
3.3K
emily mcmilin ری ٹویٹ کیا
Zhiqing Sun
Zhiqing Sun@EdwardSun0909·
Excited to share Muse Spark, the first model from whole team’s work in MSL! 🚀 It’s natively multimodal and agentic. I’ve been using it for my daily coding and research tasks. Still plenty of room to improve in agentic domains, but we’re moving with great velocity. It’s a seriously good model! Check out the full breakdown and try it out in meta.ai
Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English
9
26
197
19.2K
emily mcmilin ری ٹویٹ کیا
Yuxiang Wei
Yuxiang Wei@YuxiangWei9·
Software agents can self-improve via self-play RL Introducing Self-play SWE-RL (SSR): training a single LLM agent to self-play between bug-injection and bug-repair, grounded in real-world repositories, no human-labeled issues or tests. 🧵
Yuxiang Wei tweet media
English
66
290
1.7K
518.1K
Nan Rosemary Ke
Nan Rosemary Ke@rosemary_ke·
At NeurIPS this week. Excited to meet, please reach out. - Focussed on Scaling LLM-RL - Working on real world evals and long form generation (mathematical proofs/STEM) - Scaling tasks for agents (computer use/ coding/ research)
English
3
2
49
6K
emily mcmilin
emily mcmilin@micmylin·
We modify each repo's CI workflows to capture a single successful third-party build. For pytest repos, we inject conftest.py fixtures to verify the correct container and support optional Python execution tracing. See more in our paper: arxiv.org/abs/2510.02387
English
1
0
1
134
emily mcmilin
emily mcmilin@micmylin·
Key insight: the execution env of a GitHub Actions CI workflow is fully built with deps. So we can cheaply capture it as a standalone Docker image for later execution.
English
1
0
1
152
emily mcmilin
emily mcmilin@micmylin·
Better late than never to share how we built 35k+ unique repos (rather than commits from the same dozens of repos) into executable envs for CWM mid-training and SWE-RL post-training... x.com/syhw/status/19…
Gabriel Synnaeve@syhw

(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. ai.meta.com/research/publi…

English
1
2
11
1.7K
emily mcmilin
emily mcmilin@micmylin·
Dreams can come true. I’ve joined FAIR’s CodeGen team. :)
English
14
1
362
34.8K
emily mcmilin
emily mcmilin@micmylin·
@vishnuvig Thanks so much for the lightning fast, GPU speed and technical support, over the years. Great service!
English
1
0
2
1.1K
Vishnu - Jarvislabs.ai
Vishnu - Jarvislabs.ai@vishnuvig·
Ola recently announced that they are bringing affordable AI to Indian developers. 𝐉𝐚𝐫𝐯𝐢𝐬𝐥𝐚𝐛𝐬 an Indian company has been providing affordable GPUs for developers across the globe since 2020. We are a little known, so I want to share our story here. 𝐖𝐡𝐨 𝐰𝐞 𝐚𝐫𝐞 We are bootstrapped, building from the outskirts of Coimbatore. Started as a small team of 4, from humble backgrounds none from IITs/IIMs. Currently, we are a team of 12+. 𝐖𝐡𝐚𝐭 𝐰𝐞 𝐚𝐜𝐡𝐢𝐞𝐯𝐞𝐝 The cost of hosting GPU servers 4 years back in India was insanely high. We got 2 quotes which charged us Rs. 1.5L for a single server per month. At that cost, it was not practical for us to do the business. So we went to the first principle to build an MVP for a mini data center/server room. For the first few years, we ran all our servers from a room fitted with ACs, a UPS, and a Generator, which experts claimed would not work. As we scaled, we faced the heat of our setup, but by then we accumulated more money than we had. So last year we moved it to a tier 3+ DC near Bangalore. This helped us boost the confidence of our users, as we have redundancy for power, internet, and networking which gives us and our customers a lot of peaceful nights. 𝐖𝐡𝐨 𝐮𝐬𝐞𝐬 𝐉𝐚𝐫𝐯𝐢𝐬𝐥𝐚𝐛𝐬 Developers and artists from across the world have supported us in our journey. Some prominent companies are ZOHO (My inspiration), Weights and Biases, UNC, UpGrad, and many more. 𝐑𝐞𝐯𝐞𝐧𝐮𝐞 We crossed 580K USD in the last financial year, the highest ever in our history. Being bootstrapped, the only way for us to grow is to put all the money back. Our customers are our investors, as a founder I have hardly taken a paycheck for the last 4+ years, since the team also believes in our vision they are happy not taking a fancy cheque. 𝐕𝐢𝐬𝐢𝐨𝐧 As AI evolves, we want to bring the capabilities of AI to users at the lowest prices possible. Being bootstrapped, the only way to survive is to be frugal and disciplined. 𝐇𝐢𝐫𝐢𝐧𝐠 I am proud of our hiring strategy. We hired only freshers to date, and most of our hires do not have a formal degree. They come from rural areas and economically challenged backgrounds. The average age of our new team is 19. They have played an active role in building our V2 of Jarvislabs and improving the product daily. I love to thank everyone for supporting us in our journey. Thanks to Analytics India Magazine, INDIAai, fastai for recognizing us in our early years. If our story resonates with you, Please share our story to inspire others & support our mission. #StartupIndia
English
34
85
520
67.6K
emily mcmilin ری ٹویٹ کیا
Udacity
Udacity@udacity·
💡 Interested in learning more about LLM fundamentals? In the video below, Udacity instructor Emily McMilin explains what the Transformer model is & walks you through the difference between Encoder and Decoder model architectures. bit.ly/44f0eJn #genAI #generativeAI
Udacity tweet media
English
0
1
10
6.4K
emily mcmilin
emily mcmilin@micmylin·
@srush_nlp Using pronoun resolution as a case study, we hypothesize a casual mechanism & show empirically, that denoising objs are generally less underspecified, less vulnerable to spurious correlations / hallucinations, w AR comps ranging up to GPT-4 turbo preview. ojs.aaai.org/index.php/AAAI…
English
1
0
1
381
Sasha Rush
Sasha Rush@srush_nlp·
Lazy twitter: A common question in NLP class is "if xBERT worked well, why didn't people make it bigger?" but I realize I just don't know the answer. I assume people tried but that a lot of that is unpublished. Is the theory that denoising gets too easy for big models?
English
43
40
474
141.7K