
Quyet V. Do (Looking for Research Internship)
230 posts

Quyet V. Do (Looking for Research Internship)
@Quyet_Azir
PhD at @virginia_tech, supervised by @tuvllms. Work on LLMs' long-context reasoning, LLMs' post-training, and searching. ex-Research Intern at @AdobeResearch.











@yisongyue Semantic Scholar has API & MCP server ... 👀allenai.org/asta/resources…

Introducing SWE-Playground: A fully automated pipeline that generates synthetic environments to train versatile coding agents. 🤖✨ Training software engineering agents often relies on existing resources like GitHub issues and focuses on solving SWE-bench style issue resolution tasks. While this has driven incredible progress, real-world engineering involves a wider spectrum of tasks —from designing new libraries to writing reproduction scripts. 🌐 Rather than mining existing repositories, SWE-Playground synthetically generates projects, tasks, and verifiable unit tests from scratch. This approach offers two exciting opportunities: 1️⃣ Flexibility: We can generate tasks without being constrained by the availability or structure of existing open-source data. 2️⃣ Versatility: We extend training beyond Issue Resolution to include Issue Reproduction and Library Generation from Scratch. The results? 🚀 Our agents achieve strong performance across SWE-bench Verified, SWT-Bench, and Commit-0, demonstrating high data efficiency compared to baselines trained on larger datasets. Huge thanks to my amazing collaborators @apurvasgandhi and @gneubig for their incredible efforts on bringing this work to life! 👇 🧵 A deep dive into how we build versatile agents synthetically. Paper: arxiv.org/pdf/2512.12216 Project Page: neulab.github.io/SWE-Playground Code: github.com/neulab/SWE-Pla… Data & Models: huggingface.co/collections/St…



wait... you can install uv with pip? this changes everything







🚨 New paper 🚨 Excited to share my first paper w/ my PhD students!! We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model versions. 💡You don’t necessarily need to fine-tune from scratch again for every new base model version. Instead, fine-tune once and add the diff vector to updated versions! ♻️♻️♻️. This can also offer a stronger and more computationally efficient starting point when further training is feasible. 📰: tinyurl.com/finetuning-tra… More 👇


GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/







