Quyet V. Do (Looking for Research Internship)

230 posts

Quyet V. Do (Looking for Research Internship)

@Quyet_Azir

PhD at @virginia_tech, supervised by @tuvllms. Work on LLMs' long-context reasoning, LLMs' post-training, and searching. ex-Research Intern at @AdobeResearch.

Blacksburg, VA Katılım Mayıs 2022

336 Takip Edilen127 Takipçiler

Quyet V. Do (Looking for Research Internship) retweetledi

anirudh bv@anirudhbv_ce·3d

I implemented @GoogleResearch's TurboQuant as a CUDA-native compression engine on Blackwell B200. 5x KV cache compression on Qwen 2.5-1.5B, near-loseless attention scores, generating live from compressed memory. 5 custom cuTile CUDA kernels ft: - fused attention (with QJL corrections) - online softmax -on-chip cache decompression - pipelined TMA loads Try it out: devtechjr.github.io/turboquant_cut… s/o @blelbach and the cuTile team at @nvidia for lending me Blackwell GPU access :) cc @sundeep @GavinSherry

English

141

306

3.2K

751.4K

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·2d

@ChujieZheng IMO, 35B MoE is always a sweet spot to balance speed and performance

English

410

Chujie Zheng@ChujieZheng·3d

We are planning to open-source the Qwen3.6 models (particularly medium-sized versions) to facilitate local deployment and customization for developers. Please vote for the model size you are **most** anticipating—the community’s voice is vital to us!

English

312

262

4.1K

290.7K

Quyet V. Do (Looking for Research Internship) retweetledi

Andrej Karpathy@karpathy·3d

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine many other visual output formats depending on the query. Often, I end up "filing" the outputs back into the wiki to enhance it for further queries. So my own explorations and queries always "add up" in the knowledge base. Linting: I've run some LLM "health checks" over the wiki to e.g. find inconsistent data, impute missing data (with web searchers), find interesting connections for new article candidates, etc., to incrementally clean up the wiki and enhance its overall data integrity. The LLMs are quite good at suggesting further questions to ask and look into. Extra tools: I find myself developing additional tools to process the data, e.g. I vibe coded a small and naive search engine over the wiki, which I both use directly (in a web ui), but more often I want to hand it off to an LLM via CLI as a tool for larger queries. Further explorations: As the repo grows, the natural desire is to also think about synthetic data generation + finetuning to have your LLM "know" the data in its weights instead of just context windows. TLDR: raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki, and all of it viewable in Obsidian. You rarely ever write or edit the wiki manually, it's the domain of the LLM. I think there is room here for an incredible new product instead of a hacky collection of scripts.

English

2.4K

5.8K

49.8K

16.4M

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·3d

@TheCraigHewitt It's seemingly a strong model, yet when I asked "What is the capital of France?", then it responsed as if I asked a math question @@ I think the model is a bit divergent from the well-tuned base one.

Quyet V. Do (Looking for Research Internship) tweet media

English

425

Craig Hewitt@TheCraigHewitt·5d

Very bullish on open source and local models Imagine running near-Opus-level model locally on that $600, 16GB Mac Mini you bought last month This 27B Qwen3.5 distill was trained on Claude 4.6 Opus reasoning traces and is putting up real numbers: - beats Claude Sonnet 4.5 on SWE-bench - keeps 96.91% HumanEval - cuts CoT (chain of thought) bloat by 24% - runs in 4-bit quantization Why this matters: local agent loops get a lot cheaper, faster, and more usable. frontier models aren’t going to keep subsidizing cheap tokens on subscriptions forever 300K+ downloads already on HF Link below 👇🏻 We’re early

English

145

223

2.6K

445K

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·4d

Surfing X on the April Fools' Day after a ton of tech events happened is a nightmare. Don't know which one is real, which one isn't @@

English

Quyet V. Do (Looking for Research Internship) retweetledi

Andrej Karpathy@karpathy·8 Mar

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later. I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run: github.com/karpathy/autor… Alternatively, a PR has the benefit of exact commits: github.com/karpathy/autor… but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back. I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.

English

529

711

7.6K

1.1M

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·27 Şub

@ankit_s_anand I'm interested and just submitted the form. Thanks for posting this opportunity!

English

Ankit Anand@ankit_s_anand·25 Şub

Hi Everyone, we are hiring for Ph.D student researchers in the field of ``Search/RL with LLMs'' ideally for discovery. Please respond in this form if you are interested. Please don't reach out directly as i may not be able to reply individually forms.gle/CxY4VzQRdJacLX…

English

229

22.9K

Quyet V. Do (Looking for Research Internship) retweetledi

Akari Asai@AkariAsai·24 Oca

Semantic Scholar (@SemanticScholar, @allen_ai) offers a *free* official API for paper metadata and keyword-based paper search, plus a new snippet search (directly retrieves top-k paragraphs from papers for arbitrary queries). #tag/Snippet-Text" target="_blank" rel="nofollow noopener">api.semanticscholar.org/api-docs/#tag/… MCP server is available!😉

Akari Asai@AkariAsai

@yisongyue Semantic Scholar has API & MCP server ... 👀allenai.org/asta/resources…

English

10.7K

Quyet V. Do (Looking for Research Internship) retweetledi

Graham Neubig@gneubig·16 Ara

There are many good training methods for improving agents on SWE-bench: SWE-Gym, SWE-Smith, R2E-Gym. But what about broader software engineering tasks? In SWE-Playground, we introduce a new, more diverse synthetic data generation strategy to train divers software agents.

Yiqi Zhu@StephenZhu0218

Introducing SWE-Playground: A fully automated pipeline that generates synthetic environments to train versatile coding agents. 🤖✨ Training software engineering agents often relies on existing resources like GitHub issues and focuses on solving SWE-bench style issue resolution tasks. While this has driven incredible progress, real-world engineering involves a wider spectrum of tasks —from designing new libraries to writing reproduction scripts. 🌐 Rather than mining existing repositories, SWE-Playground synthetically generates projects, tasks, and verifiable unit tests from scratch. This approach offers two exciting opportunities: 1️⃣ Flexibility: We can generate tasks without being constrained by the availability or structure of existing open-source data. 2️⃣ Versatility: We extend training beyond Issue Resolution to include Issue Reproduction and Library Generation from Scratch. The results? 🚀 Our agents achieve strong performance across SWE-bench Verified, SWT-Bench, and Commit-0, demonstrating high data efficiency compared to baselines trained on larger datasets. Huge thanks to my amazing collaborators @apurvasgandhi and @gneubig for their incredible efforts on bringing this work to life! 👇 🧵 A deep dive into how we build versatile agents synthetically. Paper: arxiv.org/pdf/2512.12216 Project Page: neulab.github.io/SWE-Playground Code: github.com/neulab/SWE-Pla… Data & Models: huggingface.co/collections/St…

English

124

19.4K

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·21 Oca

@mar_kar_ @Microsoft @SFU Congratulations 🥳

English

Marzena Karpinska@mar_kar_·17 Oca

Now is probably a good time to share that I left my job at @Microsoft (will forever miss this team) and moved to Vancouver, Canada, where I'm starting my lab as an assistant professor at gorgeous @SFU 🏔️ I'm looking to hire 1-2 students starting from Fall 2026. Details in 🧵

English

246

27.1K

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·23 Ara

Started using uv 1 month ago when I discovered that we can install uv with pip in a conda environment. Now I use pip to only install uv, just like we used IE to only download Chrome a decade ago 😅

vik@vikhyatk

wait... you can install uv with pip? this changes everything

English

Quyet V. Do (Looking for Research Internship) retweetledi

Tu Vu@tuvllms·16 Ara

We are hiring at @Google! 🚀 Looking for student researchers for Summer 2026 who are excited about the next frontier of AI research. If you are into: multi-agent AI systems 🤖 RAG & factuality ✅ prompt optimization ⚡️ self-improving AI agents 🔄 please fill out this form 👇 forms.gle/1abMhn8fXQiWXg… @Google @GoogleDeepMind #AI #LLMs #internships

English

190

1.9K

190.7K

Quyet V. Do (Looking for Research Internship) retweetledi

Daniel Severo@_dsevero·8 Eki

Our team at FAIR is looking to hire interns in 2026 to help us advance the foundations of generative modelling! metacareers.com/jobs/441969828…

English

251

18.4K

Quyet V. Do (Looking for Research Internship) retweetledi

Chris@chatgpt21·28 Eyl

Google DeepMind say true AGI will “reason, adapt, and learn continuously,” and they estimate it’s about 5-10 years away. Meta’s chief scientist Yann LeCun likewise views continual learning as a key pillar of human level AI and is optimistic about achieving it by around 2030 Currently, large language models only improve via offline updates (fine-tuning or reinforcement learning from human feedback) rather than meta learning or modular networks that update themselves gradually. Thankfully All major research teams (OpenAI, DeepMind, Anthropic, Meta, etc.) are actively working on these directions, aiming for models that continuously learn from mistakes as they interact a capability they believe will emerge as we approach true AGI later this decade 2029

English

119

1.1K

284.1K

Quyet V. Do (Looking for Research Internship) retweetledi

Da Yu@DaYu85201802·23 Eyl

✨ Internship Opportunity @ Google Research ✨ We are seeking a self-motivated student researcher to join our team at Google Research starting around January 2026. 🚀 In this role, you will contribute to research projects advancing agentic LLMs through tool use and RL, with the goal of enabling breakthrough applications. We are particularly interested in PhD students with a strong background in these areas. If interested, please send a brief self-introduction and your CV to yuda3.edu@gmail.com. Looking forward to connecting with talented researchers in this exciting space!

English

840

76.1K

Quyet V. Do (Looking for Research Internship) retweetledi

Connor Davis@connordavis_ai·22 Eyl

This MIT paper just broke my brain. Everyone keeps saying LLMs can't do real logical reasoning. Turns out we've just been teaching them wrong this whole time. These researchers built something called PDDL-INSTRUCT that actually teaches models to think through planning problems step by step. Not just pattern matching - actual logical reasoning. Here's how it works: Phase 1: show the model correct and incorrect plans with explanations. Basic stuff. Phase 2 is where it gets interesting. They make the model generate explicit reasoning for every single action, then use an external verifier to check if each step is logically sound. The numbers are wild. Llama-3-8B jumped from 28% to 94% accuracy on planning benchmarks. That's not incremental improvement - that's a completely different capability emerging. What's smart is they don't trust the model to check its own work. They use VAL, a formal planning verifier, to validate every logical step. When the model screws up, it gets specific feedback about exactly what went wrong. The two-stage training is clever. First stage focuses purely on better reasoning chains. Second stage optimizes for actually solving the problem. This prevents the model from just gaming the metrics. One finding caught my attention - detailed feedback destroys binary feedback. Just telling a model "wrong" vs explaining exactly which preconditions failed makes a huge difference. The gap is especially big on complex problems. This isn't trying to replace symbolic planners. It's teaching neural networks to reason like symbolic planners while keeping external verification. That's actually sustainable. The implications go way beyond planning. Any multi-step reasoning task could benefit from this approach. We might finally be seeing how to teach LLMs structured thinking instead of just sophisticated autocomplete. Makes me wonder what other "impossible" capabilities are just sitting there waiting for the right training approach.

English

117

702

3.8K

293.3K

Quyet V. Do (Looking for Research Internship) retweetledi

Tu Vu@tuvllms·20 Ağu

Excited to share that our paper on efficient model development has been accepted to #EMNLP2025 Main conference @emnlpmeeting. Congratulations to my students @linusdd44804 and @Sub_RBala on their first PhD paper! 🎉

Tu Vu@tuvllms

🚨 New paper 🚨 Excited to share my first paper w/ my PhD students!! We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model versions. 💡You don’t necessarily need to fine-tune from scratch again for every new base model version. Instead, fine-tune once and add the diff vector to updated versions! ♻️♻️♻️. This can also offer a stronger and more computationally efficient starting point when further training is feasible. 📰: tinyurl.com/finetuning-tra… More 👇

English

5.6K

Quyet V. Do (Looking for Research Internship) retweetledi

Sundar Pichai@sundarpichai·6 Ağu

Excited to make our best AI tools free for college students in the US + other select countries for a year - and to provide $1B in funding for education + research, including free AI and career training for every college student in America.

English

380

841

10.5K

938.5K

Quyet V. Do (Looking for Research Internship)@Quyet_Azir·9 Ağu

@TFang229 I remember that when o3 and o4-mini were released, it was said that GPT-5 would be an ensemble of models. I think GPT-5 is indeed an ensemble.

English

Tianqing Fang@TFang229·7 Ağu

Seems the term GPT-5 refers to more than a model but a Deep(er) Research Agent 🫣

OpenAI@OpenAI

GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/

English

225

Quyet V. Do (Looking for Research Internship) retweetledi

Tianqing Fang@TFang229·4 Ağu

🚀 We are thrilled to release a new open-source Deep Research Agent, Cognitive Kernel-Pro, from Tencent AI Lab! We focus on building a fully open-source agent with (to the maximum extent) free tools, showcasing impressive performance on GAIA with Claude-3.7-sonnet and surpass the counterpart, SmolAgents by a large margin. In addition, we study the training recipe for an open-source Deep Research Agent Foundation Model. We curate high-quality training data (queries, trajectories, and verifiable answers across web, file, code, and reasoning domains). Our finetuned Qwen3-8B (CK-Pro-8B) surpasses WebDancer and WebSailor with the similar model size on the text-only subset of GAIA. 📜 Paper: arxiv.org/pdf/2508.00414 🔧 Code: github.com/Tencent/Cognit… 🤗 Data & Model: huggingface.co/datasets/Cogni… huggingface.co/CognitiveKerne… This work builds on the previous efforts of Tencent AI Lab (Fig. 2). Be sure to check them out if you're interested!

English

8.5K

Keşfet

@GoogleResearch @blelbach @nvidia @sundeep @GavinSherry @ChujieZheng @TheCraigHewitt @ankit_s_anand