Da Yu

31 posts

Da Yu

@DaYu85201802

Senior Research Scientist at Google DeepMind. Former intern at @MSFTResearch and @GoogleAI. Opinions are my own.

Katılım Ağustos 2020

180 Takip Edilen593 Takipçiler

Da Yu retweetledi

Qianhui Wu@5000hui·26 Şub

Congrats to the LightMem team! 👏Great to see the continued exploration of topic-based segmentation and lightweight compression for building efficient memory systems for LLMs. Glad that our findings in SeCom and LLMLingua-2 have been useful building blocks for the community. 😀

Ningyu Zhang@ZJU@zxlzr

We’re thrilled to share that our team’s work LightMem has been accepted to ICLR 2026 🎉 Paper: arxiv.org/abs/2510.18866 Code: github.com/zjunlp/LightMem LightMem is a lightweight, modular memory system for LLM agents that enables scalable long-context reasoning and structured memory management across tasks and environments. Recent updates: 1️⃣ Introduced a comprehensive baseline evaluation framework for benchmarking memory layers (Mem0, A-MEM, LangMem) across datasets like LoCoMo and LongMemEval 2️⃣ Released a demo video showcasing long-context handling, along with tutorial notebooks covering multiple usage scenarios 3️⃣ Enabled multi-tool invocation via MCP Server integration 4️⃣ Added full LoCoMo dataset support and integrated GLM-4.6, achieving strong performance and efficiency with reproducible scripts 5️⃣ Supported local deployment through Ollama, vLLM, and Transformers with automatic model loading #ICLR2026 #LLM #Agents #MemorySystems #LightMem

English

1.1K

Da Yu retweetledi

Qianhui Wu@5000hui·26 Şub

We've released the full package for GUI-Libra! 🌟 📂 Data/Model: huggingface.co/GUI-Libra 📄 Paper: arxiv.org/abs/2602.22190 🌐 Project: gui-libra.github.io Happy to hear feedback from the community!

Rui Yang@RuiYang70669025

Collecting high-quality GUI trajectories for agent training is expensive. But are we fully leveraging the open-source data we already have? 🤔 ✨Introducing GUI-Libra (gui-libra.github.io): 81K high-quality, action-aligned reasoning dataset curated from open-source corpora, plus a tailored training recipe that combines action-aware SFT with step-wise RLVR-style training (⚠️partially verifiable rather than fully verifiable!). Result: stronger native GUI agents on both offline step-wise evaluation and online environments across mobile and web domains. Take away: With careful data curation + tailored post-training recipe, a small subset of open-source trajectories can still go a long way for training native GUI agents. Check out our paper (arxiv.org/abs/2602.22190) and code/dataset/model (github.com/GUI-Libra/GUI-…) for more details. #GUI #agent #VLM

English

3.5K

Da Yu@DaYu85201802·8 Şub

@jia_xuhui Congratulations!

English

552

Xuhui Jia@jia_xuhui·7 Şub

Nano Banana has truly redefined what's possible with image generation models, pushing the boundaries of people's imagination when it debuted Today, we're excited to introduce Grok-Imagine-Image: a new model that's both faster and better than Nano Banana. Through this journey, we've built many of the essential building blocks needed to unlock the next generation of models and to keep fueling the growth and prosperity of the visual AI community. Stay tuned... something incredible is coming very soon! But today, hello world, grok-imagine-image!

Arena.ai@arena

Latest image models from @xAI, Grok-Imagine-Image and Pro debut top 6 in the Image Arena! Text-to-Image: ▪️ #4 Grok-Imagine-Image; scoring 1170, surpassing Flux-2-max and Nano-banana ▪️ #6 Grok-Imagine-Image-Pro Image-Edit: ▪️ #5 Grok-Imagine-Image-Pro; scoring 1330, overtaking Seedream-4.5 ▪️ #6 Grok-Imagine-Image With this launch, @xAI is now a top-3 Image AI provider alongside @GoogleDeepMind and @OpenAI. Congrats to the @xAI team on the impressive releases!

English

327

162.5K

Da Yu@DaYu85201802·5 Şub

@richardxp888 Congratulations!

English

Peng (Richard) Xia@richardxp888·5 Şub

I’m thrilled to share that I’ll be joining Google as a Research Intern for Summer 2026! 🚀 I’m looking forward to advancing the frontiers of AI Agents. I’ll be based in the Bay Area this summer. If you’re around and want to talk about AI or grab a coffee, let’s connect! ☕️

English

529

Da Yu@DaYu85201802·5 Şub

@zeliu_ Congratulations! Super impressive.

English

101

Ze Liu@zeliu_·5 Şub

It’s just the beginning. We are creating a universal imagination engine to unlock limitless creativity. Join us to define the next-gen models for image, video, audio, and beyond🚀 job-boards.greenhouse.io/xai/jobs/47206…

Arena.ai@arena

BREAKING: @xAI’s Grok-Imagine-Video now #1 in Video Arena! For the first time, Grok-Imagine-Video-720p takes the top spot on the Image-to-Video leaderboard, overtaking Google’s Veo 3.1 while being 5x cheaper. Its 480p version released a few days ago ranks #4. Huge congrats to @xAI team and @elonmusk on this incredible milestone!

English

Da Yu retweetledi

Qianhui Wu@5000hui·12 Oca

🔊2026 Summer Internship @MSFTResearch Deep Learning Group🔊 We’re looking for a self-motivated intern with strong background on ⛑️building GUI agent environments and/or 🏗️reinforcement learning. 📩Interested? Send your CV + a short intro to qianhuiwu@microsoft.com!

English

344

23.7K

Da Yu@DaYu85201802·4 Ara

I’ll be at NeurIPS this Friday! You can catch me at the Google booth from 9:30–10:30 AM for the Q&A session with the Gemini team. Later in the afternoon, I’ll be presenting our SCONE paper (arxiv.org/abs/2502.01637) from 4:30–6:00 PM at Exhibit Hall C,D,E #5315. I'm looking forward to seeing familiar faces and meeting some new ones!

Jeff Dean@JeffDean

We'll be doing two different Q&A sessions at #NeurIPS2025 w/members of the Gemini team (including me). One is on Thurs., 2:30 to 3:30 PM and the other is Fri., 9:30 to 10:30 AM. Both at the @Google booth. Looking forward to talking about Gemini 3 ♊, Nano Banana 🍌, and more!

English

1.2K

Da Yu retweetledi

ICLR 2026@iclr_conf·27 Kas

ZXX

138

681

Da Yu retweetledi

Jeff Dean@JeffDean·18 Kas

I’m really excited about our release of Gemini 3 today, the result of hard work by many, many people in the Gemini team and all across Google! 🎊 We’ve built many exciting new product experiences with it, as you’ll see today and in the coming weeks and months. You can find it today on @GeminiApp and AI Mode in Search. For developers, you can build with it now in @GoogleAIStudio and Vertex AI. blog.google/products/gemin… The model performs quite well on a wide range of benchmarks.

English

208

339

3.4K

398.3K

Da Yu retweetledi

Demis Hassabis@demishassabis·18 Kas

We’ve been intensely cooking Gemini 3 for a while now, and we’re so excited and proud to share the results with you all. Of course it tops the leaderboards, including @arena, HLE, GPQA etc, but beyond the benchmarks it’s been by far my favourite model to use for its style and depth, and what it can do to help with everyday tasks.

English

218

485

5.7K

588.9K

Da Yu@DaYu85201802·29 Eyl

We are truly grateful for your interest in this position! Our team will carefully review each application, but as we’ve received many more applications than expected, we may not be able to respond individually.

English

941

Da Yu@DaYu85201802·23 Eyl

✨ Internship Opportunity @ Google Research ✨ We are seeking a self-motivated student researcher to join our team at Google Research starting around January 2026. 🚀 In this role, you will contribute to research projects advancing agentic LLMs through tool use and RL, with the goal of enabling breakthrough applications. We are particularly interested in PhD students with a strong background in these areas. If interested, please send a brief self-introduction and your CV to yuda3.edu@gmail.com. Looking forward to connecting with talented researchers in this exciting space!

English

842

76K

Da Yu@DaYu85201802·24 Eyl

@soul_surfer78 Thank you for your question. We are primarily looking for PhD students, but we are also open to exceptional master/undergraduate candidates.

English

611

ML Guy@soul_surfer78·24 Eyl

@DaYu85201802 PhD?

Da Yu@DaYu85201802·24 Eyl

@shreyas1007 Thank you for your question. We are primarily looking for PhD students, but we are also open to exceptional master/undergraduate candidates.

English

526

shreyas@shreyas1007·24 Eyl

@DaYu85201802 Phd??

812

Da Yu@DaYu85201802·24 Eyl

@m84736062 Thank you for the question. Unfortunately this is a full-time intern role.

English

726

Charles@m84736062·24 Eyl

@DaYu85201802 can it be part time?

English

1.3K

Da Yu@DaYu85201802·23 Eyl

Repost appreciated!

English

6.6K

Da Yu retweetledi

Qianhui Wu@5000hui·4 Haz

🚀 Excited to share GUI-Actor—a new approach for GUI grounding! Big thanks to @_akhaliq for featuring our work! 🌐 Project page: microsoft.github.io/GUI-Actor/ 📜 Paper: arxiv.org/pdf/2506.03143 🤔 What's limiting coordinate generation-based GUI grounding? 1️⃣ Weak spatial-semantic alignment 2️⃣ Ambiguous supervision signals 3️⃣ Vision–action granularity mismatch 👀 But think about it: humans don’t calculate precise screen coordinates—we perceive elements and then act directly. 💡 Meet GUI-Actor: a VLM with an attention-based action head that: ✅ Addresses above limitations ✅ Proposes multiple candidate regions in one pass, enabling flexible downstream strategies. ✅ Performs coordinate-free grounding that better mirrors human behavior ➕ We also introduce a grounding verifier to select the most plausible action region — and it can boost other grounding methods too. 🎯 Results? GUI-Actor achieves SOTA on several benchmarks, even GUI-Actor-7B outperforms UI-TARS-72B on ScreenSpot-Pro, all using the same Qwen2-VL backbone.

AK@_akhaliq

Microsoft just dropped GUI-Actor on Hugging Face Coordinate-Free Visual Grounding for GUI Agents

English

103

28.5K

Da Yu retweetledi

AK@_akhaliq·4 Haz

Microsoft just dropped GUI-Actor on Hugging Face Coordinate-Free Visual Grounding for GUI Agents

English

327

64.2K

Da Yu retweetledi

Qianhui Wu@5000hui·27 Mar

Check out our SeCom and other amazing works!

Microsoft Research@MSFTResearch

In this issue of Research Focus, we examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMs’ understanding of geologic maps. Check out the latest research: msft.it/6019q9k33

English

Da Yu retweetledi

Qianhui Wu@5000hui·11 Şub

🚀 Introducing SECOM🚀 How can conversational agents **better retain and retrieve past interactions** for more **coherent and personalized** experiences? Our latest work on **Memory Construction & Retrieval** tackles this challenge head-on! 🔍 Key Takeaways: ✅ **Granularity matters** – Turn-level, session-level & summarization-based memory struggle with retrieval accuracy and the semantic integrity/relevance of the context. ✅ **Prompt compression (e.g., LLMLingua-2) can denoise memory retrieval**, boosting both retrieval accuracy and response quality. 💡 Meet **SeCom** – an approach that **segments conversations topically** for memory construction and performs memory retrieval based on compressed memory units. 📊 Result? Superior performance on long-term conversation benchmarks such as LOCOMO and Long-MT-Bench+! 📖 Dive into the details: arxiv.org/abs/2502.05589

English

1.4K

Keşfet

@jia_xuhui @richardxp888 @zeliu_ @MSFTResearch @GeminiApp @GoogleAIStudio @arena @soul_surfer78