Da Yu

31 posts

Da Yu banner
Da Yu

Da Yu

@DaYu85201802

Senior Research Scientist at Google DeepMind. Former intern at @MSFTResearch and @GoogleAI. Opinions are my own.

Katılım Ağustos 2020
180 Takip Edilen593 Takipçiler
Da Yu retweetledi
Qianhui Wu
Qianhui Wu@5000hui·
Congrats to the LightMem team! 👏Great to see the continued exploration of topic-based segmentation and lightweight compression for building efficient memory systems for LLMs. Glad that our findings in SeCom and LLMLingua-2 have been useful building blocks for the community. 😀
Ningyu Zhang@ZJU@zxlzr

We’re thrilled to share that our team’s work LightMem has been accepted to ICLR 2026 🎉 Paper: arxiv.org/abs/2510.18866 Code: github.com/zjunlp/LightMem LightMem is a lightweight, modular memory system for LLM agents that enables scalable long-context reasoning and structured memory management across tasks and environments. Recent updates: 1️⃣ Introduced a comprehensive baseline evaluation framework for benchmarking memory layers (Mem0, A-MEM, LangMem) across datasets like LoCoMo and LongMemEval 2️⃣ Released a demo video showcasing long-context handling, along with tutorial notebooks covering multiple usage scenarios 3️⃣ Enabled multi-tool invocation via MCP Server integration 4️⃣ Added full LoCoMo dataset support and integrated GLM-4.6, achieving strong performance and efficiency with reproducible scripts 5️⃣ Supported local deployment through Ollama, vLLM, and Transformers with automatic model loading #ICLR2026 #LLM #Agents #MemorySystems #LightMem

English
0
2
8
1.1K
Da Yu retweetledi
Qianhui Wu
Qianhui Wu@5000hui·
We've released the full package for GUI-Libra! 🌟 📂 Data/Model: huggingface.co/GUI-Libra 📄 Paper: arxiv.org/abs/2602.22190 🌐 Project: gui-libra.github.io Happy to hear feedback from the community!
Rui Yang@RuiYang70669025

Collecting high-quality GUI trajectories for agent training is expensive. But are we fully leveraging the open-source data we already have? 🤔 ✨Introducing GUI-Libra (gui-libra.github.io): 81K high-quality, action-aligned reasoning dataset curated from open-source corpora, plus a tailored training recipe that combines action-aware SFT with step-wise RLVR-style training (⚠️partially verifiable rather than fully verifiable!). Result: stronger native GUI agents on both offline step-wise evaluation and online environments across mobile and web domains. Take away: With careful data curation + tailored post-training recipe, a small subset of open-source trajectories can still go a long way for training native GUI agents. Check out our paper (arxiv.org/abs/2602.22190) and code/dataset/model (github.com/GUI-Libra/GUI-…) for more details. #GUI #agent #VLM

English
0
7
21
3.5K
Xuhui Jia
Xuhui Jia@jia_xuhui·
Nano Banana has truly redefined what's possible with image generation models, pushing the boundaries of people's imagination when it debuted Today, we're excited to introduce Grok-Imagine-Image: a new model that's both faster and better than Nano Banana. Through this journey, we've built many of the essential building blocks needed to unlock the next generation of models and to keep fueling the growth and prosperity of the visual AI community. Stay tuned... something incredible is coming very soon! But today, hello world, grok-imagine-image!
Arena.ai@arena

Latest image models from @xAI, Grok-Imagine-Image and Pro debut top 6 in the Image Arena! Text-to-Image: ▪️ #4 Grok-Imagine-Image; scoring 1170, surpassing Flux-2-max and Nano-banana ▪️ #6 Grok-Imagine-Image-Pro Image-Edit: ▪️ #5 Grok-Imagine-Image-Pro; scoring 1330, overtaking Seedream-4.5 ▪️ #6 Grok-Imagine-Image With this launch, @xAI is now a top-3 Image AI provider alongside @GoogleDeepMind and @OpenAI. Congrats to the @xAI team on the impressive releases!

English
41
29
327
162.5K
Peng (Richard) Xia
Peng (Richard) Xia@richardxp888·
I’m thrilled to share that I’ll be joining Google as a Research Intern for Summer 2026! 🚀 I’m looking forward to advancing the frontiers of AI Agents. I’ll be based in the Bay Area this summer. If you’re around and want to talk about AI or grab a coffee, let’s connect! ☕️
Peng (Richard) Xia tweet media
English
3
0
19
529
Da Yu
Da Yu@DaYu85201802·
@zeliu_ Congratulations! Super impressive.
English
0
0
1
101
Ze Liu
Ze Liu@zeliu_·
It’s just the beginning. We are creating a universal imagination engine to unlock limitless creativity. Join us to define the next-gen models for image, video, audio, and beyond🚀 job-boards.greenhouse.io/xai/jobs/47206…
Arena.ai@arena

BREAKING: @xAI’s Grok-Imagine-Video now #1 in Video Arena! For the first time, Grok-Imagine-Video-720p takes the top spot on the Image-to-Video leaderboard, overtaking Google’s Veo 3.1 while being 5x cheaper. Its 480p version released a few days ago ranks #4. Huge congrats to @xAI team and @elonmusk on this incredible milestone!

English
2
2
55
4K
Da Yu retweetledi
Qianhui Wu
Qianhui Wu@5000hui·
🔊2026 Summer Internship @MSFTResearch Deep Learning Group🔊 We’re looking for a self-motivated intern with strong background on ⛑️building GUI agent environments and/or 🏗️reinforcement learning. 📩Interested? Send your CV + a short intro to qianhuiwu@microsoft.com!
English
6
20
344
23.7K
Da Yu
Da Yu@DaYu85201802·
I’ll be at NeurIPS this Friday! You can catch me at the Google booth from 9:30–10:30 AM for the Q&A session with the Gemini team. Later in the afternoon, I’ll be presenting our SCONE paper (arxiv.org/abs/2502.01637) from 4:30–6:00 PM at Exhibit Hall C,D,E #5315. I'm looking forward to seeing familiar faces and meeting some new ones!
Jeff Dean@JeffDean

We'll be doing two different Q&A sessions at #NeurIPS2025 w/members of the Gemini team (including me). One is on Thurs., 2:30 to 3:30 PM and the other is Fri., 9:30 to 10:30 AM. Both at the @Google booth. Looking forward to talking about Gemini 3 ♊, Nano Banana 🍌, and more!

English
0
1
3
1.2K
Da Yu retweetledi
ICLR 2026
ICLR 2026@iclr_conf·
ICLR 2026 tweet media
ZXX
52
138
681
1M
Da Yu retweetledi
Jeff Dean
Jeff Dean@JeffDean·
I’m really excited about our release of Gemini 3 today, the result of hard work by many, many people in the Gemini team and all across Google! 🎊 We’ve built many exciting new product experiences with it, as you’ll see today and in the coming weeks and months. You can find it today on @GeminiApp and AI Mode in Search. For developers, you can build with it now in @GoogleAIStudio and Vertex AI. blog.google/products/gemin… The model performs quite well on a wide range of benchmarks.
Jeff Dean tweet media
English
208
339
3.4K
398.3K
Da Yu retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
We’ve been intensely cooking Gemini 3 for a while now, and we’re so excited and proud to share the results with you all. Of course it tops the leaderboards, including @arena, HLE, GPQA etc, but beyond the benchmarks it’s been by far my favourite model to use for its style and depth, and what it can do to help with everyday tasks.
Demis Hassabis tweet media
English
218
485
5.7K
588.9K
Da Yu
Da Yu@DaYu85201802·
We are truly grateful for your interest in this position! Our team will carefully review each application, but as we’ve received many more applications than expected, we may not be able to respond individually.
English
0
0
0
941
Da Yu
Da Yu@DaYu85201802·
✨ Internship Opportunity @ Google Research ✨ We are seeking a self-motivated student researcher to join our team at Google Research starting around January 2026. 🚀 In this role, you will contribute to research projects advancing agentic LLMs through tool use and RL, with the goal of enabling breakthrough applications. We are particularly interested in PhD students with a strong background in these areas. If interested, please send a brief self-introduction and your CV to yuda3.edu@gmail.com. Looking forward to connecting with talented researchers in this exciting space!
English
15
93
842
76K
Da Yu
Da Yu@DaYu85201802·
@soul_surfer78 Thank you for your question. We are primarily looking for PhD students, but we are also open to exceptional master/undergraduate candidates.
English
0
0
0
611
Da Yu
Da Yu@DaYu85201802·
@shreyas1007 Thank you for your question. We are primarily looking for PhD students, but we are also open to exceptional master/undergraduate candidates.
English
1
0
0
526
Da Yu
Da Yu@DaYu85201802·
@m84736062 Thank you for the question. Unfortunately this is a full-time intern role.
English
0
0
0
726
Da Yu
Da Yu@DaYu85201802·
Repost appreciated!
English
1
0
2
6.6K
Da Yu retweetledi
Qianhui Wu
Qianhui Wu@5000hui·
🚀 Excited to share GUI-Actor—a new approach for GUI grounding! Big thanks to @_akhaliq for featuring our work! 🌐 Project page: microsoft.github.io/GUI-Actor/ 📜 Paper: arxiv.org/pdf/2506.03143 🤔 What's limiting coordinate generation-based GUI grounding? 1️⃣ Weak spatial-semantic alignment 2️⃣ Ambiguous supervision signals 3️⃣ Vision–action granularity mismatch 👀 But think about it: humans don’t calculate precise screen coordinates—we perceive elements and then act directly. 💡 Meet GUI-Actor: a VLM with an attention-based action head that: ✅ Addresses above limitations ✅ Proposes multiple candidate regions in one pass, enabling flexible downstream strategies. ✅ Performs coordinate-free grounding that better mirrors human behavior ➕ We also introduce a grounding verifier to select the most plausible action region — and it can boost other grounding methods too. 🎯 Results? GUI-Actor achieves SOTA on several benchmarks, even GUI-Actor-7B outperforms UI-TARS-72B on ScreenSpot-Pro, all using the same Qwen2-VL backbone.
AK@_akhaliq

Microsoft just dropped GUI-Actor on Hugging Face Coordinate-Free Visual Grounding for GUI Agents

English
4
26
103
28.5K
Da Yu retweetledi
AK
AK@_akhaliq·
Microsoft just dropped GUI-Actor on Hugging Face Coordinate-Free Visual Grounding for GUI Agents
AK tweet media
English
3
57
327
64.2K
Da Yu retweetledi
Qianhui Wu
Qianhui Wu@5000hui·
🚀 Introducing SECOM🚀 How can conversational agents **better retain and retrieve past interactions** for more **coherent and personalized** experiences? Our latest work on **Memory Construction & Retrieval** tackles this challenge head-on! 🔍 Key Takeaways: ✅ **Granularity matters** – Turn-level, session-level & summarization-based memory struggle with retrieval accuracy and the semantic integrity/relevance of the context. ✅ **Prompt compression (e.g., LLMLingua-2) can denoise memory retrieval**, boosting both retrieval accuracy and response quality. 💡 Meet **SeCom** – an approach that **segments conversations topically** for memory construction and performs memory retrieval based on compressed memory units. 📊 Result? Superior performance on long-term conversation benchmarks such as LOCOMO and Long-MT-Bench+! 📖 Dive into the details: arxiv.org/abs/2502.05589
English
1
3
17
1.4K