Tao Li

47 posts

Tao Li banner
Tao Li

Tao Li

@tao__li

Research Engineer @GoogleDeepMind | Formerly @GoogleAI | PhD @UtahNLP @UUtah | Intern @allen_ai Aristo, @Amazon A9, @PhilipsNA.

Mountain View, CA Katılım Haziran 2019
462 Takip Edilen180 Takipçiler
Tao Li retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Think you know Gemini? 🤔 Think again. Meet Gemini 2.5: our most intelligent model 💡 The first release is Pro Experimental, which is state-of-the-art across many benchmarks - meaning it can handle complex problems and give more accurate responses. Try it now → goo.gle/4c2HKjf
English
90
509
2.5K
1.1M
Tao Li retweetledi
Jeff Dean
Jeff Dean@JeffDean·
Got a picture that isn't quite right? Try our native image generation in Gemini Flash 2.0. "Can you remove the stuff on the couch?". "Can you make the curtains light green?" "Can you put a unicorn horn on the person in the green pants?" Editing in human language, not image editing tools
Oriol Vinyals@OriolVinyalsML

Gemini 2.0 Flash debuts native image gen! Create contextually relevant images, edit conversationally, and generate long text in images. All totally optimized for chat iteration. Try it in AI Studio or Gemini API. Blog: developers.googleblog.com/en/experiment-…

English
25
62
767
102.3K
Tao Li retweetledi
Qingyao Ai
Qingyao Ai@QingyaoAi·
Thrilled to know that our paper on Scaling Laws for Dense Retrieval has won the #SIGIR2024 Best Paper Award! 🏆Our study reveals a power-law scaling of dense retrieval models, which can help optimize training and resource allocation. Huge thanks and congrats to all collaborators!
Qingyao Ai tweet media
English
12
5
125
8.7K
Tao Li retweetledi
Zhichao Xu Brutus
Zhichao Xu Brutus@zhichaoxu_ir·
Are compressed LLMs less toxic and biased against different demographic groups❓In this new📜, we study 4 pruning methods and 3 quantization methods and evaluate on 7 bias/toxicity benchmarks. arxiv.org/abs/2407.04965 (Un)surprising answer is: they are not less toxic/biased
English
1
4
11
1.9K
Tao Li retweetledi
Google AI
Google AI@GoogleAI·
Congratulations to the authors of the “Rich Human Feedback for Text-to-Image Generation” paper, which received the #CVPR2024 Best Paper Award. Check out the paper at: arxiv.org/pdf/2312.10240
Google AI tweet mediaGoogle AI tweet media
English
18
36
169
37.4K
Tao Li retweetledi
Haoyu Wang
Haoyu Wang@Haoyu_Wang_97·
NeurIPS’24 authors: if you are desk rejected due to a missing checklist, add your email to this appeal letter! docs.google.com/document/d/16_… Getting desk rejected after filling out the checklist form carefully in OpenReview… This does not make any sense!
English
1
2
5
832
Tao Li
Tao Li@tao__li·
Reflection doesn't have to be post-hoc. We show that agent can benefit from "anticipatory" failures. An extra bonus of doing so is that reflective trials can now run in parallel.
Haoyu Wang@Haoyu_Wang_97

Multiple Reflections NOT helping much? Tired of changing plans and NOT seeing utmost effort in their execution? Introducing Devil’s Advocate 😈: Equipping LLM Agents with *Anticipatory* Reflection before action execution #LLM #Agent #AI #ML arxiv.org/pdf/2405.16334…

English
0
0
5
231
Tao Li retweetledi
Haoyu Wang
Haoyu Wang@Haoyu_Wang_97·
Multiple Reflections NOT helping much? Tired of changing plans and NOT seeing utmost effort in their execution? Introducing Devil’s Advocate 😈: Equipping LLM Agents with *Anticipatory* Reflection before action execution #LLM #Agent #AI #ML arxiv.org/pdf/2405.16334…
Haoyu Wang tweet media
English
1
1
5
511
Tao Li retweetledi
fly51fly
fly51fly@fly51fly·
[AI] Devil's Advocate: Anticipatory Reflection for LLM Agents H Wang, T Li, Z Deng, D Roth, Y Li [Google DeepMind] (2024) arxiv.org/abs/2405.16334 - Proposes a novel approach that integrates introspection into LLM agents to enhance their consistency and adaptability in solving complex tasks. - Focuses on 3 types of introspective interventions: 1) Anticipatory reflection before action execution to consider potential failures; 2) Post-action evaluation and backtracking to ensure alignment with subtask objectives; 3) Comprehensive review upon completion to refine strategies. - Decomposes tasks into manageable subtasks to form a plan, then implements introspective mechanisms during plan execution. - Evaluated in WebArena across 812 tasks in 5 scenarios. Outperformed baseline zero-shot methods, achieving 23.5% success rate. - Reduced number of trials and plan revisions by 45% compared to baselines, showing improved efficiency. - Analysis of errors revealed need to fully learn from failures and incorporate more sophisticated logic into planning like loops and encapsulation.
fly51fly tweet mediafly51fly tweet mediafly51fly tweet mediafly51fly tweet media
English
0
4
12
1.3K
Tao Li retweetledi
Haoyu Wang
Haoyu Wang@Haoyu_Wang_97·
Excited to introduce our new paper BLINK! It’s a new benchmark for MLLMs, focusing on visual perception capabilities. We show that there’s still a gap between SOTA MLLMs and human performance in 14 tasks that can be solved by humans within a blink~
AK@_akhaliq

BLINK Multimodal Large Language Models Can See but Not Perceive We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses on core visual perception abilities not found in other evaluations. Most of the Blink tasks can be solved by humans

English
0
1
6
641
Tao Li retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
Introducing SIMA: the first generalist AI agent to follow natural-language instructions in a broad range of 3D virtual environments and video games. 🕹️ It can complete tasks similar to a human, and outperforms an agent trained in just one setting. 🧵 dpmd.ai/3TiYV7d
English
188
804
3.7K
1.1M
Tao Li retweetledi
Maitrey Mehta
Maitrey Mehta@my_tray·
New preprint 🚨 "Do LLM predictors provide structurally consistent outputs in the zero- and few-shot regime?" Our new work "Promptly Predicting Structures: The Return of Inference" shows that they do not, and we show how to fix it. (1/n) 🧵
Maitrey Mehta tweet media
English
1
12
50
11K
Tao Li retweetledi
Ana Marasović
Ana Marasović@anmarasovic·
.@huggingface appreciation at UtahNLP baking party 🤗🎄❤️☃️
Ana Marasović tweet mediaAna Marasović tweet media
English
0
9
72
49.9K
Tao Li retweetledi
AK
AK@_akhaliq·
A Zero-Shot Language Agent for Computer Control with Structured Reflection paper page: huggingface.co/papers/2310.08… Large language models (LLMs) have shown increasing capacity at planning and executing a high-level goal in a live computer environment (e.g. MiniWoB++). To perform a task, recent works often require a model to learn from trace examples of the task via either supervised learning or few/many-shot prompting. Without these trace examples, it remains a challenge how an agent can autonomously learn and improve its control on a computer, which limits the ability of an agent to perform a new task. We approach this problem with a zero-shot agent that requires no given expert traces. Our agent plans for executable actions on a partially observed environment, and iteratively progresses a task by identifying and learning from its mistakes via self-reflection and structured thought management. On the easy tasks of MiniWoB++, we show that our zero-shot agent often outperforms recent SoTAs, with more efficient reasoning. For tasks with more complexity, our reflective agent performs on par with prior best models, even though previous works had the advantages of accessing expert traces or additional screen information.
AK tweet media
English
3
45
215
65.3K
Tao Li retweetledi
Maitrey Mehta
Maitrey Mehta@my_tray·
I’ll be presenting our work “Verifying Annotation Agreement without Multiple Experts: A Case Study with Gujarati SNACS” as a virtual poster on Tuesday at #ACL2023 and in-person at LAW on Thursday. Joint work with @viveksrikumar Paper: tinyurl.com/3989kvnp (1/3)
English
1
4
20
1.7K