Kai Mei

9 posts

Kai Mei

Kai Mei

@KaiMei_2000

PhD candidate @RutgersCS, Intern at @trycua. Prev: Intern at @aws, Intern at @toyota

Katılım Aralık 2021
47 Takip Edilen28 Takipçiler
Kai Mei retweetledi
Minghao Guo
Minghao Guo@MurphyKwok_·
👀 Can multimodal agents truly remember what they saw? Do they just rely on caption-level shortcuts? MemEye: a visual-centric benchmark for multimodal agent memory, can test whether agents can preserve visual evidence and track evolving visual states. 📄 arxiv.org/pdf/2605.15128
Minghao Guo tweet media
English
0
3
6
2.2K
Kai Mei retweetledi
Nous Research
Nous Research@NousResearch·
Computer use with any model Hermes Agent × @trycua
English
106
132
2.1K
1.1M
Kai Mei retweetledi
Francesco
Francesco@francedot·
there's a PR to add background computer use to OpenClaw via a proprietary Codex plugin. opened a counter-PR using cua-driver instead - MIT licensed, works with every agent harness, not just Codex. give it a 👍 if you want the open version github.com/openclaw/openc…
English
2
5
20
1.8K
Kai Mei retweetledi
Cua
Cua@trycua·
We're open-sourcing Cua Driver - our new macOS driver that lets any agent (Claude Code, Codex, your own loop) drive any app in the background, with true multi-player and multi-cursor built-in. 1/8
Cua tweet media
English
63
174
1.7K
231.9K
Kai Mei retweetledi
Alexandr Wang
Alexandr Wang@alexandr_wang·
1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵
Alexandr Wang tweet media
English
727
1.2K
10.3K
4.5M
Kai Mei retweetledi
Wenyue Hua
Wenyue Hua@HuaWenyue31539·
🌟🎲🎲How to create a rational LLM-based agent? using game-theoretic workflow! Game-theoretic LLM: Agent Workflow for Negotiation Games 😊 paper link: arxiv.org/abs/2411.05990 github link: github.com/Wenyueh/game_t… 😼 This paper aims at observing and enhancing the performance of agents in interactions guided by self-interest maximization 😼 😼 We chose game theory as the foundation, with rationality and Pareto optimality as the two basic evaluation metrics: whether an individual is rational and whether a globally optimal solution is developed based on individual rationality. ❣️ Complete information games They are classic games such as Prisoner's Dilemma. We selected 5 simultaneous games and 5 sequential games. We found that, except for o1, other LLM generally lack a robust ability to compute Nash equilibria, meaning they are not very rational. They are not robust to noise, perturbations, or random talks among them. Therefore, based on classical game theory methods (Iterative Elimination of Dominated Strategy & Backward Induction), we designed two workflows to guide large models step-by-step in computing Nash equilibria during inference time. ❣️ Incomplete information games We used the classic "Deal or No Deal" resource allocation game with private valuation, where agents do not know the opponent's valuation of resources. Game theory does not provide a solution for this, and previous work has been based on reinforcement learning. 👉 Sonnet and o1 perform better than humans in terms of negotiation success rate and results 👉 Opus and 4o are far behind. 👉 We designed an algorithmic workflow based on the rational actor assumption, allowing agents to infer the opponent's valuation based on their reactions to various resource allocation schemes. The workflow is very effective, reducing the possible estimated valuations from an initial 1000 possibilities to 2-3 within 5 rounds of dialogue, and always including the opponent's true valuation. 🌟🌟Based on the estimated valuation of opponent's resource, we guide the agents in each step to calculate and propose an allocation proposal that maximizes their own interests while having a non-zero probability of being envy-free, ensuring that both parties are relatively satisfied and the negotiation can proceed. 🌟🌟 But very interestingly, we found that if only one agent uses this workflow during negotiation, it will be exploited. Although the workflow improves the overall negotiation outcome and brings more benefits to the individual agent, the benefits will always be less than the opponent's. 🔥In the future, we will need a meta-strategy to choose which workflows to use!
Wenyue Hua tweet mediaWenyue Hua tweet mediaWenyue Hua tweet mediaWenyue Hua tweet media
English
5
49
200
26.6K
Kai Mei retweetledi