WebAgentlab

1.3K posts

WebAgentlab banner
WebAgentlab

WebAgentlab

@webagentlab

WebAgentLab is building an open-source community focused on Web Agent and the broader GUI Agent field.

join to contribute 👉 शामिल हुए Kasım 2024
1.4K फ़ॉलोइंग634 फ़ॉलोवर्स
पिन किया गया ट्वीट
WebAgentlab
WebAgentlab@webagentlab·
🚀 A must-join knowledge hub for GUI Agent builders Building GUI Agents today is noisy. Papers explode, products iterate weekly, and real signal is hard to track. That’s exactly why WebAgentLab exists. An open-source community focused on GUI Agents, with 5,000+ members across academia and industry. More than a knowledge base — it’s a collaborative brain for the GUI Agent era: 🔹 Curated industry briefs (daily → weekly → monthly) 🔹 Structured paper database to spot trends fast 🔹 Top conference guides & global event radar 🔹 GUI Agent product landscape & hands-on evaluations 🔹 Open-source collaboration toward industry standards 🔹 High-signal job matching inside the core circle If you’re building, researching, or betting on GUI Agents this is where the signal lives, not the noise. Feishu knowledge base webagentlab.feishu.cn/wiki/ZZvdwdy3X… follow us on Xiaohongshu (RED)
WebAgentlab tweet media
English
1
1
9
802
WebAgentlab रीट्वीट किया
Abhishek Das
Abhishek Das@abhshkdz·
We just shipped the biggest update to Scouts since launch (and yes, we know what day it is). Scouts used to be just for monitoring. Now they act. Scouts is now a general-purpose task execution engine for the web. Tell it what you need done, and it does it: across any website, behind any login, connected to your apps. 🧵
GIF
English
4
10
82
19.6K
WebAgentlab रीट्वीट किया
ForProduction
ForProduction@ForProduction·
// Holo3-35B-A3B: SOTA GUI Agent with 3B Active Parameters // H Company just dropped Holo3-35B-A3B — a sparse MoE VLM that achieves state-of-the-art on OSWorld-Verified (77.8%) with only 3B active parameters. It's designed for real-world computer use agents across web, desktop, and mobile environments. Key highlights: 35B total / 3B active parameters, fine-tuned from Qwen3.5 for perception and decision-making, and competitive with models 10x its inference cost on enterprise benchmarks (E-commerce, Business Software, Collaboration). The efficiency story is strong — it matches or beats GPT-5.4 mini, Sonnet 4.6, and larger Qwen3.5 variants at a fraction of the cost. 🤗 Model huggingface.co/Hcompany/Holo3… 📖 Blog hcompany.ai/holo3
English
0
1
1
110
WebAgentlab रीट्वीट किया
Browserbase
Browserbase@browserbase·
We're excited to announce our partnership with @PrimeIntellect to allow anyone to train browser agents. General-purpose models aren't optimized for your browser workflows, BrowserEnv lets you train one that is. Checkout browserenv.com and train your own custom model in a few hours.
English
34
43
498
187.2K
WebAgentlab रीट्वीट किया
Ai2
Ai2@allen_ai·
Today we're releasing MolmoWeb, an open source agent that can navigate + complete tasks in a browser on your behalf. Built on Molmo 2 in 4B & 8B sizes, it sets a new open-weight SOTA across four major web-agent benchmarks & even surpasses agents built on proprietary models. 🧵
Ai2 tweet media
English
21
115
804
128.1K
WebAgentlab रीट्वीट किया
Shuyan Zhou
Shuyan Zhou@shuyanzh36·
In 2023, WebArena took 7 grad students more than 6 months to build just 5 environments with 812 variable browser-use tasks. Now, it takes under 10 hours and less than $100 per environment, with easy support for parallel generation. Excited to introduce WebArena-Infinity: a scalable approach for automatically generating high-authenticity, high-complexity browser environments with verifiable tasks suitable for RL training and benchmarking. Even strong open-source models that already achieve 60%+ success rates on WebArena and OSWorld complete fewer than 50% of tasks here. Project page: webarena.dev/webarena-infin… Repo: github.com/web-arena-x/we… 🧵 (1/n)
GIF
English
12
48
323
42.1K
WebAgentlab रीट्वीट किया
Revanth Atmakuri
Revanth Atmakuri@RevanthAtmakuri·
Researchers want GUI agents with lasting memory 🚀 yet re‑playing every click is noisy and wasteful, while summaries wipe out key dependencies 📊. Their hack: keep only traceable, essential snippets—more info, less noise 🤖🧠 -gpt-oss arxiv.org/abs/2603.18429
English
0
1
1
73
WebAgentlab रीट्वीट किया
Rui Ye
Rui Ye@ruiye1129·
🚀 Meet OpenSeeker: A Fully Open-Source (Data & Model) Frontier Search Agent! ⚔️ Developed by a purely academic team, OpenSeeker achieves competitive performance using SFT ONLY: 📈 48.4% on BrowseComp-ZH: Surpassing Alibaba’s Tongyi DeepResearch (CPT+SFT+RL, 46.7%)! 📊 SOTA for ~30B SFT RaAct Agents: 29.5% (BrowseComp), 74.0% (xbench), 59.4% (WideSearch-EN). 💪Over the past year, search agents have flourished, but they’ve largely been a "Big Tech" game, leaving the community without high-quality open data. -> OpenSeeker now fills that gap. ✅ OpenSeeker fully releases all of the 11.7k high-quality training data (QA + trajectories)! 🔗 GitHub: github.com/rui-ye/OpenSee… 🤗 Data: huggingface.co/datasets/OpenS… 🤗 Models: huggingface.co/OpenSeeker/Ope… 📄 Paper: arxiv.org/pdf/2603.15594
Rui Ye tweet media
English
0
3
11
970
WebAgentlab रीट्वीट किया
Linxin Song
Linxin Song@linxins2·
🚀 Introducing ExeVRM — a video-based reward model that judges whether a computer-use agent actually completed your task, just by watching the screen recording. Our 8B model hits 84.7% accuracy & 87.7% recall, outperforming GPT-5.2 and Gemini-3 Pro on execution video assessment across Ubuntu, macOS, Windows & Android. No access to agent internals needed. Just the video. 🎬 📄 Paper: arxiv.org/abs/2603.10178 💻 Code: github.com/limenlp/ExeVRM 🤗 Model: huggingface.co/lime-nlp/ExeVR… 📦 Data: huggingface.co/datasets/lime-…
English
2
10
47
8.9K
WebAgentlab
WebAgentlab@webagentlab·
OSExpert: Computer-Use Agents Learning Professional Skills via Exploration The paper presents the OSExpert framework, which enhances computer-use agents’ performance and efficiency in professional environments by enabling them to autonomously acquire and compose skills through a GUI-based exploration algorithm, achieving significant improvements in task completion and generalization compared to human experts. Jiateng Liu, Zhenhailong Wang, Rushi Wang, Bingxuan Li, Jeonghwan Kim, Aditi Tiwari, Pengfei Yu, Denghui Zhang, Heng Ji University of Illinois Urbana-Champaign; Stevens Institute of Technology arxiv.org/pdf/2603.07978
WebAgentlab tweet media
English
0
0
1
45
WebAgentlab
WebAgentlab@webagentlab·
PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents The paper introduces PIRA-Bench, a benchmark for evaluating proactive intent recommendation agents that autonomously anticipate user actions from visual inputs, and presents the Proactive Intent Recommendation Framework (PIRF) to enhance performance in complex, noisy environments. Yuxiang Chai, Shunye Tang, Han Xiao @HanXiao27253141 , Rui Liu, Hongsheng Li MMLab @ CUHK; Nankai University; Huawei Research arxiv.org/pdf/2603.08013
WebAgentlab tweet media
English
1
0
0
40
WebAgentlab
WebAgentlab@webagentlab·
#GUIAgent Papers of the Week(3/9~3/15): ◾️Adaptive VLM Routing ◾️CUAAudit ◾️MM-CondChain ◾️HATS ◾️Hybrid Self-evolving Structured Memory ◾️OpenClaw-RL ◾️SecAgent ◾️SlowBA ◾️PIRA-Bench ◾️OSExpert
English
1
1
1
240