Tianqing Fang (@TFang229) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

🚀 We are thrilled to release a new open-source Deep Research Agent, Cognitive Kernel-Pro, from Tencent AI Lab! We focus on building a fully open-source agent with (to the maximum extent) free tools, showcasing impressive performance on GAIA with Claude-3.7-sonnet and surpass the counterpart, SmolAgents by a large margin. In addition, we study the training recipe for an open-source Deep Research Agent Foundation Model. We curate high-quality training data (queries, trajectories, and verifiable answers across web, file, code, and reasoning domains). Our finetuned Qwen3-8B (CK-Pro-8B) surpasses WebDancer and WebSailor with the similar model size on the text-only subset of GAIA. 📜 Paper: arxiv.org/pdf/2508.00414 🔧 Code: github.com/Tencent/Cognit… 🤗 Data & Model: huggingface.co/datasets/Cogni… huggingface.co/CognitiveKerne… This work builds on the previous efforts of Tencent AI Lab (Fig. 2). Be sure to check them out if you're interested!

English

0

24

84

8.5K

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·3 Ara

📢New paper: Guided Self-Evolving LLMs with Minimal Human Supervision Self-evolving / Self-improving LLMs often plateau fast due to concept drift, diversity collapse, and mis-evolution. Our method fixes this — keeping self-evolution stable, aligned, and on track! Link: arxiv.org/abs/2512.02472

English

6

39

219

69.7K

Tianqing Fang@TFang229·5 Kas

#emnlp2025 I’m presenting several papers I authored at Tencent AI Lab about agent self evolving and LLM memory. There will also be an oral presentation on the WebEvolver paper at Thu. Nov 6 at 16:30-18:00 Location: A110. Come and chat with me 😆

English

3

2

53

10.4K

Tianqing Fang retweetledi

Longyue Wang@wangly0229·23 Eki

🚀 Thrilled to share our breakthrough! Our 🍁Marco‑MT🍁 achieved outstanding results at #WMT2025 General Translation! 🏆 @AI_AlibabaInt Final Human Evaluation Results: 🏅6 First Places 🥈4 Second Places 🥉2 Third Places

English

0

4

9

550

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·8 Eki

Code for 𝐏𝐚𝐫𝐚𝐥𝐥𝐞𝐥-𝐑𝟏 is live! 👉 github.com/zhengkid/Paral… (now 189 stars and climbing 🔥) It lets LLMs think in parallel — multiple reasoning paths, smarter synthesis, more creative inference! Miss this paper and you’re missing a leap forward: arxiv.org/abs/2509.07980

English

2

48

230

14.3K

Tianqing Fang retweetledi

Zhenwen Liang@LiangZhenwen·9 Eki

@wyu_nd and I are recruiting 2026 Spring/Summer Research Interns at Tencent AI Lab 🚀 Topics include self-evolving, Agent Systems, Complex Reasoning, etc. We are also hiring full-time researchers with PhD degrees, fully publication-driven. Please DM or email.

English

4

8

55

4.6K

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·28 Ağu

New paper: VLMs can self-reward during RL training — no visual annotations needed! -- Decompose VLM reasoning into visual vs. language parts -- Prompt the same VLM without visual input for visual reward We call it 𝐕𝐢𝐬𝐢𝐨𝐧-𝐒(𝐞𝐥𝐟)𝐑𝟏: arxiv.org/abs/2508.19652

English

7

89

439

49.4K

Tianqing Fang retweetledi

Shizhe Diao@shizhediao·24 Ağu

This week, we open-sourced NVIDIA-Nemotron-Nano-v2-9B: our next-generation efficient hybrid model. - 6× faster than Qwen3-8B at reasoning tasks. - Retained long-context capability (8k → 262k trained, usable at 128k) First true demonstration that reasoning models can be compressed without bespoke architectures 👉 Report: research.nvidia.com/labs/adlr/file… 👉 HuggingFace: huggingface.co/collections/nv…

English

6

25

134

27.7K

Tianqing Fang@TFang229·25 Ağu

WebEvolver is accepted by #EMNLP2025 main conference. See you in Suzhou, China! Code: github.com/Tencent/SelfEv… Paper: arxiv.org/abs/2504.21024

Tianqing Fang@TFang229

🚀 Check out our paper: WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model, from Tencent AI Lab!. We present a world model-driven framework for self-improving web agents, addressing critical challenges in self-training—such as limited exploration and performance plateaus. 🔍 Key Innovations: - Co-Evolving World Model: The world model is implemented as an LLM that predicts the next webpage state (observation) given the current state and a planned action. In addition to fine-tuning the agent's policy model using self-generated trajectories, the same data is repurposed to train the world model. - World Model as Web Server Training Phase: sample pseudo trajectories by replacing the real web server with the world model. Inference Phase: simulate the outcome of candidate actions with 1-2 step look-ahead planning, to help better select actions. 📊 Results on Real-World Tasks (Mind2Web-Live & WebVoyager): ✅ ~10% higher success rate vs. pure self-training. ✅ significantly fewer environment interactions—efficient yet powerful! arxiv: arxiv.org/pdf/2504.21024 code: github.com/Tencent/SelfEv…

English

0

2

13

1.5K

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·21 Ağu

WebEvolver is now accepted to EMNLP 2025. @emnlpmeeting A self-evolving web agent capable of look-ahead simulation.

Tianqing Fang@TFang229

🚀 Check out our paper: WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model, from Tencent AI Lab!. We present a world model-driven framework for self-improving web agents, addressing critical challenges in self-training—such as limited exploration and performance plateaus. 🔍 Key Innovations: - Co-Evolving World Model: The world model is implemented as an LLM that predicts the next webpage state (observation) given the current state and a planned action. In addition to fine-tuning the agent's policy model using self-generated trajectories, the same data is repurposed to train the world model. - World Model as Web Server Training Phase: sample pseudo trajectories by replacing the real web server with the world model. Inference Phase: simulate the outcome of candidate actions with 1-2 step look-ahead planning, to help better select actions. 📊 Results on Real-World Tasks (Mind2Web-Live & WebVoyager): ✅ ~10% higher success rate vs. pure self-training. ✅ significantly fewer environment interactions—efficient yet powerful! arxiv: arxiv.org/pdf/2504.21024 code: github.com/Tencent/SelfEv…

English

0

8

62

5.4K

Tianqing Fang retweetledi

DailyPapers@HuggingPapers·9 Ağu

Microsoft just released Agent Lightning on Hugging Face. Train ANY AI agents with Reinforcement Learning with almost ZERO code change! A flexible and extensible framework that fully decouples agents from RL training.

English

3

12

84

8.1K

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·9 Ağu

𝑳𝑳𝑴𝒔 can really 𝑺𝒆𝒍𝒇-𝑬𝒗𝒐𝒍𝒗𝒆, 𝒘𝒊𝒕𝒉𝒐𝒖𝒕 𝑯𝒖𝒎𝒂𝒏 𝑫𝒂𝒕𝒂! -- One LLM, two roles: Challenger creates tasks, Solver answers them. -- No data, no labels, just a base model that learns and improves itself! We name it 𝑹-𝒛𝒆𝒓𝒐: arxiv.org/abs/2508.05004

English

34

243

1.4K

143.4K

Tianqing Fang retweetledi

DailyPapers@HuggingPapers·8 Ağu

Tencent AI Lab introduces R-Zero! A groundbreaking framework enabling LLMs to self-evolve their reasoning capabilities from zero human-curated data, through an autonomous Challenger-Solver loop.

English

7

79

489

78.3K

Tianqing Fang@TFang229·7 Ağu

Seems the term GPT-5 refers to more than a model but a Deep(er) Research Agent 🫣

OpenAI@OpenAI

GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/

English

1

0

6

225

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·6 Ağu

This is a fully open-source Deep Research Agent from Tencent -- data, code, and checkpoint!

Tianqing Fang@TFang229

🚀 We are thrilled to release a new open-source Deep Research Agent, Cognitive Kernel-Pro, from Tencent AI Lab! We focus on building a fully open-source agent with (to the maximum extent) free tools, showcasing impressive performance on GAIA with Claude-3.7-sonnet and surpass the counterpart, SmolAgents by a large margin. In addition, we study the training recipe for an open-source Deep Research Agent Foundation Model. We curate high-quality training data (queries, trajectories, and verifiable answers across web, file, code, and reasoning domains). Our finetuned Qwen3-8B (CK-Pro-8B) surpasses WebDancer and WebSailor with the similar model size on the text-only subset of GAIA. 📜 Paper: arxiv.org/pdf/2508.00414 🔧 Code: github.com/Tencent/Cognit… 🤗 Data & Model: huggingface.co/datasets/Cogni… huggingface.co/CognitiveKerne… This work builds on the previous efforts of Tencent AI Lab (Fig. 2). Be sure to check them out if you're interested!

English

0

9

23

2.2K

DailyPapers@HuggingPapers·4 Ağu

Tencent AI Lab just released Cognitive Kernel-Pro on Hugging Face! A fully open-source & free multi-module agent framework designed for deep research & agent foundation model training. Achieves state-of-the-art among open-source agents on GAIA.

English

2

6

20

1.3K

Tianqing Fang@TFang229·6 Ağu

@HuggingPapers Thanks for sharing our work!

English

0

19

Tianqing Fang@TFang229·6 Ağu

Thank you for sharing our work! The code, data, and models have been open-sourced for the research community’s benefit: github.com/Tencent/Cognit… huggingface.co/CognitiveKerne… huggingface.co/datasets/Cogni…

Rohan Paul@rohanpaul_ai

Cognitive Kernel‑Pro shows that a small 8B open‑source language model can run a multi‑skill research agent without paying for proprietary APIs. It wraps web, file, and code tools in one Python based framework and still tops other free agents on the GAIA benchmark. The framework splits work between a planner and specialist sub‑agents. The planner keeps four lists that record finished steps, upcoming tasks, lessons, and gathered facts. Each sub‑agent writes Python code, so the same language model can browse live pages, read PDFs, crunch tables, or run quick scripts. Everything moves through a plain text interface, so new skills bolt on quickly. To teach the model, the team built roughly 15K multi‑step examples that cover web navigation, document reading, math, and coding. Another large model first explored the internet, stored every intermediate step, then those hints were stripped before fine‑tuning. Extra synthetic questions came from PersonaHub, where a generated persona sparks a fresh web task that the agent later checks by itself. During runtime the agent judges its own work. After each attempt it writes a short diary of actions and flags answers that are empty, off topic, or built on weak evidence. If something looks wrong it retries, then a voting step picks the best result among several runs, which steadies performance on changing pages. With Claude‑3.7 as backbone the system reaches 70.9 pass\@3 on 165 GAIA tasks, beating every other open framework that avoids paid parsers and crawlers. This study shows how clear state design, transparent code actions, and rich training traces can close much of the gap between hobby hardware and closed commercial agents. ---- Paper – arxiv. org/abs/2508.00414 Paper Title: "Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training"

English

0

1

8

348

Tianqing Fang retweetledi

Wenhao Yu@wyu_nd·25 Tem

🗒️Have been exploring Agent-RL training over the past few months, particularly in GUI scenarios. Here’s a summary of some practical insights and lessons 🤔 learned from the perspective of an industry researcher, and some reference papers.

English

2

15

118

9K

Tianqing Fang

Keşfet