Xiangchen Song

23 posts

Xiangchen Song

Xiangchen Song

@XiangchenSong

PhD student @mldcmu @SCSatCMU | Undergrad @dmguiuc @UofIllinois | Intern @AmazonScience @SFResearch @MSFTResearch

Pittsburgh, PA Katılım Aralık 2016
711 Takip Edilen182 Takipçiler
Xiangchen Song retweetledi
Weiran Yao
Weiran Yao@iscreamnearby·
Introducing CHI-Bench on @huggingface: the world’s first long-horizon healthcare benchmark for AI agents. 75 real healthcare workflows + 20 apps + 200+ MCP tools + 1,290 skills + process / outcome rewards huggingface.co/datasets/actav… Any questions, lmk!
English
8
26
139
26.2K
Xiangchen Song retweetledi
Aether AI (Causal Intelligence)
We are building Aether AI. #AetherAI Scaling has made AI powerful. But scaling pattern recognition alone will not deliver real-world intelligence. The next paradigm requires causal world models and causal agentic systems — systems that uncover mechanisms, reason about interventions, and improve through the consequences of their own actions. Our first proving ground is Physical AI. #Causality #AI
English
0
2
3
985
Xiangchen Song retweetledi
Caiming Xiong
Caiming Xiong@CaimingXiong·
In real healthcare operations, agents must do far more than answer medical questions. They need to read charts, interpret clinical and operational policies, verify coverage, route referrals, draft P2P scripts, and finalize care plans — where a single policy violation can mean a denied claim or missed patient outcome. @actAVAai @iscreamnearby led and developed CHI-Bench (Clinical Healthcare In-situ Benchmark), the first long-horizon, policy-rich benchmark for AI agents operating across end-to-end U.S. healthcare workflows. Key highlights: ▶️ High-fidelity simulators for Provider Prior Authorization, Payer Utilization Management, and Population Health Care Management, all exposed as MCP servers over patient, clinician, and insurer records. 🧪 Each trial runs 60–80 agent steps across 4–6 clinical stages, with access to 21 healthcare apps, 200+ MCP tools, and a 1,279-document operations handbook. Leaderboard results across 30 frontier agents: • Claude Code + Opus 4.6: 28% pass@1 • Codex + GPT-5.5: 21% • Utilization review: 41% • Care management: 32% • Prior authorization: 29% Reliability remains a major challenge: no agent exceeds 20% when the same case is repeated three times.
Caiming Xiong tweet media
English
7
19
54
2.8K
Xiangchen Song retweetledi
Weiran Yao
Weiran Yao@iscreamnearby·
1/🧵Can AI agents automate U.S. healthcare workflows end to end given just clinician & insurer apps and operations, medical policy library? Introducing CHI-Bench: 75 long-horizon realistic healthcare workflows × 30 frontier agents. Best agent solves only 28% #AIinHealthcare 👇
Weiran Yao tweet media
English
12
23
42
62.6K
Xiangchen Song retweetledi
Weiran Yao
Weiran Yao@iscreamnearby·
Stop restarting your long-running agents. Enterprise Deep Research (EDR) lets you steer mid-run—like driving a car. It can save you hours or even days of work. Open-source, enterprise-ready, built by @SFResearch. Try it & drop your use case below 👇 🤖GitHub: github.com/SalesforceAIRe…
English
4
5
22
9.3K
Xiangchen Song retweetledi
Kun Zhang-in pursuit of Causality with ML
MBZUAI Machine Learning Winter School 2026: Representation Learning & GenAI (mlws.mbzuai.ac.ae) on Feb. 9-13, 2026, in Abu Dhabi, UAE. Application Deadline: Oct. 20, 2025! Join us for an exciting 5-day program with world-class researchers! Funding available! #MBZUAI
English
0
18
44
6.5K
Xiangchen Song retweetledi
Aashiq Muhamed
Aashiq Muhamed@AashiqMuhamed·
🧵 Your SAE learns different features each time? Struggling to convince people to trust your interpretations? Maybe you're only one architecture choice away from a solution. We formulate this as a Feature Consistency problem and show that high consistency is achievable!
Aashiq Muhamed tweet media
English
1
6
26
2.2K
Xiangchen Song retweetledi
Caiming Xiong
Caiming Xiong@CaimingXiong·
We present 🧩Retroformer🧩, iteratively improving LLM agents by learning a plug-in retrospective model, that through the process of policy gradient optimization, automatically refines the prompts with env-specific rewards. arXiv: arxiv.org/abs/2308.02151 #LanguageAgents #LLM
Caiming Xiong tweet media
English
1
34
110
14.5K
Xiangchen Song retweetledi
Xiangchen Song retweetledi
CLeaR-Conference on Causal Learning and Reasoning
The CLeaR society is delighted to announce that we are organizing the 2023 edition of CLeaR in Tubingen, Germany. The submission deadline will be around mid-October. Details will be released shortly. Please stay tuned!
English
0
32
113
0
Xiangchen Song retweetledi
Sang Choe
Sang Choe@sangkeun_choe·
We've just released Betty, a PyTorch library for generalized meta-learning (GML) and multilevel optimization (MLO)! Betty gives a unified programming interface for applications including HPO, NAS, MAML, RL, and more. Code: github.com/leopard-ai/bet… Paper: tinyurl.com/bettyautodiffm…
Sang Choe tweet media
English
2
21
102
0
Xiangchen Song retweetledi
uai2026
uai2026@UncertaintyInAI·
We are happy to announce that the UAI 2022 program committee is carefully reviewing the 730 submissions to the conference! We are looking forward to seeing you in Eindhoven, The Netherlands on August 1-5, 2022!
uai2026 tweet media
English
0
14
52
0