Fang Sun

7 posts

Fang Sun

Fang Sun

@FrancoTSolis

CS PhD Student @UCLA

Los Angeles เข้าร่วม Kasım 2023
55 กำลังติดตาม35 ผู้ติดตาม
Fang Sun รีทวีตแล้ว
Mingyu Derek MA
Mingyu Derek MA@mingyu_ma·
How is info propagated on social media? Where will the discussion spread next? What did people discuss? Our AAAI demo system "MIDDAG: Where Does Our News Go? Investigating Information Diffusion via Community-Level Information Pathways" visualizes the existing and forecasted information pathways and provides insights into the motivating forces of the spread. 🌐We gather COVID-19 related discussions on Twitter and Reddit with 640M posts, form communities among users by their influence 🌐GNN method is used to predict where will the info propagated next among communities 🌐Susceptibility of communities is estimated with a computational approach 🌐We show events and leading opinions of communities for insights into discussion content while users propagate info This is joint work with Alexander K. Taylor, Nuan Wen, @_yanchenliu, @P_N_Kung, Wenna Qin, Shichang Wen, Azure Zhou, @Diyi_Yang, @MaxMa1987, @VioletNPeng, @WeiWang1973 from @UCLA @USC_ISI @Stanford @Harvard. We are presenting the system at Demo Session 1 (7pm-9pm Thu, Exhibit Hall AB1) at @RealAAAI. Please drop by! Please refer to info-pathways.github.io for video, code, and more!
Mingyu Derek MA tweet media
English
0
3
11
669
Fang Sun รีทวีตแล้ว
Mandy
Mandy@Xiaoxuan__Wang·
🧸We introduce SCIBENCH, a challenging college-level scientific dataset designed to evaluate the reasoning abilities of current LLMs (#gpt4, #chatgpt). 🐻We find that no current prompting methods or external tools improves all capabilities. Github: github.com/mandyyyyii/sci…
Mandy tweet media
AK@_akhaliq

SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models paper page: huggingface.co/papers/2307.10… Recent advances in large language models (LLMs) have demonstrated notable progress on many mathematical benchmarks. However, most of these benchmarks only feature problems grounded in junior and senior high school subjects, contain only multiple-choice questions, and are confined to a limited scope of elementary arithmetic operations. To address these issues, this paper introduces an expansive benchmark suite SciBench that aims to systematically examine the reasoning capabilities required for complex scientific problem solving. SciBench contains two carefully curated datasets: an open set featuring a range of collegiate-level scientific problems drawn from mathematics, chemistry, and physics textbooks, and a closed set comprising problems from undergraduate-level exams in computer science and mathematics. Based on the two datasets, we conduct an in-depth benchmark study of two representative LLMs with various prompting strategies. The results reveal that current LLMs fall short of delivering satisfactory performance, with an overall score of merely 35.80%. Furthermore, through a detailed user study, we categorize the errors made by LLMs into ten problem-solving abilities. Our analysis indicates that no single prompting strategy significantly outperforms others and some strategies that demonstrate improvements in certain problem-solving skills result in declines in other skills. We envision that SciBench will catalyze further developments in the reasoning abilities of LLMs, thereby ultimately contributing to scientific research and discovery.

English
0
16
44
10.1K
Fang Sun รีทวีตแล้ว
Mingyu Derek MA
Mingyu Derek MA@mingyu_ma·
Check out our recent work on LLM fingerprinting and ownership protection!
🌴Muhao Chen🌴@muhao_chen

Paper: arxiv.org/abs/2401.12255 Website: cnut1648.github.io/Model-Fingerpr… We present a pilot study on LLM fingerprinting as a form of very lightweight instruction tuning. Model publisher specifies a confidential private key and implants it as an instruction backdoor that causes the LLM to generate specific text when the key is present. Results showed that this approach is lightweight and does not affect the normal behavior of the model. It also prevents publisher overclaiming, maintains robustness against fingerprint guessing and parameter-efficient training, and supports multi-stage fingerprinting akin to MIT License.

English
0
1
7
344
Fang Sun รีทวีตแล้ว
Kaixuan Ji
Kaixuan Ji@Kaixuan_Ji_19·
🔥Excited to share our recent research on query-efficient RLHF! Introducing Active Direct Preference Optimization (ADPO), a new approach that improves DPO performance on Open-LLM-Benchmark with just half the queries. Discover how ADPO eliminates the significant demand for querying preference labels.🚀[1/4] Paper: arxiv.org/pdf/2402.09401… A joint work with @JiafanHe , and @QuanquanGu 👏
Kaixuan Ji tweet media
English
2
34
128
40.4K