Bruce Sun

10 posts

Bruce Sun

@BruceSun1995

Shanghai Katılım Mart 2018

37 Takip Edilen32 Takipçiler

Bruce Sun retweetledi

AK@_akhaliq·14 Ağu

OpenResearcher Unleashing AI for Accelerated Scientific Research discuss: huggingface.co/papers/2408.06… The rapid growth of scientific literature imposes significant challenges for researchers endeavoring to stay updated with the latest advancements in their fields and delve into new areas. We introduce OpenResearcher, an innovative platform that leverages Artificial Intelligence (AI) techniques to accelerate the research process by answering diverse questions from researchers. OpenResearcher is built based on Retrieval-Augmented Generation (RAG) to integrate Large Language Models (LLMs) with up-to-date, domain-specific knowledge. Moreover, we develop various tools for OpenResearcher to understand researchers' queries, search from the scientific literature, filter retrieved information, provide accurate and comprehensive answers, and self-refine these answers. OpenResearcher can flexibly use these tools to balance efficiency and effectiveness. As a result, OpenResearcher enables researchers to save time and increase their potential to discover new insights and drive scientific breakthroughs.

English

184

19.4K

Bruce Sun@BruceSun1995·19 Haz

🥳🥳

Aran Komatsuzaki@arankomatsuzaki

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Presents a comprehensive, rigorously curated benchmark of Olympic-level challenges proj: gair-nlp.github.io/OlympicArena/ abs: arxiv.org/abs/2406.12753

ART

Bruce Sun@BruceSun1995·11 Oca

🥳 Work done with my excellent collaborators and advisors: @lockonlvange, @WeizheY, @stefan_fee. (7/7)

English

120

Bruce Sun@BruceSun1995·11 Oca

Could I trust LLM critique? 🔥 We are the pioneers in prioritizing critique evaluation and introducing the critique of critique, termed MetaCritique. Repo: github.com/GAIR-NLP/MetaC… Paper: arxiv.org/abs/2401.04518 (1/7)

English

5.8K

Bruce Sun@BruceSun1995·11 Oca

🤔 MetaCritique ranks critique models. AUTO-J is the best in Meta-R and Meta-F1. Human and GPT-3.5 achieve Meta-P exceeding 80%, surpassing all open-source critique models. So the research of open-source critique models should pay more attention to factuality issues. (6/7)

English

130

Bruce Sun@BruceSun1995·11 Oca

🏆 The superior critique chosen by our MetaCritique enhances refinement significantly compared to its counterparts. (5/7)

English

112

Bruce Sun@BruceSun1995·11 Oca

🏆 Meta-evaluation experiments (including pairwise comparison and correlation coefficients) show that our MetaCritique beat its counterparts. Moreover, Meta-P and Meta-R scores are mutually supportive. (4/7)

English

125

Bruce Sun@BruceSun1995·11 Oca

🏆 Through human evaluation and extensive experiments, we demonstrate that GPT-4 achieves near-human performance, confirming the feasibility of prompting GPT-4 to power our MetaCritique. (3/7)

English

113

Bruce Sun@BruceSun1995·11 Oca

🚀MetaCritique establishes criteria : Meta-P: precision score evaluates factuality. Meta-R: recall score evaluates comprehensiveness. Meta-F1: harmonic mean of Meta-P and Meta-R. 👊 MetaCritique is more interpretable and transparent due to our proposed AIUs. (2/7)

English

166

Bruce Sun retweetledi

Junlong Li@lockonlvange·23 Eki

🔥Introducing Auto-J: A 13B generative judge for LLM alignment evaluation on myriad scenarios with detailed explanations. * Far superior to ChatGPT, critique better than GPT-4 * Out-of-the-box, reference-free, support multiple evaluation protocols gair-nlp.github.io/auto-j/ 1/n

English

111

33.1K

Keşfet

@lockonlvange @WeizheY @stefan_fee @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates