Stefan Bejgu

24 posts

Stefan Bejgu

Stefan Bejgu

@SBejgu

Rome, Lazio Katılım Haziran 2021
229 Takip Edilen83 Takipçiler
Stefan Bejgu retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
Want to know if an AI is lying? LLM-OASIS helps detect factual accuracy in AI outputs with 81k training examples. LLM-OASIS introduces the largest dataset for training factuality evaluators, created by extracting and falsifying information from Wikipedia articles. This enables end-to-end verification of AI-generated text accuracy. ----- 🤔 Original Problem: LLMs still produce hallucinations in their outputs. Existing factuality evaluation resources are limited by being task-specific, small in size, or focused only on simple claim verification. ----- 🔧 Solution in this Paper: → LLM-OASIS extracts claims from Wikipedia passages using an LLM-based pipeline. → The system falsifies selected claims by introducing subtle but critical factual errors. → It generates pairs of factual and unfactual texts based on the original and modified claims. → The dataset covers 81k Wikipedia pages with 681k claims for training factuality evaluators. ----- 💡 Key Insights: → Task-agnostic factuality evaluation is possible with a large-scale synthetic dataset → Wikipedia provides reliable source material for generating factual/unfactual pairs → Human validation confirms high quality of automated data generation (90%+ accuracy) ----- 📊 Results: → GPT-4 achieves 60% accuracy on end-to-end factuality evaluation → 68% accuracy with Retrieval Augmented Generation → Human validation shows 96.78% accuracy for claim extraction → Dataset creation pipeline maintains 89-98% accuracy across all steps
Rohan Paul tweet media
English
3
4
15
1.9K
Stefan Bejgu retweetledi
Valentino Maiorca
Valentino Maiorca@ValeMaiorca·
✨ Meet #ResiDual, a novel perspective on the alignment of multimodal latent spaces! Think of it as a spectral "panning for gold" along the residual stream. It improves text-image alignment by simply amplifying task-related directions! 🌌🔍 arxiv.org/abs/2411.00246 [1/6]
Valentino Maiorca tweet media
English
2
11
30
3K
Stefan Bejgu retweetledi
UniReps
UniReps@unireps·
🔵🔴When do distinct learning processes learn similar representations? Detecting patterns and conditions for this to happen is an open direction: a thread🧵 Working on this topic? Submit at: openreview.net/group?id=NeurI… DEADLINE: 20 Sept See you at @NeurIPSConf! 🔵🔴 [1/N]
UniReps tweet media
English
1
14
49
5.4K
Stefan Bejgu retweetledi
Alessandro Scirè
Alessandro Scirè@alescire94·
Exciting strides in text summarization with LLMs 🚀but verifying their factual accuracy is still an open challenge 🤔 We introduce FENICE, a factuality-oriented metric for summarization with a strong focus on interpretability🔍arxiv.org/abs/2403.02270 #NLProc #LLMs #Factuality
English
2
10
20
1.5K
Stefan Bejgu retweetledi
Valentino Maiorca
Valentino Maiorca@ValeMaiorca·
📢 It looks like relative representations are here to stay! I'm beyond thrilled to announce that our work has been selected as one of the notable top 5% (oral) papers at #iclr23 ! 🥳 twitter.com/moschella_luca… [1/5]
Luca Moschella@moschella_luca

Welcome Relative Representations, enabling zero-shot communication between latent spaces without any training! arxiv.org/abs/2209.15430 It turns out that distinct neural networks learn intrinsically equivalent latent spaces [1/6]

English
3
37
267
54.6K
Stefan Bejgu retweetledi
Babelscape
Babelscape@babelscape·
Empower your natural language applications with WordAtlas! #WordAtlas is the next-generation multilingual knowledge graph. What makes it special is its linkage between words and concepts in hundreds of languages. babelscape.com/wordatlas
Babelscape tweet media
English
0
10
15
0
Stefan Bejgu retweetledi
Ksenia_TuringPost
Ksenia_TuringPost@TheTuringPost·
Classy is a @PyTorch-based library for the fast prototyping and sharing of deep neural network models. It wraps the best libraries like PyTorch Lightning, Transformers, @streamlit and offers them to users with a simple CLI interface. Try it here: github.com/sunglasses-ai/…
Ksenia_TuringPost tweet media
English
0
17
27
0