Seungwon Lim

2 posts

Seungwon Lim

Seungwon Lim

@sngwonlim

Katılım Temmuz 2024
14 Takip Edilen5 Takipçiler
Seungwon Lim retweetledi
Wooseok Seo
Wooseok Seo@just1nseo·
🚀New Paper! arxiv.org/abs/2506.13342 While fact verification is essential to ensure the reliability of LLMs, detailed analysis of fact verifiers remains understudied. We present several findings based on our revised dataset, along with practical guidance to improve the models.
Wooseok Seo tweet media
English
1
26
103
13K
Seungwon Lim
Seungwon Lim@sngwonlim·
@TheHeroShep Interesting Results! Which dataset or benchmark did use used for this visualization?
English
0
0
0
22
alex duffy
alex duffy@alxai_·
Been playing with GPT4.5 for a few days some people might be disappointed today... but I can't wait for it to get into voice Ran it through the Dark Triad & OCEAN personality tests 4.5 was more extroverted, open, agreeable, conscientious, & less neurotic -- which tracked with my experience Interested to hear your thoughts. Ours 👇
Dan Shipper 📧@danshipper

GPT-4.5 is out! We've been testing it for a few days @every and, honestly...it’s not going to blow your mind, but it might befriend you. It's more like a personality, communication, and creativity upgrade than a huge intelligence leap. It's like OpenAI is pivoting its base model from "bland assistant" to "AI bestie." What it does do well: - OpenAI says it scores 64% on SimpleQA (double GPT-4's score) - Much better writing with cleaner, better structured, more human-like prose - Genuinely warmer and more emotionally intelligent (gave me some good advice!) - Less robotic, more opinionated responses We ran personality tests and 4.5 is more extroverted, agreeable, and less neurotic than 4o. The trade-off? It's sometimes worse at following instructions and because it's less sycophantic and more creative. It also hallucinated in our testing. I didn't love it day one, but after more use it’s growing on me

English
1
7
24
6.7K