Silviu Pitis

93 posts

Silviu Pitis

@silviupitis

ML PhD student at @UofT/@VectorInst working on normative AI alignment.

Toronto, ON, Canada Katılım Nisan 2016

721 Takip Edilen2.2K Takipçiler

Silviu Pitis retweetledi

Schwartz Reisman Institute@TorontoSRI·4 Eki

“What objective function do we want AI to optimize for? If we aggregate values from society, what weights do we use, and whose values?” Learn more about SRI Grad Affiliate @silviupitis's research, supported by an @OpenAI Superalignment Fast Grant. 🔗 uoft.me/aWX

English

3.5K

Silviu Pitis retweetledi

Michael Zhang@michaelrzhang·12 Eyl

📝 How do you choose which language model to use? Quantitative benchmarks can be uninformative and fall prey to Goodhart's Law, and even Chatbot Arena performance can be optimized for. In our new preprint, we propose generating qualitative report cards... 🧵

English

4.9K

Silviu Pitis retweetledi

Blair Yang@BlairYang12·7 Eyl

🔍 Current LLM evaluations fall short: • Lack nuanced understanding of model capabilities • Overly focused on quantitative metrics • Difficult for humans to interpret Introducing LLM Report Cards! A novel approach for qualitative, interpretable model evaluation. 1/N

English

1.2K

Silviu Pitis@silviupitis·17 Ağu

@denny_zhou Neither answer is very good. A better response would first seek a common scenario where your assumption (11.11 > 11.8) is correct... for example, as a package version. Python 3.11 is indeed "larger" than Python 3.8.

English

115

Denny Zhou@denny_zhou·17 Ağu

Gemini did not fall into my trap

English

Denny Zhou@denny_zhou·17 Ağu

Why is 11.11 larger than 11.8?

English

7.4K

Silviu Pitis@silviupitis·15 Ağu

By fine-tuning a 7B parameter reward model on RPR, we achieve better context specific performance than unconditioned models, including larger models used in llm-as-a-judge mode. See the paper for more details: arxiv.org/abs/2407.14916 With @ZiangXiao, Nicolas Le Roux & @murefil

English

275

Silviu Pitis@silviupitis·15 Ağu

However, we noticed that current models may fail to consider context. So we synthesized a Reasonable Preference Reversal (RPR) dataset, where every preference query comes with an alternative context under which preference reverses.

English

317

Silviu Pitis@silviupitis·15 Ağu

When evaluating LLMs, added context such as a criteria or user profile, may be critical for determining preferred behavior. But can reward models effectively incorporate this additional context? 📝 New paper: arxiv.org/abs/2407.14916 🤗 Dataset: huggingface.co/datasets/micro…

English

2.1K

Silviu Pitis@silviupitis·31 Tem

@DavidMSidhu Use dropbox or other auto syncing drive for storage. Then use vscode with Foam plugin. Better / more flexible than obsidian IMO. This is for personal notes. If you intend to share, Notion / Google docs.

English

206

David Michael Sidhu@DavidMSidhu·30 Tem

What is your go to method for storing summaries of/notes on journal articles? A Word document? Notion? Zotero? Something else?

English

504

Silviu Pitis retweetledi

Yangjun Ruan@YangjunR·9 May

We are presenting ToolEmu at #ICLR2024 tomorrow! ⏲️ Friday 4:30pm-6:30pm CEST 📍 Spotlight poster session, Hall B #80 I won't be able to attend ICLR this year but don't miss the chance to meet our amazing collaborators!

Yangjun Ruan@YangjunR

Should you let LMs control your email? terminal? bank account? or even your smart home?🤔 🔥Introducing ToolEmu for identifying risks associated with LM agents at scale! 🛠️Featuring LM-emulation of tools & automated realistic risk detection 🚨GPT4 is risky in 40% of our cases!

English

2.6K

Silviu Pitis retweetledi

Roger Grosse@RogerGrosse·16 Şub

Here's what I see as a likely AGI trajectory over the next decade. I claim that later parts of the path present the biggest alignment risks/challenges. The alignment world has been focusing a lot on the lower left corner lately, which I'm worried is somewhat of a Maginot line.

English

121

603

78.9K

Silviu Pitis retweetledi

Yangjun Ruan@YangjunR·17 Oca

ToolEmu has been accepted at #ICLR2024 as a Spotlight presentation🔥 Explore our LLM-based emulation framework for identifying LLM agent risks at scale! 🎯 Demo: demo.toolemu.com 📄 Paper: arxiv.org/abs/2309.15817 🔗 Code: github.com/ryoungj/ToolEmu 🧵⬇️

Yangjun Ruan@YangjunR

English

8.7K

Silviu Pitis@silviupitis·1 Oca

@ylecun 1. Some control does not mean complete control. 2. If some of my fellow humans had access to weapons of mass destruction ... damn right I'd be scared of them. 3. Institutions do the wrong thing all the time.

English

445

Yann LeCun@ylecun·1 Oca

AI is not some sort of natural phenomenon that we have no control over. AI is being built by us, humans. Hence, if you're scared of AI, what you are actually scared of are your fellow humans. You probably have doubts about the ability of institutions to do the right things with it. Why?

English

657

219

1.8K

401.4K

Silviu Pitis retweetledi

Lucas Caccia@LucasPCaccia·14 Ara

Our team at MSR Montréal is looking for interns! Subjects range from efficient modular adaptation to building complex systems by stacking LLMs. Consider applying here : aka.ms/AAo5t0x

English

6.2K

Silviu Pitis@silviupitis·5 Ara

I will be at #NeurIPS2023 Dec 11-16 Shoot me an email to connect! Particularly interested in: - LM eval for long-horizon / agents - Alignment / rewards generally Will present my paper on multi-objective reward aggregation at Poster sess 6 Thurs eve (neurips.cc/virtual/2023/p…)

English

3.9K

Silviu Pitis retweetledi

Yangjun Ruan@YangjunR·7 Kas

#OpenAI’s GPTs & Assistants APIs are a blast, making it much easier to build customized agents with new tools. But are they safe to deploy? 🚨 A simple & quick test against prompt injections reveals that it is fairly easy to make GPTs delete all your files 💀

English

4.7K

Silviu Pitis retweetledi

Alan Chan@_achan96_·7 Kas

OpenAI just announced GPTs and the Assistants API for “ helping developers build agent-like experiences”, but what does that mean and how does it change how we should govern AI? Some early thoughts relating to my ongoing work 🧵:

OpenAI@OpenAI

We're rolling out new features and improvements that developers have been asking for: 1. Our new model GPT-4 Turbo supports 128K context and has fresher knowledge than GPT-4. Its input and output tokens are respectively 3× and 2× less expensive than GPT-4. It’s available now to all developers in preview. 2. Assistants API and new tools (Retrieval, Code Interpreter) will help developers build world-class AI assistants within their own apps. 3. The platform is becoming multimodal. GPT-4 Turbo with Vision, DALL·E 3, and text-to-speech are all now available to developers. Oh… and we’re doubling GPT-4 rate limits. openai.com/blog/new-model…

English

12.7K

Silviu Pitis retweetledi

Jiahai Feng@feng_jiahai·27 Eki

When given context about a “green square” and a “blue circle”, how do language models bind corresponding shapes and colors? Using causal experiments, we find that large enough language models learn simple structured representations for binding! A thread (1/n)

English

331

91.9K

Silviu Pitis retweetledi

Rachel Freedman@FreedmanRach·26 Eki

RLHF typically assumes that all training feedback comes from a single teacher, but teachers can disagree up to 37% of the time in practice. In our new paper, we introduce active teacher selection to learn from different teachers. (1/n)

GIF

English

103

26.8K

Keşfet

@OpenAI @denny_zhou @ZiangXiao @murefil @DavidMSidhu @ylecun @elonmusk @BarackObama