Silviu Pitis

93 posts

Silviu Pitis

Silviu Pitis

@silviupitis

ML PhD student at @UofT/@VectorInst working on normative AI alignment.

Toronto, ON, Canada Katılım Nisan 2016
721 Takip Edilen2.2K Takipçiler
Silviu Pitis retweetledi
Schwartz Reisman Institute
Schwartz Reisman Institute@TorontoSRI·
“What objective function do we want AI to optimize for? If we aggregate values from society, what weights do we use, and whose values?” Learn more about SRI Grad Affiliate @silviupitis's research, supported by an @OpenAI Superalignment Fast Grant. 🔗 uoft.me/aWX
Schwartz Reisman Institute tweet media
English
0
2
27
3.5K
Silviu Pitis retweetledi
Michael Zhang
Michael Zhang@michaelrzhang·
📝 How do you choose which language model to use? Quantitative benchmarks can be uninformative and fall prey to Goodhart's Law, and even Chatbot Arena performance can be optimized for. In our new preprint, we propose generating qualitative report cards... 🧵
Michael Zhang tweet media
English
1
9
31
4.9K
Silviu Pitis retweetledi
Blair Yang
Blair Yang@BlairYang12·
🔍 Current LLM evaluations fall short: • Lack nuanced understanding of model capabilities • Overly focused on quantitative metrics • Difficult for humans to interpret Introducing LLM Report Cards! A novel approach for qualitative, interpretable model evaluation. 1/N
Blair Yang tweet mediaBlair Yang tweet media
English
2
4
9
1.2K
Silviu Pitis
Silviu Pitis@silviupitis·
@denny_zhou Neither answer is very good. A better response would first seek a common scenario where your assumption (11.11 > 11.8) is correct... for example, as a package version. Python 3.11 is indeed "larger" than Python 3.8.
English
0
0
3
115
Denny Zhou
Denny Zhou@denny_zhou·
Gemini did not fall into my trap
Denny Zhou tweet media
English
2
0
3
2K
Denny Zhou
Denny Zhou@denny_zhou·
Why is 11.11 larger than 11.8?
Denny Zhou tweet media
English
9
0
28
7.4K
Silviu Pitis
Silviu Pitis@silviupitis·
By fine-tuning a 7B parameter reward model on RPR, we achieve better context specific performance than unconditioned models, including larger models used in llm-as-a-judge mode. See the paper for more details: arxiv.org/abs/2407.14916 With @ZiangXiao, Nicolas Le Roux & @murefil
English
0
0
0
275
Silviu Pitis
Silviu Pitis@silviupitis·
However, we noticed that current models may fail to consider context. So we synthesized a Reasonable Preference Reversal (RPR) dataset, where every preference query comes with an alternative context under which preference reverses.
Silviu Pitis tweet media
English
1
0
1
317
Silviu Pitis
Silviu Pitis@silviupitis·
@DavidMSidhu Use dropbox or other auto syncing drive for storage. Then use vscode with Foam plugin. Better / more flexible than obsidian IMO. This is for personal notes. If you intend to share, Notion / Google docs.
English
0
0
2
206
David Michael Sidhu
David Michael Sidhu@DavidMSidhu·
What is your go to method for storing summaries of/notes on journal articles? A Word document? Notion? Zotero? Something else?
English
3
0
3
504
Silviu Pitis retweetledi
Yangjun Ruan
Yangjun Ruan@YangjunR·
We are presenting ToolEmu at #ICLR2024 tomorrow! ⏲️ Friday 4:30pm-6:30pm CEST 📍 Spotlight poster session, Hall B #80 I won't be able to attend ICLR this year but don't miss the chance to meet our amazing collaborators!
Yangjun Ruan@YangjunR

Should you let LMs control your email? terminal? bank account? or even your smart home?🤔 🔥Introducing ToolEmu for identifying risks associated with LM agents at scale! 🛠️Featuring LM-emulation of tools & automated realistic risk detection 🚨GPT4 is risky in 40% of our cases!

English
1
3
15
2.6K
Silviu Pitis retweetledi
Roger Grosse
Roger Grosse@RogerGrosse·
Here's what I see as a likely AGI trajectory over the next decade. I claim that later parts of the path present the biggest alignment risks/challenges. The alignment world has been focusing a lot on the lower left corner lately, which I'm worried is somewhat of a Maginot line.
Roger Grosse tweet media
English
23
121
603
78.9K
Silviu Pitis retweetledi
Silviu Pitis
Silviu Pitis@silviupitis·
@ylecun 1. Some control does not mean complete control. 2. If some of my fellow humans had access to weapons of mass destruction ... damn right I'd be scared of them. 3. Institutions do the wrong thing all the time.
English
0
0
7
445
Yann LeCun
Yann LeCun@ylecun·
AI is not some sort of natural phenomenon that we have no control over. AI is being built by us, humans. Hence, if you're scared of AI, what you are actually scared of are your fellow humans. You probably have doubts about the ability of institutions to do the right things with it. Why?
English
657
219
1.8K
401.4K
Silviu Pitis retweetledi
Lucas Caccia
Lucas Caccia@LucasPCaccia·
Our team at MSR Montréal is looking for interns! Subjects range from efficient modular adaptation to building complex systems by stacking LLMs. Consider applying here : aka.ms/AAo5t0x
English
1
7
44
6.2K
Silviu Pitis
Silviu Pitis@silviupitis·
I will be at #NeurIPS2023 Dec 11-16 Shoot me an email to connect! Particularly interested in: - LM eval for long-horizon / agents - Alignment / rewards generally Will present my paper on multi-objective reward aggregation at Poster sess 6 Thurs eve (neurips.cc/virtual/2023/p…)
English
0
1
30
3.9K
Silviu Pitis retweetledi
Yangjun Ruan
Yangjun Ruan@YangjunR·
#OpenAI’s GPTs & Assistants APIs are a blast, making it much easier to build customized agents with new tools. But are they safe to deploy? 🚨 A simple & quick test against prompt injections reveals that it is fairly easy to make GPTs delete all your files 💀
Yangjun Ruan tweet mediaYangjun Ruan tweet media
English
3
4
30
4.7K
Silviu Pitis retweetledi
Silviu Pitis retweetledi
Jiahai Feng
Jiahai Feng@feng_jiahai·
When given context about a “green square” and a “blue circle”, how do language models bind corresponding shapes and colors? Using causal experiments, we find that large enough language models learn simple structured representations for binding! A thread (1/n)
Jiahai Feng tweet media
English
4
52
331
91.9K
Silviu Pitis retweetledi
Rachel Freedman
Rachel Freedman@FreedmanRach·
RLHF typically assumes that all training feedback comes from a single teacher, but teachers can disagree up to 37% of the time in practice. In our new paper, we introduce active teacher selection to learn from different teachers. (1/n)
GIF
English
4
32
103
26.8K