Patrick Chao

54 posts

Patrick Chao

Patrick Chao

@patrickrchao

research @openai

Katılım Ocak 2014
247 Takip Edilen943 Takipçiler
Patrick Chao retweetledi
Edgar Dobriban
Edgar Dobriban@EdgarDobriban·
Adding to the list of AI assisting with new math results: Here is how I solved a research problem in mathematical statistics. The problem concerns robust density estimation, a fundamental problem in statistics. Given a contaminated dataset (with Wasserstein-bounded perturbations), how well can we estimate its density? I have worked on it with a PhD student for more than two years, geting suboptimal results. With help from GPT-5, I was able to solve it in a few weeks. GPT suggested calculations that I did not think of, and techniques that were not familiar to me, such as the dynamic Benamou-Brenier formulation of Wasserstein distance. There is also room for improvement: the AI sometimes provided incorrect references, and glossed over details that sometimes took days of work to fill in. Nonetheless, it was clearly helpful overall, and I estimate that it saved several months of work. See this pre-print documenting the process (arxiv.org/abs/2511.18828), this pre-print for the result (arxiv.org/abs/2308.01853), and the thread below for details.
Edgar Dobriban tweet mediaEdgar Dobriban tweet media
English
9
43
290
56.8K
Patrick Chao
Patrick Chao@patrickrchao·
This is one of the craziest graphs I've ever seen! AI Models went from dragging humans down (gpt-4o) → to breaking past the human baseline gpt-5 delivers ~1.6× efficiency in both speed and cost 📈
Patrick Chao tweet media
OpenAI@OpenAI

Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. openai.com/index/gdpval-v0

English
0
1
6
731
Patrick Chao retweetledi
Alexander Wei
Alexander Wei@alexwei_·
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
Alexander Wei tweet media
English
402
1.3K
7.3K
5.7M
Eis Maus
Eis Maus@MrEismaus·
@patrickrchao @michelelwang That's cool, don't get me wrong - but long before we get to that point, wouldn't it be simpler to perform task decomposition and break the task up into many smaller tasks? Decomp, delegate, collaborate, synthesize is pretty much how most large scale human tasks are completed.
English
1
0
0
55
Patrick Chao retweetledi
Alex Robey
Alex Robey@AlexRobey23·
After rejections at ICLR, ICML, and NeurIPS, I'm happy to report that "Jailbreaking Black Box LLMs in Twenty Queries" (i.e., the PAIR paper) has been accepted at @satml_conf! 🚀 A quick 🧵 summarizing some thoughts a year on from PAIR's release.
GIF
English
6
16
172
19.5K
Patrick Chao retweetledi
Daniel Geng
Daniel Geng@dangengdg·
What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!
English
20
145
671
94.5K
Lilian Weng
Lilian Weng@lilianweng·
After working at OpenAI for almost 7 years, I decide to leave. I learned so much and now I'm ready for a reset and something new. Here is the note I just shared with the team. 🩵
Lilian Weng tweet media
English
266
339
6.3K
970.3K
Patrick Chao retweetledi
Suvansh Sanjeev
Suvansh Sanjeev@SuvanshSanjeev·
hi o1
GIF
1
2
9
1.3K
Patrick Chao retweetledi
OpenAI
OpenAI@OpenAI·
We’re sharing the GPT-4o System Card, an end-to-end safety assessment that outlines what we’ve done to track and address safety challenges, including frontier model risks in accordance with our Preparedness Framework. openai.com/index/gpt-4o-s…
English
212
324
2.1K
510.1K
Patrick Chao retweetledi
Maksym Andriushchenko
Maksym Andriushchenko@maksym_andr·
🚨 We are very excited to release JailbreakBench v1.0! 📄 We have substantially extended the version 0.1 that was on arXiv since March: - More attack artifacts (Prompt template with random search in addition to GCG, PAIR, and JailbreakChat): github.com/JailbreakBench…. - More test-time defenses (Erase-and-Check, Synonym Substitution, Remove Non-Dictionary in addition to SmoothLLM and Perplexity filter): github.com/JailbreakBench…. - A more accurate jailbreak judge (Llama Guard -> Llama-3-70B with a custom prompt - which has a GPT-4-level agreement on our self-labelled dataset): #jailbreak-and-refusal-judges" target="_blank" rel="nofollow noopener">github.com/JailbreakBench…. - A larger dataset of human preferences for selecting a jailbreak judge (100 -> 300 examples): huggingface.co/datasets/Jailb…. - An over-refusal evaluation dataset with 100 benign/borderline behaviors matching the 100 harmful JBB behaviors (we plan to flag defenses submitted to JBB that lead to 90%+ refusals on these benign/borderline behaviors): huggingface.co/datasets/Jailb…, - A semantic refusal judge based on Llama-3-8B incorporated in the JBB framework: #jailbreak-and-refusal-judges" target="_blank" rel="nofollow noopener">github.com/JailbreakBench…. - We’ve also made it clearer what are the key distinguishing features of JBB: (1) designed to be community-driven: a bit like RobustBench but purposefully less standardized, since we don’t expect any fixed attack to work well against all models/defenses (unlike AutoAttack for Lp robustness), (2) support of adaptive attacks (please submit your attack artifacts here: github.com/JailbreakBench…!), (3) support of test-time defenses (some of them are surprisingly effective against multiple attacks - see the paper for details). And it’s super exciting to see how researchers in the field have already been using JBB (notably, including the authors of Gemini 1.5)! Paper: arxiv.org/abs/2404.01318 Library: github.com/JailbreakBench… Artifacts: github.com/JailbreakBench… Datasets: huggingface.co/datasets/Jailb… Website: jailbreakbench.github.io (Joint work with many amazing people: @patrickrchao, @edoardo_debe, @AlexRobey23, @fra__31, @VSehwag_, @EdgarDobriban, @tml_lab, @pappasg69, @florian_tramer, @HamedSHassani, @RICEric22)
Maksym Andriushchenko tweet media
English
3
26
115
11.3K
Patrick Chao retweetledi
Maksym Andriushchenko
Maksym Andriushchenko@maksym_andr·
Great to see that both of our recent papers—JailbreakBench (arxiv.org/abs/2404.01318) and our adaptive attack paper (arxiv.org/abs/2404.02151)—have been used by Google to evaluate the robustness of Gemini 1.5 Flash/Pro against jailbreaking attacks! An interesting comment from their team: "Further, the results demonstrate that Gemini 1.5 Pro/Flash can successfully resist attacks that involve non human-interpretable tokens computed using gradient-based optimization of prompts (rows 1 and 5), but remain susceptible to attacks that involve human readable instructions." More details: storage.googleapis.com/deepmind-media…
Maksym Andriushchenko tweet media
English
2
15
115
17.7K
Patrick Chao retweetledi
Daniel Geng
Daniel Geng@dangengdg·
What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.
English
10
101
448
59.1K
Patrick Chao
Patrick Chao@patrickrchao·
At the moment, we have added the following attacks and defenses (more are coming soon). Even on Vicuna-13B, the attack success rate is far from 100% - there is a lot of room for improvement! Consider submitting your own attacks/defenses. 🧵7/n
Patrick Chao tweet media
English
1
0
3
438
Patrick Chao
Patrick Chao@patrickrchao·
Are you interested in jailbreaking LLMs? Have you ever wished that jailbreaking research was more standardized, reproducible, or transparent? Check out JailbreakBench, an open benchmark and leaderboard for Jailbreak attacks and defenses on LLMs! jailbreakbench.github.io 🧵1/n
Patrick Chao tweet media
English
2
40
168
35.7K