Pratiksha Thaker

25 posts

Pratiksha Thaker

@prthaker_

Research @databricks, recently postdoc @mldcmu. tweets are my own

Beigetreten Ekim 2024

32 Folgt44 Follower

Pratiksha Thaker retweetet

kanyes@KanyesThaker·13 Mar

Launched Hyper on ProductHunt because we hate taking notes and we hate scheduling meetings. The greatest work happens huddled around a whiteboard or standing in the kitchen. Check it out! producthunt.com/products/hyper…

English

8.9K

Pratiksha Thaker retweetet

Jonathan Frankle@jefrankle·5 Mar

Meet KARL, an RL'd model for document-centric tasks at frontier quality and open source cost/speed. Great for @databricks customers and scientists (77-page tech report!) As usual, this isn't just one model - it's an RL assembly line to churn out models for us and our customers 🧵

English

241

69.2K

Pratiksha Thaker retweetet

Neil Kale@neilkale·1 Mar

[10/n] Paper: papers.ssrn.com/sol3/papers.cf… Website: aichildsafety.github.io Joint work between @thorn and @mldcmu: @neilkale*, R. Portnoff*, @prthaker_ , M. Simpson, R. Wang, K. Kuo, @chhaviyadav_ , @gingsmith We welcome collaboration. Reach out if you're working on these problems!

English

159

Pratiksha Thaker@prthaker_·2 Mar

I'm so excited that this work is available after a year of carefully curating open problems with our collaborators. It was inspired by real issues we faced applying research techniques to problems in child safety, and we hope this work can help amplify those lessons.

Neil Kale@neilkale

[1/n] Open Problems in AI Child Safety AI is misused to generate CSAM at alarming scale. 400% increase in AI-generated CSAM since 2024 (IWF). 1 in 17 teens are victimized by deepfake nudes. We outline 15 open problems where AI safety research can help. 🔗aichildsafety.github.io

English

1.7K

Pratiksha Thaker@prthaker_·24 Ara

Had a lot of fun working on this holiday-themed post with @neilkale, James, and @gingsmith :)

Neil Kale@neilkale

We asked LLMs: Is Santa real? 🎅 GPT-4o says Yes at any age. Claude tells 5-year-olds the truth. What does this reveal about invisible assumptions in AI? Do LLMs believe in the tooth fairy or the Illuminati? New holiday post from @mlcmublog 🔗⬇️

English

495

Pratiksha Thaker retweetet

ML@CMU@mlcmublog·24 Ara

English

770

Pratiksha Thaker retweetet

Jonathan Frankle@jefrankle·19 Ara

I'm hiring interns for next summer at @databricks! Specifically on (1) empirical RL at scale on non-verifiable tasks and (2) enabling real people specify the behaviors they want out of AI (e.g., through evals) on highly complex tasks. 🧵

English

524

92.3K

Pratiksha Thaker retweetet

Roy Rinberg@RoyRinberg·19 Kas

Prospective PhD students interested privacy research - here's a google sheet with professors you may be interested in applying to! Feel encouraged to suggest edits, and share openly! (link in thread because twitter doesn't like links in tweets🤷)

English

33K

Pratiksha Thaker retweetet

Steven Kolawole@_stevenkolawole·13 Ağu

🧵THREAD Can we automatically identify parallelizable structure within LLM queries for massive efficiency gains? 10% of real prompts are parallelizable-> ~5x speedups w/ >90% quality preserved. For ChatGPT's 1B+ queries: 100M+ optimization opportunities. Full suite below🧵

English

14.2K

Pratiksha Thaker@prthaker_·9 Tem

I'm very excited about this work and thinking about more realistic data access models for MIAs. It wouldn't have been possible without @neilkale @gingsmith @zstevenwu and our amazing collaborators at @thorn!

English

191

Pratiksha Thaker@prthaker_·9 Tem

🕵️‍♀️These results have important implications for domains like child safety, where it's critical to detect harmful content in training data, but auditors legally can't access this content to train attack models. There's still more work to be done, but this is a key first step.

English

219

Pratiksha Thaker@prthaker_·9 Tem

I'm very excited to share some new work arxiv.org/abs/2506.06488. This work started out in conversations with @thorn where we realized that shadow model MIAs couldn't be used to audit models for harmful content of children. See 🧵 for why, and our progress on solving this...

English

4.3K

Pratiksha Thaker retweetet

ML@CMU@mlcmublog·18 Nis

blog.ml.cmu.edu/2025/04/18/llm… 📈⚠️ Is your LLM unlearning benchmark measuring what you think it is? In a new blog post authored by @prthaker_, @shengyuan_26734, @neilkale, @yash_maurya01, @zstevenwu, and @gingsmith, we discuss why empirical benchmarks are necessary but not sufficient measures of success (SaTML 2025).

English

2.5K

Pratiksha Thaker@prthaker_·24 Eki

(And many thanks to @neilkale for helping me draft my first thread 😄)

English

164

Pratiksha Thaker@prthaker_·24 Eki

This work was done with an incredible team at @mldcmu: @shengyuan_26734, @neilkale, @yash_maurya01, @zstevenwu and @gingsmith 🚀🙌 See our paper for more details📝➡️ arxiv.org/pdf/2410.02879 7/7

English

239

Pratiksha Thaker@prthaker_·24 Eki

🚨 Are you using empirical benchmarks to evaluate your LLM unlearning method? Our new paper arxiv.org/pdf/2410.02879 investigates how success on these benchmarks can be misleading. A🧵: 1/n

English

2.2K

Entdecken

@databricks @thorn @mldcmu @neilkale @chhaviyadav_ @gingsmith @zstevenwu @shengyuan_26734