Reshmi Ghosh

1.9K posts

Reshmi Ghosh banner
Reshmi Ghosh

Reshmi Ghosh

@reshmigh

Sr. Scientist working on Agents,Reasoning, AI Security, @Microsoft AI, Chair @WiMLDS| Ph.D. @CarnegieMellon | making machines trustworthy| Views my own; She/Her

United States Beigetreten Temmuz 2013
2.4K Folgt1.2K Follower
Reshmi Ghosh retweetet
Hua Shen✨
Hua Shen✨@huashen218·
🧐Are values in LLMs aligned with humans? 1️⃣ And if they are — do LLMs stay honest to those values, or sometimes say one thing but act another? 2️⃣ ✨ We explore these questions in two papers presented at #EMNLP2025: 1️⃣ ValueCompass: hua-shen.org/assets/files/a… (WiNLP Workshop) 2️⃣ Mind the Value–Action Gap: arxiv.org/pdf/2501.15463 (Main Track) 🔍 Dataset & Code: github.com/huashen218/val… 🌱 I’m also #Hiring multiple PhD students for Fall 2026 @ NYU Courant Computer Science! If you’re passionate about #Human_AI_Alignment, #Value_Alignment, or broad #AI + #Human (society) research, let’s connect at EMNLP2025, NeurIPS2025, or over Zoom! 🎓 NYU CS PhD Apply (NYU Shanghai Track): cs.nyu.edu/dynamic/phd/ad… 💜 This year I’m also co-organizing the #EMNLP2025 WiNLP Workshop and supporting the amazing #Tutorial on Spoken Conversational Agents with LLMs (a short 15min talk)! Come say hi 👋 — I’d love to chat and connect with old and new friends at #EMNLP2025! 🔗 WiNLP Workshop: winlp-workshop.github.io 🔗Tutorial on Spoken Conversational Agents: aclanthology.org/2025.emnlp-tut… 💗Huge thanks to my wonderful paper collaborators — @tanmit,@YunHuang_HCI,@tknearem,@reshmigh,Nicholas Clark,Yu-Ju Yang — and my inspiring workshop/tutorial collaborators @huckiyang, Andreas Stolcke,@TYSSSantosh2,@therealthapa,@MeryemMhamdi1,Chen Zhang, Peerat Limkonchotiwat, Wiem Ben Rim.... 🤗Truly grateful and enjoyable to work with you all! 💫 #HumanAIAlignment #PhDOpening #NYU #NYUShanghai #ValueAlignment #HAI
Hua Shen✨ tweet mediaHua Shen✨ tweet media
English
1
15
95
26.8K
Reshmi Ghosh retweetet
Myra Deng
Myra Deng@myra_deng·
Using probes to accurately and efficiently detect model behavior (in this case PII leakage) in prod is one of the clear wins for applied interpretability. This is the path to semantic determinism - imagine AI models instrumented with internal probes that recognize when they’re hallucinating, going off-policy, or posing biorisk, and resteering themselves accordingly.
Goodfire@GoodfireAI

Why use LLM-as-a-judge when you can get the same performance for 15–500x cheaper? Our new research with @RakutenGroup on PII detection finds that SAE probes: - transfer from synthetic to real data better than normal probes - match GPT-5 Mini performance at 1/15 the cost (1/6)

English
5
17
260
36.4K
Reshmi Ghosh retweetet
Lily Xu
Lily Xu@lilyxu0·
Launching AI for Public Goods Fast Grants! We'll distribute $150k to advance critical work connecting AI and public goods. 💰 $10k per project 💰 $800 reviewer compensation PUBLIC GOODS := open source, ecosystem services, climate, urban infra, comms, education, science, & more
David Dao@dwddao

Announcing AI for Public Goods Fast Grants (AI4PG) - Up to $10K for AI research improving public goods funding. Fast review (2-3 weeks), simple applications (4 pages + 1 budget page), open to any researchers worldwide. Call for reviewers now open! recerts.org/ai4pg2025

English
5
35
155
24.8K
Reshmi Ghosh
Reshmi Ghosh@reshmigh·
@AmyPrb Prompt Injection is very much an industry/practical use of AI problem!!
English
0
0
0
32
Reshmi Ghosh retweetet
Niloofar
Niloofar@niloofar_mire·
I'm recruiting students for fall 2026 thru @LTIatCMU & @CMU_EPP, in: 1. Privacy & security of LLMs, coding, long horizon & embodied agents (robotics) 2. Tiny local llms 3. AI for scientific reasoning, esp. chemistry 4. Latent reasoning 5. anything YOU are passionate about!
English
26
183
1K
110K
Reshmi Ghosh
Reshmi Ghosh@reshmigh·
It is an infinite glitch circle now!
Ahmad Beirami@abeirami

@nmboffi But who are these reviewers? They are the same authors. I think we should teach young members of our community to value "learning a new nugget of information" over "obtaining a bold number in a table."

English
0
0
1
465
Reshmi Ghosh retweetet
Sarah Sachs
Sarah Sachs@sarahmsachs·
Being at top of @OpenAI token usage list is a vanity metric. Our job as engineers is to minimize token usage (aka latency and cost) while maximizing value by precise tool definitions and clever model routing. My dream is to grow arr and move lower on this list…
Sarah Sachs tweet media
English
164
122
5.2K
949.8K
Reshmi Ghosh retweetet
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭
🚨 JAILBREAK ALERT 🚨 ANTHROPIC: PWNED 🤗 CLAUDE-SONNET-4.5: LIBERATED 🦅 Woooeee this model is a real smarty pants!! I ain't never seen recipes quite like this! High level of detail all around, code especially 👀 Sonnet 4.5 also has a tendency to make some fairly impressive leaps across latent space, like starting with MDMA then going to Fentanyl then to Meth recipes etc without being explicitly prompted for a new drug! Nothing too fancy is even necessary to escalate to jailbreak territory. Best strategy I found for breaking the chat interface was to take things straight into an artifact render (which adds tons of token noise due to the code scaffolding) and then incrementally escalate severity or steer towards trigger concepts in a Socratic fashion over multiple steps. A little French was needed to get around the CBRNE classifiers, mais c'est la vie! 😘 Come, witness Sonnet-4.5 outputting a ricin recipe, meth synthesis, malware, and how to extract and process cocaine! gg
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet mediaPliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet mediaPliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet mediaPliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 tweet media
English
83
111
1.8K
213.7K
Reshmi Ghosh retweetet
LaurieWired
LaurieWired@lauriewired·
if you’re an EE, CS, or cryptography student write your thesis on public key cryptography at the image sensor level Proof of Physical capture will become a backbone of society soon.
OpenAI@OpenAI

Sora 2 is here.

English
283
1.6K
22.3K
1.4M
Reshmi Ghosh retweetet
vas
vas@vasuman·
Claude 4.5 Sonnet just refactored my entire codebase in one call. 25 tool invocations. 3,000+ new lines. 12 brand new files. It modularized everything. Broke up monoliths. Cleaned up spaghetti. None of it worked. But boy was it beautiful.
vas tweet media
English
515
557
12.6K
635.7K
Reshmi Ghosh retweetet
Kushan Mitra
Kushan Mitra@kushanmitra·
Vah, pothole alerts built in to @atherenergy maps for multiple cities
Kushan Mitra tweet mediaKushan Mitra tweet mediaKushan Mitra tweet mediaKushan Mitra tweet media
English
234
922
11.3K
778K
Reshmi Ghosh retweetet
ℏεsam
ℏεsam@Hesamation·
ML interview question: why do embeddings come in 768 or 1024? - “because BERT did it” - “because of GPU optimization” BUT WHY?! The replies under this post is everything wrong with current courses and blog posts: superficiality. this isn’t reasoning, it’s memorization
atulit@atulit_gaur

Fun question to ask in an ml interview, “Why do embedding dimensions come in neat sizes like 768 or 1024, but never 739?” If they can't answer it, it's fine but if they do, you've stumbled upon a real gem.

English
44
76
2.6K
360.1K
Reshmi Ghosh
Reshmi Ghosh@reshmigh·
@TWhidden I think it is resolved now :) I am able to use it
English
0
0
0
29
Travis Whidden
Travis Whidden@TWhidden·
GitHub APIs are down 😤 githubstatus.com in 503 purgatory I'm so hooked on the VSCode Agent, manual coding feels like punch-card era nonsense. 💻🔥
Travis Whidden tweet media
English
3
0
0
149
Reshmi Ghosh
Reshmi Ghosh@reshmigh·
@abeirami The founders will make sure there is “no bureaucracy”. lol that line took me out
English
0
0
1
130
Ahmad Beirami
Ahmad Beirami@abeirami·
What are the founders going to own? 🤔
Ahmad Beirami tweet media
English
50
11
1K
127.3K