Stephen Bach

1.8K posts

Stephen Bach

Stephen Bach

@stevebach

Asst. prof. @BrownCSDept. Working on improving how humans teach computers. Weak supervision, zero-shot learning, few-shot learning, and high-level knowledge.

Katılım Ağustos 2007
502 Takip Edilen1.6K Takipçiler
Stephen Bach retweetledi
Cristina Menghini
Cristina Menghini@CriMenghini·
Last week we launched Muse Spark at an acceptable risk level under our Advanced AI scaling framework, after multiple mitigation iterations. Today we’re releasing its first Safety & Preparedness Report documenting that decision. This was a long, cross-team effort — from catastrophic risk assessment to day-to-day model behavior. We hope this contributes to transparent discussion of responsible development of personal superintelligence. Running the evals, it was fascinating to watch the model’s safety profile take shape. Under the new framework, we’re also introducing our first assessment of loss of control risks — built on extensive threat modeling that’s still evolving. The report’s dense and there’s a lot of work ahead. You can find the full report here: ai.meta.com/static-resourc…— we’re eager to hear feedback and improve.
Summer Yue@summeryue0

🚀 Muse Spark Safety & Preparedness Report for Meta AI is out. We start with our pre-deployment assessment under Meta's Advanced AI Scaling Framework, covering chemical and biological, cybersecurity, and loss of control risks. Our assessment flagged potentially elevated chem/bio risk, so we implemented safeguards and validated mitigations before deployment - bringing residual risk to within acceptable levels. Beyond the Framework, we also share findings and early explorations of model behavior (honesty, intent understanding, etc.), jailbreak robustness, eval awareness, and more. We're sharing this report to give a closer look at how we evaluate advanced AI safety. Always more work to do, and we welcome feedback from the community. ai.meta.com/static-resourc…

English
0
11
59
5.6K
Stephen Bach retweetledi
Deb Raji
Deb Raji@rajiinio·
I thought "AI for Science" was something like AlphaFold, ie. using AI to creatively address computational bottlenecks for well articulated scientific problems. Now I'm seeing more of "AI slop cosplaying as research paper", where the problems are fake, methods unverified, etc.
English
21
73
653
33.7K
Stephen Bach retweetledi
Tiancheng Hu @ ICLR 2026
Tiancheng Hu @ ICLR 2026@tiancheng_hu·
1/7 🧵 The GPT-4 technical report featured detailed calibration curves. Since then, not a single major model release has reported calibration. The field quietly stopped measuring whether models know what they don't know. Our new position paper argues this is a mistake. Here's why.
Tiancheng Hu @ ICLR 2026 tweet media
English
1
6
18
1.9K
Stephen Bach retweetledi
Nihal Nayak
Nihal Nayak@nihalcanrun·
Targeted instruction tuning for LLMs involves selecting a subset of instructions from a candidate pool using a small query set from target tasks. Despite growing interest, we still lack guidance on what to select. Our new preprint brings clarity to this space (thread 👇).
English
2
8
22
3.4K
Stephen Bach retweetledi
Alex Ratner
Alex Ratner@ajratner·
Simple (proposed!) rule for terminology around synthetic data: If a "synthetic generation" method uses model A to generate data that leads to gains on model B, where A >> B - this is distillation, not synthetic generation :) The true technical challenge of synthetic data is to use model A, plus some cleverness around system architecture and/or human-in-the-loop input (e.g. context eng, review/filtering, editing), to produce data that improves model B where B >= A.
English
3
4
32
4.1K
Stephen Bach retweetledi
Yisong Yue
Yisong Yue@yisongyue·
I am saddened by the loss of Joe Halpern. I still remember taking his Reasoning About Uncertainty class during my first year as a PhD student at @Cornell. Joe leaves behind a tremendous legacy, not only in his research, but the lives of so many students he touched along the way. bangsfuneralhome.com/obituaries/jos…
English
0
2
47
5.9K
Stephen Bach retweetledi
Alex Ratner
Alex Ratner@ajratner·
This week we launched the Open Benchmarks Grant with a $3M initial commitment from @SnorkelAI + partner support from @huggingface @togethercompute @PrimeIntellect @PyTorch @harborframework & others, in order to close the evaluation gap in AI. Our ability to measure AI has been outpaced by our ability to develop it - and open benchmarks are one of several critical, complementary tools to fix this. We're particularly interested in novel benchmarks that push and probe the frontier along three key vectors: (1) Environment complexity --> E.g. complex, domain-specific context and tool/action spaces, human interaction, world modeling) (2) Autonomy horizon --> E.g. long horizon, non-stationary goals (3) Output complexity --> E.g. complex outputs with nuanced, rubric-based evaluation / reward signals Check out more detail + link to apply here! benchmarks.snorkel.ai
English
1
7
44
7.6K
Stephen Bach retweetledi
Omar Khattab
Omar Khattab@lateinteraction·
PSA: If you're not currently following @jacobli99 and staying tuned, you really really should this week.
English
14
6
205
39.1K
Stephen Bach retweetledi
Dylan Sam
Dylan Sam@dylanjsam·
I'm at NeurIPS this week! Excited to meet old/new friends and chat with people about training safer language models. I'm presenting a few works on safety pretraining, measuring diversity in data curation, and monitoring model behaviors --- more info below 👇
English
4
4
37
4.2K
Stephen Bach retweetledi
Dyah Adila 🦄
Dyah Adila 🦄@dyahadila_·
⭐ New blog post! Most people think activation steering ≈ a cheap version of finetuning. But why does it sometimes work, and sometimes fall flat? We dug into this and found a surprisingly clear answer. Full breakdown here 👇 sprocketlab.github.io/posts/2025/11/…
Dyah Adila 🦄 tweet media
English
1
16
27
5.9K
Stephen Bach retweetledi
Yeganeh Kordi
Yeganeh Kordi@yeganekordi·
How well do language models generalize to problems that are harder, or even easier, than the ones they’ve trained on? We show that LLMs don’t generalize across difficulty levels quite as much as you might think. 🧵
Yeganeh Kordi tweet media
English
1
8
30
2.9K
Stephen Bach retweetledi
Tal Linzen
Tal Linzen@tallinzen·
I too am recruiting PhD students this year! things I think about: cognitively plausible LLMs, interpretability, evaluating and improving multi-turn interaction, LLMs for cognitive science and neuroscience, psycholinguistics... the deadline for Data Science is Dec 6 and for Linguistics Dec 18.
English
10
65
351
25.6K
Stephen Bach retweetledi
Brown Research
Brown Research@BrownUResearch·
ARIA, a Brown-based research consortium supported by a $20 million grant from the National Science Foundation, welcomed scientists from across the U.S. to kick off its five-year program with a launch event in Providence. @BrownUniversity brown.edu/news/2025-11-2…
English
1
2
8
1.6K