Ji Won Park

492 posts

Ji Won Park

Ji Won Park

@jiwoncpark

Principal ML Scientist at @Genentech @PrescientDesign. Bayesian optimization, Bayesian inference.

San Francisco Bay Area เข้าร่วม Ocak 2014
1.2K กำลังติดตาม1.1K ผู้ติดตาม
ทวีตที่ปักหมุด
Ji Won Park
Ji Won Park@jiwoncpark·
I'm hiring a postdoc at Prescient Design, @genentech, to work on scalable experimental design for scientific discovery! Strong theory background and interest in practical algorithms encouraged. Apply here: tinyurl.com/2hxyvnvc
English
5
24
90
10K
Ji Won Park รีทวีตแล้ว
Kyunghyun Cho
Kyunghyun Cho@kchonyc·
if you speak Korean but are not a Korean and are interested in working as a researcher/research analyst at the Korean Mission to the United Nations in NYC, consider this position: un.mofa.go.kr/un-ko/brd/m_24…
Kyunghyun Cho tweet media
English
1
4
15
3.5K
Ji Won Park รีทวีตแล้ว
Kyunghyun Cho
Kyunghyun Cho@kchonyc·
congrats to anthropic for snapping up coefficient bio! what a great beginning of the great prescient design alumni network, @nc_frey @samuel_stanton_ jesse! stay tuned to hear more from other entrepreneurial alumni, such as @juliusadml and @asalam_91 of Gauss Lab.
English
6
4
127
12.3K
Ji Won Park รีทวีตแล้ว
Antonio Orvieto
Antonio Orvieto@orvieto_antonio·
Optimization theory for adaptive methods actually predicts most of what we know about hyperparameter scaling in LLM pretraining, and suggests new strategies as well. We did a deep dive here.
Antonio Orvieto tweet media
English
10
70
575
119.7K
Ji Won Park รีทวีตแล้ว
Krikamol (Hiring Postdoc)
Krikamol (Hiring Postdoc)@krikamol·
Great news! Our 2nd Workshop on Epistemic Intelligence in Machine Learning (EIML) will happen again and will co-locate with @icmlconf in Seoul, South Korea 🇰🇷. 🔗sites.google.com/view/eimlicml2… Stay tuned for an update soon.
English
0
4
38
8.6K
Ji Won Park รีทวีตแล้ว
Pietro Barbiero
Pietro Barbiero@pietro_barbiero·
Join our interpretability seminar tomorrow (25/03)! 🗓️ Topic Bayesian Concept Bottleneck Models with LLM Priors Speaker Jean Feng Tune in live at 5:30 CET / 9:30 PST: 📷 @InterpretableDeepLearning/live" target="_blank" rel="nofollow noopener">youtube.com/@Interpretable… Link to paper arxiv.org/abs/2410.15555
English
0
5
41
3K
Ji Won Park รีทวีตแล้ว
NEJM AI
NEJM AI@NEJM_AI·
In the latest episode of AI Grand Rounds, Dr. Kyunghyun Cho (@kchonyc) discusses his wide-ranging career spanning fundamental AI research, co-founding Prescient Design (acquired by @genentech), and driving applications of AI in health care. Full episode: nejm.ai/ep40
NEJM AI tweet media
English
1
11
24
9.5K
Ji Won Park รีทวีตแล้ว
Guide Labs
Guide Labs@guidelabsai·
At @guidelabsai we’ve proven we don’t need to sacrifice intelligence to gain transparency. Standard AI systems have representations that are, by default, entangled but Steerling-8B takes a different path. It learns disentangled representations by construction through architectural and training-time constraints. Steerling-8B has already learned thousands of novel concepts that it was never explicitly constrained to represent, spanning domains from linguistics and programming to geography, culture, and abstract reasoning. Consequently we shift the question from: “Can we reverse-engineer what this model knows?” to: “What did this model learn?” Here we show that we can easily discover thousands of novel concepts from the model; concepts it was never explicitly trained to learn. The model discovered: - British English - spelling as a coherent system - unified “you” across six languages - separated spelled-out numbers from digit - typographic errors - a dedicated concept for broken Unicode These first ~100K concepts Steerling-8b discovered are evidence of what becomes possible when interpretability is a design choice rather than a retrofit. We’re unlocking an entirely new category of AI. Test the model LIVE: Guide Labs: guidelabs.ai/post/concept-d… Gitbub: github.com/guidelabs/stee… Hugging Face: huggingface.co/guidelabs/stee… #Steerling8b #GuideLabs #AI #MachineLearning
Guide Labs tweet mediaGuide Labs tweet mediaGuide Labs tweet mediaGuide Labs tweet media
English
0
8
15
1.3K
Ji Won Park รีทวีตแล้ว
Guide Labs
Guide Labs@guidelabsai·
It’s official: the first large-scale inherently interpretable language model is here. Steerling-8B from @guidelabsai is the first and largest model that can trace every token it generates back to: →  Input Context → Training data → Human-understandable concepts In other words, we've successfully trained Steerling-8B to trace its outputs and explain what has impacted that decision for more reliable manipulation. This isn’t post-hoc explainability. Interpretability is built directly into the model. 🔓Steerling-8B can self-monitor for memorized content and suppress it at inference time without retraining. That makes interpretability a first-class design principle, not an afterthought. This is a major step toward models we can actually understand, debug, and trust. Over the coming days, we’ll be sharing investigations into what Steerling-8B’s interpretability enables in practice. Stay tuned as we dive deeper into our research & how we are building LLMs we can trust. 🚨 Try it LIVE and help improve it: Guide Labs: guidelabs.ai/post/steerling…  GitHub: github.com/guidelabs/stee… Hugging Face: huggingface.co/guidelabs/stee… Huge thank you to @TimFernholz and @TechCrunch for featuring this breakthrough. techcrunch.com/2026/02/23/gui… #Steerling8B #GuideLabs #AI #MachineLearning
Guide Labs tweet media
English
8
65
432
53.3K
Ji Won Park รีทวีตแล้ว
Stanford NLP Group
Stanford NLP Group@stanfordnlp·
For this week's seminar, we are excited to host @EkdeepL from @GoodfireAI! Date and Time: Thursday, February 19, 11:00 AM — 12:00 PM Pacific Time. Zoom Link: stanford.zoom.us/j/93941842999?… Title: Bayes-ed: Formalizing a Paradigm for Interpretability in the Language of Bayesian Inference Abstract: Interpretability research has exploded in recent years, resulting in diverse, often heuristic attempts at understanding how models perform the tasks they do. In this talk, I intend to present steps towards a framework that helps concretize these heuristics and also expands the notion of what it means to interpret. Specifically, focusing on in-context learning, we will start our analysis with a behavior-first approach and define Bayesian models that predict both the outputs produced and, assuming power-law scaling, the learning dynamics of large-scale Transformers. We then use these Bayesian models as our guiding object and characterize how representations ought to be structured in order to support such a behavioral model, hence making feature geometry a core object of study for interpretability. This lens helps us characterize the limitations with several existing interpretability paradigms, e.g., SAEs, but also offers a path forward by either designing tools with appropriate geometrical assumptions or post-processing of SAE activations. Critically, this implies there is no silver bullet in bottom-up interpretability: behavior guides what tool or post-processing ought to be used. Grounded in this discussion, we then analyze the utility of our framework by assessing how representations can be used to influence behavior: we will make precise what inference-time interventions like activation steering are trying to achieve, how existing protocols have inherently incorrect assumptions, and how this can be fixed. Critically, making a formal link to post-training (grounded in existing Bayesian accounts of RLHF), we will show inference-time interventions can be seen as rejection sampling, motivating a pipeline for amortizing this process and leading to scalable oversight approaches grounded in interpretability. As a case study, we will operationalize a naive version of this pipeline for the task of reducing hallucinations, resulting in 58% reduction in hallucinated claims in an open-source LLM at 100x less cost than the use of a frontier model judge. Excited to see everyone at the seminar!
Stanford NLP Group tweet media
English
1
8
74
25.6K
Ji Won Park รีทวีตแล้ว
Stephen Ra
Stephen Ra@stephenra·
I’m excited to start as a Visiting Scholar at @nyuniversity’s Global AI Frontier Lab starting today! Please reach out if you’d like to chat!
Stephen Ra tweet mediaStephen Ra tweet mediaStephen Ra tweet media
English
0
2
39
1.9K
Ji Won Park
Ji Won Park@jiwoncpark·
I'm grateful to the GPTZero team (@alexcdot, gptzero.me/news/neurips/) for alerting us of the miscitations in our NeurIPS paper. Our analysis of the errors can be found here: openreview.net/forum?id=IiEtQ…, while the paper with updated references is available at tinyurl.com/ymb99d7s. We have also reached out to the NeurIPS'25 PCs. We sincerely apologize to the whole community, especially the affected authors, for our error.
English
2
2
60
9.7K
Ji Won Park รีทวีตแล้ว
Kyunghyun Cho
Kyunghyun Cho@kchonyc·
i was made aware of miscitations thanks to the GPTZero team (cc @alexcdot). ji won and i quickly checked them ourselves and have posted what happened on openreview: openreview.net/forum?id=IiEtQ…. we have already notified NeurIPS'25 PC's about this issue. i truly thank the GPTZero team for bringing this to our attention as well as raising the awareness of this serious issue (gptzero.me/news/neurips/), and at the same time i sincerely apologize to all for our error.
Kyunghyun Cho tweet media
English
20
24
291
85K
Ji Won Park
Ji Won Park@jiwoncpark·
I'm hiring a Ph.D. intern at Prescient Design (Frontier Research) in SSF to work on active simulation-based inference for Summer 2026. Ideal for students with strong stats backgrounds and interests in decision theory and adaptive experimentation. Apply at tinyurl.com/427shxe4
English
7
31
190
19.1K
Ji Won Park รีทวีตแล้ว
Guide Labs
Guide Labs@guidelabsai·
🚀 We've developed a language model that's interpretable by design. 8 billion parameters, no performance sacrifice.
GIF
English
2
8
28
6K
Ji Won Park รีทวีตแล้ว
Julius Adebayo
Julius Adebayo@juliusadml·
We are continuing our series of inherently interpretable models. In this one, we introduce causal diffusion language models. GPT style scaling behavior + multi-token control! See: guidelabs.ai/post/block-cau… for more.
Guide Labs@guidelabsai

Today we're releasing Causal Diffusion Language Models: a new architecture that combines AR-style scaling with block-level generation. We're building interpretable language models at @GuideLabsAI. That requires controlling concepts that span multiple tokens, which led us to rethink diffusion architectures. ❌ Autoregressive: generates one token at a time ❌ Diffusion: harder to scale effectively ✅ Solution: Causal Diffusion Models 🧵

English
0
2
17
1.8K