Erin LeDell - ledell.bsky.social

5K posts

Erin LeDell - ledell.bsky.social banner
Erin LeDell - ledell.bsky.social

Erin LeDell - ledell.bsky.social

@ledell

Chief Scientist @dbnlAI #AITesting, Founder @ https://t.co/IevLGfnUT0 PhD in #biostatistics @UCBerkeley, founder @wimlds, co-founder @RLadiesGlobal

Oakland, California, USA 🌴 Katılım Şubat 2009
7.1K Takip Edilen10.2K Takipçiler
Sabitlenmiş Tweet
Erin LeDell - ledell.bsky.social
Hey folks, come and find me elsewhere? I'm trying to move my attention and energy to the Good Place.
Erin LeDell - ledell.bsky.social tweet media
English
0
2
7
1.2K
Erin LeDell - ledell.bsky.social retweetledi
Dr. Lucky Tran
Dr. Lucky Tran@luckytran·
New Mexico has issued a public health order that removes federal restrictions to COVID-19 vaccine access so that pharmacies in New Mexico can vaccinate people of all ages and risk profiles. Every state needs to do this!
Dr. Lucky Tran tweet mediaDr. Lucky Tran tweet media
English
726
7K
28.2K
615.5K
Sara Hooker
Sara Hooker@sarahookr·
It has been an incredible honor to spend the past few years leading @Cohere_Labs @cohere . This has been the adventure of a lifetime. However, after much deliberation, I made a tough decision 2 months ago it is time to say goodbye.
English
133
20
1.3K
138K
Erin LeDell - ledell.bsky.social
Sara is one of the most amazing AI leaders I know. She's built one of the most unique labs in the industry, made huge advances in the field, all while being a fierce champion for under-served communities. If you don't know her work, please have a look at her post below!
Sara Hooker@sarahookr

It has been an incredible honor to spend the past few years leading @Cohere_Labs @cohere . This has been the adventure of a lifetime. However, after much deliberation, I made a tough decision 2 months ago it is time to say goodbye.

English
0
0
7
1.1K
Erin LeDell - ledell.bsky.social retweetledi
Chip Huyen
Chip Huyen@chipro·
I open sourced Sniffly, a tool that analyzes Claude Code logs to help me understand my usage patterns and errors. Key learnings. 1. The biggest type of errors Claude Code made is Content Not Found (20 - 30%). It tries to find files or functions that don't exist. So I restructured my code base for discoverability, and the average number of steps Claude Code needs for each instruction went from 8 to 7 steps.
Chip Huyen tweet mediaChip Huyen tweet media
English
45
127
1.2K
154.1K
Erin LeDell - ledell.bsky.social retweetledi
Noel Brewer
Noel Brewer@noelTbrewer·
The 17 fired members of ACIP, including me, have written an article in JAMA about our experience and our concerns. jamanetwork.com/journals/jama/…
English
137
1.3K
3K
200.3K
Erin LeDell - ledell.bsky.social retweetledi
Dr. Lucky Tran
Dr. Lucky Tran@luckytran·
BREAKING (ACLU): A federal judge reversed NIH's terminations of hundreds of critical research grants that were canceled because of their alleged connection to disfavored topics, including diversity, equity, inclusion, and gender identity. This is a major win for public health.
English
11
994
4.2K
88.5K
Erin LeDell - ledell.bsky.social retweetledi
Shreya Shankar
Shreya Shankar@sh_reya·
new blogpost on writing in the ~glorious~ age of LLMs
Shreya Shankar tweet media
English
19
166
1.4K
118.6K
Erin LeDell - ledell.bsky.social retweetledi
Meredith Whittaker
Meredith Whittaker@mer__edith·
Use Signal. We promise, no AI clutter, no surveillance ads—whatever the rest of the industry does. We lead we don’t follow❤️
Meredith Whittaker tweet media
English
153
624
3.3K
304.2K
Erin LeDell - ledell.bsky.social retweetledi
Raven Baxter, Ph.D.
Raven Baxter, Ph.D.@ravenscimaven·
History and biology are in alignment: sex and gender are on a spectrum, and it's complex. People who crave power have, throughout history, manipulated this as a means of control. Why should we accept ideas that were designed to control people for greed?
Heartland Signal@HeartlandSignal

U.S. Rep. Marjorie Taylor Greene (R-GA) uses a committee hearing on 23andMe's bankruptcy and data security to ask a board member why she reposted a tweet critical of her in 2021.

English
75
845
3.5K
107.1K
Erin LeDell - ledell.bsky.social retweetledi
Sara Hooker
Sara Hooker@sarahookr·
Truly excellent video by @MLStreetTalk about how a handful of providers have systematically overfit to @lmarena_ai. 26 mins of video showcase how easy it has been to distort the rankings. As scientists, we must do better. As a community, I hope we can demand better.
Sara Hooker tweet media
English
3
21
135
8.8K
Erin LeDell - ledell.bsky.social retweetledi
Hamel Husain
Hamel Husain@HamelHusain·
This happens when you don’t have good AI evals
Hiten Shah@hnshah

I just got off the phone with a founder. It was an early Sunday morning call, and they were distraught. The company had launched with a breakout AI feature. That one worked. It delivered. But every new release since then? Nothing’s sticking. The team is moving fast. They’re adding features. The roadmap looks full. But adoption is flat. Internal momentum is fading. Users are trying things once, then never again. No one’s saying it out loud, but the trust is gone. This is how AI features fail. Because they teach the user a quiet lesson: don’t rely on this. The damage isn’t logged. It’s not visible in dashboards. But it shows up everywhere. In how slowly people engage. In how quickly they stop. In how support teams start hedging every answer with “It should work.” Once belief slips, no amount of capability wins it back. What makes this worse is how often teams move on. A new demo. A new integration. A new pitch. But the scar tissue remains. Users carry it forward. They stop expecting the product to help them. And eventually, they stop expecting anything at all. This is the hidden cost of broken AI. Beyond failing to deliver, it inevitably also subtracts confidence. And that subtraction compounds. You’re shaping expectation, whether you know it or not. Every moment it works, belief grows. Every moment it doesn’t, belief drains out. That’s the real game. The teams that win build trust. They ship carefully. They instrument for confidence. They treat the user’s first interaction like a reputation test, because it is. And they fix the smallest failures fast. Because even one broken output can define the entire relationship. Here’s the upside: very few teams are doing this. Most are still chasing the next “AI-powered” moment. They’re selling potential instead of building reliability. If you get this right, you become the product people defend in meetings. You become the platform they route their workflow through. You become hard to replace. Trust compounds. And when it does, it turns belief into lock-in.

English
5
8
75
13K
Erin LeDell - ledell.bsky.social retweetledi
Hamel Husain
Hamel Husain@HamelHusain·
It is very easy to make mistakes when creating evals for your AI product. @sh_reya and I run through the most common mistakes in this talk (with memes 🌶️!) . Chapter summaries below: 00:51 Foundation model benchmarks are not the same as your application evals 03:00 Generic Evals Are Useless 04:00 Do not outsource labeling & prompting to non domain experts 09:28 You should make your own data annotation app 12:40 Your LLM prompts should be specific and grounded in error analysis 15:25 Use binary labels 18:57 Look at your data 23:41 Be careful of overfitting to test data 25:40 Do online tests Links more resources in the reply
English
10
35
313
46K
Erin LeDell - ledell.bsky.social retweetledi
Hamel Husain
Hamel Husain@HamelHusain·
This is going to be one of the highest signal sessions of the series. Most issues with RAG are poor retrieval. Search OG @softwaredoug is going to share his bag of tricks from 10 years of optimizing search in industry maven.com/p/29a33a/hybri…
Hamel Husain tweet media
Hamel Husain@HamelHusain

RAG is dead posts are annoying as F "R" is retrieval and "AG" is the LLM. This means you think retrieval is dead. Seriously, you think retrieval is dead? Keyword search, metadata filtering (dates, users), grep, and other filtering are retrieval. Good luck without retrieval

English
2
20
132
23.3K
Erin LeDell - ledell.bsky.social retweetledi
Zain
Zain@ZainHasan6·
Overview of Large Language Models for Statisticians They bridge the gap between statistics and AI - identifying key areas where statistical expertise can enhance LLM development. List emerging statistical challenges in LLM: uncertainty quantification, decision-making, causal inference, distribution shift, interpretability, fairness, privacy and watermarking.
Zain tweet media
English
2
3
18
1K
Erin LeDell - ledell.bsky.social retweetledi
Hamel Husain
Hamel Husain@HamelHusain·
I'm excited to teach this lightning lesson with @sh_reya on improving your AI consistently with evals. Even though this is the most important topic in applied AI, there are sparse educational materials on this! We are fixing that in this series. maven.com/p/0c0359/impro…
Hamel Husain tweet media
English
5
24
167
10K
Erin LeDell - ledell.bsky.social
@Globalbiosec If you are going to be wearing a mask, why not just wear a respirator? KN95s are just as comfortable as a medical mask but has far greater protection. It makes zero sense to me when I see people wearing medical masks.
English
0
1
8
166