Sampler

11 posts

Sampler banner
Sampler

Sampler

@sampler_ai

Collaborative prompt design for product teams. Made by @newmaterialco.

☁️ انضم Ağustos 2025
1 يتبع11 المتابعون
Sampler أُعيد تغريده
Jonathan Haas
Jonathan Haas@JonathanHaas·
Them: “You can’t just throw @DSPyOSS at everything.” Me: throws DSPy at inbox DSPy: batch_with_similar updates, route Google → Engineering, mark Mercury as networking pitch, surface competitive intel. Me: I absolutely can and will.
Jonathan Haas tweet media
English
3
14
127
9.2K
Sampler أُعيد تغريده
Shreya Shankar
Shreya Shankar@sh_reya·
One of the most pressing questions in our AI Evals course is: "Why can’t I just have an LLM write my LLM pipeline?" The nuanced answer is that you can use LLMs to assist, but not for the whole pipeline. Knowing where to put the LLM in the loop is the hard part. To unpack this, we invited Omar Khattab (@lateinteraction) —creator of DSPy, leading expert on prompt optimization, and now professor at MIT—for a "fireside chat" in the course. He shed light on how he approaches pipeline development in practice. What stood out to us is that Omar spends most of his time on specification—e.g., defining the task clearly, looking at the data, and doing careful error analysis—before letting LLMs automate anything. This up-front rigor is what makes downstream optimization actually work. We've put the recording on YouTube. If you're wondering how Omar thinks about these tradeoffs, this conversation is worth a listen! youtube.com/watch?v=ctyU0z…
YouTube video
YouTube
English
8
30
249
60.6K
Sampler أُعيد تغريده
Justin Torre
Justin Torre@justinstorre·
@awwstn Think of it like web dev: you don’t only rely on unit tests, you need logging, monitoring, and user analytics. Same thing for agents At the end of the day you’re shipping a product so it still boils down to the same first principles In prod, users are the ultimate eval
English
1
2
40
5.3K
Sampler أُعيد تغريده
Lenny Rachitsky
Lenny Rachitsky@lennysan·
Trend I'm following: evals becoming a must-have skill for product builders and AI companies. It's the first new hard skill in a long time that PMs/engineers/founders have had to learn to be successful. The last one was maybe SQL, and Excel? A few examples: @garrytan: "Evals are emerging as the real moat for AI startups." @kevinweil: "Writing evals is going to become a core skill for product managers." @mikeyk: "Writing evals is probably the most important thing right now." @saranormous: "Evals = your new marketing." @gdb: "Evals are surprisingly often all you need." More to come.
Brendan (can/do)@BrendanFoody

Mercor (@mercor_ai) is now working with 6 out of the Magnificent 7, all of the top 5 AI labs, and most of the top application layer companies. One trend is common across every customer: we are entering The Era of Evals. RL is becoming so effective that models will be able to saturate any evaluation. This means that the primary barrier to applying agents to the entire economy is building evals for everything. This will be one of the largest buildouts we have ever seen with enterprises pouring hundreds of billions of dollars into evals for every workflow we want agents to automate. We're quickly defining a new class of work and hiring across nearly every domain: software engineers, consultants, bankers, lawyer, doctors, gamers, and many more.

English
33
58
643
414.8K
Sampler أُعيد تغريده
Eugene Yan
Eugene Yan@eugeneyan·
@HanchungLee sad but true. many teams want to delegate the work of qa / eval to some external provider but that’s like outsourcing the definition of your customer problem and how you measure it. i’m skeptical an external provider will know and care about your user as much as you do
English
1
1
20
738
Sampler أُعيد تغريده
Prashanth Rao
Prashanth Rao@tech_optimist·
Just had an aha moment w/ GEPA @DSPyOSS - Gemini 2.5 Flash-Lite (GPT 4.1 reflection LM) - 3 signatures in one compound module - 12 training examples - 10 test examples - 32 minutes of optimizer runtime - $0.90 total cost Results: - Baseline: 68.2% - GEPA-Optimized: 95.3% 🤯
Prashanth Rao tweet media
English
28
64
919
114.7K
Sampler
Sampler@sampler_ai·
A modular way to build prompts out of universal building blocks.
English
0
1
8
2K
Sampler أُعيد تغريده
edwin
edwin@edwinarbus·
Prompting GPT-5 is different. In the examples below, optimized prompts: • Cut runtime by 1s • Dropped memory use 3,626 KB → 577 KB • Boosted code quality • Improved robustness (0.32→0.54) • Increased context grounding (0.80→0.95) We built a prompt migrator + optimizer so you don’t need to memorize every GPT-5 best practice.
English
31
77
738
268.9K
Sampler أُعيد تغريده
Gabriel Mitchell
Gabriel Mitchell@gabeschnitzel·
@sampler_ai squad - complaining about the weather & UI.
Gabriel Mitchell tweet media
English
0
1
2
47