Sampler

11 posts

Sampler

@sampler_ai

Collaborative prompt design for product teams. Made by @newmaterialco.

☁️ انضم Ağustos 2025

1 يتبع11 المتابعون

Sampler أُعيد تغريده

Jonathan Haas@JonathanHaas·3 Eyl

Them: “You can’t just throw @DSPyOSS at everything.” Me: throws DSPy at inbox DSPy: batch_with_similar updates, route Google → Engineering, mark Mercury as networking pitch, surface competitive intel. Me: I absolutely can and will.

English

127

9.2K

Sampler أُعيد تغريده

Shreya Shankar@sh_reya·28 Ağu

One of the most pressing questions in our AI Evals course is: "Why can’t I just have an LLM write my LLM pipeline?" The nuanced answer is that you can use LLMs to assist, but not for the whole pipeline. Knowing where to put the LLM in the loop is the hard part. To unpack this, we invited Omar Khattab (@lateinteraction) —creator of DSPy, leading expert on prompt optimization, and now professor at MIT—for a "fireside chat" in the course. He shed light on how he approaches pipeline development in practice. What stood out to us is that Omar spends most of his time on specification—e.g., defining the task clearly, looking at the data, and doing careful error analysis—before letting LLMs automate anything. This up-front rigor is what makes downstream optimization actually work. We've put the recording on YouTube. If you're wondering how Omar thinks about these tradeoffs, this conversation is worth a listen! youtube.com/watch?v=ctyU0z…

YouTube

English

249

60.6K

Sampler أُعيد تغريده

Justin Torre@justinstorre·6 Eyl

@awwstn Think of it like web dev: you don’t only rely on unit tests, you need logging, monitoring, and user analytics. Same thing for agents At the end of the day you’re shipping a product so it still boils down to the same first principles In prod, users are the ultimate eval

English

5.3K

Sampler أُعيد تغريده

Lenny Rachitsky@lennysan·4 Eyl

Trend I'm following: evals becoming a must-have skill for product builders and AI companies. It's the first new hard skill in a long time that PMs/engineers/founders have had to learn to be successful. The last one was maybe SQL, and Excel? A few examples: @garrytan: "Evals are emerging as the real moat for AI startups." @kevinweil: "Writing evals is going to become a core skill for product managers." @mikeyk: "Writing evals is probably the most important thing right now." @saranormous: "Evals = your new marketing." @gdb: "Evals are surprisingly often all you need." More to come.

Brendan (can/do)@BrendanFoody

Mercor (@mercor_ai) is now working with 6 out of the Magnificent 7, all of the top 5 AI labs, and most of the top application layer companies. One trend is common across every customer: we are entering The Era of Evals. RL is becoming so effective that models will be able to saturate any evaluation. This means that the primary barrier to applying agents to the entire economy is building evals for everything. This will be one of the largest buildouts we have ever seen with enterprises pouring hundreds of billions of dollars into evals for every workflow we want agents to automate. We're quickly defining a new class of work and hiring across nearly every domain: software engineers, consultants, bankers, lawyer, doctors, gamers, and many more.

English

643

414.8K

Sampler أُعيد تغريده

Julia Neagu@julianeagu·7 Eyl

x.com/i/article/1964…

ZXX

374

95.6K

Sampler أُعيد تغريده

Eugene Yan@eugeneyan·6 Eyl

@HanchungLee sad but true. many teams want to delegate the work of qa / eval to some external provider but that’s like outsourcing the definition of your customer problem and how you measure it. i’m skeptical an external provider will know and care about your user as much as you do

English

738

Sampler أُعيد تغريده

Prashanth Rao@tech_optimist·4 Eyl

Just had an aha moment w/ GEPA @DSPyOSS - Gemini 2.5 Flash-Lite (GPT 4.1 reflection LM) - 3 signatures in one compound module - 12 training examples - 10 test examples - 32 minutes of optimizer runtime - $0.90 total cost Results: - Baseline: 68.2% - GEPA-Optimized: 95.3% 🤯

English

919

114.7K

Sampler@sampler_ai·14 Ağu

A modular way to build prompts out of universal building blocks.

English

Sampler أُعيد تغريده

edwin@edwinarbus·12 Ağu

Prompting GPT-5 is different. In the examples below, optimized prompts: • Cut runtime by 1s • Dropped memory use 3,626 KB → 577 KB • Boosted code quality • Improved robustness (0.32→0.54) • Increased context grounding (0.80→0.95) We built a prompt migrator + optimizer so you don’t need to memorize every GPT-5 best practice.

English

738

268.9K

Sampler@sampler_ai·12 Ağu

Good soup for prompt design by @omarsar0 here 👉 promptingguide.ai/techniques

English

Sampler أُعيد تغريده

Gabriel Mitchell@gabeschnitzel·12 Ağu

@sampler_ai squad - complaining about the weather & UI.

English

اكتشف

@DSPyOSS @lateinteraction @awwstn @garrytan @kevinweil @mikeyk @saranormous @gdb