Angehefteter Tweet

I'm beyond excited to announce Try That LLM: a service for people using LLMs via API. Bulk-test your project's prompts against dozens of LLMs, automatically test them against every new LLM as it ships, and set up LLM judges to score the outputs.
Try That LLM is designed for scenarios like these:
"Our product uses 20 or so basic prompts...but wow those LLMs are pricey. If we switched some prompts to a cheaper LLM, would quality suffer?"
"How do I test my prompts against every new LLM that appears? I don't want to be copy & pasting every time something new ships."
"My manager wants to know if we should use LLM XYZ, they read about it on Hacker News. I guess I'm spending my day figuring that out."
"I just want something to score the responses and tell me the best one"
"Which prompts are costing us the most?"
If you get a chance to try it out, I'd love your feedback/comments.
trythatllm.com
English



















