Jaas AI

@jaas_ai

JaaS AI - AI Agent that Automate your Business. Our focus is to help SMB increase productivity and profitability by driving AI Agents! - jaas-ai. com

Toronto, Canada Katılım Eylül 2025

16 Takip Edilen3 Takipçiler

Jaas AI@jaas_ai·18 Eyl

@svpino What about implementing monitoring tools that could look for the non-deterministic outcomes (ie Hallucinations) in parallel to the agent?

English

Santiago@svpino·18 Eyl

Every single solution to fight the non-deterministic nature of Large Language Models comes with trade-offs. There's no free lunch. Some people have suggested running the same process 3 times and choosing the most common answer. That would definitely improve reliability, but it will make the solution 3x slower. Anybody who's spent more than 10 minutes building software for a living understands why that non-deterministic nature is a big issue. If reliability were all that mattered, you could find 1,000 ways to improve it, but there's always more at stake.

Santiago@svpino

Agents are far, far away from being reliable enough to work as part of critical applications. I wish I could tell you something different, but I just can't. I've built several agents, and so far, none of them are 100% reliable, even when most aren't complex at all. The simplest agent I've built does the following: 1. Retrieves some information from a vector store 2. Formats that information into HTML It doesn't get simpler than this, and yet about 1% of the time, the agent fails to perform the conversion. Don't ask me why, but it just fails. 99% of the time, it works. 1% of the time, it fails. You just can't trust an LLM.

English

121

23.6K

Jaas AI@jaas_ai·13 Eyl

@svpino Even better if you can have real-time monitoring for some AI apps to detect issues (i.e. hallucination before they happen!) - jaas-ai.net

English

Santiago@svpino·12 Eyl

The only thing worst than not having tests is having useless tests.

English

147

27.8K

Jaas AI@jaas_ai·13 Eyl

@svpino AI testing is important to ensure that your applications (including chatbots) behave as their should. There are some tools out there... check JaaS-AI.net.. ;)

English

Santiago@svpino·11 Eyl

These are some of the most mediocre people: The same folks who hate unit tests are now up in arms against writing evals for their LLM-based applications. If you are a solo developer or a startup building a prototype, you shouldn’t spend your time writing tests or evaluations. Instead, focus on iterating as fast as you can. But as soon as you are ready to go to production and scale your product, you gotta start writing automated tests and evals. Testing/evals are 10x more relevant when building agents and 100x more relevant when building critical workflows. Nobody has ever built anything of value without extensive eval coverage.

English

691

87.7K

Keşfet

@svpino @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine