Sapien

4.1K posts

Sapien banner
Sapien

Sapien

@BuildOnSapien

Building Proof of Quality - Verifiable quality signals for AI

Anywhere Katılım Mayıs 2024
202 Takip Edilen137.7K Takipçiler
Sabitlenmiş Tweet
Sapien
Sapien@BuildOnSapien·
Most AI failures are not “mystery bugs.” They are predictable outcomes of unverified judgments made somewhere in data capture, evaluation, or review. Proof of Quality is built to make those judgments auditable and accountable. Today we are publishing the Sapien roadmap so builders can see exactly what is being shipped, when, and why. It is organized around one goal: make Proof of Quality a drop in primitive for any AI pipeline: Sapien.io/roadmap
Sapien tweet media
English
21
19
126
999.8K
Sapien
Sapien@BuildOnSapien·
@randgroup We need to be able to rely on the medical care given to us
English
0
0
5
79
Sapien
Sapien@BuildOnSapien·
The risk is that an agent keeps working while the artifact slowly stops matching reality. Proof of Quality fixes this with verification loops tied to the work product, not trust in the agent.
English
1
0
7
782
Sapien
Sapien@BuildOnSapien·
Microsoft Research published notes on its May research around AI delegation and long-horizon reliability. The benchmark studies delegated workflows where an AI modifies important artifacts over multiple steps. Frontier models introduced sparse but consequential errors, with roughly 19-34% degradation in artifact fidelity over 20 delegated iterations.
English
30
3
28
1.5K
Sapien
Sapien@BuildOnSapien·
AI-assisted security auditing will increase output. Firms will be able to generate more leads, more hypotheses, and more report drafts, but clients will ask a fair question: which of these findings have been reviewed by people I should trust? The answer should become part of the deliverable.
English
2
2
36
1.7K
Sapien
Sapien@BuildOnSapien·
AI systems increasingly generate decisions faster than organizations can verify them. Proof of Quality fixes this. sapien.io/developers
English
0
0
6
655
Sapien
Sapien@BuildOnSapien·
A recent survey by Sinch found that 74% of companies using AI agents in customer service have rolled systems back or shut them down. Teams realized they lacked an audit trail for whether outputs were actually reliable or hallucinations.
Dr Efi Pylarinou@efipm

🔴 Governance overtakes development in AI! Sinch says 74% of enterprises have rolled back live AI communications agents, and the highest rollback rates appear in organizations with the most mature guardrails. One of the most important signals in enterprise AI: production is not the finish line. buff.ly/Ywh16PX

English
4
6
42
2.3K
Sapien
Sapien@BuildOnSapien·
Sapien is live on @Wealthsimple. Canada has long been home ground for part of our team, our community, and our early supporters. $SAPIEN is now easier to access for the market that helped shape us.
English
8
5
60
3.3K
Sapien
Sapien@BuildOnSapien·
Using an AI to summarise patient/doctor conversations can save hours of time, yet 45% of AI scribe systems are prone to hallucinations according to this study by Supply Ontario. Proof of Quality gives the necessary guardrails to these systems so we can trust their output.
Nick Kapur@nick_kapur

An auditor for the Ontario, Canada government found that AI agents tasked with turning doctor/patient conversations into structured notes routinely hallucinated false treatments, replaced drug names with entirely different drugs, and missed crucial information

English
3
3
31
1.8K
Sapien
Sapien@BuildOnSapien·
If you want your agents to only take verified actions, join the proof of quality waitlist for early access sapien.io/developers
English
0
1
4
911
Sapien
Sapien@BuildOnSapien·
“Agents are optimized for completeness, not certainty. An AI will give you an output without knowing whether it's good or not.” Lukas Grapentine put it clearly during the House of AI event at Consensus
Sapien tweet media
English
6
5
48
1.6K
Sapien
Sapien@BuildOnSapien·
Google found an AI-assisted zero-day exploit with a hallucinated CVSS score inside the script. That is the failure mode builders should pay attention to. AI output can be partly wrong and still operationally dangerous. The next security primitive is verification.
News from Google@NewsFromGoogle

The Google Threat Intelligence Group has detected the first known instance of a threat actor using an AI-developed zero-day exploit in the wild. While the attackers planned a wide-scale strike, our proactive counter-discovery may have prevented that from happening. This finding is part of our new report on AI-powered threats.

English
6
6
39
2.3K
Sapien
Sapien@BuildOnSapien·
@extremebaba Official date will be announced very soon. Thank you.
English
0
0
1
49
Extremebaba
Extremebaba@extremebaba·
@BuildOnSapien Been anticipating Any timeline to look forward to for the launch of proof of quality ?
English
1
0
2
108
Sapien
Sapien@BuildOnSapien·
A model can ace one domain and drift badly in another. A study testing this failure case across 72 configurations found that AI reliability can vary sharply by domain, and the hardest cases are the ones where the model is forced to reason about implicit world state. Builders need verification that travels with the workflow. Proof of Quality fixes this with domain-aware review, consensus, and final attestation.
English
11
7
48
2.1K