
In our multi-speaker audio recordings, every conversation produces three files. Host track. Guest track. Combined file.
The combined file is what ends up in the training data. It's also the easiest one to fake.
Here's how: submit two genuine individual tracks, but replace the combined file with a completely different recording. To most systems, it looks legitimate. Three files, all present, nothing flagged.
We found exactly this. Combined files that couldn't have been produced by mixing the individual tracks submitted alongside them. The math didn't add up.
Our Authenticity Verifier catches it by reconstructing what the combined file should sound like from the two individual tracks, then comparing that against what was actually submitted.
Files that don't match: rejected.
Files that are close but not right: flagged for human review.
If the parts don't make the whole, something has gone wrong. This tool helps us find out exactly what.
More on this soon. Want to talk to us? DMs open.

English















