
Thinking effort doesn't fix hallucination.
Even the best frontier model at matched HIGH still gets 24.2% of fields wrong on adversarial insurance docs. Going from default to HIGH buys 0-2pp per model.
aginor.ai/extraction-tes…

English
Tim Michaud
1.2K posts

@TimGMichaud
Founder @ New thing - (YC Alum) still a Security Nerd.







ngl I think about this every so often. after decades of looking at embedded, mobile, cloud, windows (userland), and linux (userland+kernel) I feel like I have a foundation to create something of my own. but at the same time throwing myself at a high risk idea is a bit spooky












