Sabitlenmiş Tweet

SciWeave: 0% hallucinated citations on ScholarQABench (Asai et al., Nature 2026).
GPT-5.2: 59%.
Claude Opus 4.6: 62%.
100 queries across biomedicine and neuroscience.
The 0% isn't a model trick. SciWeave doesn't ask the LLM to recall papers. It retrieves passages from OpenAlex (~300M scientific works) and synthesizes the answer from what comes back. The architecture rules out fabrication.
The benchmark, the scoring script, and the NLI judge are all public. We ran SciWeave against them.
Hallucinated citations are already showing up in peer-reviewed bibliographies. The failure rate deserved a number.
desci.com/blog/zero-hall…
English















