
The most interesting idea here isn't visual retrieval.
It's treating the screenshot as the ground truth.
We've spent years optimizing:
HTML → Text → Chunks → Embeddings
Maybe the better approach is:
Page → Pixels → Embeddings
Especially for tables, dashboards, PDFs, charts, and documentation.
English








