
ZiniuYu
41 posts








Text watermarking using — surprise — embedding models?! And those watermarks persist after paraphrasing & translation—one of the most "out-of-domain" usages of embeddings we learned at EMNLP2024. It leverages the long-context and cross-lingual features of jina-embeddings-v3 to create a robust watermark system. But first, what is a good text watermark?

No parsing or OCR; No multi-vector or late interaction! VisRAG from @TsinghuaNLP outperforms TextRAG by addressing RAG bottlenecks at both stages: achieving higher retrieval accuracy and better answer generation via multimodal reasoning. We're thrilled to invite @dgdsxyushi to present his VisRAG work and share his hot take on how pure vision-based pipelines can better generalize to real-world scenarios.















