PageIndex

1.7K posts

PageIndex

@PageIndexAI

Building Vectorless Long-Context Infra

London Katılım Haziran 2023

291 Takip Edilen1.5K Takipçiler

Sabitlenmiş Tweet

PageIndex@PageIndexAI·10 Nis

Inspired by @karpathy's knowledge base thread, we are open-sourcing OpenKB: Open LLM Knowledge Base In addition to Andrej's great original design, OpenKB can scale to long PDFs and multi-modality, see details below 👇

English

4.3K

PageIndex@PageIndexAI·1d

@rwayne 感谢分享PageIndex! github.com/VectifyAI/Page…

中文

Roland.W@rwayne·2d

处理超长专业文档时，向量检索常常找不准关键段落，因为它只看相似度、不看文档结构。 VectifyAI/PageIndex 换了一条路：用 LLM 推理替代向量数据库，通过文档的自然层次结构（章节、页码）进行检索，模仿人类专家浏览复杂文档的方式。 GitHub：github.com/VectifyAI/Page… 主要功能：将文档组织成层次化树状索引，而非人工切块根据完整对话上下文和领域知识自适应检索提供可追溯、可解释的结果，明确标注页码和章节引用在金融文档基准测试中达到 98.7% 准确率，超越向量检索 pip 安装，MIT 协议，30.9k stars，支持本地自托管。适合处理 SEC 文件、监管报告、学术论文等超出 LLM 上下文窗口的专业文档。

中文

1.3K

PageIndex@PageIndexAI·2d

@sitinme 感谢分享PageIndex! github.com/VectifyAI/Page…

中文

182

sitin@sitinme·2d

Github 30k star，不用向量数据库也能做 RAG，而且准确率还更高！做 RAG 的人应该都有过这种体验：向量数据库返回的内容“看起来相关”，但就是不是你要的那个答案。特别是处理合同、财报、技术手册这类长文档的时候，你问“第三季度营收是多少”，它给你返回一段“公司业务概述”。相似 ≠ 相关，这是向量检索的根本问题。PageIndex 的解法很简单粗暴：不用向量，用推理。

中文

6.4K

PageIndex@PageIndexAI·3d

@LLMpsycho Thanks for sharing, Bessi!

English

Bessi@LLMpsycho·3d

VectifyAI just dropped PageIndex and it is a massive win for RAG. A document index designed for vectorless, reasoning-based retrieval to kill the noise. This is clean work for anyone building high-precision LLM apps. 4,555 stars. github.com/VectifyAI/Page…

English

PageIndex@PageIndexAI·8 May

@Ren_Lifestyle_ PageIndexの共有ありがとうございます！github.com/VectifyAI/Page…

日本語

Ren｜LifestyleLog@Ren_Lifestyle_·8 May

【予言】RAGの「壁」を壊す、期待のリポジトリ。 VectifyAIの「PageIndex」。従来のベクトル検索を捨て、AIがドキュメントの「構造」を理解して人間のように推論して探し出す。・Vector DB不要・チャンク分割の苦悩から解放・FinanceBenchで98.7%の驚異的な精度専門文書の解析に悩む層には、救世主になるかもしれません。 github.com/VectifyAI/Page… #RAG #AI

日本語

106

PageIndex@PageIndexAI·8 May

@cloudthetacom @Theta_Network Thanks for sharing PageIndex! github.com/VectifyAI/Page…

English

136

Theta Communications@cloudthetacom·8 May

🚨 NEW YOUTUBE IS LIVE 🚨 youtu.be/U2Z0PKtOGQs?si… Diving into the significance of @PageIndexAI incorporation to @Theta_Network Edgecloud 😎🏝️ #theta #tfuel #tdrop #aethir #zcash #tao

YouTube

English

2.6K

PageIndex@PageIndexAI·7 May

@AlphaSignalAI Thanks for sharing PageIndex! We are rethinking RAG without vector DBs — PageIndex for reasoning-based, context-aware retrieval. github.com/VectifyAI/Page…

English

200

AlphaSignal AI@AlphaSignalAI·7 May

x.com/i/article/2052…

ZXX

165

10.7K

PageIndex@PageIndexAI·7 May

@rosie_codes Thanks for sharing PageIndex, our vectorless, reasoning-based RAG framework! github.com/VectifyAI/Page…

English

122

Rosie@rosie_codes·7 May

PageIndex skips the vector DB entirely. It builds a tree index and uses LLM reasoning to navigate documents — no chunking, no similarity guessing, just finding what you actually need. #RAG #LLM #OpenSource 🔗 Link in the comments

English

190

PageIndex@PageIndexAI·7 May

@blackthornecl Thanks for sharing PageIndex and our bet on reasoning-as-retrieval. github.com/VectifyAI/Page…

English

BlackthornE@blackthornecl·7 May

¿Cuánto pagarías por una herramienta que hace esto? Bueno, es gratis y open source. 🔥 Hace 6 meses necesitaba buscar algo específico en 500 informes financieros. Los embeddings y similarity search me devolvían resultados...meh. Así encontré PageIndex. Suena simple pero cambia todo: 1. No usa Vector DB — razonamiento puro sobre estructura del documento 2. No hace chunking — mantiene contexto natural de cada sección 3. 98.7% de precisión en FinanceBench 4. Construye un árbol jerárquico como tabla de contenidos inteligente El sistema simula cómo un humano navega un documento: piensa, razona y llega directo a la sección relevante. 📄 No más "vibe retrieval" con resultados opacos. 🔍 Prueba: pip install pageindex o el chat integrado. Si te resulta útil, dale RT. Ayuda a la comunidad 🔄

Español

PageIndex@PageIndexAI·7 May

@NFTCPS 感谢分享 PageIndex！ github.com/VectifyAI/Page…

中文

893

鸟哥 | 蓝鸟会🕊️@NFTCPS·7 May

RAG行业要被干翻了！😱 PageIndex这玩意儿直接颠覆传统RAG！不用向量DB、不用嵌入、不用切块、连相似搜索都省了，直接给文档建树形索引，让LLM像人一样逐层推理。 FinanceBench爆到98.7%准确率，秒杀所有向量方案。100%开源，开发者/研究者/文档分析玩家直接起飞，Pinecone那些老掉牙可以扔了！ 🔗 chat.pageindex.ai 用了就回不去😭

中文

158

31K

PageIndex@PageIndexAI·7 May

@SidJain_80 Thanks for sharing PageIndex, Sid! github.com/VectifyAI/Page…

English

Sid@SidJain_80·5 May

Everyone is trying to fix RAG with better embeddings. Bigger vector DBs. Smarter chunking. More retrieval tricks. But the real issue? Similarity ≠ relevance github.com/VectifyAI/Page… PageIndex takes a completely different route: - No vectors - No chunking - Builds a tree (like a human reading a doc) - Uses reasoning to navigate, not just retrieve Instead of: “find similar text” It does: “understand structure -> reason -> find the right section” This is why it beats traditional RAG on complex docs (98.7% on FinanceBench) RAG isn’t broken. It’s just been solving the wrong problem.

English

716

PageIndex@PageIndexAI·6 May

@Tomodo_ysys PageIndexの共有ありがとうございます！

日本語

ともど@Tomodo_ysys·6 May

ベクトル検索よりいいと聞いたPageIndex見てみた。ベクトル検索、チャンク分割をやめて目次ツリーを作ってそこから情報がありそうな場所を推論して探してくらしい。1つの長文pdfなら良さそう github.com/VectifyAI/Page…

日本語

162

PageIndex@PageIndexAI·6 May

@CryptoFamilyOz1 Thanks for sharing PageIndex, our vectorless, reasoning-based retrieval framework for AI agents. github.com/VectifyAI/Page…

English

©️®️Y🅿️T🅾️_F🅰️MℹLY🇦🇺🦘@CryptoFamilyOz1·6 May

$THETA #PageIndex for THETA #EdgeCloud #AIAgents

Theta Network@Theta_Network

NEW: We just shipped PageIndex for Theta EdgeCloud AI agents. It fixes one of the most common reasons AI agents give bad answers, and the benchmark numbers are worth a look. It's also already available for your Theta-powered AI agents. 🧵

Sydney, New South Wales 🇦🇺 English

448

PageIndex@PageIndexAI·6 May

@CoderArmy Thanks for sharing PageIndex, our vectorless, reasoning-based RAG framework! github.com/VectifyAI/Page…

English

CoderArmy@CoderArmy·4 May

Stop saying RAG is dead💀 Everyone is hyping up PageIndex, but it's not a silver bullet. In this video we're breaking down: ✅ Why Vector RAG still matters ✅ When to use PageIndex vs. GraphRAG ✅ How to build a hybrid architecture for real-world production Don't follow the hype, understand the engineering: youtu.be/-ADU-H1ZNjg?si…

YouTube

English

344

PageIndex@PageIndexAI·6 May

@funkyfr31801951 Thank you for sharing PageIndex and our bet on "reasoning-as-retrieval"! github.com/VectifyAI/Page…

English

funkyfresh@funkyfr31801951·5 May

interesting idea on how to more effectively use llm's, pageindex looks like its got some decent traction on github. still a bit baffled by how it works, but what it does(assuming true) is pretty sweet. good work $theta folks

Theta Network@Theta_Network

English

768

PageIndex@PageIndexAI·6 May

@ThetaUniv @Theta_Network Thanks for sharing and adopting PageIndex! github.com/VectifyAI/Page…

English

ThetaUniverse@ThetaUniv·6 May

Wow, @Theta_Network's #EdgeCloud #AI Agents just got SIGNIFICANTLY smarter. #PageIndex on #EdgeCloud's AI agents pulls the EXACT section that answers the question. Let's GOOOOOO 🤠😎🔥🔥 $THETA $TFUEL #THETA #TFUEL Read more here : medium.com/theta-network/…

Theta Network@Theta_Network

English

612

PageIndex@PageIndexAI·6 May

@0xKingsKuan 感谢分享我们的无向量、基于推理的 RAG 解决方案 PageIndex！ github.com/VectifyAI/Page…

中文

币世王 | 🦅🐬TermMax@0xKingsKuan·6 May

RAG 行业要被干翻了! 这是 PageIndex 向量零 RAG 革命神器，彻底把传统 RAG 行业干翻！无需向量 DB、无嵌入、无切块、无相似搜索，直接给文档建树形索引，让 LLM 像人类读书一样逐层推理。 FinanceBench上直接干到 98.7%准确率，完爆所有向量 RAG方案。100% 开源，开发者/研究者/文档智能分析玩家直接起飞，再也不用 Pinecone 那些老掉牙的东西了！ chat.pageindex.ai 用了就回不去了 😭

币世王 | 🦅🐬TermMax@0xKingsKuan

还在用自己手动刷视频、记笔记、总结关键点？太落后了！ algrow.online/mcp Claude 现在直接能看视频了！直接丢 YouTube、TikTok 或 Instagram 链接进去，让 Claude 自己看完、听完、拆完： • 秒总结核心内容 • 提取关键点 + 时间戳 • 分析逻辑/卖点/可优化点全部在 Algrow 里实时上线！

中文

110

573

136.8K

PageIndex@PageIndexAI·6 May

@nash_su 这部分接下来也很快会开源滴，目前人手有点小不足，我们抓紧 😎

中文

555

nash_su - e/acc@nash_su·6 May

哈哈，没想到引来官方回复了😂😂 之前我说 PageIndex 不支持大量文档，原来是在企业版 PageIndex File System 中支持的，看了下是使用了类似知识图谱、主题聚类的方式实现的，这个逻辑上就说得通了。确实可以通过这种方式来做。感觉方式跟我的 llm_wiki 有点像，知识图谱+comminity，主题聚类也是个不错的办法。不知道具体实现方式是什么样子的，开源版本中好像没有包含这部分。

PageIndex@PageIndexAI

@nash_su 您好，针对多文档检索的场景，我们也推出了 PageIndex File System 来解决这一问题。具体可以参考我们的博客 pageindex.ai/blog/pageindex…

中文

14.1K

PageIndex@PageIndexAI·6 May

@nash_su 您好，针对多文档检索的场景，我们也推出了 PageIndex File System 来解决这一问题。具体可以参考我们的博客 pageindex.ai/blog/pageindex…

中文

15.8K

nash_su - e/acc@nash_su·6 May

看到很多人在推 PageIndex，说颠覆了向量方式我前几天看到这个项目时候分析了下，这个思路只适合文档数量很小的场景，他的原理是给文档每页做 summary，然后把所有页面摘要都给到 LLM，让模型自己选择。所以只能检索单文档，并不支持对一个文件夹或更大范围的文件进行检索。 github.com/VectifyAI/Page…

中文

10.2K

PageIndex@PageIndexAI·6 May

@AiwithYasir Thanks for sharing PageIndex! github.com/VectifyAI/Page…

English

168

Yasir Ai@AiwithYasir·6 May

The entire RAG industry is about to get cooked. Researchers have built a new RAG approach that: - does not need a vector DB. - does not embed data. - involves no chunking. - performs no similarity search. It's called PageIndex. Instead of chunking your docs and stuffing them into pinecone, it builds a tree index and lets the LLM reason through it like a human reading a book. hit 98.7% on financebench. beats every vector RAG on the leaderboard. no embeddings. no chunking. no vector DB. 100% open source.

English

191

6.3K

PageIndex@PageIndexAI·5 May

@Kavyabuildss Thanks for sharing PageIndex, Kavya!

English

Kavya@Kavyabuildss·5 May

THE ENTIRE RAG INDUSTRY IS ABOUT TO GET SHAKEN 🤯 researchers just built a new RAG method called PageIndex and it breaks almost every rule people follow today → no vector DB → no embeddings → no chunking → no similarity search instead of chopping docs into pieces it builds a tree-like index and lets the model navigate information more like a human the result? 98.7% on FinanceBench beating traditional vector RAG systems on the leaderboard 100% open source

English

168

Keşfet

@rwayne @sitinme @LLMpsycho @Ren_Lifestyle_ @cloudthetacom @Theta_Network @AlphaSignalAI @rosie_codes