Nanonets

958 posts

Nanonets banner
Nanonets

Nanonets

@nanonets

AI-Powered Document Processing and Workflow Automation Start for Free: https://t.co/IQuUoC12s8

San Francisco, CA เข้าร่วม Aralık 2016
2.9K กำลังติดตาม1.6K ผู้ติดตาม
ทวีตที่ปักหมุด
Nanonets
Nanonets@nanonets·
Your LLMs are hungry for data, but documents are messy 😩. DocStrange is the answer! Our open-source solution turns any document into clean, LLM-ready data with one command. Give your models what they need. 🔗 github.com/NanoNets/docst… 🔗 pypi.org/project/docstr…
English
4
10
28
2.4K
Nanonets
Nanonets@nanonets·
Nanonets OCR-3 is live. This is the most accurate OCR model in the world currently. 87.4 on OLM-OCR (Global #1) 85.9 on IDP Leaderboard (Global #1) 90.5 on OmniDocBench OCR-3 also ships with two critical features that foundational models and VLMs miss today - confidence scores and bounding boxes.
Nanonets tweet media
English
7
20
36
368
Nanonets
Nanonets@nanonets·
Nanonets OCR-3 is the only OCR model you'll need in your agentic stack. The model API exposes five endpoints - /parse - structured markdown /extract - structured outputs in your schema /split - classify or split outputs based on content /chunk - context-aware chunks optimized for RAG /vqa - grounded answers with bboxes over sources We've specifically fine-tuned the model on edge cases where OCR repeatedly fails - complex tables, forms, non-trivial layouts.
GIF
English
0
0
4
80
Nanonets
Nanonets@nanonets·
With bounding boxes, you get exact coordinates for every extracted element. Use them for - 1. RAG citations 2. Feeding specific document regions to agents 3. Agent observability With confidence scores, you can measure reliability of every extraction. Pass high-confidence outputs directly, route low-confidence outputs to human review or a larger model. Use them to push your net accuracy to near 100%.
Nanonets tweet mediaNanonets tweet mediaNanonets tweet media
English
0
1
5
100
syoyo.eth 🌸 レイトラ ® 🐯 8 周年 🎉
nanonets-ocr-s を vision-language.cpp でスマッホで動かしたいから優秀な VLM 若人さまはよじゃぶじゃぶ湧き出てきてもろて?🥺👊
日本語
2
1
1
601
merve
merve@mervenoyann·
a question for y'all: which PDF renderers do you use (other than Docling, SmolDocling and R/OlmOCR)? why do you prefer that over these ones? 👀
English
28
10
273
52.8K
Siddharth Dwivedi
Siddharth Dwivedi@Naamhaisidu·
Does anyone have any contacts in Nanonets - Bangalore ??
English
2
0
0
58
Paul Fadieiev
Paul Fadieiev@pavlo_fadieiev·
Just tried the new Nanonets-OCR-s. Works really well! - Small size: just 3.75B parameters, works on RTX 3060 without quantization. - Recognizes equations and tables! And tables with equations (!) - Is multilingual! - Outputs descriptions of the images - Outputs in Markdown format
English
1
0
1
101
Mi imamo knjigu za vas
Mi imamo knjigu za vas@kombib·
Nanonets-OCR-s predstavlja veliki iskorak u odnosu na klasične OCR (optical character recognition – optičko prepoznavanje karaktera) alate. Dok većina OCR sistema samo prepoznaje i transkribuje tekst iz slika, Nanonets-OCR-s strukturira dokumente na način koji je optimizovan za dalju obradu pomoću velikih jezičkih modela (LLM) – kao što su GPT, Claude i Gemini.
2
0
2
162
alexmcaulay
alexmcaulay@alexmcaulay·
We are testing document parsing engines right now for a major project and going to report back on our findings. We are testing Docling, N8N, MarkITDown, LlamaParse, Mistral, Rossum, Veryfi, Google Document AI, Amazon Textract. Going to give a really good breakdown of everything for you. Anything else we should test?
English
1
1
3
1.6K
Sriram
Sriram@srizzler·
@karthikreddy95 @venky4a @miryalasrikanth Nanonets is the best IDP out there in terms of accuracy and pricing. PS: I've personally evaluated their APIs using Indian Regional Invoices (Most toughest out of all)
English
2
0
2
72
Srikanth Miryala
Srikanth Miryala@miryalasrikanth·
డాక్టరు అవటం వలన మరో అడ్వాంటేజీ, వేరే డాక్టరు గీకిపడేసిన మాత్రల్ని మనం ఇంట్లోవాళ్లకి అర్థమయ్యేట్లుగా రాసి ఇవ్వటం.
Srikanth Miryala tweet media
తెలుగు
38
19
364
17.7K
Antony Barroux
Antony Barroux@Blogalto·
📄 Nanonets vient de sortir Nanonets-OCR-s, un modèle IA révolutionnaire qui transforme tes docs (images, PDFs) en Markdown bien structuré. Ça gère les équations LaTeX, tables complexes, signatures et plus encore ! Suivez ce thread pour en savoir plus ! 👇 #NanonetsOCR #AI #Tech
Antony Barroux tweet media
Français
2
0
0
55
Nick Levine
Nick Levine@status_effects·
@andersonbcdefg @hoanhle_ In my tests I found rolmocr (reducto’s version of olmocr) and nanonets worked best fwiw (much better than marker). Everything whiffs on math expressions despite the latex support
English
2
0
7
259
Ben (no treats)
Ben (no treats)@andersonbcdefg·
it's shocking how terrible the current state of OCR is given how many companies are working on doc intelligence AND that we are supposedly "almost at AGI"
English
82
53
1.6K
119.1K
Andi Marafioti
Andi Marafioti@andimarafioti·
Nanonets does: 🧮 LaTeX Equation Recognition -> Transforms mathematical equations into perfect LaTeX syntax. 🖼️ Intelligent Image Description -> Automatically describes images with structured tags for smooth LLM processing. ✒️ Signature Detection & Isolation -> Accurately identifies and isolates signatures from other text, streamlining legal and business document handling. 💧 Watermark Extraction -> Extracts watermark text seamlessly, keeping content structured and context clear. ☑️ Smart Checkbox Handling -> Converts checkboxes and radio buttons into standardized Unicode symbols (☐, ☑, ☒) for clarity and consistency. 📊 Complex Table Extraction -> Handles intricate tables, converting them into both markdown and HTML.
English
4
0
37
6.8K
Andi Marafioti
Andi Marafioti@andimarafioti·
📢 A new open-source OCR model is breaking the internet: Nanonets-OCR-s! Nanonets understands context and semantic structures, transforming documents into clean, structured markdown. It has an Apache 2.0 license, and the authors compare it to Mistral-OCR 🧵 Let's look closer:
Andi Marafioti tweet media
English
20
213
1.7K
172K