
NoAI
3.4K posts

NoAI retweetledi

Introducing a revival of PapersWithCode!
As @ilyasut said, we're back to the "age of research".
Hence, it's important to share research and build on each other's work.
> find SOTA per domain, not just LLMs
> leaderboards
> methods
> all parsed at scale using AI agents.
English
NoAI retweetledi
NoAI retweetledi

𝗧𝗵𝗲 𝗿𝗲𝗰𝗼𝗿𝗱𝗶𝗻𝗴 𝗼𝗳 𝗟𝘂𝗰𝗮𝘀 𝗕𝗲𝘆𝗲𝗿'𝘀 (@giffmana) 𝗹𝗲𝗰𝘁𝘂𝗿𝗲 𝗮𝘁 @ETH 𝗶𝘀 𝗻𝗼𝘄 𝗹𝗶𝘃𝗲 𝗼𝗻 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗳𝗼𝗿 𝗲𝘃𝗲𝗿𝘆𝗼𝗻𝗲 𝘄𝗵𝗼 𝗰𝗼𝘂𝗹𝗱𝗻'𝘁 𝗷𝗼𝗶𝗻 𝘂𝘀 𝗶𝗻 𝗽𝗲𝗿𝘀𝗼𝗻!
This past Monday, we had the pleasure of hosting Lucas (@Meta @AIatMeta Superintelligence Labs) for our "Robot Learning: From Fundamentals to Foundation Models" course. He joined us to talk about: "𝗩𝗶𝘀𝗶𝗼𝗻 𝗶𝗻 𝘁𝗵𝗲 𝗔𝗴𝗲 𝗼𝗳 𝗟𝗟𝗠𝘀".
Drawing from a remarkable track record in computer vision and multimodal AI (𝗩𝗶𝗧, 𝗦𝗶𝗴𝗟𝗜𝗣, 𝗣𝗮𝗹𝗶𝗚𝗲𝗺𝗺𝗮) 🧠, Lucas delivered a masterclass on the frontier of multimodal foundation model training: from pre-training to post-training, where the field stands today, and what comes next 🚀
📽️ YouTube Recording: youtu.be/0XB7fNS_ONg
📚 Course Website: cvg.ethz.ch/lectures/Robot…

YouTube

English
NoAI retweetledi

"The Truth Lies Somewhere in the Middle (of the Generated Tokens)"
In autoregressive language models, mean pooling hidden states across generation yields better representations than any token alone.
project page: sophielwang.com/tokens
w/ @phillip_isola and @thisismyhat
English
NoAI retweetledi

NoAI retweetledi

DeepSeek has encountered a bug that is both strange and concerning.

Xiuyu Li@sheriyuo
BREAKING NEWS: You Can Steal DeepSeek Training Data By Questioning This <|begin▁of▁sentence|> <|sft▁begin|> <think> Remember to close Search #deepseek
English
NoAI retweetledi
NoAI retweetledi

🚨 someone just dropped a full 10-stage academic research pipeline for Claude Code.
It doesn’t write your paper for you, it hunts references, formats citations, verifies data, and even runs a "devil's advocate" agent to attack your own thesis.
Here's why it's a massive deal for academics:
→ Anti-AI Voice: Learns your specific writing style.
→Integrity Gates: Actively hunts down fabricated citations and statistical errors.
→Simulated Peer Review: Runs your draft through a 7-agent panel (including a Devil’s Advocate).
→Cheap: A full 15k-word paper costs ~$4–$6 in API credits.
Best part?
It's 100% free and open-source.
Install in 30s: `/plugin install academic-research-skills`
repo in 🧵↓

English
NoAI retweetledi

Akademisyen dostlarım, lütfen bana kızmayın. Bunu paylaşmak zorundayım.
Mushtaq Bilal, PhD@MushtaqBilalPhD
Türkçe
NoAI retweetledi

Can vision-language models truly see the fine-grained details in images?
Google DeepMind presents TIPSv2.
They boost dense patch-text alignment using three novel tricks: a distillation method where the student outperforms the teacher, an upgraded masked image objective (iBOT++) that also learns from unmasked tokens, and smarter caption sampling with synthetic captions.
Across 9 tasks and 20 datasets (classification, retrieval, segmentation, depth), TIPSv2 matches or beats leading vision encoders.
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
Project: gdm-tipsv2.github.io
Paper: arxiv.org/abs/2604.12012
Our report: mp.weixin.qq.com/s/R_Yn6_DytVEE…
📬 #PapersAccepted by Jiqizhixin

English
NoAI retweetledi
NoAI retweetledi

Parsing complex tables in PDFs is extremely challenging.
Existing metrics for measuring table accuracy, like TEDS (tree edit distance similarity), overweight exact table structure and underweight semantic correctness.
🚫 Overweight: If the rows within a table are out of order - even if the semantic meaning is still consistent - then TEDS heavily penalizes these values, even though the downstream AI agent would have no problem interpreting the values.
🚫 Overweight: If the HTML is semantic equivalent but output with different tags (th vs. td), TEDS will penalize
🚫 Underweight: If the header is dropped or transposed, then TEDS mildly penalizes these values, even though the entire semantic meaning of the table is destroyed.
We recently released ParseBench, a comprehensive enterprise document benchmark with a heavy focus on *semantic correctness* for tables.
We define a new metric: TableRecordMatch - which treats tables as a bag of records, where each record is a dictionary of key-value pairs, with keys being the headers and values being the cell values.
We combine it with the GriTS metric (more robust than TEDS) to come up with the final GTRM score.
It’s worth giving our full paper a read if you haven’t already. Also come check out our website hub!
Website: parsebench.ai
Blog: llamaindex.ai/blog/parsebenc…
Paper: arxiv.org/abs/2604.08538…


LlamaIndex 🦙@llama_index
Let's talk parsing tables. Two days ago we launched ParseBench,the first document OCR benchmark built for AI agents. This deep dive breaks down TableRecordMatch (GTRM), our metric for evaluating complex tables the way your pipeline actually consumes them: as records keyed by column headers. llamaindex.ai/blog/parsebenc…
English
NoAI retweetledi
NoAI retweetledi

Can language models explain features learned by vision encoders? #CVPR2026
- Feed a blank image
- Steer a specific feature in the vision encoder
- Ask the language model to explain the image
The model explains the feature itself.
GIF
English
NoAI retweetledi

NVIDIA just released Nemotron OCR v2 on Hugging Face
A production-ready multilingual OCR system with a hybrid
detector-recognizer architecture for text, layout and reading order.
huggingface.co/nvidia/nemotro…
English
NoAI retweetledi

🚨BREAKING: A dev just open-sourced the #1 ranked OCR model on Earth.
It's called GLM-OCR and it just hit 94.62 on OmniDocBench V1.5, beating every OCR model in existence.
Only 0.9B parameters. One pip install. Handles documents no other model could touch.
100% Open Source.

English
NoAI retweetledi
NoAI retweetledi

Nice finetune of Qwen3.5 4B!
It's only missing a comparison to 4x smaller LightOnOCR-2-1B 😉
Vik Paruchuri@VikParuchuri
I'm excited to open source Chandra OCR 2! - 85.9% (sota) on olmocr bench - 90+ language support w/benchmarks - 4B model (down from 9B) - Full layout information - Extracts + captions images and diagrams - Strong handwriting, math, form, table support
English
NoAI retweetledi

Bunch of new open OCR models recently — all available as uv scripts on @huggingface.
19 models from 0.9B–8B. Some standouts:
- Qianfan-OCR - 192 languages
- dots.mocr — charts/figures → editable SVG
- GLM-OCR — 94.6% accuracy, only 0.9B params

English
NoAI retweetledi

Another OCR model just dropped on @huggingface (so many OCRs lately!)
dots.mocr from @xiaohongshu Hi Lab looks really impressive on the benchmarks.
-Model: huggingface.co/collections/re…
-Paper: huggingface.co/papers/2603.13…
✨ 3B
✨ Multilingual support
✨ Converts charts, diagrams, and UI layouts directly into SVG code

English










