Jina AI

2.3K posts

Jina AI

@JinaAI_

Your Search Foundation, Supercharged!

Mountain View, CA Katılım Mart 2020

1 Takip Edilen16.8K Takipçiler

Jina AI@JinaAI_·6d

@qorprate one off

English

snav@qorprate·6d

@JinaAI_ This is great. Quick question: are free API token allocations per month? Or just one off?

English

Jina AI@JinaAI_·13 Mar

Our official CLI for agents github.com/jina-ai/cli

English

216

19.9K

Jina AI@JinaAI_·13 Mar

@ChiragCX Oh man we thought Skills were the dead ones

English

452

Chirag@ChiragCX·13 Mar

@JinaAI_ Amazing! Can you package it up into a skill?

English

499

Jina AI@JinaAI_·19 Şub

The trend toward smaller embeddings is a shift. On-device retrieval, browser-based search, and edge deployment all demand models that fit in constrained memory budgets. Learn more about Small & Nano below: - blog post: jina.ai/news/jina-embe… - 🤗 weights including GGUFs and MLX: huggingface.co/collections/ji… - arXiv: arxiv.org/abs/2602.15547

English

1.3K

Jina AI@JinaAI_·19 Şub

v5-text uses decoder-only backbones with last-token pooling instead of mean pooling. Four lightweight LoRA adapters are injected at each transformer layer, handling retrieval, text-matching, classification, and clustering independently. Users select the appropriate adapter at inference time. For retrieval, queries get a "Query:" prefix and documents get "Document:". Context length is 32K tokens, a 4x increase over v3.

English

2.1K

Jina AI@JinaAI_·19 Şub

jina-embeddings-v5-text is here! Our fifth generation of jina embeddings, pushing the quality-efficiency frontier for sub-1B multilingual embeddings. Two versions: small & nano, available today on Elastic Inference Service, vLLM, GGUF and MLX.

GIF

English

113

14K

Jina AI@JinaAI_·11 Şub

@tmztmobile It will be a lossy compression, like impressionist lossy

English

313

Timothy Meade@tmztmobile·11 Şub

@JinaAI_ So can they be used for compression maybe with a few extra bits in another field?

English

322

Jina AI@JinaAI_·11 Şub

Most don't know (1) how easy it is to invert embedding vectors back into sentences, (2) this is a perfect task text diffusion models. Here's a 78M parameter model and live demo that recovers 80% of tokens from Qwen3-Embedding and EmbeddingGemma vectors. Works even on multilingual input.

English

170

12.1K

Jina AI@JinaAI_·11 Şub

Check out the live demo embedding-inversion-demo.jina.ai and see it in action. Our read our repo and paper for more technical details on training and decoding.

English

1.5K

Jina AI@JinaAI_·11 Şub

Text embeddings are widely assumed to be safe, irreversible representations. We show we can reconstruct the original text using conditional masked diffusion. Existing inversions (Vec2Text, ALGEN, Zero2Text) generate tokens autoregressively and require iterative re-embedding through the target encoder. We take a different approach: embedding inversion as conditional masked diffusion. Starting from a fully masked sequence, a denoising model reveals tokens at all positions in parallel, conditioned on the target embedding via adaptive layer normalization (AdaLN-Zero). Each denoising step refines all positions simultaneously using global context, without ever re-embedding the current hypothesis.

English

1.7K

Jina AI@JinaAI_·29 Oca

@Prince_Canuma @liquidai @deepseek_ai @Alibaba_Qwen @allen_ai @TencentHunyuan @PaddlePaddle 🔥

QME

611

Prince Canuma@Prince_Canuma·29 Oca

🚀 mlx-vlm v0.3.10 is here and it's the biggest ever! New Models: • LFM2.5-VL by @liquidai • DeepSeek OCR 2 by @deepseek_ai • Qwen3-Omni @Alibaba_Qwen • Molmo2 by @allen_ai • Jina VLM @JinaAI_ • HunyuanOCR by @TencentHunyuan • PaddleOCR-VL by @PaddlePaddle • Ernie-4.5-VL by @Baidu_Inc • GLM-4.6V MoE by @Zai_org New Features: ⚡ Batch Generation - process multiple prompts in parallel 🗜️ MXFP4, MXFP8 & NVFP4 quantization 📝 Text Prefill + Input Embeddings API 🎯 Structured Outputs 💬 Enhanced Chat UI 🔧 @huggingface Transformers v5 support by Huge thanks to all 12 new contributors! 🙏 > uv pip install -U mlx-vlm Give us a star: github.com/Blaizzy/mlx-vlm

English

211

15.1K

Jina AI@JinaAI_·28 Oca

0.6B params. Top3 on MTEB reranking task. 10× smaller than generative listwise rerankers. Read more about this Best Paper at AAAI Frontier IR here: arxiv.org/abs/2509.25085

English

1.7K

Jina AI@JinaAI_·28 Oca

jina-reranker-v3 was the first listwise reranker to throw all documents into one context window (where traditional rerankers loop over ⟨q,d⟩ pairs) and let them fight it out via self-attention—what we call "last but not late" interaction. Bold or stupid? But not mediocre. Today it won Best Paper at AAAI Frontier IR Workshop.

English

6.6K

Jina AI@JinaAI_·23 Oca

Find our C implementation: github.com/jina-ai/jzip-c… And the paper: jina.ai/embedding-comp…

English

2.6K

Jina AI@JinaAI_·23 Oca

Here's an visualization for jina-embeddings-v4 vectors. This concentration collapses exponents to a single dominant value of 127, reducing exponent entropy from 2.6 to 0.03 bits/byte for jina-v4. But bxponent compression alone would yield only 1.1x compression. The additional gain comes from the high-order mantissa byte: when angles cluster around π/2 ≈1.5708, the IEEE 754 mantissa bits encoding the fractional part also become predictable. Empirically, the high-order mantissa byte entropy drops from 8.0 to 4.5 bits. Together, exponent and mantissa concentration yield the observed 1.5x compression

English

3.7K

Jina AI@JinaAI_·23 Oca

Convert your embeddings to spherical coordinates before compression - this trick cuts embedding storage from 240 GB to 160 GB, and 25% better than the best lossless baseline. Reconstruction is near-lossless as the error stays below float32 machine epsilon - so retrieval quality is preserved perfectly. Works across text, image, and multi-vector embeddings. No training, no codebooks.

English

546

35K

Jina AI@JinaAI_·15 Oca

@yusuke_m_MU Thanks for the reminder, we have just added service instruction to mcp

English

1.3K

村田悠典 | 生成AI駆動開発 | 医学論文 | 個人開発者@yusuke_m_MU·15 Oca

さらに今回わかったことがあって、同じような機能のMCPサーバーが2つある場合、エージェントに渡されたときに「Server Instructions」がきちんと書かれている方を、ほぼ確実にエージェントが選んでいる。実際、OpenAI Agent Builder の場合、公式のWebサーチツールとJina AI サーチMCPを持たせた場合、どんなにプロンプトで指示してもJina MCPを使ってくれない。初歩的なことなのだろうけど、意外にも、Jina AI の公式MCPサーバーのindex.tsにすら、「Server Instructions」が書いてなかったから、MCPサーバー自作している人は油断せずに見直してみるといいと思う。

村田悠典 | 生成AI駆動開発 | 医学論文 | 個人開発者@yusuke_m_MU

【朗報】もしかしたら、MCP単独でSkillsに近い位置付けにできるかもしれない：MCPの最大の課題の一つだったコンテキストウィンドウの消費に対する解決策：Tool search tool がClaude Codeに導入 Claude Code が、MCP のツール説明がコンテキストの 10% 以上を使用すると判断した場合に自動検知して、必要なサーバー以外はコンテキストに入れないようになった。神すぎる。 Tool search tool が起動すると、ツールは事前ロードされず、必要と判断されたツールのみ検索経由で読み込まれるようになる。で、ここからが重要なのが、MCPサーバーの「server instructions」です。「instructions」はClaude に対して「いつツール検索を行うべきか」を伝える役割を果たし、Skills に近い位置づけになる。で、MCPをそんなに使ってこなかったワイみたいな人間は、ツールのdeescriptionはしっかり書くけど、instructionsを書いてないみたいなことがある。いや、普通にあると思う。だって、今いろんなGitHubのリモートMCPのindex.ts みに行ったら、なんと「instructions」フィールドが書かれているサーバーが1/5でしたw というのが、サーバー全体のinstructionsは必須フィールドじゃないので、書かなくても動かせるけど、エージェントが何のサーバーがマジで分からなくなるから、これからは絶対書いた方がいい。どこに何を書ばいいのか分からなかったら、以下の公式ドキュメントでわかるので、参考にしてみて。ちなみにTypescript でinstructionsの記載は公式ドキュメントにないのだが、ServerOptions に instructions プロパティがある。 McpServer は第2引数として ServerOptions を受け取るので、そこに instructions を渡せばOK よく分からなければ、このテキストをClaude Codeに渡せばやってくれます。【server instructions の書き方のドキュメント】 #python:~:text=instructions%3D%22Resource%20Server%20that%20validates%20tokens%20via%20Authorization%20Server%20introspection%22%2C" target="_blank" rel="nofollow noopener">modelcontextprotocol.io/docs/tutorials… 【Server Instructionsとは：公式ブログ】 blog.modelcontextprotocol.io/posts/2025-11-…

日本語

977

Jina AI@JinaAI_·8 Oca

@Alibaba_Qwen Huge congratulations, and thanks for mentioning our m0 and jinaVDR work! 🫡

English

1.7K

Qwen@Alibaba_Qwen·8 Oca

🚀 Introducing Qwen3-VL-Embedding and Qwen3-VL-Reranker – advancing the state of the art in multimodal retrieval and cross-modal understanding! ✨ Highlights: ✅ Built upon the robust Qwen3-VL foundation model ✅ Processes text, images, screenshots, videos, and mixed modality inputs ✅ Supports 30+ languages ✅ Achieves state-of-the-art performance on multimodal retrieval benchmarks ✅ Open source and available on Hugging Face, GitHub, and ModelScope ✅ API deployment on Alibaba Cloud coming soon! 🎯 Two-stage retrieval architecture: 📊 Embedding Model – generates semantically rich vector representations in a unified embedding space 🎯 Reranker Model – computes fine-grained relevance scores for enhanced retrieval accuracy 🔍 Key application scenarios: Image-text retrieval, video search, multimodal RAG, visual question answering, multimodal content clustering, multilingual visual search, and more! 🌟 Developer-friendly capabilities: • Configurable embedding dimensions • Task-specific instruction customization • Embedding quantization support for efficient and cost-effective downstream deployment Hugging Face： huggingface.co/collections/Qw… huggingface.co/collections/Qw… ModelScope： modelscope.cn/collections/Qw… modelscope.cn/collections/Qw… Github: github.com/QwenLM/Qwen3-V… Blog: qwen.ai/blog?id=qwen3-… Tech Report:github.com/QwenLM/Qwen3-V…

English

293

1.9K

210K

Keşfet

@qorprate @ChiragCX @tmztmobile @Prince_Canuma @liquidai @deepseek_ai @Alibaba_Qwen @allen_ai