Alex C-G

7K posts

Alex C-G

@alexcg

Open Source Evangelist and tech content writer at @jinaai_. he/him.

Berlin Katılım Mayıs 2007

1.8K Takip Edilen1.8K Takipçiler

Alex C-G@alexcg·9 Eki

Psyched to be joining the Elastic team. Woohoo!

Elastic@elastic

We’re excited to announce that we have joined forces with @JinaAI_, a leader in frontier models for multimodal and multilingual search. This acquisition deepens Elastic’s capabilities in retrieval, embeddings, and context engineering to power agentic AI: go.es.io/48QeYCM

English

137

Alex C-G retweetledi

Jina AI@JinaAI_·6 Eki

Heard you like GGUFs and MLX. Our newly released listwise reranker, jina-reranker-v3, is now available in dynamic quantized GGUFs and MLX. Check out our🤗 collection for the weights and arxiv report: huggingface.co/collections/ji…

English

128

9.4K

Alex C-G retweetledi

Jina AI@JinaAI_·3 Eki

Last but not late: jina-reranker-v3 is here! A new 0.6B-parameter listwise reranker that puts query and all candidate documents in one context window and SOTA on BEIR. We call this new query-document interaction "last but not late" - It's "last" because <|doc_emb|> is placed as the final token of each document for embedding extraction. It's "not late" because, unlike late interaction models i.e. ColBERT that separately encode documents before multi-vector matching, we enable query-document-document interactions early in the forward pass.

English

151

10.1K

Alex C-G retweetledi

Jina AI@JinaAI_·4 Eyl

Today we're releasing jina-code-embeddings, a new suite of code embedding models in two sizes—0.5B and 1.5B parameters—along with 1~4bit GGUF quantizations for both. Built on latest code generation LLMs, these models achieve SOTA retrieval performance despite their compact size. They support over 15 programming languages and 5 tasks: nl2code, code2code, code2nl, code2completions and qa.

English

306

29.6K

Alex C-G retweetledi

Jina AI@JinaAI_·21 Ağu

Got a Mac with an M-chip? You can now train Gemma3 270m locally as a multilingual embedding or reranker model using our mlx-retrieval project. It lets you train Gemma3 270m locally at 4000 tokens/s on M3 Ultra - that's actually usable speed. We've implemented some standard practices for training an effective decoder-only embedding or reranker model with MLX: full/partial LoRA, InfoNCE, gradient accumulation, and streaming data loader. Plus MTEB integration for train-evaluation loops.

English

413

31.9K

Alex C-G retweetledi

Jina AI@JinaAI_·13 Ağu

Two weeks ago, we released jina-embeddings-v4-GGUF with dynamic quantizations. During our experiments, we found interesting things while converting and running GGUF embeddings. Since most of the llama.cpp community focuses on LLMs, we thought it'd be valuable to share this from an embedding provider's perspective. What's particularly relevant is that today's embedding models are almost identical to LLMs - for example, jina-embeddings-v4 is based on Qwen2.5-VL-3B-instruct and jina-reranker-m0 is based on Qwen2-VL-2B. The only real difference is the output: LLMs are generative, the embeddings and rerankers are discriminative.

English

170

12.2K

Alex C-G retweetledi

Jina AI@JinaAI_·11 Ağu

Our official MCP server with read, search, embed, rerank tools on mcp[at]jina[at]ai, where we optimized the embedding and reranker usage particularly for context engineering for LLMs.

English

139

26.3K

Alex C-G retweetledi

Michael Günther@michael_g_u·31 Tem

Resolution is important for image embeddings - especially for visual document retrieval. jina-embeddings-v4 supports inputs up to 16+ MP (the default is much lower). We wrote a blog post about how resolution affects performance across benchmarks jina.ai/news/how-image…

English

467

Alex C-G retweetledi

Jina AI@JinaAI_·26 Tem

New benchmark drops: JinaVDR (Visual Document Retrieval) evals how good retrieval models handle real-world visual documents on 95 tasks in 20 langs—think layouts packed with graphs, charts, tables, text, images. We're talking scanned docs, screenshots, the works. JinaVDR pairs them with targeted text queries, enabling comprehensive evaluation of retrieval performance across real-world document complexity and broader domain coverage.

English

121

10.9K

Alex C-G retweetledi

Jina AI@JinaAI_·21 Tem

jina-reranker-m0-GGUF is here huggingface.co/jinaai/jina-re…

Indonesia

4.9K

Alex C-G retweetledi

Jina AI@JinaAI_·18 Tem

jina-embeddings-v4-GGUF is here with different quantizations github.com/jina-ai/jina-e… Unsloth-like dynamic quants is on the way.

English

132

8.1K

Alex C-G retweetledi

Jina AI@JinaAI_·14 Tem

Context engineering is curating the most relevant information to pack the context windows just right. Text selection and passage reranking are integral components of it. In part 2 of our Submodularity Series, we show that both text selection and passage reranking yield to submodular optimization, which provides rigorous solutions. If you're unfamiliar with submodular functions, think "diminishing returns." - We start with an empty set and incrementally add selected text or passages. Each addition provides value, but the marginal benefit decreases—capturing the intuition that diverse, non-redundant selections are most valuable.

English

144

11K

Alex C-G retweetledi

Michael Günther@michael_g_u·14 Tem

We just arrived @SIGIRConf! If you're here or are interested in an internship @JinaAI_ on training the following search foundation models, feel free to reach out to me: - Embedding / Dense Retrieval Models - Rerankers - Small LMs (<2B) for document cleaning, extraction, etc.

English

Alex C-G retweetledi

Jina AI@JinaAI_·9 Tem

Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their org. positions. Think of it as a form of "compression"—you can adjust the top-k slider to dial in different "compress rates". Can you still make sense of the compressed text?

English

132

13.1K

Alex C-G retweetledi

Jina AI@JinaAI_·4 Tem

Many know the importance of diverse query generation in DeepResearch, but few take its implementation seriously. Most DeepResearch simply hardcode "diversity" into the prompt. We show a more rigorous approach to diverse query generation using sentence embeddings and submodular optimization, which can significantly improve the overall quality of DeepResearch systems.

English

158

13.8K

Alex C-G retweetledi

Saba Sturua@jupyterjazz·2 Tem

I just integrated jina-embeddings-v4 with vLLM, and throughput doubled compared to inference via transformers (tested on Flickr data, 2k text/images). Instructions on the model page: huggingface.co/jinaai/jina-em…

English

834

Alex C-G retweetledi

Jina AI@JinaAI_·30 Haz

Cliché is that quantization hurts performance—that you must trade quality for space. Reality? Skill issue. Learn how we trained quantized versions of jina-embeddings-v4 using Quantization-Aware Training (QAT), where models learn to work with rounding rather than fight against it. QAT integrates quantization constraints directly into training, allowing models to adapt and compensate for quantization effects. This includes Low-Rank QAT (LR-QAT), which reduces weight and activation redundancy, and efficient frameworks like EfQAT that optimize training to achieve near-full precision accuracy with lower computational overhead.

English

7.4K

Alex C-G retweetledi

Jina AI@JinaAI_·25 Haz

Today we're releasing jina-embeddings-v4, our new 3.8B universal embedding model for retrieving text, images, visual documents and code. V4 achieves state-of-the-art retrieval performance on multimodal and multilingual tasks across MTEB, MMTEB, CoIR, LongEmbed, STS, Jina-VDR, CLIP, and ViDoRe benchmarks, with particular strength in processing visually rich content such as tables, charts, diagrams, and mixture of them. The model supports both single-vector and multi-vector embeddings.

GIF

English

213

17.4K

Alex C-G retweetledi

Jina AI@JinaAI_·27 May

One interesting question people ask us is: "How do you guys vibe-check your embeddings?" Sure, there's MTEB for more serious quantitative evaluation on public benchmarks, but what do you do for open-domain or new problem? Today we want to share a small internal tool we use for debugging and visualization. You can call it vibe-testing, we call it "Correlations" - and it's now open source on GitHub.

English

381

33.9K

Alex C-G retweetledi

Jina AI@JinaAI_·7 May

Yesterday's bagging & boosting; today's model soups. It involves training multiple models with different hyperparameters or data partitions — the same as you usually would — but then combining them. The result is a better and more robust model than the single best performer. At Jina, we used this technique to train our jina-embeddings-v3 and ReaderLM-v2. For example, multilingual embedding models often suffer from biases and performance failures caused by inbalanced training data. It would be a boon to be able to train the best model we can on each task or dataset individually and then combine them equally.

English

119

8.5K

Keşfet

@SIGIRConf @JinaAI_ @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA