Alex C-G

7K posts

Alex C-G banner
Alex C-G

Alex C-G

@alexcg

Open Source Evangelist and tech content writer at @jinaai_. he/him.

Berlin Katılım Mayıs 2007
1.8K Takip Edilen1.8K Takipçiler
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Heard you like GGUFs and MLX. Our newly released listwise reranker, jina-reranker-v3, is now available in dynamic quantized GGUFs and MLX. Check out our🤗 collection for the weights and arxiv report: huggingface.co/collections/ji…
Jina AI tweet media
English
1
20
128
9.4K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Last but not late: jina-reranker-v3 is here! A new 0.6B-parameter listwise reranker that puts query and all candidate documents in one context window and SOTA on BEIR. We call this new query-document interaction "last but not late" - It's "last" because <|doc_emb|> is placed as the final token of each document for embedding extraction. It's "not late" because, unlike late interaction models i.e. ColBERT that separately encode documents before multi-vector matching, we enable query-document-document interactions early in the forward pass.
Jina AI tweet media
English
2
17
151
10.1K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Today we're releasing jina-code-embeddings, a new suite of code embedding models in two sizes—0.5B and 1.5B parameters—along with 1~4bit GGUF quantizations for both. Built on latest code generation LLMs, these models achieve SOTA retrieval performance despite their compact size. They support over 15 programming languages and 5 tasks: nl2code, code2code, code2nl, code2completions and qa.
Jina AI tweet media
English
9
49
306
29.6K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Got a Mac with an M-chip? You can now train Gemma3 270m locally as a multilingual embedding or reranker model using our mlx-retrieval project. It lets you train Gemma3 270m locally at 4000 tokens/s on M3 Ultra - that's actually usable speed. We've implemented some standard practices for training an effective decoder-only embedding or reranker model with MLX: full/partial LoRA, InfoNCE, gradient accumulation, and streaming data loader. Plus MTEB integration for train-evaluation loops.
Jina AI tweet media
English
7
61
413
31.9K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Two weeks ago, we released jina-embeddings-v4-GGUF with dynamic quantizations. During our experiments, we found interesting things while converting and running GGUF embeddings. Since most of the llama.cpp community focuses on LLMs, we thought it'd be valuable to share this from an embedding provider's perspective. What's particularly relevant is that today's embedding models are almost identical to LLMs - for example, jina-embeddings-v4 is based on Qwen2.5-VL-3B-instruct and jina-reranker-m0 is based on Qwen2-VL-2B. The only real difference is the output: LLMs are generative, the embeddings and rerankers are discriminative.
Jina AI tweet media
English
4
20
170
12.2K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Our official MCP server with read, search, embed, rerank tools on mcp[at]jina[at]ai, where we optimized the embedding and reranker usage particularly for context engineering for LLMs.
Jina AI tweet media
English
7
16
139
26.3K
Alex C-G retweetledi
Michael Günther
Michael Günther@michael_g_u·
Resolution is important for image embeddings - especially for visual document retrieval. jina-embeddings-v4 supports inputs up to 16+ MP (the default is much lower). We wrote a blog post about how resolution affects performance across benchmarks jina.ai/news/how-image…
English
0
2
11
467
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
New benchmark drops: JinaVDR (Visual Document Retrieval) evals how good retrieval models handle real-world visual documents on 95 tasks in 20 langs—think layouts packed with graphs, charts, tables, text, images. We're talking scanned docs, screenshots, the works. JinaVDR pairs them with targeted text queries, enabling comprehensive evaluation of retrieval performance across real-world document complexity and broader domain coverage.
Jina AI tweet media
English
4
19
121
10.9K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
jina-embeddings-v4-GGUF is here with different quantizations github.com/jina-ai/jina-e… Unsloth-like dynamic quants is on the way.
English
1
23
132
8.1K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Context engineering is curating the most relevant information to pack the context windows just right. Text selection and passage reranking are integral components of it. In part 2 of our Submodularity Series, we show that both text selection and passage reranking yield to submodular optimization, which provides rigorous solutions. If you're unfamiliar with submodular functions, think "diminishing returns." - We start with an empty set and incrementally add selected text or passages. Each addition provides value, but the marginal benefit decreases—capturing the intuition that diverse, non-redundant selections are most valuable.
Jina AI tweet media
English
6
16
144
11K
Alex C-G retweetledi
Michael Günther
Michael Günther@michael_g_u·
We just arrived @SIGIRConf! If you're here or are interested in an internship @JinaAI_ on training the following search foundation models, feel free to reach out to me: - Embedding / Dense Retrieval Models - Rerankers - Small LMs (<2B) for document cleaning, extraction, etc.
Michael Günther tweet media
English
0
4
33
2K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Submodular optimization for token/sentence selection from long contexts. Here's an interesting exp: first used jina-embeddings-v4's multi-vector feature to extract token-level embeddings from a passage, then applied submodular optimization to cherry-pick the tokens that provide the best coverage, finally call tokenizer and convert selections back to the strings at their org. positions. Think of it as a form of "compression"—you can adjust the top-k slider to dial in different "compress rates". Can you still make sense of the compressed text?
English
3
19
132
13.1K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Many know the importance of diverse query generation in DeepResearch, but few take its implementation seriously. Most DeepResearch simply hardcode "diversity" into the prompt. We show a more rigorous approach to diverse query generation using sentence embeddings and submodular optimization, which can significantly improve the overall quality of DeepResearch systems.
Jina AI tweet media
English
4
24
158
13.8K
Alex C-G retweetledi
Saba Sturua
Saba Sturua@jupyterjazz·
I just integrated jina-embeddings-v4 with vLLM, and throughput doubled compared to inference via transformers (tested on Flickr data, 2k text/images). Instructions on the model page: huggingface.co/jinaai/jina-em…
English
0
5
16
834
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Cliché is that quantization hurts performance—that you must trade quality for space. Reality? Skill issue. Learn how we trained quantized versions of jina-embeddings-v4 using Quantization-Aware Training (QAT), where models learn to work with rounding rather than fight against it. QAT integrates quantization constraints directly into training, allowing models to adapt and compensate for quantization effects. This includes Low-Rank QAT (LR-QAT), which reduces weight and activation redundancy, and efficient frameworks like EfQAT that optimize training to achieve near-full precision accuracy with lower computational overhead.
Jina AI tweet media
English
3
14
71
7.4K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Today we're releasing jina-embeddings-v4, our new 3.8B universal embedding model for retrieving text, images, visual documents and code. V4 achieves state-of-the-art retrieval performance on multimodal and multilingual tasks across MTEB, MMTEB, CoIR, LongEmbed, STS, Jina-VDR, CLIP, and ViDoRe benchmarks, with particular strength in processing visually rich content such as tables, charts, diagrams, and mixture of them. The model supports both single-vector and multi-vector embeddings.
GIF
English
6
37
213
17.4K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
One interesting question people ask us is: "How do you guys vibe-check your embeddings?" Sure, there's MTEB for more serious quantitative evaluation on public benchmarks, but what do you do for open-domain or new problem? Today we want to share a small internal tool we use for debugging and visualization. You can call it vibe-testing, we call it "Correlations" - and it's now open source on GitHub.
English
9
56
381
33.9K
Alex C-G retweetledi
Jina AI
Jina AI@JinaAI_·
Yesterday's bagging & boosting; today's model soups. It involves training multiple models with different hyperparameters or data partitions — the same as you usually would — but then combining them. The result is a better and more robust model than the single best performer. At Jina, we used this technique to train our jina-embeddings-v3 and ReaderLM-v2. For example, multilingual embedding models often suffer from biases and performance failures caused by inbalanced training data. It would be a boon to be able to train the best model we can on each task or dataset individually and then combine them equally.
Jina AI tweet media
English
4
17
119
8.5K