George Halal (@halal_george) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

George Halal@halal_george·27 Ağu

Excited to share that we trained rerankers at the cost/performance frontier and are open sourcing them! Contextual AI Reranker v2 🚀 Best performing, most efficient reranker 🤗 Open weights (1B, 2B, 6B) 🫡 Instruction-following (including recency-awareness) 🌐 Multilingual 1/4

English

6

26

169

31.1K

George Halal@halal_george·1 Ara

Stop building heuristic-based graphs. Start building adaptable, agentic tools. We break down the full system design in this blog post: contextual.ai/blog/an-agenti… 🧵 4/4

English

0

102

George Halal@halal_george·1 Ara

Our solution: shift the traversal logic to the agent. Extract “metadata”/“aliases” for each chunk at indexing time. During retrieval, the agent dynamically chooses: Search raw text content OR search metadata to hop to a reference? 🧵 3/4

English

2

0

127

George Halal@halal_george·1 Ara

An agentic alternative to GraphRAG. We built a Metadata Search Tool to solve reference traversal without the rigid complexity of static graphs. The result? Agents resolve complex queries in fewer steps with higher accuracy. 🧵 1/4

English

1

6

12

2.3K

George Halal@halal_george·30 Eki

This flexibility is the superpower. You control what to extract: section hierarchies, list of claims, questions the doc answers—whatever fits your use case. GraphRAG locks you into a static workflow. Metadata search adapts to yours. Thanks Jackie Zhang and @sheshanshag for their help on this project! This tool will be available on our platform soon, but contact @ContextualAI for early access.

English

0

180

George Halal@halal_george·30 Eki

Like GraphRAG, we extract structured info from docs at ingestion—each entry becoming a searchable node in the embedding space. Unlike GraphRAG, we skip the heuristic-based graph building and navigation methods, which are often specialized to various domains and show diminishing returns in our ablations. This keeps things fast and adaptable. Adding new docs or changing your metadata schema is trivial.

English

1

0

1

180

George Halal@halal_george·30 Eki

Your metadata IS your graph. Giving our agents access to a metadata search tool boosted our evals by 11%, providing the flexibility of GraphRAG while avoiding all the complexity. It unlocks new capabilities, including reference traversal. Example: 1. Agent finds a doc with references. 2. Agent decides which references to traverse and searches over metadata to fetch them.

English

3

5

17

2.9K

George Halal@halal_george·7 Eyl

Woohoo! We made sure not to overfit to benchmarks and focused on its generalization capabilities, so glad to hear that worked :)

search founder@n0riskn0r3ward

.@ContextualAI 's new re-ranker ($0.05 per M tokens) is a bit better than voyage re-rank 2.5 (also $0.05 per M tokens) which is a pretty high bar IMO. ~2% better recall @ 10 in my eval. I'm also not exactly doing standard QA RAG either, so likely a bit out of domain for both.

English

1

0

9

479

George Halal@halal_george·29 Ağu

@ethan_kim00 and that's why all other rerankers perform poorly on the recency benchmark. Ours was specifically trained to rank retrieved documents as: more_relevant_more_recent_doc > more_relevant_less_recent_doc > less_relevant_more_recent_doc > less_relevant_less_recent_doc

English

0

81

Ethan Kim@ethan_kim00·29 Ağu

@halal_george Seems like recency ranking would be poorly calibrated and prone to drift for a pointwise reranker.

English

1

0

128

George Halal@halal_george·27 Ağu

Excited to share that we trained rerankers at the cost/performance frontier and are open sourcing them! Contextual AI Reranker v2 🚀 Best performing, most efficient reranker 🤗 Open weights (1B, 2B, 6B) 🫡 Instruction-following (including recency-awareness) 🌐 Multilingual 1/4

English

6

26

169

31.1K

George Halal retweetledi

Michael@michael_chomsky·27 Ağu

Instruction following rerankers are so underrated. You can set arbitrary instructions like ‘sort by candidates that are a good fit for this role’ or ‘article mentions an early stage company’. This is the kind of thing I was hypothesizing years ago, and it’s cool to see the space catch up to theory. The next step will be small models that do binary classification based on a set of arbitrary criteria.

George Halal@halal_george

Excited to share that we trained rerankers at the cost/performance frontier and are open sourcing them! Contextual AI Reranker v2 🚀 Best performing, most efficient reranker 🤗 Open weights (1B, 2B, 6B) 🫡 Instruction-following (including recency-awareness) 🌐 Multilingual 1/4

English

0

3

16

1.8K

George Halal@halal_george·28 Ağu

@lgandecki "compared to the 2nd-best rerankers which are up to ~10x more expensive!” I’m now realizing that the line break makes it seem like it’s not part of the sentence above it

English

1

0

25

Łukasz Gandecki@lgandecki·27 Ağu

@halal_george are you saying 10x more expensive is a good thing?

English

1

0

18

George Halal retweetledi

Douwe Kiela@douwekiela·27 Ağu

We just released the latest version of our reranker: best performing, most efficient, open weights, instruction following, and multilingual. Try it out in your agentic RAG pipelines!

George Halal@halal_george

Excited to share that we trained rerankers at the cost/performance frontier and are open sourcing them! Contextual AI Reranker v2 🚀 Best performing, most efficient reranker 🤗 Open weights (1B, 2B, 6B) 🫡 Instruction-following (including recency-awareness) 🌐 Multilingual 1/4

English

6

4

44

6K

George Halal retweetledi

Sheshansh Agrawal@sheshanshag·27 Ağu

Performance on standard retrieval benchmarks like BEIR/ MMTEB hasn't correlated with performance on real world retrieval evaluation datasets for a while now. The causes are twofold: - Relevance is ill-defined and subjective. - Popular retrieval benchmarks are gameable. Here I describe how we tackled these challenges while building our second generation of rerankers. 1/N

George Halal@halal_george

Excited to share that we trained rerankers at the cost/performance frontier and are open sourcing them! Contextual AI Reranker v2 🚀 Best performing, most efficient reranker 🤗 Open weights (1B, 2B, 6B) 🫡 Instruction-following (including recency-awareness) 🌐 Multilingual 1/4

English

1

6

20

2.7K

George Halal@halal_george·27 Ağu

Check out our blogpost for more details: contextual.ai/blog/rerank-v2 Tagging folks who might find this interesting: @bclavie, @jobergum, @mariaKhalusova, @virattt, @pavelsvitek_, @jerryjliu0, @dzhng, @dani_avila7, @daniel_mac8, @helloiamleonie, @ShengyaoZhuang, @nlpnyc, @swyx, @wowitsmrinal, @spacemanidol, @NateSesti, @n0riskn0r3ward, @michael_chomsky, @mrdbourke, @pelaseyed, @IntuitMachine, @rohanpaul_ai, @tom_doerr, @OptimiseOrDie, @johnjnay, @_akhaliq, @hwchase17, @LuizaJarovsky, @daansan_ml, @daansan_ml, @omarsar0, @_reachsumit, @Aurimas_Gr, @RichardSocher 4/4

English

0

8

768

George Halal@halal_george·27 Ağu

You can access our rerankers now on 🤗 HuggingFace: huggingface.co/collections/Co… 🌳 Google Model Garden: Available soon. 🟠 API endpoint: The first $50 (1 billion tokens) are free with a business email. Documentation: docs.contextual.ai/api-reference/… 👩‍💻 Python SDK (code snippet attached) 3/4

English

1

0

6

1K

George Halal retweetledi

Contextual AI@ContextualAI·20 Ağu

🏆 It's official - Contextual AI is now at the top of the FACTS leaderboard for groundedness, beating out strong competition from Gemini 2.5 Pro and GPT-5! Congrats to our research team @w33lliam @rajan__vivek @nandita__naik @Thienhn97 @sheshanshag @shikibmehri on this awesome achievement!

William Berrios@w33lliam

Tired of seeing O3 hallucinate? 😵‍💫 Today, I am excited to share how we built the least hallucinatory LLM in the 🌍 Our GLMv2, developed at @ContextualAI, just claimed 1st place 🥇 on the FACTS Grounded leaderboard by Google DeepMind — outperforming Gemini-2.5-pro, Claude 4, and O3 by 18%. 🤯 More details about our SFT and post-training recipe below 👇 1/N

English

0

6

18

6.1K

George Halal@halal_george·5 Ağu

Another great use case for the instruction-following reranker we trained

Nina Lopatina@NinaLopatina

We had an interesting meta-learning at @aiDotEngineer World’s Fair from some of the organizers of the MCP track: there has been such an explosion in MCP Server creation, that one of the emerging challenges in this space is selecting the right one for your task.

English

1

4

157

George Halal retweetledi

William Berrios@w33lliam·22 Tem

📢 As promised ✨, we're open-sourcing LMUnit! Our SoTA generative model for fine-grained criteria evaluation of your LLM responses 🎯 ✅ SoTA on Flask & BigGbench ✅ SoTA generative reward model on RewardBench2 🤗 Models available on @huggingface: tiny.cc/qjzp001 💻 Github repo: github.com/ContextualAI/L… 📄 Paper: arxiv.org/abs/2412.13091 ✍️ Blog: contextual.ai/lmunit/ See more details in the quoted tweet👇

William Berrios@w33lliam

Excited to share 🤯 that our LMUnit models with @ContextualAI just claimed the top spots on RewardBench2 🥇 How did we manage to rank +5% higher than models like Gemini, Claude 4, and GPT4.1? More in the details below: 🧵 1/11

English

1

14

34

7K

George Halal

Keşfet