Olivier Balais (overnetcity.bsky.social)

3.3K posts

Olivier Balais (overnetcity.bsky.social)

@overnetcity

CTO @semji_fr, proud #casc co-founder, passionate fullstack web developer, I build cool things with #php #js #docker and an amazing team @SemjiTech !

Lyon, France Katılım Ocak 2011

805 Takip Edilen664 Takipçiler

Olivier Balais (overnetcity.bsky.social)@overnetcity·2 Oca

Tested #Gemini 2.0 Flash to convert recordings+audio into specs for Windsurf — mind-blowing. Now hoping for native support :)

English

143

Olivier Balais (overnetcity.bsky.social)@overnetcity·2 Oca

A few weeks ago, I was blown away by @windsurf_ai IDE! Next step: a multimodal approach (video+text+audio) so we can "show" code while explaining logic—like a screen record in a tech issue.

English

147

Olivier Balais (overnetcity.bsky.social)@overnetcity·2 Oca

@dunglas Amazing! Thanks for the pointer, @dunglas! I checked the page you shared but couldn’t find any mention of the BoltDB you referenced. Is there any other documentation or guidance you could point me to?

English

Kévin Dunglas@dunglas·31 Ara

@overnetcity Hey, it’s built-in! #reconciliation" target="_blank" rel="nofollow noopener">mercure.rocks/spec#reconcili… The reference (FOSS) implementation uses BoltDB to achieve that. The commercial version also supports Redis, Kafka, Pulsar and Postgres as storage engine.

English

Olivier Balais (overnetcity.bsky.social)@overnetcity·31 Ara

Hey @dunglas, hope all’s well! I’m exploring mercure.rocks for handling SSE auto-retries on network failure. Does Mercure have a built-in feature to store/buffer events for client auto-retry, or is it something we typically manage on the backend?

English

217

Olivier Balais (overnetcity.bsky.social)@overnetcity·30 Ara

o3 just dropped and it’s a monumental leap in AI capabilities. From blazing code gen to near-human performance on the ARC AGI test (yes, that test), it changes what we thought AI could do. Exciting? Absolutely. Terrifying? A bit. Hardware is now the real bottleneck. Buckle up!

English

174

Olivier Balais (overnetcity.bsky.social) retweetledi

Pinecone@pinecone·3 Ara

Our new models — fully integrated alongside our database — bring best-in-class retrieval to your applications: ✔️ Our new sparse embedding model — pinecone-sparse-english-v0 — boosts performance for keyword-based queries, delivering up to 44% and on average 23% better NDCG@10 than BM25 on TREC. ✔️ Our new reranking model — pinecone-rerank-v0 — improves search accuracy by up to 60% and on average 9% over industry-leading models on the BEIR benchmark. ✔️ @cohere's latest model — cohere-rerank-v3.5 — balances performance and latency for a wide range of enterprise search applications. Learn more by visiting our Model Gallery: docs.pinecone.io/models/overview

English

528

Olivier Balais (overnetcity.bsky.social) retweetledi

Pinecone@pinecone·3 Ara

First-of-its-kind Pinecone Knowledge Platform Powers Best-in-class Retrieval for Customers 💠 Industry-leading vector database capabilities combined with proprietary AI models help developers build up to 48% more accurate AI applications: faster & easier prnewswire.com/news-releases/…

English

1.1K

Olivier Balais (overnetcity.bsky.social)@overnetcity·23 Kas

Bye bye Twitter. Let’s meet on Bsky: overnetcity.bsky.social.

English

Olivier Balais (overnetcity.bsky.social)@overnetcity·20 Kas

I’ve been working and building products with generative AI for over 4 years now. It’s hard to impress me at this point. But damn, @windsurf_ai, your IDE is absolutely next-level! 🔥

English

278

Olivier Balais (overnetcity.bsky.social) retweetledi

Visual Studio Code@code·18 Kas

New in Copilot Chat... enhanced links for any workspace symbols that Copilot mentions 🔗 These links appear in responses as little pills, letting you jump directly to definitions for better understanding.

English

578

78.4K

Olivier Balais (overnetcity.bsky.social) retweetledi

goosewin@Goosewin·15 Kas

build failed

English

684

12.3K

418.8K

Olivier Balais (overnetcity.bsky.social) retweetledi

Jerry Liu@jerryjliu0·13 Kas

Pretty excited about this new RAG technique I cooked up 🧑‍🍳 A top issue with RAG chunking is it splits the document into fragmented pieces, causing top-k retrieval to return partial context. Also most documents have multiple hierarchies of sections: top-level sections, sub-sections, etc. This is also why lots of people are interested in exploring the idea of knowledge graphs - pulling in "links" to related pages to expand retrieved context. This notebook lets you retrieve contiguous chunks without having to spend a lot of time tuning the chunking algorithm, thanks to GraphRAG-esque metadata tagging + retrieval. Tag chunks with sections, and use the section ID to expand the retrieved set. Check it out github.com/run-llama/llam…

LlamaIndex 🦙@llama_index

We’re excited to feature a new RAG technique - dynamic section retrieval 💫 - which ensures that you can retrieve entire contiguous sections instead of naive fragmented chunks from a document. This is a top pain point we’ve heard from our community on multi-document RAG challenges - naive RAG returns fragmented context without awareness of the surrounding document. Our approach allows you to start off with a “simple” chunking technique (e.g. per page), but do a post-processing workflow to attach section/sub-section metadata. You can then do GraphRAG-like retrieval (two-pass retrieval): retrieve chunks, look up the attached section metadata, and then do a second call to return all chunks that match the section ID. github.com/run-llama/llam…

English

110

632

117.8K

Olivier Balais (overnetcity.bsky.social) retweetledi

Rohan Paul@rohanpaul_ai·11 Kas

Text chunking now matches human reading patterns by detecting natural breaks in information flow. Meta-chunking, proposed in this paper, uses probability patterns to find natural segment boundaries in documents, just like humans do Original Problem 🎯: Text chunking in Retrieval-Augmented Generation (RAG) systems often fails to maintain logical coherence between segments, leading to incomplete or fragmented information retrieval. Current methods rely on fixed-length splits or basic semantic similarity, missing crucial logical connections between sentences. ----- Solution in this Paper ⚡: • Meta-Chunking: A novel segmentation technique operating between sentence and paragraph levels • Two key strategies: - Margin Sampling: Uses LLMs for binary classification to determine segment boundaries based on probability differences - Perplexity (PPL) Chunking: Analyzes perplexity distribution to identify natural text boundaries • Dynamic combination approach to balance fine and coarse-grained segmentation • KV caching mechanism for handling longer texts efficiently ----- Key Insights 💡: • Smaller models (1.5B parameters) can effectively perform chunking tasks • PPL distribution characteristics guide optimal threshold selection • Dynamic chunk sizing preserves logical integrity better than fixed-length approaches • Re-ranking performance improves significantly with Meta-Chunking ----- Results 📊: • Outperforms similarity chunking by 1.32 on 2WikiMultihopQA while using only 45.8% processing time • PPL Chunking with Qwen2-1.5B achieves 0.3760 BLEU-1 score on single-hop queries • Maintains consistent performance across both Chinese and English datasets • Shows 3.59% improvement in Hits@8 metric when combined with PPLRerank

English

411

24.3K

Olivier Balais (overnetcity.bsky.social)@overnetcity·11 Kas

Peur sur les pistes cyclables : gros pneus et coups de sonnettes, la menace de la « SUVisation du vélo » – via @lemondefr lemonde.fr/m-perso/articl…

Français

Olivier Balais (overnetcity.bsky.social) retweetledi

Marko Denic@denicmarko·27 Eki

It's true.

English

109

2.4K

34.4K

1.4M

Olivier Balais (overnetcity.bsky.social) retweetledi

Yann LeCun@ylecun·21 Eki

I don't want to say "I told you so", but I told you so. Repeatedly. I was heavily criticized for it. Mocked, even.

François Chollet@fchollet

People have been rewriting history and saying that "everyone has always believed that LLMs alone wouldn't be AGI and that extensive scaffolding around them would be necessary". No, throughout most of 2023 (the "sparks of AGI" era) the mainstream bay area belief was that LLMs were *already* AGI, and that merely scaling their parameter count and training data size by ~2 OOM without changing anything else would lead to super-intelligence.

English

133

174

2.1K

339K

Olivier Balais (overnetcity.bsky.social) retweetledi

Rohan Paul@rohanpaul_ai·20 Eki

This project tries to implement a real-time replication of OpenAI’s groundbreaking O1 model. Exploring advanced reasoning capabilities and a specific "journey learning" mechanisms for AI. They propose a new approach: “journey learning”. This paradigm goes beyond the traditional focus on specific tasks and emphasizes continuous progress through learning, reflection, and adaptation.

English

170

19.1K

Olivier Balais (overnetcity.bsky.social)@overnetcity·18 Eki

🤯

Brad Costanzo@BradCostanzo

Wow! @HeyGen_Official just released today ability to have an AI avatar join a Zoom meeting and interact. I invited one of their AI avatars into a Zoom room and recorded this clip. Time to build my own now

ART

Olivier Balais (overnetcity.bsky.social) retweetledi

Brad Costanzo@BradCostanzo·18 Eki

English

183

1.1K

239.1K

Keşfet

@windsurf_ai @dunglas @cohere @lemondefr @Heygen_Official @elonmusk @BarackObama @taylorswift13