Leo

1.7K posts

Leo banner
Leo

Leo

@_leander30

AI/ML professional actively seeking opportunities

India 参加日 Eylül 2011
275 フォロー中162 フォロワー
固定されたツイート
Leo
Leo@_leander30·
After 3 months of non stop building, I’m back with a new daily posting series. I just shipped three production AI projects (RAG system, job application agent, and GitHub portfolio reviewer). Starting today, I’ll showcase one deep dive every day. First up: My grounded RAG Q&A system that beats OpenAI File Search & Vectara on RAGAS benchmarks. Live: helpmateai.xyz
Leo tweet mediaLeo tweet media
English
5
0
0
105
Leo がリツイート
NeilXbt
NeilXbt@neil_xbt·
ANDREJ KARPATHY COULD HAVE CHARGED $500 FOR THIS WALKTHROUGH. He put it on YouTube. Every way he personally uses LLMs in his own life. Thinking models. Deep research. File uploads. Python interpreter. Claude Artifacts. Not theory. Not benchmarks. The actual daily workflow of the person who built Tesla Autopilot and co-founded OpenAI. 2 hours walking through his personal LLM workflow. The gap between people who watch this week and those who save it for later is not 2 hours. It is everything those 2 hours quietly change about how you work for the rest of your career.
English
37
225
2.1K
193.2K
Leo がリツイート
CyrilXBT
CyrilXBT@cyrilXBT·
ANTHROPIC JUST PROVED MOST PEOPLE HAVE NO IDEA HOW TO PROMPT CLAUDE. Their applied AI team dropped a 24 minute free workshop. Not a creator who reverse engineered it. Not a Reddit thread. ANTHROPIC. The people who wrote the weights. And what they showed is uncomfortable. There are 6 elements to a properly structured Claude prompt. Most people are using 1. Maybe 2. That is not a skill issue. That is an information issue. And it has been quietly costing you every single day. The outputs that felt slightly off. The responses you had to rewrite 4 times. The prompts that worked once and never again. All of it traces back to the same 6 missing elements. The people who watch this 24 minute workshop tonight will understand something about Claude that most daily users still do not know exists. The people who skip it will keep getting 30% of what the tool is actually capable of and wonder why the results never quite land. I watched it twice. Then I built a Claude Skill that applies all 6 elements to every prompt automatically. No more thinking about structure. No more guessing what Claude needs. The framework runs in the background every single time. Full breakdown and skill setup is below. Bookmark this now. Watch the workshop first. Then read the guide. This is the one that compounds. Follow @cyrilXBT for the exact prompt architecture, Claude skills, and systems I use to get outputs most people do not believe came from one person working alone.
English
167
691
7.3K
765.3K
Leo
Leo@_leander30·
It’s a good reminder that better RAG is not always about retrieving more. Sometimes it’s about retrieving more selectively and being more critical about the evidence before answering. #RAG #AI
English
0
0
0
13
Leo
Leo@_leander30·
Self-RAG pushes on that by making retrieval and critique part of the generation process. Instead of treating RAG as: retrieve -> stuff context -> answer the model learns to: - retrieve when needed - reflect on the evidence - critique its own response That’s a much more interesting framing than naive “top-k chunks + prompt”.
English
1
0
0
10
Leo
Leo@_leander30·
Today I’m reading Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. Paper: arxiv.org/abs/2310.11511 One idea I really like: RAG shouldn’t always retrieve the same way for every query.
English
1
0
0
15
Leo
Leo@_leander30·
This feels very relevant to what I’m building, My current retrieval already uses section synopses and summary aware retrieval for broad questions, but RAPTOR’s recursive summary hierarchy is a really interesting extension of that idea. #RAG #LLM #PaperReview
English
0
0
0
16
Leo
Leo@_leander30·
RAPTOR builds a hierarchy of summaries, so retrieval can happen at multiple levels of abstraction instead of only pulling nearby chunks. That’s especially useful for broad questions like: • What is this paper about? • What are the key findings? • How do these sections connect? This is where naive RAG usually starts to break.
English
1
0
0
10
Leo
Leo@_leander30·
Today I’m reading RAPTOR - Recursive Abstractive Processing for Tree-Organized Retrieval. Paper: arxiv.org/abs/2401.18059 One idea that really clicked for me: flat chunk retrieval is often not enough for long-document QA.
English
1
0
0
17
Leo
Leo@_leander30·
All benchmark reports are saved in the repo. if anyone wants to take a more closer look @RivraDev github.com/LEANDERANTONY/… Still working on it, I'll keep sharing the architecture, failures, and fixes as I go. Would especially welcome thoughts from people running RAG systems in production @Arjunjain #RAG #AI
English
0
0
0
29
Leo
Leo@_leander30·
The idea for the selector layer was this: The reranker selects good chunks. But ordering based purely on cross-encoder scores doesn't account for what the generator actually needs to answer the specific question. The selector layer adds an LLM layer that looks at query intent and promotes the most directly answerable chunk to position 1. Result: - Context precision: 0.9036 → 0.9608 - Faithfulness: 0.9310 → 0.9657 The generator now sees the most query-relevant evidence first, which keeps answers tighter and better grounded.
English
1
0
0
22
Leo
Leo@_leander30·
Yesterday I said HelpmateAI beats OpenAI File Search and Vectara. I was asked about the eval setup ,here’s exactly how I measure it.
English
1
0
0
22
Leo
Leo@_leander30·
@unbankedgroup it was pretty intense, still some ways to go
English
1
0
1
16
Mal
Mal@unbankedgroup·
@_leander30 3 months shipping and you already learned what most people take a year to figure out: the RAG system is the foundation. everything else is a feature on top
English
1
0
0
13
Leo
Leo@_leander30·
After 3 months of non stop building, I’m back with a new daily posting series. I just shipped three production AI projects (RAG system, job application agent, and GitHub portfolio reviewer). Starting today, I’ll showcase one deep dive every day. First up: My grounded RAG Q&A system that beats OpenAI File Search & Vectara on RAGAS benchmarks. Live: helpmateai.xyz
Leo tweet mediaLeo tweet media
English
5
0
0
105
Leo
Leo@_leander30·
@mem0ai I faced the same issues when i built my RAG system. Switched to hybrid retrieval + cross-encoder reranking, got much cleaner citation and backed answers with full evidence panels. A dedicated context layer feels like the right direction. Will explore how mem0 fits in.
English
1
0
1
374
Leo
Leo@_leander30·
Hybrid retrieval + query-aware routing + cross-encoder reranking → Supported answer rate: 0.8026 → 0.8816 Citation page-hit: 0.6974 → 0.8684 Outperforms on faithfulness, answer relevancy & context precision (tested on health-policy, thesis & research papers). Full citation trails + raw evidence panels Built with Next.js + FastAPI + ChromaDB • Deployed on VPS with Docker.
Leo tweet media
English
1
0
0
44