Harshal Dharpure

131 posts

Harshal Dharpure banner
Harshal Dharpure

Harshal Dharpure

@mharshald

🚀 IIT Patna | NLP Research | Multimodality 🤖 Exploring AI, DevOps & emerging tech 🛠️ Sharing tools, trends & tutorials

India Katılım Haziran 2022
1.1K Takip Edilen54 Takipçiler
Harshal Dharpure retweetledi
Data Science Dojo
Data Science Dojo@DataScienceDojo·
📢 Kimi AI just released a paper showing you can match the performance of a model trained with 1.25x more compute by changing one thing: how residual connections work. The core problem is something that has been sitting inside every transformer since 2015. When layer outputs accumulate through a network, every layer gets the same fixed weight of 1. By layer 50, earlier layers are contributing so little to the final result that research has shown you can remove a significant fraction of them entirely with barely any performance drop. The model had already learned to ignore them. Attention residuals replace that fixed accumulation with a learned weighted sum over all previous layer outputs. Each layer computes a small search query, scores every earlier layer's output for relevance, and builds its input from the most useful ones. The weights adapt per input rather than staying fixed, which is what makes the difference. Tested on a 48B parameter model trained on 1.4T tokens, the gains hold across every benchmark. GPQA-Diamond up 7.5 points. Math up 3.6. HumanEval up 3.1. The largest improvements are on multi-step reasoning tasks, which makes sense — those are exactly the tasks where later layers need to selectively build on what earlier layers figured out. Full breakdown in the blog. Link in the replies! #AttentionResiduals #KimiAI #LLM #DeepLearning #AIResearch #GenerativeAI #DataScience
Data Science Dojo tweet media
English
1
7
47
1.9K
Harshal Dharpure retweetledi
Alex Ratner
Alex Ratner@ajratner·
We are hiring for a ton of roles on our #Research team @SnorkelAI - if interested please reply/reach out! As one of the first academic teams to focus on AI data development back at @StanfordAILab / @UW - we have long believed this is one of *the* most exciting areas to be as a researcher :) Today - as a frontier data lab & partner to the world's leading AI labs and companies - we have more research vectors than we can possibly handle! Come help us tackle problems in complex environment generation; long-horizon and non-stationary benchmarking; complex rubric and process reward design; data valuation and curriculum learning; core data quality control; human-in-the-loop system design; large scale RL systems; and more!!
English
29
36
471
31K
Harshal Dharpure
Harshal Dharpure@mharshald·
Most AI chatbots are wrong because they DON'T use RAG. 1. User asks a question. 2. Instead of guessing, the system searches a knowledge base. 3. Relevant documents are retrieved. 4. Those documents are added to the prompt. 5. The LLM generates an answer using real data.
English
0
1
1
8
Harshal Dharpure retweetledi
Chidanand Tripathi
Chidanand Tripathi@thetripathi58·
Insane A complete 7-week Agentic RAG bootcamp was just open-sourced. AI academies charge $10,000 for this curriculum. You can get it for free. It covers everything from basic keyword search to building production-grade Agentic RAG systems with LangGraph. This is not a toy project tutorial. It is a full production pipeline. Here is what is inside: - 7 weeks of building an AI research assistant from scratch - Complete infrastructure setup with Docker, FastAPI, and PostgreSQL - Production keyword and hybrid search using OpenSearch - Local LLM deployment with streaming responses - Production monitoring with Langfuse tracing and Redis caching - Agentic workflows using LangGraph and Telegram bots Here is the core value: It forces you to build the way successful companies do. You do not just jump to vector search. You build solid search foundations first, then enhance with AI. Theory and practice in one place. Thousands of developers are using this to master production AI. Summary of the Production Agentic RAG Course: - It gives you a senior AI engineer curriculum for free - It bridges the gap between basic RAG and production systems - It forces you to build an actual end-to-end portfolio project You still have to write the code. It just removes the guesswork.
Chidanand Tripathi tweet media
English
15
54
315
45.5K
Harshal Dharpure retweetledi
Ronin
Ronin@DeRonin_·
How to become AI engineer in next 6 months: By the end, you want to be able to: - build LLM apps end-to-end - use APIs from OpenAI / Anthropic / open-source stacks - design prompts and context properly - add tool calling and structured outputs - deploy real projects So, let’s discuss your roadmap month by month Month 1: Get solid enough in coding and fundamentals What to learn: - Python really well - Git + GitHub - CLI / terminal basics - JSON, APIs, HTTP, async basics - basic SQL - basic data handling with pandas - virtual environments, package management, error handling - FastAPI or Flask Month 2: Master LLM app development What to learn: - prompting fundamentals - system vs user instructions - structured outputs / JSON schemas - function/tool calling - streaming responses - conversation state - cost / latency / token basics - failure handling - prompt injection awareness Month 3: Learn RAG properly What to learn: - embeddings - chunking - vector databases - metadata filtering - reranking - retrieval quality issues - hallucination reduction - citations and grounding Month 4: Agents, tools, workflows, evals - agent loops - tool selection - state management - retries - when NOT to use agents - multi-step workflows - evaluation harnesses - task success metrics Month 5: Deployment, product thinking, and reliability What to learn: - FastAPI production patterns - Docker - background jobs - queues - auth + API key security - logging - observability - prompt/version management - eval dashboards - cost monitoring - rate limits - caching Month 6: Specialize and become hireable these knowledge and skills you gained can be applied in three directions you need to choose one of them and focus on practice although everything mentioned above is also best learned purely through practice Direction 1: AI product engineer Best if you want startup jobs fast Focus on: - LLM apps - RAG - agents - deployment - product UX Direction 2: Applied ML / LLM engineer Focus on: - fine-tuning - when to fine-tune vs prompt - evaluation - inference optimization - open-source models - training pipelines Direction 3: AI automation engineer Focus on: - workflow orchestration - business process automation - multi-tool systems - CRM, docs, email, support, ops use cases This roadmap will help you go through a practical path, and the key is to study each of these points and then test them in real work By month six, you will already have several built products or examples of completed tasks And it will be much easier to get a job as an AI engineer Save it so you don't lose it and can return to study later
Ronin tweet media
English
132
607
4.4K
781.5K
Harshal Dharpure retweetledi
FAR.AI
FAR.AI@farairesearch·
LLM safety filters can look strong until prompts are rephrased. Sravanti Addepalli’s ReG-QA generates semantically related, natural variants to stress-test safeguards, showing models can block direct requests yet answer closely reworded ones.
English
28
138
1.5K
3.7M
Harshal Dharpure retweetledi
Daily Dose of Data Science
Daily Dose of Data Science@DailyDoseOfDS_·
The ultimate Full-stack AI Engineering roadmap to go from 0 to 100.
Daily Dose of Data Science tweet media
English
2
24
134
5.1K
Harshal Dharpure retweetledi
Aurimas Griciūnas
Aurimas Griciūnas@Aurimas_Gr·
𝗔𝗜 𝗔𝗴𝗲𝗻𝘁’𝘀 𝗠𝗲𝗺𝗼𝗿𝘆 is the most important piece of 𝗖𝗼𝗻𝘁𝗲𝘅𝘁 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴, this is how we define it 👇 In general, the memory for an agent is something that we provide via context in the prompt passed to LLM that helps the agent to better plan and react given past interactions or data not immediately available. It is useful to group the memory into four types: 𝟭. 𝗘𝗽𝗶𝘀𝗼𝗱𝗶𝗰 - This type of memory contains past interactions and actions performed by the agent. After an action is taken, the application controlling the agent would store the action in some kind of persistent storage so that it can be retrieved later if needed. A good example would be using a vector Database to store semantic meaning of the interactions. 𝟮. 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 - Any external information that is available to the agent and any knowledge the agent should have about itself. You can think of this as a context similar to one used in RAG applications. It can be internal knowledge only available to the agent or a grounding context to isolate part of the internet scale data for more accurate answers. 𝟯. 𝗣𝗿𝗼𝗰𝗲𝗱𝘂𝗿𝗮𝗹 - This is systemic information like the structure of the System Prompt, available tools, guardrails etc. It will usually be stored in Git, Prompt and Tool Registries. 𝟰. Occasionally, the agent application would pull information from long-term memory and store it locally if it is needed for the task at hand. 𝟱. All of the information pulled together from the long-term or stored in local memory is called short-term or working memory. Compiling all of it into a prompt will produce the prompt to be passed to the LLM and it will provide further actions to be taken by the system. We usually label 1. - 3. as Long-Term memory and 5. as Short-Term memory. And that is it! The rest is all about how you architect the topology of your Agentic Systems. Learn all of this hands-on in my End-to-end AI Engineering Bootcamp (we are kicking off in 2 weeks!). 🎁 Get a 15% discount via this link: maven.com/swirl-ai/end-t… Any war stories you have while managing Agent’s memory? Let me know in the comments 👇
Aurimas Griciūnas tweet media
English
30
53
323
16.6K
Harshal Dharpure retweetledi
Matt Dancho (Business Science)
🚨BREAKING: Uber launches QueryGPT Natural Language to SQL Using Generative AI.
Matt Dancho (Business Science) tweet media
English
16
55
571
55K
Harshal Dharpure retweetledi
Daily Dose of Data Science
Daily Dose of Data Science@DailyDoseOfDS_·
Google dropped another banger! PaperBanana is an agentic framework that generates publication-ready academic illustrations from methodology descriptions. no manual design, no Figma, just your method section and a caption.
Daily Dose of Data Science tweet media
English
18
188
1.1K
57.6K
Harshal Dharpure retweetledi
Brendan (can/do)
Brendan (can/do)@BrendanFoody·
I'm hiring some of the world's top AI researchers to join a new team I'm creating at Mercor, focused on building frontier benchmarks. If you're an exceptional fit (or know someone else who is), dm me!
English
53
37
816
72K
Harshal Dharpure retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
A small Qwen3.5 from-scratch reimplementation for edu purposes: github.com/rasbt/LLMs-fro… (probably the best "small" LLM today for on-device tinkering)
Sebastian Raschka tweet media
English
47
509
3.1K
150.9K