Dheeraj Mishra

133 posts

Dheeraj Mishra banner
Dheeraj Mishra

Dheeraj Mishra

@mishra945

MTech, (CSP & ML) ,EECS@ IIT Bombay | Founder @ EECS Academy & POSTGATE -EdTech Start up for GATE Test Prep Online Platform

India Katılım Ağustos 2012
473 Takip Edilen20 Takipçiler
Dheeraj Mishra
Dheeraj Mishra@mishra945·
Thank you @cwolferesearch for sharing this ! I am currently working on it for my master thesis !
Cameron R. Wolfe, Ph.D.@cwolferesearch

Reinforcement Learning (RL) is quickly becoming the most important skill for AI researchers. Here are the best resources for learning RL for LLMs… TL;DR: RL is more important now than it has ever been, but (probably due to its complexity) there aren’t a ton of great resources for learning it online. I’ve been doing a lot of reading / learning on RL recently, so I wanted to share the best resources I’ve found. Links to all resources are provided in the image below. (1) RLHF book. Nathan is a long-time RL researcher and an expert on LLM alignment / post-training. He decided to write an entire book on (LLM-focused) RL techniques and has been slowly expanding / iterating on the book over the last year. This is the most comprehensive RL resource that is currently available, and it’s an especially great resource for those who are unfamiliar with RL and still need to learn the basics. (2) The Spinning up with Deep RL Course from OpenAI–despite being created in ~2018–has stood the test of time and is one of the best tutorials for learning RL. This course builds up to understanding PPO, which is one of the most widely used algorithms for RL with LLMs. Plus, understanding related algorithms (policy gradients, TRPO, etc.) will help a lot with gaining an understanding of new RL algorithms like GRPO. (3) PPO / GRPO blog. Jimmy Shi (DeepMind) recently wrote a great blog explaining both PPO (RL algo traditionally used for RLHF) and GRPO (RL algo used for reasoning models). This blog is great and it’s written in a way that is understandable for non-RL people. (4) HuggingFace RL. HuggingFace has also published numerous useful blogs on the topic of RL. Most recently, they published a blog that explains GRPO and PPO from the ground up (i.e., not assuming any background knowledge on RL). These blogs are inspired by the recent initiative from HuggingFace to create a fully open replication of DeepSeek-R1.

English
0
0
0
7
Dheeraj Mishra retweetledi
EECS Academy
EECS Academy@EECSAcademy·
EECS Chit-Chat : GATE 2026 Post GATE Guidance . We are gearing up for GATE DA & CS & ECE & EE & IN & AE 2027 . Stay tuned . @mishra945 Stay in touch with Dheeraj Mishra Sir
EECS Academy tweet media
English
0
1
1
22
Dheeraj Mishra retweetledi
Khairallah AL-Awady
Khairallah AL-Awady@eng_khairallah1·
🚨 Anthropic's own team just showed how to actually use Claude Code properly. 30 minutes. free. the person who created Claude Code. watch the workshop. bookmark it. worth more than every $500 course you almost bought. you've been using Claude without knowing 40 of its commands. Then read the guide below.
Khairallah AL-Awady@eng_khairallah1

x.com/i/article/2047…

English
263
5K
30K
7.3M
Dheeraj Mishra
Dheeraj Mishra@mishra945·
@raghav_chadha , has given a big try to take his voice optimize the prices of the food and water in the airport to minimize across India in front of Govt of India ...But it did not happen anywhere ,As of now as well everywhere the same cost is applied ..Water bottle is still 100
English
0
0
0
6
Dheeraj Mishra
Dheeraj Mishra@mishra945·
Will be flying to Delhi Today from Mumbai !! Gonna be there for couple of days . Looking forward to have greater experience in terms of weather in Delhi 😁.. #DelhiTrip #AkasaTravel .
English
0
0
0
7
Dheeraj Mishra
Dheeraj Mishra@mishra945·
Many Congratulations Team India for World Cup T20 ,2026 ! It was highly emotional listening to @bhogleharsha the way you narrate the story of World Cricket !.. #ProudMoments !!
English
0
0
0
27
Dheeraj Mishra retweetledi
Amit Sethi
Amit Sethi@amit_sethi·
Arey India ke industry walon, koi IIT Bombay main bhi khol do is tarah ki fundu lab. Tumhe nahi chahiye CUDA optimized kernel ya koi aur innovation? Mere bachche taras gaye hain GPU access aur industry-relevant problem pe kaam karne ke liye. cuda-agent.github.io
हिन्दी
33
120
1.2K
49.2K
Dheeraj Mishra
Dheeraj Mishra@mishra945·
Covering complete spectrum of AI as a part of MTech at IIT Bombay ! 1. Perception AI 2.Generative AI 3. Agentic AI 4. physical AI Courses at IIT Bombay are assisting in realising real potential of AI and ML.. #IITBombay #ML #AI
English
0
0
0
12
Dheeraj Mishra retweetledi
Narendra Modi
Narendra Modi@narendramodi·
It was a delight to meet Mr. Sundar Pichai on the sidelines of the AI Impact Summit in Delhi. Talked about the work India is doing in AI and how Google can work with our talented students and professionals in this field. @sundarpichai
Narendra Modi tweet mediaNarendra Modi tweet media
English
1.8K
8.9K
109.8K
7.7M
Dheeraj Mishra retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
The hottest new programming language is English
English
2.2K
8.8K
71.9K
12.5M
Dheeraj Mishra retweetledi
Amit Sethi
Amit Sethi@amit_sethi·
My ongoing course on Advanced Machine Learning (Deep Learning) is available on CDEEP @IITBombay at: cdeep.iitb.ac.in/vod/vodCloud/c… Topics covered so far: 1. CNNs 2. Intro to NLP 3. RNN, LSTM, GRU 4. Transformers Coming soon: GNNs, State space models, Regularization, training methods
English
0
1
18
768
Dheeraj Mishra
Dheeraj Mishra@mishra945·
Village : Bara 396 Paipkhar Post - Mankahari Rewa ! Vidhanshabha : Semaria : No electricity : Continuous Disruption of electricity ! Why things are not getting fixed ? Why they are not able to give 24 by 7 proper electricity ? How student will study ? How tough @CMMadhyaPradesh
English
0
0
0
39
Dheeraj Mishra
Dheeraj Mishra@mishra945·
Hello Everyone, I am not the part of Future Funk_IITian aka Dazzling Career now where I Worked for more than a years. I will release a short video on official youtube channel of mine since many students have been asking to me to avoid confusion #GATE #GATEDA #GATECS Rgds
English
0
0
1
80
TeamYouTube
TeamYouTube@TeamYouTube·
@mishra945 For security reasons, it's best to ask the channel owner to reach out to us (@TeamYouTube) directly so that we can help them with their account
English
1
0
1
58