Charu
1.1K posts

Charu
@charu_mandhare
seeker of…. Humble me..




Deepseek is now #1 on the AppStore, surpassing ChatGPT—no NVIDIA supercomputers or $100M needed. The real treasure of AI isn’t the UI or the model—they’ve become commodities. The true value lies in data and metadata, the oxygen fueling AI’s potential. The future’s fortune? It’s in our data. Deepgold. 😇

DeepSeek-R1 is very hot, but the idea of LLM+RL framework is not entirely novel. In 2022, my Salesforce AI team was among the pioneers to propose the LLM + RL framework for joint training and inference, which was a couple years earlier than OpenAI O1 and DeepSeek-R1 works. We published a NeurIPS paper called CodeRL (arxiv.org/abs/2207.01780) that achieved open-source SOTA in code generation with less than 1B code LLM model (beating 10x larger models). Our idea was to apply the similar principles and methodologies of AlphaGo/AlphaZero to train and improve LLM in a self-play/self-taught manner, but the base LLM model size & performance was not strong enough (ChatGPT was not yet released).
















