Charchit Sharma

514 posts

Charchit Sharma

@charchits7

✨ Exploring the crossroads of art & AI Goal is to build, learn, and share along the way!

Mumbai, Maharashtra Katılım Aralık 2016

896 Takip Edilen101 Takipçiler

Charchit Sharma@charchits7·6d

@itsrealranky @laminalabs @ycombinator Wrote this paper, which was accepted to ICLRW 2023, extended to ICML-W 2024 Credit would help me extend further to push for full conference. pml4dc.github.io/iclr2023/pdf/P… arxiv.org/abs/2409.04041

English

347

Ranky@itsrealranky·6d

You don't need a $2M pre-seed to start building deep tech. When I started building @laminalabs (@ycombinator P26), I had no funding, no team of 10 engineers, and a vision that required serious GPU compute and AI infrastructure. So I did what any desperate founder would do. I cold emailed. I wrote to @agupta , who was building the YC student credits program. I told him I was going all in on building a deep tech project, why it needed serious compute, and the commitment I was putting behind it. I sent that email at 10:50 AM on November 14th. He replied at 10:52 AM. Two minutes. That reply changed everything. Thank you Ankit, that early access was the unlock. From there, it was months of grinding through architecture after architecture. Rewriting core pipelines more times than I can count. Shipping, breaking, rebuilding. Just me, Claude Code, and Codex running in parallel, the closest thing an early founder has to a 10-person engineering team, except they never call in sick. AI coding agents are the single greatest force multiplier available to founders right now. I'm not exaggerating. The leverage is unreal. Here's the thing most people get wrong: you don't need a massive round to get something real off the ground. You need compute credits, the right AI tools, and the willingness to grind through hundreds of iterations until the architecture clicks. If you're a student or early founder sitting on an idea that feels too ambitious, just start. Email the people building the programs. Apply for every credit you can find. Reach out to people you think won't respond. They will. The infrastructure to build serious things as a solo or two person team has never been more accessible. The funding comes after you've already started building something real. Because someone gave me that first unlock, I want to do the same: I'm giving away 5 x $50 Claude Code credits. And whoever ships the best project with that gets $200 in Claude credits from me personally. I know firsthand how much potential $200 in credits has for a builder who's willing to grind. Just comment below with the link to the coolest thing you've built. I will DM you myself.

English

143

595

45.6K

Charchit Sharma@charchits7·18 Mar

@ariG23498 @RisingSayak cool stuff, will go through the video for sure!

English

Aritra 🤗@ariG23498·18 Mar

This was part of our (/w @RisingSayak) talk for the PyTorch Day in India. If you want to take a look at the entire recording, here it is! Hope you all like it. youtu.be/q0GfzJmuaUM?si…

YouTube

English

2.3K

Aritra 🤗@ariG23498·18 Mar

When you run a @PyTorch model on a GPU, the acutal work is executed through kernels. These are low-level, hardware-specific functions designed for GPUs (or other accelerators). If you profile a model, you'll see a sequence of kernel launches. Between these launches, the GPU can sit idle, waiting for the next operation. A key optimization goal is therefore to minimize gaps between kernel execution and keep the GPU fully utilized. One common approach is `torch.compile`, which fuses multiple operations into fewer kernels, reducing overhead and improving utilization. Another approach is to write custom kernels tailored to specific workfloads (e.g., optimized attention or fused ops). However, this comes with significant challenges: > requires deep expertise in kernels writing > installation hell > integration with the model is non-trivial To address this,@huggingface introduces the `kernels` library. With this one can: > build custom kernels (with the help of a template) > upload them to the Hub (like models or datasets) > integrate them to models with ease Let's take a look at how the transformers team use the kernels library to integrate it into the already existing models. (more in the thread)

English

1.2K

83K

Charchit Sharma@charchits7·18 Mar

@RisingSayak github.com/huggingface/di…

QME

Sayak Paul@RisingSayak·18 Mar

That's the tweet. Find the PR, test the code, and enjoy 🧨

English

2.1K

Charchit Sharma retweetledi

Sebastian Raschka@rasbt·15 Mar

I (finally) put together a new LLM Architecture Gallery that collects the architecture figures all in one place! sebastianraschka.com/llm-architectu…

English

202

1.5K

8.2K

711K

Charchit Sharma@charchits7·7 Mar

@ariG23498 Beautifully written. I'm sure the collection you build will carry forward your father's legacy.

English

Aritra 🤗@ariG23498·7 Mar

x.com/i/article/2030…

ZXX

1.9K

Charchit Sharma@charchits7·3 Mar

@RisingSayak @AnthropicAI @huggingface 🥳🥳

QME

322

Sayak Paul@RisingSayak·3 Mar

Thanks @AnthropicAI. Thanks @huggingface for letting me work on Diffusers and other open-source projects across the fleet.

English

698

45.9K

Charchit Sharma@charchits7·1 Mar

It's always good to come back to basics! Thanks @DeepLearningAI , @AndrewYNg , @lmoroney using #AMDevs #AMD

English

Charchit Sharma@charchits7·28 Şub

Thanks for this!

Aritra 🤗@ariG23498

The Mixture of Experts (MoE) inside 🤗 Transformers is out now! This is going to be a long tweet, so if you just want to jump to the blog, the link is in the thread. We already had a great blog post on MoEs (which has more than 1k upvotes 😯 at the time of writing). The reason we wanted to build another blog post altogether was just noticing how far we have come in the realm. This blog post is not meant to be another "What is MoEs and how to implement them". Rather talk about how the transformers team at @huggingface made MoEs the "first class citizen" of the library and the Hub. The transformers library and the entire ecosystem was built around dense architectures, but now with the rapid growth of MoEs, it was inevitible to build around MoEs and not consider them as "just another model addition". In the post we talk about better model loading, expert backend, expert parallelism, and also @UnslothAI and out collaboration on training MoEs faster! In the process of building the blog post, I also understood how beautiful the ideas are, and ended up making my first YouTube video on the routing algorithm alone. I am very proud of this project and I think it shows in some paragraphs of the blog post. I am also very thankful to all the people who helped me in the project, I am really happy to be in the team that helps me flourish! Glad to be alive. PS: I owe you all an apology for delaying the release. I hope I (and the team) could make it worth the wait.

English

184

Charchit Sharma retweetledi

Alvaro Bartolome@alvarobartt·5 Oca

👾 `hf-mem` is all you need to estimate the required VRAM for inference of any model on @huggingface based on Safetensors metadata. - Written in Python - Lightweight, only depends on `httpx` - Runs w/ @astral_sh `uvx` as `uvx hf-mem --model-id ...` - Works with any Safetensors repository - Output inspired by @usgraphics TR-100 Machine Report

English

104

845

66.2K

Charchit Sharma@charchits7·4 Oca

Thanks to Claude, updated my personal website: charchit7.github.io

English

Charchit Sharma retweetledi

Internet Freedom Foundation (IFF)@internetfreedom·3 Ara

Statement The PIB has just issued a statement at 3:00 PM on December 3, 2025 that the government will not make pre-installation of the Sanchar Saathi app mandatory for mobile manufacturers. This is a welcome development, but we are still awaiting the full text of the legal order that should accompany this announcement, including any revised directions under the Cyber Security Rules, 2024. Everyone who raised their voice, reported on the issue, or pushed back against this mandate deserves credit for bringing us to this point. For now, we should treat this as cautious optimism, not closure, until the formal legal direction is published and independently confirmed. pib.gov.in/PressReleasePa…

Internet Freedom Foundation (IFF) tweet media

English

602

2.2K

109.1K

Charchit Sharma retweetledi

Rahul@selfawareatom·3 Ara

The scale of this achievement is hard to comprehend without understanding what the Dandakrama patha is. What is being celebrated is not the boy memorizing 2000 mantras of the Shukla Yajurveda; it is the command he has over them. The Dandakrama is one of the most difficult patterns to follow during recitation (which was created to be a highly redundant encryption of the original mantras, to avoid errors creeping into oral transmission, but that story is for another day). Assume a simple mantra containing 6 words: 1 2 3 4 5 6. The pattern of recitation for this mantra will be: 12 12 21 12 23 321 12 23 34 4321 12 23 34 45 54321 12 23 34 45 56 654321 On average, a mantra will have between 6-15 words. Imagine reciting 2000 mantras in the above pattern, without mistake, from memory!

Narendra Modi@narendramodi

19 वर्ष के देवव्रत महेश रेखे जी ने जो उपलब्धि हासिल की है, वो जानकर मन प्रफुल्लित हो गया है। उनकी ये सफलता हमारी आने वाली पीढ़ियों की प्रेरणा बनने वाली है। भारतीय संस्कृति में आस्था रखने वाले हर एक व्यक्ति को ये जानकर अच्छा लगेगा कि श्री देवव्रत ने शुक्ल यजुर्वेद की माध्यन्दिन शाखा के 2000 मंत्रों वाले 'दण्डकर्म पारायणम्' को 50 दिनों तक बिना किसी अवरोध के पूर्ण किया है। इसमें अनेक वैदिक ऋचाएं और पवित्रतम शब्द उल्लेखित हैं, जिन्हें उन्होंने पूर्ण शुद्धता के साथ उच्चारित किया। ये उपलब्धि हमारी गुरु परंपरा का सबसे उत्तम रूप है। काशी से सांसद के रूप में, मुझे इस बात का गर्व है कि उनकी यह अद्भुत साधना इसी पवित्र धरती पर संपन्न हुई। उनके परिवार, संतों, मुनियों, विद्वानों और देशभर की उन सभी संस्थाओं को मेरा प्रणाम, जिन्होंने इस तपस्या में उन्हें सहयोग दिया।

English

352

2.3K

89.2K

Charchit Sharma retweetledi

Mila - Institut québécois d'IA@Mila_Quebec·27 Eki

Congratulations to @Yoshua_Bengio, founder and scientific advisor of Mila, who has become the first researcher in the world to surpass one million citations on Google Scholar, the leading platform for academic and scientific research. A remarkable milestone that highlights the profound and global impact of his work in artificial intelligence and deep learning. mila.quebec/en/news/ai-res…

LawZero - LoiZéro@LawZero_

Our Founder and Scientific Director @Yoshua_Bengio has become the first living researcher to surpass 1 million citations on Google Scholar, a testament to the foundational and global impact of his work. Congratulations Yoshua!

English

401

73.5K

Charchit Sharma@charchits7·16 Eki

@ariG23498 @Alibaba_Qwen @vikhyatk We all will win. Rooting for Prof. Oak though.

English

Aritra 🤗@ariG23498·15 Eki

Who would win? Qwen3-VL-4B OR Moondream3-preview? I am taking bets. CC: @Alibaba_Qwen @vikhyatk

English

402

Charchit Sharma@charchits7·27 Eyl

@anshitasaini_ Please give tips!

English

Anshita Saini@anshitasaini_·27 Eyl

discovered a secret talent recently... no one on my team can beat me 😇

English

191

2.3K

232.8K

Charchit Sharma@charchits7·8 Eyl

@aryanvs_ @huggingface You have come a long way, Aryan, inspiring. Good luck 🤞

English

468

Aryan V S@aryanvs_·8 Eyl

Starting my last week @huggingface today 🥺 It's been such a fun journey to work with my awesome colleagues here and collaborate on soooo many model releases. Deeply grateful for all the interactions, learnings, visits to France, opportunity to contribute on wide range of subjects, my first full-time role fresh out of uni, and many many other things 🤗 My next journey is with a great bunch of people in SF. I'll still be working on diffusion models but in stealth for the next couple months - y'all are not ready for what we will build :)

English

145

12.2K

Charchit Sharma retweetledi

Marques Brownlee@MKBHD·31 Ağu

I LOVE good analogies reddit.com/r/explainlikei…

English

161

15.1K

925.5K

Charchit Sharma retweetledi

Slotoholic@Slotoholic·1 Eyl

My word.

English

450

7.2K

231.8K

Charchit Sharma retweetledi

Sayak Paul@RisingSayak·4 Ağu

Wait is over 🤯 An Apache 2.0 DiT-based image generation model from @Alibaba_Qwen -- Qwen-Image 🔥 Supported in Diffusers. Training script PR is up and should be merged soon. Go, fire!

English

270

33.1K

Charchit Sharma retweetledi

clem 🤗@ClementDelangue·23 Tem

It’s time for the American AI community to wake up, drop the "open is not safe" bullshit, and return to its roots: open science and open-source AI, powered by an unmatched community of frontier labs, big tech, startups, universities, and non‑profits. If we don’t, we’ll be forced to build on foreign foundations and risk losing the race in AI altogether by lack of local innovation and competition. Let’s go!

clem 🤗@ClementDelangue

Love to see this from @WhiteHouse!

English

558

65K

Keşfet

@itsrealranky @laminalabs @ycombinator @agupta @ariG23498 @RisingSayak @PyTorch @huggingface