Ani

647 posts

Ani banner
Ani

Ani

@curiousZeedX

Curious lurker

San Francisco, CA Katılım Haziran 2023
454 Takip Edilen1.8K Takipçiler
Sabitlenmiş Tweet
Ani
Ani@curiousZeedX·
Our paper got accepted at NeSy 2025!! 3 months of hard work, debugging, and writing—all worth it 🙌 Super proud of this one.
Ani tweet media
English
0
1
20
2.7K
Ani
Ani@curiousZeedX·
@Mayank_022 Great job. The amount of grind it took to make this is evident. Thanks for sharing!
English
1
0
1
203
Ani
Ani@curiousZeedX·
@o_v_shake Added to my reading list. You are doing God's work!
English
0
0
1
75
himanshu
himanshu@himanshustwts·
Career update: Excited to share that I have joined the incredible team at @smallest_AI to work on Research x Devrel! The team is cooking incredible small + efficient multi-modal models and it feels like an exciting time to push the frontier on scale!
himanshu tweet media
English
209
31
1.9K
63.8K
Ani
Ani@curiousZeedX·
@himanshustwts I love this format of how you share these posts where you share screenshots and then Highlight the more impactful parts of the post. Many a times I have read these highlights and then gone back to check the original post.
English
0
0
1
1.1K
himanshu
himanshu@himanshustwts·
a significant % of ml researchers might be hooked by what happened in ONE day. ai seems to be doing a research loop fascinatingly well (understand the problem + propose a change + train/test it + measure results + keep the better version + repeat) and genuinely reducing research friction. we are early to automated experimentation, frontier scale could be an interesting watch.
himanshu tweet mediahimanshu tweet media
English
12
17
464
47.8K
Ani
Ani@curiousZeedX·
@rubylawn Ofcourse. Happy to help if I can.
English
0
0
1
48
Rubylawn 🪬
Rubylawn 🪬@rubylawn·
@curiousZeedX Thank you for sharing this champ! First time seeing the course was on tiktok, i’m jumping on this after my exams in few days. Is it possible to reach out to you if i need some help?
English
1
0
1
29
Ani
Ani@curiousZeedX·
I love this course from Stanford. 17 lectures. ~20 hours. Completely free. Honestly? This is one of the cleanest end-to-end walkthroughs of how LLMs are actually built, not just used. They start from scratch: – data collection & cleaning – transformers from first principles – training at scale – evaluation before deployment Love the fact that no hand-wavy “magic happens here”. What you’ll really learn: • Tokenization (beyond BPE buzzwords) • PyTorch + compute planning • Architecture choices & tuning • Mixture of Experts (MoE) • GPUs, memory & training bottlenecks • Kernels, Triton, and why they matter • Parallelism strategies • Scaling laws (what actually scales, what doesn’t) • Eval beyond “looks good to me” • SFT & RLHF, with tradeoffs If you want to understand LLMs, not just prompt them, this is a gold mine. Bookmark-worthy
Ani tweet media
English
2
0
1
87
Ani
Ani@curiousZeedX·
Just realised that the people behind Mercury 2 are all professors from Stanford, UCLA, Cornell and they have a course on deep generative models. Learning from people who have actually shipped cutting edge foundational model is something else. The projects are especially worth checking out.
Ani tweet media
Inception@_inception_ai

Mercury 2 is live. The world's first reasoning diffusion LLM – 5x faster than leading speed-optimized autoregressive models. Built for production: multi-step agents without delays, voice AI with tight latency budgets, instant coding feedback. Diffusion-based generation enables parallel refinement, not sequential tokens. Faster. More controllable. Dramatically lower inference cost. Available today on the Inception API. @dinabass has the story in @business.

English
2
0
4
277
Ani
Ani@curiousZeedX·
The scope and need of an AI infra guy is at its peak. If you speak to any reputed recruiter, they are gonna tell you the same thing. Demand for someone good at infra is insane. Looking at the contents and the team that came up with the book, this can be the ultimate inference guide. They don't even gatekeep it by asking for a work email. If you give your personal email and get the book. Just get this, load the epub and read it, future you will thank you.
Ani tweet media
Philip Kiely@philipkiely

Inference Engineering launches today. baseten.com/inference-engi…

English
0
0
11
748
Ani
Ani@curiousZeedX·
@archiexzzz This is amazing. One suggestion you could try pencil.dev for frontend. I used to struggle with frontend and design but this extension helped a lot with guiding the AI on how to build good looking frontend
English
3
1
60
4.3K
Archie Sengupta
Archie Sengupta@archiexzzz·
it has been just 3 days since i started my minimax-m2.5 coding plan plus subscription, and it has already crossed 922,796,434 (922 million) total tokens across 50+ sessions. i have completed many parallel workstreams in these three days with minimax-m2.5 as the default model via claude code. i finally feel like a "100x engineer" - i was just monitoring and deciding what to do next. i have created ~28 agents with skills .md inside claude code that will start running 24/7 in the coming days building software, frontend architecture, backend systems, model training, scientific research, qa testing, running evals, and more. i'm not great at frontend, so i talked to some frontend friends and they advised me to write the frontend skills .md file by providing it every possible detail frameworks, component libraries, design systems, brand guidelines, typography, using react-hook-form and tanstack query, creating an axios instance instead of calling apis directly with axios, pre-setting up all the asset files for it to acess and not generate icons on the fly and so on. otherwise you always end up with sloppy ai-generated code that every other person gets too - the same purple/blue gradient websites.
Archie Sengupta tweet media
English
33
31
728
48.4K
Ani
Ani@curiousZeedX·
@himanshustwts Best of luck for your next stint bro.
English
0
0
1
427
himanshu
himanshu@himanshustwts·
Some update: I have decided to step down from my roles at Upsurge Labs. It was quite a ride building products and creating a mindshare. Nothing but love to my former team and incredible people i worked with!
English
96
4
596
48.8K
Ani
Ani@curiousZeedX·
@neural_avb Absolutely love your content. Looking forward to the next banger.
English
0
0
1
100
AVB
AVB@neural_avb·
Had one of the most grueling recording sessions of my life last night. 3.5 hours of yap.💀 Got everything I wanted. I even went off-script and busted out an excalidraw diagram on the fly. I'll try to edit and ship the RLM video this weekend.
AVB tweet media
English
5
3
122
4K
Ani
Ani@curiousZeedX·
@o_v_shake I love what you have built here. Such indepth and practical stuff.
English
1
0
1
138
Abhishek Maiti
Abhishek Maiti@o_v_shake·
job board is live on workatafrontierlab.com/jobs (sign in required). for candidates: access all the frontier lab jobs from a single place. no need to search across different platform. get your interview chops from the lessons and get interviewed by the top labs. added jobs from few labs right now. will add more. for companies: some of the highest quality candidates, primed to keep your gpus buzzing can be found here. if you are an ai lab wanting to grab eyeballs of such students, DM me. some stats: we are now ~600 members strong with 3k visits/day and growing at an exp rate, all under 72 hours from launch.
Abhishek Maiti tweet media
English
6
1
113
12.7K
Ani
Ani@curiousZeedX·
Randomly came across this video. Spent 3 hours watching it. Worth every bit of it. We need to see more of these kinds of scaling ,monitoring ,network and debugging high stakes 1 million request/sec kinda videos.
Ani tweet media
English
0
0
2
135
Ani
Ani@curiousZeedX·
Has anyone seen this interview? Its very interesting how both Amodei and Demis seems to agree that they would want AGI to have a 5-10 year timeline instead of 1-2 year so that they can have time to digest and think about potential risks and mitigations strategies of a world post AGI . They mentioned how that’s not possible because of geopolitical adversaries ( China in this case) building at similar speed and how difficult it might be to adhere to a global agreement of taking things slow. Both agreed that we will probably be on uncharted territory and not enough thought is being put into how if it will effect major issues like labour displacement for example.
Ani tweet media
English
0
0
1
75
Ani
Ani@curiousZeedX·
Just found a 🔥 resource for anyone curious about how LLMs really work — not just interview fodder. BuildML pulled together the most asked 24 LLM questions at places like DeepMind, OpenAI, Meta & more, with explanations that actually dig into what matters when you build and ship models. This isn’t “list of buzzwords” — you’ll see questions on things like: • positional embeddings (RoPE) • Chinchilla scaling laws • causal vs bidirectional attention • KV-caching • RAG design & eval strategies … and more deep nuts & bolts stuff that most casual guides skip. Whether you’re learning for fun, building your first LLM project, or just trying to level up from “hey chatGPT” to actually understanding transformers, this is worth a read.
Ani tweet media
English
1
0
2
190
Ani
Ani@curiousZeedX·
@the2ndfloorguy Absolutely deserved attention. Hope you build a great product out of it.
English
0
0
0
333
Pankaj
Pankaj@the2ndfloorguy·
last 48 hours were absolutely MAD 🤯 internet went wild over a weekend hack i built out of frustration. didn't expect it to blow up to 5m+ - blr city police reached out, planning to meet sometime this week - flooded with dms from founders and other state police officials - some DAMN big names in industry shared it - unexpected early inbound interest for investment - got interviewed by 4 top national tv news channels - multiple newspapers picked it up and gave it solid space - big digital media pages, radio stations, instagram accounts, and youtube channels shared - thousands of people dmed saying this inspired them to build something of their own my mom is gonna see me on tv for the first time, feels so good 😭 i'm actively working on the roadmap now since there’s real interest. the current setup is very hacky and early stage. will keep sharing updates here genuinely overwhelmed by the love and support. if i’ve missed your message, i’ll get back once things settle a little 🙏
Pankaj@the2ndfloorguy

OMG. office of the commissioner of police, blr reached out 🤯

English
521
777
12.8K
516.2K
Ani
Ani@curiousZeedX·
@NeelNanda5 Big fan of your work @NeelNanda5. I was already familiar with your work but seeing it visualised made me appreciate the depth of it even more. Loved the overview.
English
0
0
1
180
Neel Nanda
Neel Nanda@NeelNanda5·
Thanks to Welch Labs for making a gorgeous animated explainer video about grokking, including a deep dive into my work reverse engineering a modular addition transformer I particularly love the animations of 2D Fourier Transforms, it makes it all so much cleaner!
Neel Nanda tweet media
English
22
61
746
43.1K