Praveenkumar

291 posts

Praveenkumar banner
Praveenkumar

Praveenkumar

@Im_PK_R

Computer Vision / AI Research Engineer, Neubility📍Seoul 🇰🇷 From 🇮🇳

Katılım Şubat 2012
821 Takip Edilen178 Takipçiler
Praveenkumar retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
Introducing Willow, our new state-of-the-art quantum computing chip with a breakthrough that can reduce errors exponentially as we scale up using more qubits, cracking a 30-year challenge in the field. In benchmark tests, Willow solved a standard computation in <5 mins that would take a leading supercomputer over 10^25 years, far beyond the age of the universe(!).
English
2.8K
12.1K
75.6K
19.2M
Praveenkumar retweetledi
Sundar Pichai
Sundar Pichai@sundarpichai·
We see Willow as an important step in our journey to build a useful quantum computer with practical applications in areas like drug discovery, fusion energy, battery design + more. Details here: blog.google/technology/res…
English
345
1.6K
13.3K
2.4M
Praveenkumar retweetledi
Daniel Nguyen
Daniel Nguyen@daniel_nguyenx·
New wallpaper. Motivation.
Daniel Nguyen tweet media
English
59
44
873
109.7K
Praveenkumar retweetledi
Del
Del@TheCartelDel·
The most "Main Character Energy" I've ever seen in my life.
English
1.6K
61.2K
442.4K
41.9M
Praveenkumar retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
⚡️ Excited to share that I am starting an AI+Education company called Eureka Labs. The announcement: --- We are Eureka Labs and we are building a new kind of school that is AI native. How can we approach an ideal experience for learning something new? For example, in the case of physics one could imagine working through very high quality course materials together with Feynman, who is there to guide you every step of the way. Unfortunately, subject matter experts who are deeply passionate, great at teaching, infinitely patient and fluent in all of the world's languages are also very scarce and cannot personally tutor all 8 billion of us on demand. However, with recent progress in generative AI, this learning experience feels tractable. The teacher still designs the course materials, but they are supported, leveraged and scaled with an AI Teaching Assistant who is optimized to help guide the students through them. This Teacher + AI symbiosis could run an entire curriculum of courses on a common platform. If we are successful, it will be easy for anyone to learn anything, expanding education in both reach (a large number of people learning something) and extent (any one person learning a large amount of subjects, beyond what may be possible today unassisted). Our first product will be the world's obviously best AI course, LLM101n. This is an undergraduate-level class that guides the student through training their own AI, very similar to a smaller version of the AI Teaching Assistant itself. The course materials will be available online, but we also plan to run both digital and physical cohorts of people going through it together. Today, we are heads down building LLM101n, but we look forward to a future where AI is a key technology for increasing human potential. What would you like to learn? --- @EurekaLabsAI is the culmination of my passion in both AI and education over ~2 decades. My interest in education took me from YouTube tutorials on Rubik's cubes to starting CS231n at Stanford, to my more recent Zero-to-Hero AI series. While my work in AI took me from academic research at Stanford to real-world products at Tesla and AGI research at OpenAI. All of my work combining the two so far has only been part-time, as side quests to my "real job", so I am quite excited to dive in and build something great, professionally and full-time. It's still early days but I wanted to announce the company so that I can build publicly instead of keeping a secret that isn't. Outbound links with a bit more info in the reply!
Andrej Karpathy tweet media
English
1.5K
3.6K
27.7K
2.5M
Praveenkumar retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Nice read on the rarely-discussed-in-the-open difficulties of training LLMs. Mature companies have dedicated teams maintaining the clusters. At scale, clusters leave the realm of engineering and become a lot more biological, hence e.g. teams dedicated to "hardware health". It can be a frustrating daily life experience of training large models to "babysit" the training run. You're there carefully monitoring the vital signs of your run: loss spikes, numerical issues, throughput, gradient norms, policy entropy, etc. Every time the run degrades or flatlines (can happen often), you quickly look for the stack trace to see what's up. You have to do this fast or 10,000 GPUs could be idling. Often, it is a new, exotic, scary-looking error you've never seen before so you summon help to see if anyone can see what's up. The worst ones like to occur at 4am. Often no one can, so you just ban some nodes that look a bit sketchy and try to restart the run. Sometimes the run goes down just because you have not earned the favors of your gods that day, so you put a while True: loop around your launch command. The underlying issues can be highly diverse, from some GPUs just getting a bit too hot and suddenly doing incorrect multiplication once in a while, to some router going down and decreasing the networked file system I/O, to someone in the datacenter physically disconnecting a wire as part of an un-communicated maintenance. Sometimes you'll never know. Another necessary related citation here is the famous OPT-175B logbook and I'd hope more like it can see the light of day in the future. (see chronicles/OPT175B_Logbook.pdf in the git repo) x.com/aiatmeta/statu… TLDR LLM training runs are significant stress-tests of an overall fault tolerance of a large computing system acting as a biological entity. And when you're shopping around for your compute, think about a lot more than just FLOPs and $. Think about the whole service from hardware to software across storage, networking, and compute. And think about whether the team maintaining it looks like The Avengers and whether you could become best friends.
Yi Tay@YiTayML

Long overdue but here's a new blogpost on training LLMs in the wilderness from the ground up 😄🧐 In this blog post, I discuss: 1. Experiences in procuring compute & variance in different compute providers. Our biggest finding/surprise is that variance is super high and it's almost a lottery to what hardware one could get! (!?!) 😱 2. Discussing "wild life" infrastructure/code and transitioning to what I used to at Google 🤣 3. New mindset when training models. 😶‍🌫️ Writing can be quite therapeutic. I should write more but for now, link below: 👇

English
101
481
4.1K
655.3K
Praveenkumar retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Sleep is beautiful because it makes your training jobs advance
English
109
206
4K
801.8K
Praveenkumar retweetledi
AI at Meta
AI at Meta@AIatMeta·
Today we’re releasing Code Llama, a large language model built on top of Llama 2, fine-tuned for coding & state-of-the-art for publicly available coding tools. Keeping with our open approach, Code Llama is publicly-available now for both research & commercial use. More ⬇️
English
166
901
3.4K
1M
Praveenkumar retweetledi
Jon Barron
Jon Barron@jon_barron·
Jon Barron tweet media
ZXX
4
21
392
96.8K
Praveenkumar retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
!! Awesome !! 🚙 🤖 . It’s been great to watch driverless cars roaming the streets of SF in great numbers and making it look… boring. Cheering for my friends at Tesla, and for the space as a whole!
Pirate Wires@PirateWires

CA Votes YES on Full Self-Driving Taxi Rollout in San Francisco in win for @Cruise and @Waymo The California Public Utilities Commission (CPUC) voted 3-1 today to allow autonomous vehicle companies Waymo and Cruise’s self-driving taxi operations in San Francisco to operate 24/7 and charge fares. Prior to the vote, Waymo and Cruise were authorized to beta-test commercial service with limits on the number of passengers, fees, and areas of operation. Now, the companies will be able to charge for rides and operate throughout the city 24/7. Both companies have argued that self-driving cars will reduce rideshare costs and save lives; in their first million driverless miles, Waymo vehicles were in only two reportable collisions, and Cruise reported 73 percent fewer collisions with “meaningful risk of injury,” than their human performance benchmark. Self-driving cars have faced strenuous opposition from a coalition of labor unions, local activists, and leftist politicians in San Francisco. As we detailed in Labor’s Shadow War With Self-Driving Cars (link below), many city officials (like Aaron Peskin, Jeffrey Tumlin, and Fire Chief Jeanine Nicholson) opposed to autonomous vehicles in SF say they’re concerned about the cars’ safety, but it’s more likely that naked partisanship, conflicts of interest, and loyalty to local labor interests animate them much more than a desire to protect San Franciscans. -Sanjana Friedman (@metaversehell) Labor’s Shadow War With Self-Driving Cars x.com/piratewires/st… Waymo first million driverless miles data waymo.com/blog/2023/02/f… Cruise first million driverless miles data getcruise.com/news/blog/2023…

English
45
136
1.7K
343.3K
Praveenkumar retweetledi
Simon Eskildsen
Simon Eskildsen@Sirupsen·
1. Find big scary equation that's hard to parse 2. Latex OCR it with Mathpix 3. Ask ChatGPT to break it down into heavily commented Python
Simon Eskildsen tweet mediaSimon Eskildsen tweet mediaSimon Eskildsen tweet media
English
104
554
4.3K
798K
Praveenkumar retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
I think this is mostly right. - LLMs created a whole new layer of abstraction and profession. - I've so far called this role "Prompt Engineer" but agree it is misleading. It's not just prompting alone, there's a lot of glue code/infra around it. Maybe "AI Engineer" is ~usable, though it takes something a bit too specific and makes it a bit too broad. - ML people train algorithms/networks, usually from scratch, usually at lower capability. - LLM training is becoming sufficently different from ML because of its systems-heavy workloads, and is also splitting off into a new kind of role, focused on very large scale training of transformers on supercomputers. - In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers. - One can be quite successful in this role without ever training anything. - I don't fully follow the Software 1.0/2.0 framing. Software 3.0 (imo ~prompting LLMs) is amusing because prompts are human-designed "code", but in English, and interpreted by an LLM (itself now a Software 2.0 artifact). AI Engineers simultaneously program in all 3 paradigms. It's a bit 😵‍💫
swyx 🇸🇬@swyx

🆕 Essay: The Rise of the AI Engineer latent.space/p/ai-engineer Keeping up on AI is becoming a full time job. Let's get together and define it.

English
141
707
4.1K
2M
Praveenkumar retweetledi
Matthias Niessner
Matthias Niessner@MattNiessner·
I'm kind of frustrated to see pseudo AI influencer accounts consistently posting factually wrong information about research. This work has nothing to do with Stable Diffusion or any other diffusion method. It fits a 3D GAN to an image as described in the PanoHead CVPR paper.
Drake Facts@NewsIn6ix

Stable Diffusion is now capable of creating photo realistic full 3D models from single images. The amount of ways it could be used in video games and the metaverse blows my mind. AI is getting closer and closer to futuristic sci-fi movies!

English
25
80
783
133K
Praveenkumar retweetledi
Yannic Kilcher 🇸🇨
Yannic Kilcher 🇸🇨@ykilcher·
It's like ask me anything, but fully automatic 😁👍
Aleksa Gordić (水平问题)@gordic_aleksa

Woohoo! You can now chat with @ykilcher's YouTube channel! 🥳 @YannicKilcher" target="_blank" rel="nofollow noopener">youtube.com/@YannicKilcher 400 videos of quality ML research, news, and memes :D (bc Yannic) Yannic had a massive impact on a whole generation of up-and-coming ML engs & researchers Ortus: chrome.google.com/webstore/detai… 1/👇

English
4
4
50
15K
Praveenkumar retweetledi
Yao Fu
Yao Fu@Francis_YAO_·
Is Falcon really better than LLaMA? Short take: probably not. Longer take: we reproduced LLaMA 65B eval on MMLU and we got 61.4, close to the official number (63.4), much higher than its Open LLM Leaderboard number (48.8), and clearly higher than Falcon (52.7). Code and prompt open-sourced at github.com/FranxYao/chain… No fancy prompting engineering, no fancy decoding, everything by default. ---- Full story: On OpenLLM Leaderboard (huggingface.co/spaces/Hugging…), Falcon is the top 1, suppressing LLaMA, and promoted by @Thom_Wolf (x.com/thom_wolf/stat…) Yet later @karpathy expressed concern about why on Open LLM Leaderboard, the LLaMA 65B score is significantly lower than official (48.8 v.s. 63.4), see x.com/karpathy/statu… We figure that a simple quick open-sourced evaluation script on LLaMA 65B would clarify, so we just did it github.com/FranxYao/chain… Again, everything is default, official MMLU prompt, no fancy prompt engineering, no fancy decoding. LLaMA 65B simply can do it. We encourage everyone to try the eval script out. This result makes us continue to hold the belief that the best bet of open-source community to get close to GPT-3.5 is to do RLHF on LLaMA 65B, per our previous discovery in Chain-of-thought Hub arxiv.org/abs/2305.17306 Yet we do not intend to raise wars between LLaMA and Falcon -- both are great open-sourced models and have made significant contribution to the field! Falcon also have the advantage of a easier license, which also gives its great potential to be awesome! 🍻🍻
English
32
121
693
335.5K
Praveenkumar retweetledi
Lior Alexander
Lior Alexander@LiorOnAI·
Adobe just added their first Generative AI tool to Photoshop! Big milestone. Generative Fill allows you to extend images as well as add and remove objects using simple text prompts.
English
19
238
1.2K
221.3K