burhan rashid

4.6K posts

burhan rashid banner
burhan rashid

burhan rashid

@burhr2

The most useful person is the one who allows a person to implant goodness inside him or to do him a favour,in fact both giver and receiver benefits CV & ML

Inria, Rennes, France Katılım Şubat 2011
646 Takip Edilen213 Takipçiler
burhan rashid retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications. We cover all the major stages: 1. pretraining: data, tokenization, Transformer neural network I/O and internals, inference, GPT-2 training example, Llama 3.1 base inference examples 2. supervised finetuning: conversations data, "LLM Psychology": hallucinations, tool use, knowledge/working memory, knowledge of self, models need tokens to think, spelling, jagged intelligence 3. reinforcement learning: practice makes perfect, DeepSeek-R1, AlphaGo, RLHF. I designed this video for the "general audience" track of my videos, which I believe are accessible to most people, even without technical background. It should give you an intuitive understanding of the full training pipeline of LLMs like ChatGPT, with many examples along the way, and maybe some ways of thinking around current capabilities, where we are, and what's coming. (Also, I have one "Intro to LLMs" video already from ~year ago, but that is just a re-recording of a random talk, so I wanted to loop around and do a lot more comprehensive version of this topic. They can still be combined, as the talk goes a lot deeper into other topics, e.g. LLM OS and LLM Security) Hope it's fun & useful! youtube.com/watch?v=7xTGNN…
YouTube video
YouTube
Andrej Karpathy tweet media
English
770
2.9K
20.3K
2.4M
burhan rashid retweetledi
Santiago
Santiago@svpino·
DeepSeek R1 is *the* best model available right now. It's at the level of o1, but you can use it for free, and it's much faster. A huge leap forward that nobody saw coming. No wonder so many people are throwing tantrums online trying to discredit the Chinese students who built this. You can use DeepSeek in Visual Studio Code right now: 1. Install the Qodo Gen AI extension 2. Select DeepSeek R1 from their list of models The Qodo team is hosting DeepSeek on their servers, so none of your data will go to China. I've been building a Tetris game using DeepSeek, and this is the most impressive model I've seen so far.
English
297
967
6.2K
1.2M
burhan rashid retweetledi
elvis
elvis@omarsar0·
Top Trending AI Papers for 2024 If you are looking for the top trending AI papers of 2024, I've got you covered. I've documented and summarized the top papers every week for all of 2024. (link below)
elvis tweet media
English
9
91
433
38.6K
burhan rashid retweetledi
Santiago
Santiago@svpino·
The best code is the one you didn't write. The second best code is the one that solves the problem. The best tool is the one you already have. The best solution is the simplest one. Always make it work first. Make it better later.
English
17
40
364
27.3K
burhan rashid retweetledi
Pau Labarta Bajo
Pau Labarta Bajo@paulabartabajo_·
Wanna learn to deploy an ML REST API? This will help ⬇️
English
2
14
112
5.2K
burhan rashid retweetledi
Santiago
Santiago@svpino·
This MIT class is still the best way to learn Linear Algebra. It's free. Gilbert Strang is one of those generational professors. The type of person that will leave a positive mark on you. ocw.mit.edu/courses/18-06-…
Santiago tweet media
English
22
340
2.4K
158.5K
burhan rashid retweetledi
Santiago
Santiago@svpino·
Data Engineers are making fortunes building data pipelines. I can't think of anything more important than learning how to deal with data for the next 20 years. Here are a few steps I'd recommend to people who want to learn one of the most in-demand skills in the market right now: 1. Build a strong foundation in programming (especially Python and Java) 2. Practice SQL and learn how to work with relational databases. 3. Become familiar with NoSQL databases and handling unstructured data. 4. Learn the basics of data structures (arrays, lists, trees, graphs) and algorithms (sorting, searching, etc.) 5. Become familiar with tools like Hadoop, Spark, Kafka, Airflow, Luigi, Prefect, and Kinesis. 6. Learn how to implement data pipelines and ETL 7. Gain experience with cloud computing (AWS, GCP, Azure.) 8. Learn containerization and orchestration (Docker and Kubernetes) 9. Become proficient with data warehousing (Snowflake, Redshift, BigQuery) 10. Understand CI/CD pipelines to automate testing, deployment, and integration. The Data Engineering Nanodegree program in @Udacity is pure fire. It's a 2-month course that covers every one of the topics above. You'll come out on the other side as a new person. Everywhere I go, companies need to process data as fast as they produce it. It's hard to find people who know how to do this well, but companies are willing to pay exceptionally well for this. If you want to start 2025 with a bang, this course is for you! Link in the next post so I don't get throttled.
English
38
247
2.1K
240.8K
burhan rashid retweetledi
Visual Studio Code
Visual Studio Code@code·
Announcing GitHub Copilot Free! A new free tier for GitHub Copilot, available for everyone today in @code No trial. No subscription. No credit card required. Learn more in our blog: aka.ms/copilot-free
Visual Studio Code tweet media
English
260
2.8K
14.3K
1.2M
burhan rashid retweetledi
Santiago
Santiago@svpino·
Copilot is now free! Probably the biggest news for developers in 2024.
English
155
582
6.2K
555.5K
burhan rashid retweetledi
elvis
elvis@omarsar0·
Understanding Deep Learning Impressive new book on understanding deep learning concepts. Topics include fundamental building blocks, Transformers, GNNs, RL, diffusion models, and more. Probably one of the most comprehensive and up-to-date overviews of deep learning that exist today.
elvis tweet media
English
11
207
1K
68.9K
burhan rashid retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
Training LLMs for spam classification take 2: I added 14 experiments comparing different approaches: github.com/rasbt/LLMs-fro… - which token to train - which layers to train - different model sizes - LoRA - unmasking - and more! Any additional experiments you'd like to see?
Sebastian Raschka tweet media
English
16
89
531
40.2K
burhan rashid retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
A suggestion for an effective 11-step LLM summer study plan: 1) Read* Chapters 1 and 2 on implementing the data loading pipeline (manning.com/books/build-a-… & github.com/rasbt/LLMs-fro…). 2) Watch Karpathy's video on training a BPE tokenizer from scratch (youtube.com/watch?v=zduSFx…). 3) Read Chapters 3 and 4 on implementing the model architecture. 4) Watch Karpathy's video on pretraining the LLM. 5) Read Chapter 5 on pretraining the LLM and then loading pretrained weights. 6) Read Appendix E on adding additional bells and whistles to the training loop. 7) Read Chapters 6 and 7 on finetuning the LLM. 8) Read Appendix E on parameter-efficient finetuning with LoRA. 9) Check out Karpathy's repo on coding the LLM in C code (github.com/karpathy/llm.c). 10) Check out LitGPT to see how multi-GPU training is implemented and how different LLM architectures compare (github.com/Lightning-AI/l…). 11) Build something cool and share it with the world. (Read = read, run the code, and attempt the exercises 😊)
YouTube video
YouTube
English
31
322
1.8K
186.1K
burhan rashid
burhan rashid@burhr2·
@paulabartabajo_ Amazing short videos, keep up the good work. As I am trying improve my ML engineering skills always find your videos helpful.
English
0
0
1
365
burhan rashid retweetledi
Pau Labarta Bajo
Pau Labarta Bajo@paulabartabajo_·
Wanna learn how to organize your ML code like a PRO? Here is the way ↓↓↓
English
1
51
307
24.4K
burhan rashid retweetledi
Pau Labarta Bajo
Pau Labarta Bajo@paulabartabajo_·
Let's build an AI Coding assistant with Llama3 ↓🧵🦙
English
7
71
419
123.4K
burhan rashid retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
When doing machine learning and AI research (or writing books), making the code reproducible is usually desirable. Often, that's easier said than done! So, I recorded a video illustrating and dealing with 6 sources of randomness that occur when training deep neural networks and LLMs: 1. Model weight initialization 2. Dataset sampling and shuffling 3. Nondeterministic algorithms 4. Different runtime algorithms 5. Hardware and drivers 6. Randomness in generative AI models
English
4
54
346
81.6K