bnomial

1.2K posts

bnomial banner
bnomial

bnomial

@0xbnomial

i publish one machine learning question every day and i try to make it fun.

Katılım Şubat 2022
0 Takip Edilen18.2K Takipçiler
bnomial retweetledi
Miss Distance
Miss Distance@softminus·
If anyone knows someone at Twitter/X who can help with this, my alt account (@aptlyamphoteric) is unusable, it keeps putting me in this "You must re-enroll your yubikey" flow and no matter how many times I follow the steps, I cannot gain access to the account again.
Miss Distance tweet media
English
50
16
103
17.3K
Jazmin WiIIiams
Jazmin WiIIiams@jazminwiIIiams·
Hey @Support — seems like loads of us are suddenly getting stuck on a YubiKey re-enrolment pop-up even though we’ve never had one 😭 Can’t log in or bypass it at all.
Jazmin WiIIiams tweet mediaJazmin WiIIiams tweet mediaJazmin WiIIiams tweet media
English
28
18
136
16.3K
bnomial retweetledi
Santiago
Santiago@svpino·
I recorded a new YouTube video to teach you how to evaluate a RAG application. And we'll do it step by step. Starting from scratch. 8 out of 10 people I talk to are evaluating their LLM-powered systems manually. This is wild! They try a few samples and deploy the system if the answers look good. It reminds me of people testing the UI of an application by just "looking at it" from time to time. Please, don't do this. In this video, I'll show you how you can build automated tests for a simple RAG system that answers questions from a website. It's a 50-minute video. My goal is not to show you the code but to help you understand everything that's happening. Here is the link to the video: youtu.be/ZPX3W77h_1E I'm using @langchain and @giskard_ai to implement the evaluation process. Giskard is an open-source library that will help you with the following: 1. Generate test cases automatically. Each test case consists of a question, a ground-truth answer, and a reference context. 2. It will run every test case and point out problematic topics and RAG components that need improvement. 3. It will show recommendations to improve the system. It's a great library! Star their repository here: github.com/Giskard-AI/gis… Hope you enjoy the video!
YouTube video
YouTube
Santiago tweet media
English
14
76
522
65.3K
bnomial retweetledi
Santiago
Santiago@svpino·
vLLM has blown every other LLM inference method out of the water. vLLM is an open-source library to serve large language models. It uses a new attention algorithm to deliver up to 24x higher throughput than HuggingFace Transformers without requiring model changes. This is a huge deal. The library was developed at UC Berkeley. It makes serving any LLM affordable for anyone with limited computing resources. vLLM will give you higher throughput, but serving an LLM in production requires much more. You'll need cost-optimized computing resources, a scalable orchestration mechanism, and a streamlined pipeline to host your LLM. The @monsterapis team built one of the first large-scale implementations of vLLM behind their text-generation LLM APIs. Thanks to vLLM's speed, you can now access their models with a 98% cost reduction. You can use any of the following models: • Mistral 7B Instruct • Microsoft Phi-2 • HuggingFace Zephyr-7b-beta • TinyLlama-1.1B-Chat-v1.0 The team gave me 10,000 free credits for anyone who uses the code "SANTIAGO" in their dashboard: monsterapi.ai/signup If you want to get their latest updates, get free credits and special offers, join their Discord server: discord.com/invite/mVXfag4…
Santiago tweet media
English
11
90
623
88.5K
bnomial retweetledi
Santiago
Santiago@svpino·
I recorded a step-by-step tutorial on building a RAG application from scratch. It's a 1-hour YouTube video where I show you how to use Langchain, Pinecone, and OpenAI. You'll learn how to build a simple application to answer questions from YouTube videos using an LLM. But the video is not just about the code (the code is usually the least important thing!) The video is about helping you understand the reason for every component and how to build a chain to solve the problem. Here is the video: youtu.be/BrsocJb-fAo
YouTube video
YouTube
English
22
171
1.2K
173.8K
bnomial retweetledi
Santiago
Santiago@svpino·
Better data is better than better models. Most AI projects never graduate from the demo stage. This is now painfully obvious with Large Language Model applications. One of the main culprits is low-quality data full of incorrect edge cases. Models work by approximating their datasets. Give them garbage, and garbage you'll get. You can't out-train bad data. You have to deal with it. Check out this open-source project founded by a team from MIT: github.com/cleanlab/clean… I've been a massive fan of Cleanlab for a while. I took their Data-Centric AI MIT class (public online.) Their tool is even better. It will help you curate any image, text, or tabular dataset and do it automatically. If you are building an LLM application, you can use Cleanlab as follows: • To improve the model inputs and outputs • To improve the dataset you are using to fine-tune the model Cleanlab will automatically detect the following: • Low-quality instructions and responses • Unsafe or poorly-written text • Any model outputs that aren't trustworthy • Any model outputs that look unrealistic The team invested in a lot of research to make this possible. Much of this research is published! Here is a blog post you must read. It shows how their tool can find and fix problems in any LLM instruction-tuning dataset: cleanlab.ai/blog/filter-ll… If you are seriously working in LLM applications, Cleanlab is a tool you must check out.
Santiago tweet media
English
9
83
497
50.4K
bnomial retweetledi
Santiago
Santiago@svpino·
People are sick and tired of $9.99 online, beginner-level video courses. One year ago, I started teaching a hard-core Machine Learning class focused on my experience in the field. And it's the best advanced-level machine learning class on the Internet. Here is what makes my program different: 1. You'll learn from real-life experience building real-world projects. 2. It's live. You'll interact with me and your classmates. 3. It's a hands-on, practice-first program. You'll write a ton of code. 4. It focuses on the 95% of the work that books and courses ignore. 5. It lasts 3 weeks. 18 hours of live content and 10+ hours of recordings. 6. You get access to production-level code templates you can reuse. 7. It gives you lifetime access to every class. Forever. The program is about building end-to-end Machine Learning applications. We cover the entire process: training, tuning, evaluating, registering, deploying, and monitoring models. We go deep into many advanced, real-world practices like active learning, distributed training, human-in-the-loop deployments, model compression, test-time augmentation, and testing in production, among many others. This is not a beginner-friendly class. This is not for people who aren't willing to put in the work. The class is tough, goes deep, and will be the best money you'll spend in 2024. The next iteration of the program starts on April 8th. You can join from ml.school.
Santiago tweet media
English
17
37
346
56.2K
bnomial retweetledi
Santiago
Santiago@svpino·
Here's one of the biggest breakthroughs in LLM fine-tuning: Most haven't realized this yet, but anyone can now fine-tune a large model to personalize results to individual customers. Fine-tuning huge models is no longer exclusive to multi-billion dollar companies, thanks to LoRA. Here is how you can use LoRA adapters: Imagine you want to deploy an LLM. You have two options: • Build a general model that works for all your customers • Build a personalized model for each of your customers Everyone knows that quality comes from specialization. Unfortunately, we cannot fine-tune and serve a custom LLM for each customer. This would be too costly even to consider. And that's where LoRA comes in. LoRA works by creating and training a separate, small adapter. You can fine-tune large models in little time and cost. But that's just the beginning: LoRA doesn't modify the original model, only the small adapters. You can swap these adapters in a production application to accomplish different tasks. You can't fine-tune 100 large models to personalize the experience for 100 customers, but you can fine-tune 100 LoRA adapters. These adapters are small, and training them is affordable and fast. If you aren't using LoRA, you should look into it immediately. Unfortunately, fine-tuning your model with LoRA is not straightforward. That's where @monsterapis comes in with their efficient no-code LoRA/QLoRA-powered LLM fine-tuner. To my knowledge, this platform offers the most advanced and efficient way to fine-tune and deploy open-source Large Language Models. After fine-tuning your model with LoRA, you can serve the large model and the appropriate adapter to provide personalized results to the customer. You can switch between these adapters while keeping all of them in the same GPU. Serving 100 fine-tuned or traditional models would require 100 GPUs. Serving 1 model with 100 adapters would require a single GPU. The @monsterapis team partnered with me and gave me 10,000 free credits for anyone who uses the code "SANTIAGO" in their dashboard: monsterapi.ai/signup If you want to read their latest updates, get free credits and special offers, join their Discord server: discord.com/invite/mVXfag4…
Santiago tweet media
English
13
103
461
68.7K
bnomial retweetledi
Santiago
Santiago@svpino·
Tesla uses 8-bit integers to run their models in real time. We call this process quantization. It's how QLoRA lets us fine-tune billions of parameters using consumer hardware. Here is everything you need to know about QLoRA in plain English: Only a few companies can afford to fine-tune large models. The process is expensive and time-consuming, and very few people have the skills to do it correctly. But fine-tuning is critical. It's what turns a mediocre model into a highly specialized one, and every company in the world wants better models. We then invented LoRA. It's a mind-blowing trick that approximates billions of parameters using the product of two smaller matrices. Instead of fine-tuning every model parameter, we could train the small approximation matrices and get the same results. LoRA made fine-tuning cheaper and faster. But we went one step further with QLoRA. Quantization reduces the numerical precision to represent model parameters. For example, instead of using 16 bits to store 42.7, we could quantize it into 8 bits as the number 42. Quantization reduces the size of models, which is crucial when model speed is critical. But there's a trade-off: quantization makes models less accurate. QLoRA is the answer to that. We first quantize the model and then use LoRA to fine-tune the low-rank matrices on a specific task. This fine-tuning helps these matrices to learn specific adjustments that counteract any quantization errors. QLoRA enables us to fit large models in lower GPU memory using quantization without having to trade off precision! While a full fine-tuning of a 70B parameter model requires 780GB of GPU memory, using QLoRA, you can do it with only 48GB. For those who want to use LoRA and QLoRA to fine-tune a model, check out @monsterapis and their efficient no-code LoRA/QLoRA-powered LLM fine-tuner: MonsterTuner. Their platform automatically configures a cost-optimized GPU environment and fine-tuning pipeline for your specific model. To my knowledge, this platform offers the most advanced and efficient way to fine-tune and deploy open-source Large Language Models. They partnered with me and gave me 10,000 free credits for anyone who uses the code "SANTIAGO" in their dashboard: monsterapi.ai/signup If you want to get their latest updates, get free credits and special offers, join their Discord server: discord.com/invite/mVXfag4…
Santiago tweet media
English
12
175
923
105.8K
bnomial retweetledi
Santiago
Santiago@svpino·
Imagine if airlines interviewed pilots by asking them to recite the flight manual in latin instead of flying a plane. That’s what tech interviews are.
English
31
95
820
120.4K
bnomial retweetledi
Santiago
Santiago@svpino·
LoRA adapters are revolutionary. They have changed how we fine-tune and serve models. Unfortunately, most companies haven't realized it yet. TL;DR: Companies can now start shipping fast and affordable models that can provide a personalized experience to customers. A company has two options when deploying an LLM: • Build a general model that works for all your customers • Build a personalized model for each of your customers Personalized versus general. Everyone knows that quality comes from specialization. Unfortunately, companies cannot fine-tune and serve a custom LLM for each customer. This would be too costly to even consider. And that's where most conversations end. But they shouldn't. I've talked about parameter-efficient fine-tuning before. Specifically, I've written about LoRA and how it helps fine-tune a model for a fraction of the cost. But the beauty of LoRA goes beyond efficiency: LoRA works by creating and training a separate, small adapter. LoRA doesn't modify the original model. You can use multiple adapters with the same model to accomplish different tasks. A company with 100 customers can't fine-tune and serve 100 large models, but they can train a LoRA adapter for each. These adapters are small, and training them is affordable and fast. When serving the models, the company will use the large model and the appropriate adapter to provide personalized results to the customer. They can switch between these adapters while keeping all of them in the same GPU. This is amazing! Serving 100 fine-tuned or traditional models would require 100 GPUs. Serving 1 model with 100 adapters would require a single GPU. If you aren't using LoRA, you should look into it immediately. Unfortunately, fine-tuning your model with LoRA is not straightforward. That's where @monsterapis comes in with their efficient no-code LoRA/QLoRA-powered LLM fine-tuner. To my knowledge, this platform offers the most advanced and efficient way to fine-tune and deploy open-source Large Language Models. They partnered with me and gave me 10,000 free credits for anyone who uses the code "SANTIAGO" in their dashboard: monsterapi.ai/signup If you want to read their latest updates, get free credits and special offers, join their Discord server: discord.com/invite/mVXfag4…
Santiago tweet media
English
5
39
168
23.2K
bnomial retweetledi
Santiago
Santiago@svpino·
Companies aren't our families. We are a line item on a spreadsheet. A helpful tool helping pay for our boss' vacation. I understand everybody can't be an entrepreneur, but what about you? Do you want to build somebody else's future forever, or would you rather try to build something that's yours? Writing code is one of the best-paid skills in the world. If you know how to do it, you have what it takes to break free from the rat race. Don't quit your job. Start something on the side. I made $600,000 working as a freelancer during nights and weekends. I used Upwork to find clients. I didn't have to quit my job. If I did it, I'm sure you can try. It's an excellent way to break free from the paycheck addiction. The fantasy of many people I know is writing an email telling their boss they quit. Feeling free from the bullshit. If that sounds good, here is how I can help: I recorded a course with everything I know about making side money using Upwork. You can get it for the next 24 hours at half its price. That's only $20! Here is the link: svpino.gumroad.com/l/upwork/ksoxj… Watch the course. If you don't find it valuable, I'll refund you without asking any questions. This course is how I did it. I hope it helps you do the same.
English
26
21
312
57.3K
bnomial retweetledi
Santiago
Santiago@svpino·
I only found this a few weeks ago: • Open the ChatGPT app on your phone • Click on the little headphones icon • Start talking to the model • Keep the white button pressed while you talk • Unpress the button when you finish talking • Keep the conversation going I have been using this while walking. It works very well. Transcriptions are amazing. I haven't had any connectivity glitches, even when walking in and out of cellular coverage. This has been transformative.
English
53
57
622
198.8K
bnomial retweetledi
Santiago
Santiago@svpino·
Take your data and split it into 10 different subsets: The first one with 10% of the data. The second, with 20%. The third with 30%, and so on. Train a model using each subset. Evaluate it and plot the results. The attached image shows two examples, and the conclusions are very different. The green model can use more data. The model will improve if you add 10-30% more data. The red model will not improve with more data. You will waste your time if you spend it collecting and labeling more samples. This is a straightforward experiment. More data doesn't always help.
Santiago tweet media
English
20
78
513
73.6K
bnomial retweetledi
Santiago
Santiago@svpino·
Here is an open-source library that makes integrating AI into an application extremely easy. CopilotKit. Star their repository: github.com/CopilotKit/Cop… The library has two components: The first component is CopilotTextarea • It's a drop-in replacement for any textarea field in your app • It offers text generation and autocomplete functionality • You can use it to tag external content The second component is CopilotPortal • It's a plug-and-play AI chatbot • It uses your current application state • It can take actions inside your application and backend • It supports plugins The library is open-source. You can self-host it. You can use it with any LLM—including GPT-4. This project was #2 on HackerNews and ProductHunt. It was trending in GitHub. This library works on any React app, but the team is working to expand it. Thanks to the team for showing me their tool and collaborating with me on this post!
Santiago tweet media
English
15
177
976
129.1K
bnomial retweetledi
Santiago
Santiago@svpino·
2 billion people use spreadsheets every month. Their lives are about to change. Here is ChatGPT integrated natively on spreadsheets. It's one of the best applications of AI that will spread in 2024. You now have access to: • Generate content based on other cells • Summarize, translate, and rewrite • Analyze reviews at scale • Reformatting and extracting information • Enriching existing data And it's as simple as using ChatGPT as a spreadsheet formula. This is @NumerousAI. They collaborated with me on this post. Their tool came out in March last year. I think it will explode in the coming months. Check it out here: numerous.ai.
English
50
287
1.7K
431.2K
bnomial retweetledi
Santiago
Santiago@svpino·
In 2024, I'll write about the 50 most important lessons I've learned as a Machine Learning Engineer in 2024. One at a time. And I'll share them for free. Subscribe here if you are interested: underfitted.svpino.com
English
4
17
152
30.2K
bnomial retweetledi
Santiago
Santiago@svpino·
Salary will not make you rich. You need to build your own thing. I made a ton of money freelancing. I'll show you how to do the same. I came to the US from a socialist country. I was poor. I didn't own a computer until I was 23, and it was in 2001 when I first used the Internet. On top of it all, I didn't speak English. I started freelancing for $8 an hour, using a dictionary to translate my emails. I had to figure it out. My last freelance client paid me $300 per hour. I made $600,000 in Upwork alone, working on the side of my regular job. I'm now done. I quit. Don't need to work for money anymore. I lost the geographical lottery, but things turned out okay. If I made it, people like you can make it as well. This brings me back to Upwork. That's where everything started for me. I talk to folks who complain they can make money on Upwork. They say you can't get a job because too many people charge too little.  It turns out they are right. Competing in price is stupid. You are dead if your strategy is to charge less money than your competition. I know how to get jobs. I know how to attract attention. I learned my tricks after 20 years of freelancing. I became a Top Rated Plus freelancer in Upwork with 100% Job Success. Before leaving the platform, I sent 79 proposals and closed 19. A 24% closing rate is the highest I've seen! A few months ago, I recorded a 1-hour video with everything I know about Upwork: • How to find the projects that everyone else misses. • How to get hired, regardless of how many people apply. • How to structure your profile and proposals. To finish 2023, and for the next 24 hours, you can get this course at half its price. That's only $20! I don't know what to tell you if you can't turn those $20 into a stable side income in a few weeks of work. Here is the link with the discount: svpino.gumroad.com/l/upwork/ksoxj… I'll go one step further: I'll refund you if you don't find the course valuable. No questions asked. This is the course I didn't have when I started. These are the tips I wish somebody would have told me. I hope they help you as much as they helped me.
Santiago tweet media
English
27
98
896
187.1K
bnomial retweetledi
Santiago
Santiago@svpino·
AI will be one of the most crucial skills for the next 20 years. If I were starting today, I'd learn these: • Python • LLMs • Retrieval Augmented Generation (RAG) Here are 40+ free lessons and practical projects on building advanced RAG applications for production: 1/4
English
37
338
2K
342.9K
bnomial retweetledi
Santiago
Santiago@svpino·
Deep TDA is a new algorithm that uses self-supervised learning to overcome the limitations of traditional dimensionality reduction algorithms. t-SNE and UMAP have long been the favorites. Deep TDA has many advantages over them. Here is a use case about Intel and semiconductors: Manufacturing semiconductors is a complex process that generates a lot of data. One of the biggest challenges is finding out the root cause of any failure. This data is imbalanced and sparse, and classical AI/ML approaches are ineffective. That's where Deep TDA comes in. Here is an excellent article showing how Intel is using Deep TDA to solve this problem: datarefiner.com/feed/semicondu… The key advantages of Deep TDA: 1. More robust to noise and outliers in the data 2. It can scale to complex and high-dimensional datasets 3. It doesn't require as much tuning or knowledge about the data 4. It can capture and represent the bigger picture of the dataset Thanks to @datarefiner for partnering with me on this post! They published an article about Deep TDA: "Why you should use Topological Data Analysis over t-SNE or UMAP?" datarefiner.com/feed/why-tda
Santiago tweet media
English
6
75
386
71.5K