Gopi Kumar

525 posts

Gopi Kumar

Gopi Kumar

@zenlytix

AI Infra @ Microsoft Research. https://t.co/yXMFohcHB1. Amateur musician @ https://t.co/jbymO74ZpX. Opinions my own.

Redmond, WA Katılım Mart 2016
74 Takip Edilen234 Takipçiler
Dylan Patel
Dylan Patel@dylan522p·
$1,000 for whoever comes up with the best name replacement for InferenceMAX InferenceMAX 2.0 dropping soon but we have to rename it because HBO MAX sent us a cease and desist. We have all NVIDIA GPUs from h100 to GB300 on large MoEs with SOTA optimizations like Disagg PD tested
English
360
4
296
59.4K
Gopi Kumar
Gopi Kumar@zenlytix·
In this part two of the AI coding practices, we look at the AI assisted coding in the enterprise production code scenario. linkedin.com/pulse/coding-e… Let me know what you think and if it resonates with you.
English
0
0
0
25
Gopi Kumar
Gopi Kumar@zenlytix·
Putting together a series of short posts with some thoughts and opinions on the practice of vibe coding. This is the introductory post to set the stage for the conversation. Appreciate your feedback and suggestions. linkedin.com/pulse/practice…
Gopi Kumar tweet media
English
1
0
0
63
Gopi Kumar
Gopi Kumar@zenlytix·
@miguelgfierro Agree with learning by practice. One thing one would benefit starting with RAG/even 0-shot prompting LLM is understanding how to evaluate results. You don't need know at 1st how to build model or what gradient descent is but basics of how to measure is key to grok from start.
English
1
0
2
25
Miguel Fierro
Miguel Fierro@miguelgfierro·
Unpopular option: if you want to get into AI, start from a RAG system like this instead of linear regression. I call this reverse learning. 𝐓𝐫𝐚𝐝𝐢𝐭𝐢𝐨𝐧𝐚𝐥 𝐰𝐚𝐲: Linear regression->Logistic Regression->SVMs->Decision Trees->Random Forests->Gradient Boosted Trees->CNNs->RNN->LSTMs->LLMs->RAG. 𝐑𝐞𝐯𝐞𝐫𝐬𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠: RAG->LLMs->Gradient Boosted Trees->Random Forests->Logistic Regression 𝐓𝐫𝐚𝐝𝐢𝐭𝐢𝐨𝐧𝐚𝐥 𝐰𝐚𝐲: First study the theory, then practice with projects. 𝐑𝐞𝐯𝐞𝐫𝐬𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠: First practice with projects, then learn the theory underlying the models. 𝐓𝐫𝐚𝐝𝐢𝐭𝐢𝐨𝐧𝐚𝐥 𝐰𝐚𝐲: Six months to learn the AI that that companys require to get an AI position. 𝐑𝐞𝐯𝐞𝐫𝐬𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠: One month to learn the AI that that companys require to get an AI position. The key idea is start from the AI that is useful to get an AI position instead of the AI that was used 30 years ago. 𝐐𝐮𝐞𝐬𝐭𝐢𝐨𝐧: But it's impossible to do RAG+LLMs if you don't know linear and logistic regression! 𝐀𝐧𝐬𝐰𝐞𝐫: Are you sure? In the picture below you have a RAG solution with DeepSeek LLM in less than 30 lines of code. Challenge the status quo! Vamos!!!🦾🦾🦾
Miguel Fierro tweet media
English
3
0
4
217
Vaibhav (VB) Srivastav
Vaibhav (VB) Srivastav@reach_vb·
On-device AI framework ecosystem is blooming these days: 1. llama.cpp - All things Whisper, LLMs & VLMs - run across Metal, CUDA and other backends (AMD/ NPU etc) github.com/ggerganov/llam… 2. MLC - Deploy LLMs across platforms especially WebGPU (fastest WebGPU LLM implementation out there) github.com/mlc-ai/web-llm 3. MLX - Arguably the fastest general purpose framework (Mac only) - Supports all major Image Generation (Flux, SDXL, etc), Transcription (Whisper), LLMs github.com/ml-explore/mlx… 4. Candle - Cross-platform general purpose framework written in Rust - wide coverage across model categories github.com/huggingface/ca… Honorable mentions: 1. Transformers.js - Javascript (WebGPU) implementation built on top of ONNXruntimeweb github.com/xenova/transfo… 2. Mistral rs - Rust implementation for LLMs & VLMs, built on top of Candle github.com/EricLBuehler/m… 3. Ratchet - Cross platform, rust based WebGPU framework built for battle-tested deployments github.com/huggingface/ra… 4. Zml - Cross platform, Zig based ML framework github.com/zml/zml Looking forward to how the ecosystem would look 1 year from now - Quite bullish on the top 4 atm - but open source ecosystem changes quite a bit! 🤗 Also, which frameworks did I miss?
English
22
101
559
62.2K
Gopi Kumar
Gopi Kumar@zenlytix·
Here is a short walkthrough of running the Phi-3 Mini model on a Windows365 Cloud GPU desktop with #nvidia A10 showing the e2e steps starting with setup, downloading the model and running the model inference in two different ways (all in under 5 mins) youtu.be/xNT8aRJeC3k
YouTube video
YouTube
English
0
0
1
114
Gopi Kumar retweetledi
Jeff Boudier 🤗
Jeff Boudier 🤗@jeffboudier·
Today at @Microsoft Build, @satyanadella announced a deepened partnership with @huggingface, with new experiences across cloud, hardware, open source and developers. Here's a round up of all the joint work we announced this week! huggingface.co/blog/microsoft… - Azure AI Studio Model Catalog - Azure AMD GPU MI300X VMs - Phi-3 open models - WebGPU with transformers.js and optimum - Spaces Dev Mode
English
0
34
109
18.4K
Gopi Kumar
Gopi Kumar@zenlytix·
3. Download the Phi-3 ONNX DirectML Model huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include directml/* --local-dir . --local-dir-use-symlinks False
English
1
0
0
59
Gopi Kumar
Gopi Kumar@zenlytix·
Four easy steps: 1. Create Conda environment and activate: conda create --name phiamdv620 python==3.10 -y conda activate phiamdv620 mkdir phiamdv620 cd phiamdv620 2. Install following packages: pip install numpy huggingface-hub pip install --pre onnxruntime-genai-directml
English
1
0
0
52
Gopi Kumar
Gopi Kumar@zenlytix·
Environment Used: Azure AMD GPU V620 VM (Standard NG32ads V620 v1 (32 vcpus, 64 GiB memory, 1xV620 GPU) running Windows 11 Pro VM Image
1
0
0
56
Gopi Kumar retweetledi
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
phi-3 is here, and it's ... good :-). I made a quick short demo to give you a feel of what phi-3-mini (3.8B) can do. Stay tuned for the open weights release and more announcements tomorrow morning! (And ofc this wouldn't be complete without the usual table of benchmarks!)
Sebastien Bubeck tweet media
English
39
175
917
485.7K
Gopi Kumar
Gopi Kumar@zenlytix·
@bindureddy I guess if Alice is playing online chess hard to say what the other sister is doing? :)
English
0
0
0
66
Bindu Reddy
Bindu Reddy@bindureddy·
Bard is Now Gemini! My initial thoughts - Still continues to be somewhat nerfed and refuses to answer questions - Refused to generate a simple illustration of George Clooney, ChatGPT is better - missing PDF upload - Answers do seem better than the previous version - Seems to have a "reasoning vibe" - However, it does NOT answer some hard questions that GPT-4 does. For example, it didn't get "In a room I have only 3 sisters. Anna is reading a book. Alice is playing a match of chess. What the third sister, Amanda, is, doing ?" The answer is the 3rd sister is playing Chess. GPT-4 nails it. Overall, we plan to do a lot more analysis, but first impressions are good but not great. TLDR; I don't think it will make a material difference to how Bard was doing before, especially if their plan is to charge for this. However, it's always good to have more players in the market. 🤷‍♀️
English
73
46
364
100.1K