Bibin Wilson

2.4K posts

Bibin Wilson

@bibinwillson

Founder @DevOpsCube & CrunchOps Consulting

Join 19000+ Readers → Katılım Şubat 2010

22 Takip Edilen473 Takipçiler

Sabitlenmiş Tweet

Bibin Wilson@bibinwillson·9 Mar

Turning a Model to running app on kubernetes 🚀 In the MLOps series, We completed Phase 1 by successfully deploying a machine learning model on KServe. The latest edition covers, - Dockerizing the inference service - Dockerizing the frontend application - Why we need KServe? Why not k8s deployment? - Serving the model using KServe - Deploying a frontend that interacts with the KServe inference endpoint - How large models are served in Kserve. The goal of Phase 1 was to help DevOps engineers understand the basic ML concepts required to get started with CNCF-based AI/ML tools. 𝗥𝗲𝗮𝗱 𝗟𝗮𝘁𝗲𝘀𝘁 𝗘𝗱𝗶𝘁𝗶𝗼𝗻: newsletter.devopscube.com/p/deploying-mo… In the upcoming editions, we will dive deeper into key AI/ML tools and workflows. All the concepts we learned in Phase 1 will make those workflows much easier to understand. #mlops

English

Bibin Wilson@bibinwillson·27 Nis

If you want to learn MLOps, Dont jump straight to GPU scheduling, device plugins etc.. First understand how model serving actually works on a single GPU. We published a hands-on guide on running and optimizing a large LLM on a single GPU using Docker and vLLM Here is what you will learn 👇 - How to set up a GPU environment with Docker and NVIDIA runtime - How to deploy Llama 3 with vLLM - How to optimize GPU memory - How quantization improves throughput and reduces VRAM usage - How continuous batching improves performance under load 𝗗𝗲𝘁𝗮𝗶𝗹𝗲𝗱 𝗚𝘂𝗶𝗱𝗲: devopscube.com/deploying-llam… Use it to understand how LLM serving actually works before deploying on Kubernetes. Once you understand vLLM with Docker, the next step is to run it on Kubernetes. You can run the same stack using: - vLLM for model serving - Kubernetes Inference Gateway API for external traffic routing #devops #mlops

English

Bibin Wilson retweetledi

DevopsCube@devopscube·10 Mar

From DNS to Pod: How k8s Gateway API actually works. - You create a DNS record pointing to your cloud Load Balancer IP. - The Load Balancer forwards traffic to a Kubernetes Service, specifically the Gateway Service endpoint. - This Service points to the gateway proxy pods. These could be nginx, Envoy, or any compatible proxy. - The Gateway Controller (Ex: Nginx Fabric) watches for HTTPRoute, GRPCRoute, and similar resources. - When you apply these routes, the controller automatically configures the gateway proxy with the right configuration. - The HTTPRoute resource is what decides where your traffic actually goes. For example, /payment to payment-service, /auth to auth-service So the full traffic flow looks like this 👇 DNS to Cloud LB to Gateway Service to Gateway Proxy to your backend Service and finally to your Pod. If you understand the Ingress flow well, relating it to the Gateway API is very easy. A key difference is that in the classic Ingress model, the controller itself acts as the proxy. In the Gateway API, the controller configures and manages dedicated proxy instances (Gateways), creating a clear separation of concerns. We share such DevOps/MLOps concepts and deep dives in my newsletter. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗵𝗲𝗿𝗲 (𝟭𝟬𝟬% 𝗳𝗿𝗲𝗲): newsletter.devopscube.com Over to you… Are you using Gateway API in production? If yes, would love to hear your experience with it. ♻️ If this helped, repost it so others can learn too. #kubernetes

English

416

15.2K

Bibin Wilson retweetledi

DevopsCube@devopscube·9 Mar

Master Istio Architecture in 12 Minutes (Illustrated Guide) If you want to understand Istio better, You need to understand its complete architecture and how the Istio components interact with each other. In our latest newsletter edition, we break this down with clear diagrams and simple explanations. Here is what you will learn 👇 - Why Istio needed a new architecture - Overview of Istio architecture - Deep dive into key components: Istiod, Ztunnel, Waypoint Proxy, and CNI - Is Ztunnel a single point of failure? - Business use cases of Ambient Mesh, including cost benefits. - Hands-on guide to setting up Istio Ambient Mesh 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗵𝗲𝗿𝗲: newsletter.devopscube.com/p/istio-ambien… 𝗡𝗼𝘁𝗲: For better understanding, first set up the Ambient Mesh and then go through the architecture. You will be able to relate to all the concepts much better. #devops #istio

English

146

5.5K

Bibin Wilson@bibinwillson·6 Mar

If you are a DevOps engineer trying to understand MLOps, this series is for you. 3 editions of my MLOps for DevOps series are now live. In Phase 1, the goal is simple. Get familiar with basic ML terminology and understand the core workflow through hands-on examples. Here is what I have covered so far. 𝗦𝘁𝗲𝗽 𝟭: Building a Dataset Pipeline newsletter.devopscube.com/p/building-a-d… 𝗦𝘁𝗲𝗽 𝟮: Data Preparation (Hands On) newsletter.devopscube.com/p/mlops-data-p… 𝗦𝘁𝗲𝗽 𝟯: Training & Testing the Model (Hands On) newsletter.devopscube.com/p/mlops-traini… This Saturday I will wrap up Phase 1 with containerizing the model and serving it on Kubernetes using KServe. Note: I am not an AI/ML developer. The main goal of this series is to help DevOps engineers understand the infrastructure side of MLOps. #mlops

English

Bibin Wilson@bibinwillson·5 Mar

This is useful if you want to run LLMs locally LLM Checker scans your hardware and tells you exactly which LLM models you can run locally, with full Ollama integration. 𝗚𝗶𝘁𝗵𝘂𝗯 𝗥𝗲𝗽𝗼: github.com/Pavelevich/llm… Note: When testing new tools like this, avoid running them on your work laptops Always test on personal machines or isolated environments first.

English

Bibin Wilson@bibinwillson·4 Mar

GPU scheduling in Kubernetes is becoming an important skill now. Here is why 👇 Most organizations are either experimenting or running ML workloads in Kubernetes. As Devops engineers you should know how to set up and manage GPU resources. We have created a simple guide on setting up GPU node using NVIDIA GPU Operator By the end of this blog, you will have a clear understanding of: - Need for GPU operator on Kubernetes - Setting up NVIDIA GPU Operator on a Kubernetes cluster - Verify if Kubernetes detects GPUs - How to deploy a real GPU-based workload to validate the full stack. To be honest, it is not as complicated as it sounds once you understand the basics. Even if you dont have access to GPU's, just go through the content to have an understanding on how it works. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗵𝗲𝗿𝗲: devopscube.com/setup-gpu-oper… Getting GPUs working is step one. There are different strategies in managing GPUs in Kubernetes for the efficient use of GPU resources. Will be sending a deep dive in my newsletter. Are you running any GPU workloads on Kubernetes? Or planning to learn about it? Either way, I would love to know your thoughts and experiences. Drop your thoughts in the comments. #devops #mlops

English

Bibin Wilson@bibinwillson·2 Mar

When you start exploring Istio architecture, This SPOF question often comes up 👇 Is ztunnel a single point of failure? What happens if ztunnel goes down on a node? Here is the answer. Since ztunnel runs as one pod per node, it may sound like a single point of failure. If ztunnel goes down, traffic to the pods on that specific node will be affected. However, pods on other nodes are not impacted. Each node has its own healthy ztunnel instance. Because ztunnel runs as a DaemonSet, Kubernetes will automatically restart it, just like it does for any other daemonset pod. So, is ztunnel a SPOF? Absolutely not. The design assumes that nodes can fail, which is normal in distributed systems. Recovery is handled automatically by Kubernetes. Understanding architectural patterns like this helps you design better systems and answer questions with confidence during implementations. --- We cover topics like this in our weekly newsletter, one DevOps deep dive and one MLOps hands-on edition. 𝗦𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲 𝗵𝗲𝗿𝗲 (𝗶𝘁’𝘀 𝗳𝗿𝗲𝗲): newsletter.devopscube.com #devops #istio

English

Bibin Wilson@bibinwillson·27 Şub

MLOps job descriptions have changed. And most DevOps engineers are not ready for it. Look at this real MLOps job description (image) vLLM, GPU-aware scaling, Model monitoring, ML pipelines, FastAPI inference APIs etc This is not a DevOps job with a new title. This is a role that expects you to understand the full model lifecycle. But here is the thing. A DevOps engineer does not write the Java app (Well, some do). Yet they need to understand how it is built, configured, and deployed to do their job well. MLOps works the same way. That is exactly why our MLOps series starts with model building. Not to turn you into an ML engineer. But to understand the core basics to collaborate well with Data Scientists, Data engineers and other in the team. So if you are looking to transition into MLOps or stay relevant, Learn the fundamentals, understand the ML lifecycle, Handling foundational models etc.. Every Saturday I send a hands-on MLOps deep dive built specifically for DevOps engineers. 𝗦𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲 𝗵𝗲𝗿𝗲 (𝗶𝘁’𝘀 𝗳𝗿𝗲𝗲): newsletter.devopscube.com #mlops

English

Bibin Wilson@bibinwillson·26 Şub

Kubernetes CNI vs Istio CNI Here is the key thing you need to understand 👇 When you deploy Istio, you need to run Istio CNI. This does not mean you replace your Kubernetes cluster CNI. It works together with your existing CNI like Calico or Cilium as a chaining plugin. Here is the key difference. When a pod starts, Kubernetes calls a CNI plugin to set up networking, assign an IP, create network interfaces, set up routes, etc. So it does the real networking work. Here is what Istio CNI does: - It acts only on pods that are part of the mesh (namespaces labeled as dataplane-mode=ambient) - When it detects new pods that are part of the mesh, it notifies the Istio CNI node agent - The CNI node agent then adds iptables rules in the pods network namespace to redirect pod traffic to the Ztunnel proxy. Overall, Istio CNI is a 𝗰𝗵𝗮𝗶𝗻𝗶𝗻𝗴 𝗽𝗹𝘂𝗴𝗶𝗻. This means multiple CNI plugins run in sequence on the same pod, each adding its own piece of networking logic. For example: - Pod starts - Calico (assigns IP, sets up routes) - Istio CNI (sets up iptables redirect rules) --- I share such DevOps/MLOps concepts and deep dives in my newsletter. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗵𝗲𝗿𝗲 (𝟭𝟬𝟬% 𝗳𝗿𝗲𝗲): newsletter.devopscube.com #istio #devops

English

415

Bibin Wilson@bibinwillson·25 Şub

Master Istio Architecture in 12 Minutes (Illustrated Guide) If you want to understand Istio better, You need to understand its complete architecture and how the Istio components interact with each other. In my latest newsletter edition, I break this down with clear diagrams and simple explanations. Here is what you will learn 👇 - Why Istio needed a new architecture - Overview of Istio architecture - Deep dive into key components: Istiod, Ztunnel, Waypoint Proxy, and CNI - Is Ztunnel a single point of failure? - Business use cases of Ambient Mesh, including cost benefits. - Hands-on guide to setting up Istio Ambient Mesh 👉 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗵𝗲𝗿𝗲: newsletter.devopscube.com/p/istio-ambien… 𝗡𝗼𝘁𝗲: For better understanding, first set up the Ambient Mesh and then go through the architecture. You will be able to relate to all the concepts much better. #devops #istio

English

507

Bibin Wilson@bibinwillson·24 Şub

I created a GitHub repository to help DevOps engineers learn MLOps. Here is the thing. Most MLOps resources assume you already know machine learning. But what if you are coming from a DevOps background? That is exactly who I built this for. The repository focuses on ML operations using cloud-native tools on top of Kubernetes. So you are learning ML concepts while working with infrastructure you already understand. This approach lets you gain MLOps knowledge using your existing DevOps skills, or develop new ones along the way. The repo follows my MLOps newsletter series. I have already published two editions that are part of the repository. The latest edition is hands-on, where you perform data preparation using Python scripts. 𝗚𝗶𝘁𝗛𝘂𝗯 𝗥𝗲𝗽𝗼𝘀𝗶𝘁𝗼𝗿𝘆: github.com/techiescamp/ml… If you have any feedback on how the content is organized, please raise an issue in the repository. We can discuss it there. ♻️ PS: Repost and share this with DevOps engineers who want to expand into MLOps. #mlopsfordevops

English

Bibin Wilson@bibinwillson·23 Şub

From DNS to Pod: How k8s Gateway API actually works. - You create a DNS record pointing to your cloud Load Balancer IP. - The Load Balancer forwards traffic to a Kubernetes Service, specifically the Gateway Service endpoint. - This Service points to the gateway proxy pods. These could be nginx, Envoy, or any compatible proxy. - The Gateway Controller (Ex: Nginx Fabric) watches for HTTPRoute, GRPCRoute, and similar resources. - When you apply these routes, the controller automatically configures the gateway proxy with the right configuration. - The HTTPRoute resource is what decides where your traffic actually goes. For example, /payment to payment-service, /auth to auth-service So the full traffic flow looks like this 👇 DNS to Cloud LB to Gateway Service to Gateway Proxy to your backend Service and finally to your Pod. If you understand the Ingress flow well, relating it to the Gateway API is very easy. A key difference is that in the classic Ingress model, the controller itself acts as the proxy. In the Gateway API, the controller configures and manages dedicated proxy instances (Gateways), creating a clear separation of concerns. I share such DevOps/MLOps concepts and deep dives in my newsletter. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗵𝗲𝗿𝗲 (𝟭𝟬𝟬% 𝗳𝗿𝗲𝗲): newsletter.devopscube.com Over to you… Are you using Gateway API in production? If yes, would love to hear your experience with it. ♻️ If this helped, repost it so others can learn too. #kubernetes

English

Bibin Wilson@bibinwillson·19 Şub

Here is something interesting I noticed about Istio Ambient mode. Usually in cloud native projects, you will see most components built with Go. Take Kubernetes for example. - Control plane (API server, scheduler, controllers) is Go. - Worker node agents (kubelet, kube-proxy) are Go. - Datastore (etcd) is also Go. Istio Ambient Mode takes a different approach. - Istiod (control plane) is Go - Ztunnel is rust (designed to be small, secure) - WayPoint proxy (Envoy, which is written in C++) Each component uses the language that fits its purpose best (Polyglot architecture ) - Go for orchestration - Rust for secure and efficient overlay proxies, and - C++ for performance and rich L7 functionality. 𝗡𝗼𝘁𝗲: Initially ztunnel was implemented using an Envoy proxy. -- Get DevOps and MLOps Tips, deep dives delivered to your inbox. 👉 𝗝𝗼𝗶𝗻 𝗛𝗲𝗿𝗲 (𝗜𝘁𝘀 𝗳𝗿𝗲𝗲): newsletter.devopscube.com

English

Bibin Wilson@bibinwillson·18 Şub

Terraform Actions - A Must know feature! Terraform is great at CRUD operations. Meaning creating, reading, updating, and deleting infrastructure. But what about everything that happens after the infrastructure is up? For example, invoking a lambda function when needed. Before v1.14, you had to go with workarounds like provisioners (local-exec or remote-exec) or external scripts. With the introduction of Actions in v1.14, You can perform these operations using the terraform native action blocks built directly into providers (e.g., AWS Provider). In the latest newsletter edition, I covered, - What Terraform Actions are and why they were introduced - Common use cases and a typical workflow - A hands-on example to see how they work in practice. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗛𝗲𝗿𝗲: newsletter.devopscube.com/p/terraform-ac… This is a good improvement in how we manage post-deployment operations with Terraform. Have you tried Terraform Actions yet? What are your thoughts about it? 𝗡𝗼𝘁𝗲: It is is a new feature and provider support is still rolling out. For example, For example, AWS currently has only three actions available.

English

Bibin Wilson retweetledi

DevopsCube@devopscube·17 Şub

Compliance as Code is a must-know skill for DevOps engineers. Here's why👇 When you are designing infrastructure or creating IaC, You need to understand what compliance requirements the application falls under and what security measures to apply. So what is compliance? Every organization has rules designed to keep systems safe, protect user data, and meet legal requirements. In IT terms, these rules translate into technical implementation. For example, if compliance says "Protect data from unauthorized access", in IT this becomes: - Encrypt data at rest and in transit - Network segmentation - Firewall rules One such example is PCI DSS (Payment Card Industry Data Security Standard). In Kubernetes, this means a PCI-compliant application should not accept traffic from internal services that have no business need to communicate with it, even if they're in the same cluster. This translates to NetworkPolicies that whitelist only the specific pods/namespaces that need access, with everything else blocked by default. In our recent newsletter, We break down everything about Why Compliance is Important In DevOps, Key Compliance Standards, Compliance as Code and more. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗛𝗲𝗿𝗲: newsletter.devopscube.com/p/compliance-a… How are you enforcing compliance requirements in your projects? What tools and processes are you using?

English

1.2K

Bibin Wilson@bibinwillson·17 Şub

Waypoint proxy is a key component in Istio Ambient mode. Here is how it handles L7 traffic 👇 In Ambient mode, you get L4 security with ztunnel by default. But if you need L7 features like retries, traffic splitting, or advanced routing, you need to deploy an optional waypoint proxy. Here is how it works. - First, enable L7 policies by adding a label to your Service or Namespace. - When any source ztunnel gets traffic from the labelled services, it knows from its xDS config to route to the waypoint address instead of directly to the destination ztunnel. - The ztunnel then builds an HBONE tunnel to the waypoint proxy. - The waypoint proxy (Envoy) performs L7 processing ( retries, traffic splitting etc) - After processing, the waypoint forwards traffic via another HBONE tunnel to the destination ztunnel, which delivers to the pod. This is why Ambient mode is considered a layered approach. You get L4 mTLS and security with ztunnel, and you layer on L7 capabilities only where you actually need them, service by service. 𝗡𝗼𝘁𝗲: There are different patterns for using waypoint like per namespace, per service, or multi namespace. Will cover that in a newsletter edition. -- Get DevOps and MLOps Tips, deep dives delivered to your inbox. 👉 𝗝𝗼𝗶𝗻 𝗛𝗲𝗿𝗲 (𝗜𝘁𝘀 𝗳𝗿𝗲𝗲): newsletter.devopscube.com #devops #practicaldevops

English

Bibin Wilson retweetledi

DevopsCube@devopscube·16 Şub

One Istio concept worth understanding: HBONE. HBONE stands for HTTP-Based Overlay Network Environment. It is basically how Istio safely moves traffic inside a Kubernetes cluster. Here is what it does 👇 It creates a secure tunnel that carries traffic between service proxies. Before HBONE, every connection from one workload to another created separate connections between sidecars. With HBONE, many connections share one secure tunnel instead. How does it work? It combines three web standards. - HTTP/2 - allows many streams to run over one connection - HTTP CONNECT - builds a tunnel through that connection - mTLS (mutual TLS) - encrypts and secures the tunnel so each side verifies the other Even if you are not directly managing the service mesh, as a DevOps engineer, you should know how these pieces fit together. If you are working with service mesh, learn the fundamentals and understand what's happening behind the scenes. 𝗡𝗼𝘁𝗲: HBONE does not exist outside Istio’s ecosystem. It is not a standard networking protocol you will find in general TCP/IP or HTTP specifications. ---- We share concepts like these and practical deep dives in our DevOpsCube newsletter that help you understand how things work under the hood. Read by 18,500+ engineers worldwide. 👉 𝗝𝗼𝗶𝗻 𝗵𝗲𝗿𝗲 (𝗶𝘁𝘀 𝗳𝗿𝗲𝗲): lnkd.in/gDb9zQcv #devops #kubernetes #istio

English

101

3.9K

Bibin Wilson@bibinwillson·16 Şub

Ztunnel (The Istio component that changes everything) In Istio Ambient mode, ztunnel (zero trust tunnel) is the component that eliminates sidecars entirely. It is a Rust-based proxy running as a DaemonSet, one proxy per node. Here is how it works. - When traffic enters a node, ztunnel intercepts it using iptables by default ( You can also enable eBPF-based redirection) - Once intercepted it handles Layer 3 and 4 traffic - It then uses the HBONE protocol to create secure tunnels between services, ensuring zero trust communication. - Throughout this process, it collects Layer 4 telemetry including TCP metrics and access logs. - It also enforces Layer 3 and 4 authorization policies covering identity, IP addresses, and ports. - Behind the scenes, ztunnel communicates with the Istio daemon using xDS APIs to receive configuration updates dynamically. -- Get weekly DevOps and MLOps deep dives delivered to your inbox. 👉 𝗝𝗼𝗶𝗻 𝗛𝗲𝗿𝗲 (𝗜𝘁𝘀 𝗳𝗿𝗲𝗲): newsletter.devopscube.com #devops #practicaldevops

English

Bibin Wilson@bibinwillson·11 Şub

Compliance as Code is a must-know skill for DevOps engineers. Here's why👇 When you are designing infrastructure or creating IaC, You need to understand what compliance requirements the application falls under and what security measures to apply. So what is compliance? Every organization has rules designed to keep systems safe, protect user data, and meet legal requirements. In IT terms, these rules translate into technical implementation. For example, if compliance says "Protect data from unauthorized access", in IT this becomes: - Encrypt data at rest and in transit - Network segmentation - Firewall rules One such example is PCI DSS (Payment Card Industry Data Security Standard). In Kubernetes, this means a PCI-compliant application should not accept traffic from internal services that have no business need to communicate with it, even if they're in the same cluster. This translates to NetworkPolicies that whitelist only the specific pods/namespaces that need access, with everything else blocked by default. In my recent newsletter, I break down everything about Why Compliance is Important In DevOps, Key Compliance Standards, Compliance as Code ad more.. 𝗥𝗲𝗮𝗱 𝗶𝘁 𝗛𝗲𝗿𝗲: newsletter.devopscube.com/p/compliance-a… I learned this early in my career when I wrote an entire Chef cookbook to run HIPAA compliance checks on Linux servers. How are you enforcing compliance requirements in your projects? What tools and processes are you using?

English

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry