
Peter Phaal
397 posts



Containerlab sFlow-RT Development Environment provides example Javascript and Python scripts. Leaf / spine switches with FRRouting and Host sFlow used in SONiC, NVIDIA, VyOS etc. for realistic telemetry. #sflow-rt-development-environment" target="_blank" rel="nofollow noopener">github.com/sflow-rt/conta…

English

Explore publicly accessible dashboards showing live data from operational networks, including: an AI/ML RoCEv2 fabric, a world-wide Kubernetes cluster, and an Internet Exchange Provider (IXP). Learn how to monitor your own networks. blog.sflow.com/2026/04/four-p…

English

Learn what's coming in open source SONiC network operating system to address challenges of operating AI/ML backend networks. blog.sflow.com/2026/04/sonic-…

English

Learn how standard measurements from data center switches provides visibility into RDMA traffic from AI / ML workloads. Troubleshoot latency and drops. Includes live dashboards showing production traffic. blog.sflow.com/2026/03/nanog9…

English

See how to visualize AI/ML RoCEv2 traffic flows. Includes live dashboard showing production traffic. blog.sflow.com/2026/02/real-t…

English
Peter Phaal retweetledi

N96 Talks are Streaming! youtube.com/playlist?list=…
The N96 talks are on YouTube! 🎥✨ Whether you couldn’t make it in person or just want to relive the best moments, the full lineup is available!
Subscribe to our YouTube + keep the conversation going long after the conference.

English

Just published on YouTube from @nanog
and #NANOG96. Lightning Talk: Seeing Through the RDMA Fog: Monitoring RoCEv2 with sFlow youtu.be/vM-M0vtvxLI?si…

YouTube
English

The SDSC Expanse cluster live AI/ML metrics dashboard is a joint InMon / San Diego Supercomputer Center (SDSC) demonstration at SC25 conference being held this week in St. Louis. Click on the dashboard link during the show to see live traffic. blog.sflow.com/2025/11/sc25-s…

English

Real-time visibility into production Ultra Ethernet Transport (UET) traffic using industry standard data center switch telemetry blog.sflow.com/2025/11/ultra-…

English

Vector Packet Processor (VPP) release 25.10 extends the sFlow implementation to include support for dropped packet notifications. blog.sflow.com/2025/10/vector…

English

Industry standard packet sampling in data center switch hardware from all leading vendors (Arista, Cisco, Dell, Juniper, NVIDIA, etc.) provides a cost effective solution for even the largest AI / ML fabrics. blog.sflow.com/2025/10/ai-ml-…

English

Trimming packets that would otherwise be dropped in AI/ML networks is part of Ultra Ethernet congestion control and currently supported in NVIDIA switches and adapters. Monitoring trimmed packets is a useful metric for network visibility blog.sflow.com/2025/09/packet…

English

pwru (packet, where are you?) is an open source tool from Cilium that used eBPF instrumentation in recent Linux kernels to trace network packets through the kernel. Try it out using Multipass on your laptop. blog.sflow.com/2025/07/tracin…

English

Grafana dashboard showing performance metrics for AI/ML RoCEv2 network traffic. Step by step instructions using free Grafana Cloud account. blog.sflow.com/2025/06/ai-met…

English

Grafana dashboard showing performance metrics for AI/ML RoCEv2 network traffic used for inter-GPU communications. Includes step by step instructions to give it a try! blog.sflow.com/2025/04/ai-met…

English

The availability of the Cisco IOS XR Release 25.1.1 brings sFlow dropped packet notification support to Cisco 8000 series routers, making it easy to capture and analyze packets dropped at router ingress. blog.sflow.com/2025/04/droppe…

English

Interesting differences between network traffic patterns seen in two live AI / ML clusters. blog.sflow.com/2025/04/compar…

English

Video now available of must see lightning talk by Pim van Pelt describing recently released sFlow VPP implementation. fosdem.org/2025/schedule/…
English

The application provides performance metrics for AI/ML RoCEv2 network traffic, for example, large scale CUDA compute tasks using NVIDIA Collective Communication Library (NCCL) operations for inter-GPU communications. Step by step instructions. blog.sflow.com/2025/02/ai-met…

English
