ingero

23 posts

ingero banner
ingero

ingero

@ingero_io

Free open source GPU observability for AI teams. Traces training and inference failures and GPU stalls via eBPF-for-CUDA. Apache 2.0 license

Tel Aviv Katılım Mart 2026
2 Takip Edilen0 Takipçiler
ingero
ingero@ingero_io·
nvidia-smi: 60% memory used. PyTorch: CUDA out of memory. Both are right. Memory fragmentation leaves holes too small for any new alloc. eBPF trace of cudaMalloc/cudaFree shows exactly where the 40% went. ingero.io/gpu-problem-1-… #GPU #CUDA #PyTorch #eBPF
English
0
0
0
17
ingero
ingero@ingero_io·
10,869 GPU kernel events. 4 MCP tool calls. 47 seconds. Claude diagnosed a vLLM bottleneck nvidia-smi kept hiding: the engine coroutine was being preempted 5,347 times on a shared CPU. ingero.io/ebpf-trace-cud… #eBPF #GPU #vLLM #MCP
English
0
0
0
15
ingero
ingero@ingero_io·
MCP is becoming the interface between AI agents and infra data. Datadog wraps their dashboards. Qualys flags the security risk. We think MCP should BE the observability layer -- talking directly to kernel tracepoints, no metric pipeline in between. ingero.io/mcp-observabil…
English
0
0
0
33
ingero
ingero@ingero_io·
4-node DDP job stalling. nvidia-smi: nothing. One SQL query fanned out to all nodes via eBPF found the straggler in <1s -- checkpoint I/O preempting training on one box. No central collector needed. ingero.io/distributed-gp…
English
0
0
0
18
ingero
ingero@ingero_io·
Ingero now traces CUDA Graph lifecycle events: cudaStreamBeginCapture, cudaStreamEndCapture, cudaGraphInstantiate, cudaGraphLaunch. Detect re-capture spikes in vLLM and torch.compile workloads via eBPF. No code changes needed. github.com/ingero-io/inge… #CUDA #PyTorch #eBPF #GPU
English
0
0
0
5
ingero
ingero@ingero_io·
PyTorch DataLoader was 124x slower than direct tensor indexing. We traced 200,000 context switches and 300,000 page allocations in 40s. The GPU wasn't slow, it was starving. ingero.io/124x-slower-py… #PyTorch #eBPF #GPU
English
0
0
0
6
ingero
ingero@ingero_io·
PyTorch training 13x slower than expected. torch.profiler showed nothing. eBPF tracing found .cpu().numpy() forcing full GPU sync every batch. Fix: 2 lines of pure PyTorch. ingero.io/tracing-13x-py… #PyTorch #CUDA #eBPF
English
0
0
0
4