GPU telemetry with workload attribution. One OTLP agent per node ties hardware metrics (NVIDIA, AMD, Intel Gaudi) to the K8s pod or Slurm job burning the GPU.
-
Updated
Jun 2, 2026 - Python
GPU telemetry with workload attribution. One OTLP agent per node ties hardware metrics (NVIDIA, AMD, Intel Gaudi) to the K8s pod or Slurm job burning the GPU.
Gaudi device plugin for Kubernetes is a Daemonset that allows you to automatically expose the number of Gaudi devices on each nodes of your cluster, keep track of the health of your Gaudi devices and run Gaudi enabled containers in your Kubernetes cluster.
Intel Gaudi Base Operator for Kubernetes automates the management of all necessary Intel Gaudi software components on a Kubernetes cluster.
Gaudi aware container runtime, compatible with the Open Containers Initiative (OCI) specification used by Docker, CRI-O, and other popular container technologies. It simplifies the process of building and deploying containerized Gaudi-accelerated applications.
Gaudi Feature Discovery for Kubernetes is a software component that allows you to automatically generate labels for the set of Gaudi accelerators available on a node.
Exporter that exposes Gaudi metrics for Prometheus
Add a description, image, and links to the intel-gaudi-base-operator topic page so that developers can more easily learn about it.
To associate your repository with the intel-gaudi-base-operator topic, visit your repo's landing page and select "manage topics."