Skip to content

intel/xpumanager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4,373 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Intel(R) XPU Manager and XPU System Management Interface

Intel(R) XPU Manager is a free and open-source tool for monitoring and managing Intel Data Center GPUs. It is designed to simplify administration, and to help in improving device utilization and maximizing their reliability and uptime. Intel(R) XPU System Management Interface (XPU-SMI) is a command line interface (CLI) to manage GPUs locally.

Table of Contents

Intel(R) XPU Manager features

  • Administration:
    • GPU discovery and information - name, model, serial, stepping, location, frequency, memory capacity, firmware version
    • GPU topology
    • GPU Firmware updating, including GPU GFX firmware and AMC (Add-in card Management Controller) firmware updating.
  • Monitoring:
    • GPU telemetry – utilization, power, frequency, temperature, fabric speed, memory throughput, errors
    • GPU health – memory, power, temperature, fabric port
    • GPU metric exporter daemon
  • Configuration:
    • GPU Settings - GPU power limits, frequency range, standby mode, scheduler mode, ECC On/Off, fabric port status

SMI output examples

Some fields in the xpu-smi output example below may vary depending on platform used.

Default xpu-smi output

$ xpu-smi
+----------------------------------------------+----------------------------+------------------------+
|            Intel XPU-SMI v2.0    Driver: 9FF4708BCC22C0EE6FD801A    Level Zero: 1.27.0             |
+----------------------------------------------+----------------------------+------------------------+
| GPU  Name                    Persistence-M   | Bus-Id            Disp.A   |   Volatile Uncorr. ECC |
| Fan  Temp  Pwr:Usage/Cap                     | Memory-Usage               |   GPU-Util  Compute M. |
+----------------------------------------------+----------------------------+------------------------+
|   0  Intel(R) Arc(TM) B580   Off             | 0000:03:00.0      Off      |               Disabled |
| N/A    25C  66W / 380W                       | 239MiB / 12216MiB          |      0%        Default |
+----------------------------------------------+----------------------------+------------------------+

+-------+-----------+--------+----------------+------------------------------------------------------+
| Processes:                                                                                         |
|   GPU         PID    Type    Process Name                                         GPU Memory Usage |
+-------+-----------+--------+----------------+------------------------------------------------------+
|     0     2874292     C      xpu-smi                                                       206 MiB |
+-------+-----------+--------+----------------+------------------------------------------------------+

Running xpu-smi discovery

$ xpu-smi discovery -d 0
+-----------+--------------------------------------------------------------------------------------+
| Device ID | Device Information                                                                   |
+-----------+--------------------------------------------------------------------------------------+
| 0         | Device Type: GPU                                                                     |
|           | Device Name: Intel(R) Arc(TM) B580 Graphics                                          |
|           | Device State: normal                                                                 |
|           | PCI Device ID: 0xe20b                                                                |
|           | Vendor Name: Intel(R) Corporation                                                    |
|           | SOC UUID: 00000000-0000-0003-0000-0000e20b8086                                       |
|           | Serial Number: unknown                                                               |
|           | Core Clock Rate: 2850 MHz                                                            |
|           | Stepping: A0                                                                         |
|           | SKU Type: Production ES                                                              |
|           |                                                                                      |
|           | Driver Version: 9FF4708BCC22C0EE6FD801A                                              |
|           | Kernel Version: 6.19.0-rc6                                                           |
|           | GFX Firmware Name: GFX                                                               |
|           | GFX Firmware Version:                                                                |
|           | GFX Firmware Status: normal                                                          |
|           |                                                                                      |
|           | PCI BDF Address: 0000:03:00.0                                                        |
|           | PCI Slot: PCIEx16(G5)                                                                |
|           | PCIe Generation: 4                                                                   |
|           | PCIe Max Link Width: 8                                                               |
|           | PCIe Max Bandwidth: 15.75 GB/s                                                       |
|           |                                                                                      |
|           | Memory Physical Size: 12216.00 MiB                                                   |
|           | Max Mem Alloc Size: 11605.20 MiB                                                     |
|           | ECC State: disabled                                                                  |
|           | Number of Memory Channels: 12                                                        |
|           | Memory Bus Width: 384                                                                |
|           | Max Hardware Contexts: 65536                                                         |
|           | Max Command Queue Priority: 0                                                        |
|           |                                                                                      |
|           | Number of EUs: 160                                                                   |
|           | Number of Tiles: 1                                                                   |
|           | Number of Slices: 5                                                                  |
|           | Number of Sub Slices per Slice: 4                                                    |
|           | Number of Threads per EU: 8                                                          |
|           | Physical EU SIMD Width: 16                                                           |
|           | Number of Media Engines: 2                                                           |
|           | Number of Media Enhancement Engines: 2                                               |
+-----------+--------------------------------------------------------------------------------------+

How to get XPU Manager

Linux CLI tools (xpu-smi)

xpu-smi package is available from the Intel package repositories. One can also manually download / install the latest xpu-smi binary package from the XPUM release page.

Windows CLI tools

Latest installers / binaries can be downloaded from the XPUM release page.

GPU info exporter (xpumd)

XPUM Daemon is available as a container image, please see xpumd README.

Supported Devices

Supported OSes

  • XPU-SMI
    • Ubuntu 24.04.3
    • Windows Server 2022 (limited features including: GPU device info, GPU telemetry, GPU firmware update and GPU configuration)

Runtime Dependencies

For Ubuntu 22.04 and later, required Intel drivers can be installed from the Intel package repositories

Dependency releases:

System packages:

For system package dependencies, please refer to Prerequisites

Release cadence

XPU Manager uses semantic versioning in the form MAJOR.MINOR.PATCH.

  • Minor versions are released periodically as new features and platform support are stabilized.
  • Patch versions are released as needed for critical fixes and security updates.

Documentation

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors