|
1 | 1 | # stackhpc.linux.vgpu |
2 | 2 |
|
| 3 | +This role can configure vGPUs or Multi Instance GPU (MIG) on NVIDIA cards. |
| 4 | + |
3 | 5 | ## Prerequisites |
4 | 6 |
|
5 | | -- [Download Nvidia GRID driver](https://docs.nvidia.com/grid/latest/grid-software-quick-start-guide/index.html#redeeming-pak-and-downloading-grid-software) (This requires a login). |
6 | | - - The location of this file can be customised with the `vgpu_driver_url` variable: |
7 | | - * e.g to use an artifact uploaded to a http server: |
8 | | - `vgpu_driver_url: http://seed/pulp/content/nvidia/NVIDIA-GRID-Linux-KVM-525.85.07-525.85.05-528.24.zip` |
9 | | - * e.g to use file the control host: |
10 | | - `vgpu_driver_url: "{{ lookup('env', 'HOME'}}/NVIDIA-GRID-Linux-KVM-525.85.07-525.85.05-528.24.zip"` |
| 7 | +### Multi Instance GPU (MIG) |
| 8 | + |
| 9 | +When creating MIG devices with no vGPU instances layered on top, there are no special requirements. |
| 10 | + |
| 11 | +### VGPUs: |
11 | 12 |
|
12 | 13 | - Enable IOMUU |
13 | 14 | - Make sure the related options are enabled in the BIOS |
14 | 15 | - Intel CPUs require the intel_iommu kernel command line argument |
15 | 16 |
|
16 | | -## Enabling SR-IOV on dell hardware |
| 17 | + |
| 18 | +#### Enabling SR-IOV on dell hardware |
17 | 19 |
|
18 | 20 | ``` |
19 | 21 | /opt/dell/srvadmin/bin/idracadm7 set BIOS.IntegratedDevices.SriovGlobalEnable Enabled |
20 | 22 | /opt/dell/srvadmin/bin/idracadm7 jobqueue create BIOS.Setup.1-1 |
21 | 23 | ``` |
| 24 | + |
| 25 | +## Drivers |
| 26 | + |
| 27 | +The role will attempt to install a driver from ``vgpu_driver_url``. Currently this only works with |
| 28 | +the data center drivers such as the |
| 29 | +[Nvidia GRID drivers](https://docs.nvidia.com/grid/latest/grid-software-quick-start-guide/index.html#redeeming-pak-and-downloading-grid-software) |
| 30 | +or the [AI enterprise drivers](https://www.nvidia.com/en-gb/data-center/products/ai-enterprise/); |
| 31 | +both of which can be obtained from the NVIDIA licensing portal. The use of data centre drivers is not mandatory |
| 32 | +if you only want to use MIG without vGPUs. |
| 33 | + |
| 34 | +The location of this file can be customised with the `vgpu_driver_url` variable, e.g to use an artifact uploaded to a http server: |
| 35 | + |
| 36 | +``` |
| 37 | +vgpu_driver_url: http://seed/pulp/content/nvidia/NVIDIA-GRID-Linux-KVM-525.85.07-525.85.05-528.24.zip |
| 38 | +``` |
| 39 | + |
| 40 | +e.g to use a file on the control host: |
| 41 | + |
| 42 | +``` |
| 43 | +vgpu_driver_url: "{{ lookup('env', 'HOME'}}/NVIDIA-GRID-Linux-KVM-525.85.07-525.85.05-528.24.zip" |
| 44 | +``` |
| 45 | + |
| 46 | +At this moment in time, the role only supports zip archives, Future work may add support for other packaging formats such as: .deb and .rpm, and .run. |
| 47 | + |
| 48 | +It is possible to install a driver via some other means by setting the ``vgpu_nvidia_driver_install_enabled`` configuration option, e.g: |
| 49 | +``` |
| 50 | +--- |
| 51 | +vgpu_nvidia_driver_install_enabled: false |
| 52 | +``` |
| 53 | + |
| 54 | +This will cause the role to assume that the driver is already installed. |
22 | 55 |
|
23 | 56 | ## Running the role |
24 | 57 |
|
@@ -72,6 +105,13 @@ vgpu_definitions: |
72 | 105 | index: 0 |
73 | 106 | - mdev_type: nvidia-697 |
74 | 107 | index: 1 |
| 108 | + # Configuring a MIG without creating VGPUs. You may also want to set |
| 109 | + # vgpu_nvidia_driver_install_enabled: false if you have installed the nvidia |
| 110 | + # driver by some other means. |
| 111 | + - pci_address: "0000:17:00.0" |
| 112 | + mig_devices: |
| 113 | + "1g.10gb": 1 |
| 114 | + "2g.20gb": 3 |
75 | 115 | ``` |
76 | 116 |
|
77 | 117 |
|
|
0 commit comments