You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: source/adminguide/hosts.rst
+32Lines changed: 32 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -223,6 +223,38 @@ Following hypervisor-specific documentations can be referred for different maxim
223
223
Guest Instance limit check is not done while deploying an Instance on a KVM hypervisor host.
224
224
225
225
226
+
.. _discovering-gpu-devices-on-kvm-hosts:
227
+
228
+
Discovering GPU Devices on KVM Hosts
229
+
--------------------------------
230
+
231
+
For KVM, the user needs to ensure that IOMMU is enabled and the necessary
232
+
drivers are installed. If vGPU is to be used, the user needs to ensure that
233
+
the vGPU type is supported by the host and has been created on the host. The
234
+
cloudstack agent uses the ``gpudiscovery.sh`` script to discover the GPU devices
235
+
on the host. For more information on how to prepare the host for GPU
236
+
passthrough, see `Managing GPU devices in virtual machines <https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/9/html/configuring_and_managing_virtualization/assembly_managing-gpu-devices-in-virtual-machines_configuring-and-managing-virtualization>`_.
237
+
238
+
Once the host is configured with the GPU devices, the operator can trigger the
239
+
discovery of the GPU devices on the host by using ``discoverGPUdevices`` command
240
+
using cmk or use the ``Discover GPU devices`` button on the host details page in the UI.
241
+
This triggers a request to the cloudstack agent to discover the GPU devices on
242
+
the host.
243
+
244
+
The cloudstack agent uses the ``gpudiscovery.sh`` script to discover the GPU
245
+
devices on a KVM host. The script is located in the
246
+
``/usr/share/cloudstack-common/scripts/vm/`` directory on the host.
247
+
248
+
.. note::
249
+
The script can be run manually to debug the discovery of the GPU devices on a host.
Copy file name to clipboardExpand all lines: source/adminguide/virtual_machines.rst
+81-42Lines changed: 81 additions & 42 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1593,39 +1593,54 @@ CloudStack meet the intensive graphical processing requirement by means of the
1593
1593
high computation power of GPU/vGPU, and CloudStack users can run multimedia
1594
1594
rich applications, such as Auto-CAD, that they otherwise enjoy at their desk on
1595
1595
a virtualized environment.
1596
-
CloudStack leverages the XenServer support for NVIDIA GRID Kepler 1 and 2 series
1597
-
to run GPU/vGPU enabled Instances. NVIDIA GRID cards allows sharing a single GPU cards
1598
-
among multiple Instances by creating vGPUs for each Instance. With vGPU technology, the
1599
-
graphics commands from each Instance are passed directly to the underlying dedicated
1600
-
GPU, without the intervention of the hypervisor. This allows the GPU hardware
1601
-
to be time-sliced and shared across multiple Instances. XenServer hosts use the GPU
1602
-
cards in following ways:
1603
-
1604
-
**GPU passthrough**: GPU passthrough represents a physical GPU which can be
1596
+
1597
+
For KVM, CloudStack leverages libvirt's PCI passthrough feature to assign a
1598
+
physical GPU to a guest Instance. For vGPU profiles, depending on the vGPU type,
1599
+
CloudStack uses mediated devices or Virtual Functions(VF) to assign a virtual
1600
+
GPU to a guest Instance. It's the responsibility of the operator to ensure that
1601
+
GPU devices are in correct state and are available for use on the host. If the
1602
+
operator wants to use vGPU profiles, they need to ensure that the vGPU type is
1603
+
supported by the host and has been created on the host.
1604
+
1605
+
For XenServer, CloudStack leverages the XenServer support for NVIDIA GRID
1606
+
Kepler 1 and 2 series to run GPU/vGPU enabled Instances.
1607
+
1608
+
Some NVIDIA cards allow sharing a single GPU card among multiple Instances by
1609
+
creating vGPUs for each Instance. With vGPU technology, the graphics commands
1610
+
from each Instance are passed directly to the underlying dedicated GPU, without
1611
+
the intervention of the hypervisor. This allows the GPU hardware to be
1612
+
time-sliced and shared across multiple Instances. The GPU cards are used in the
1613
+
following ways:
1614
+
1615
+
**passthrough**: GPU passthrough represents a physical GPU which can be
1605
1616
directly assigned to an Instance. GPU passthrough can be used on a hypervisor alongside
1606
1617
GRID vGPU, with some restrictions: A GRID physical GPU can either host GRID
1607
1618
vGPUs or be used as passthrough, but not both at the same time.
1608
1619
1609
-
**GRID vGPU**: GRID vGPU enables multiple Instances to share a single physical GPU.
1620
+
**vGPU**: vGPU enables multiple Instances to share a single physical GPU.
1610
1621
The Instances run an NVIDIA driver stack and get direct access to the GPU. GRID
1611
1622
physical GPUs are capable of supporting multiple virtual GPU devices (vGPUs)
1612
-
that can be assigned directly to guest Instances. Guest Instances use GRID virtual GPUs in
1623
+
that can be assigned directly to guest Instances. Guest Instances use vGPUs in
1613
1624
the same manner as a physical GPU that has been passed through by the
1614
1625
hypervisor: an NVIDIA driver loaded in the guest Instance provides direct access to
1615
1626
the GPU for performance-critical fast paths, and a paravirtualized interface to
1616
-
the GRID Virtual GPU Manager, which is used for nonperformant management
1617
-
operations. NVIDIA GRID Virtual GPU Manager for XenServer runs in dom0.
1627
+
the NVIDIA vGPU Manager, which is used for nonperformant management
1628
+
operations. NVIDIA vGPU Manager for XenServer runs in dom0.
1629
+
1618
1630
CloudStack provides you with the following capabilities:
1619
1631
1620
-
- Adding XenServer hosts with GPU/vGPU capability provisioned by the administrator.
1632
+
- Adding hosts with GPU/vGPU capability provisioned by the administrator.
1633
+
(Supports only XenServer & KVM)
1621
1634
1622
-
- Creating a Compute Offering with GPU/vGPU capability.
1635
+
- Creating a Compute Offering with GPU/vGPU capability. For KVM, it is possible to
1636
+
specify the GPU count and whether to use the GPU for display. For XenServer,
1637
+
GPU count is simply ignored and only one device is assigned to the guest Instance.
1623
1638
1624
1639
- Deploying an Instance with GPU/vGPU capability.
1625
1640
1626
1641
- Destroying an Instance with GPU/vGPU capability.
1627
1642
1628
-
- Allowing an user to add GPU/vGPU support to an Instance without GPU/vGPU support by
1643
+
- Allowing a user to add GPU/vGPU support to an Instance without GPU/vGPU support by
1629
1644
changing the Service Offering and vice-versa.
1630
1645
1631
1646
- Migrating Instances (cold migration) with GPU/vGPU capability.
@@ -1635,57 +1650,78 @@ CloudStack provides you with the following capabilities:
1635
1650
- Querying hosts to obtain information about the GPU cards, supported vGPU types
1636
1651
in case of GRID cards, and capacity of the cards.
1637
1652
1653
+
- Limit an account/domain/project to use a certain number of GPUs.
1654
+
1638
1655
Prerequisites and System Requirements
1639
1656
-------------------------------------
1640
1657
1641
1658
Before proceeding, ensure that you have these prerequisites:
1642
1659
1643
-
- The vGPU-enabled XenServer 6.2 and later versions.
1644
-
For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
1660
+
- CloudStack does not restrict the deployment of GPU-enabled Instances with
1661
+
guest OS types that are not supported for GPU/vGPU functionality. The deployment
1662
+
would be successful and a GPU/vGPU will also get allocated for Instances; however,
1663
+
due to missing guest OS drivers, Instance would not be able to leverage GPU resources.
1664
+
Therefore, it is recommended to use GPU-enabled service offering only with supported guest OS.
1665
+
1666
+
- NVIDIA GRID K1 (16 GiB video RAM) AND K2 (8 GiB of video RAM) cards supports
1667
+
homogeneous virtual GPUs, implies that at any given time, the vGPUs resident on
1668
+
a single physical GPU must be all of the same type. However, this restriction
1669
+
doesn't extend across physical GPUs on the same card. Each physical GPU on a
1670
+
K1 or K2 may host different types of virtual GPU at the same time. For example,
1671
+
a GRID K2 card has two physical GPUs, and supports four types of virtual GPU;
1672
+
GRID K200, GRID K220Q, GRID K240Q, AND GRID K260Q.
1673
+
1674
+
- NVIDIA driver must be installed to enable vGPU operation as for a physical NVIDIA GPU.
1645
1675
1646
-
- GPU/vGPU functionality is supported for following HVM guest operating systems:
1647
-
For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
1648
1676
1649
-
- Windows 7 (x86 and x64)
1677
+
For XenServer:
1650
1678
1651
-
- Windows Server 2008 R2
1679
+
- the vGPU-enabled XenServer 6.2 and later versions.
1680
+
For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
1652
1681
1653
-
- Windows Server 2012
1682
+
- GPU/vGPU functionality is supported for following HVM guest operating systems:
1683
+
For more information, see `Citrix 3D Graphics Pack <https://www.citrix.com/go/private/vgpu.html>`_.
1654
1684
1655
-
- Windows 8 (x86 and x64)
1685
+
- Windows 7 (x86 and x64)
1656
1686
1657
-
- Windows 8.1 ("Blue") (x86 and x64)
1687
+
- Windows Server 2008 R2
1658
1688
1659
-
- Windows Server 2012 R2 (server equivalent of "Blue")
1689
+
- Windows Server 2012
1660
1690
1661
-
- CloudStack does not restrict the deployment of GPU-enabled Instances with guest OS types that are not supported by XenServer for GPU/vGPU functionality. The deployment would be successful and a GPU/vGPU will also get allocated for Instances; however, due to missing guest OS drivers, Instance would not be able to leverage GPU resources. Therefore, it is recommended to use GPU-enabled service offering only with supported guest OS.
1691
+
- Windows 8 (x86 and x64)
1662
1692
1663
-
- NVIDIA GRID K1 (16 GiB video RAM) AND K2 (8 GiB of video RAM) cards supports homogeneous virtual GPUs, implies that at any given time, the vGPUs resident on a single physical GPU must be all of the same type. However, this restriction doesn't extend across physical GPUs on the same card. Each physical GPU on a K1 or K2 may host different types of virtual GPU at the same time. For example, a GRID K2 card has two physical GPUs, and supports four types of virtual GPU; GRID K200, GRID K220Q, GRID K240Q, AND GRID K260Q.
1693
+
- Windows 8.1 ("Blue") (x86 and x64)
1664
1694
1665
-
- NVIDIA driver must be installed to enable vGPU operation as for a physical NVIDIA GPU.
1695
+
- Windows Server 2012 R2 (server equivalent of "Blue")
1666
1696
1667
-
- XenServer tools are installed in the Instance to get maximum performance on XenServer, regardless of type of vGPU you are using. Without the optimized networking and storage drivers that the XenServer tools provide, remote graphics applications running on GRID vGPU will not deliver maximum performance.
1697
+
- XenServer tools are installed in the Instance to get maximum performance on
1698
+
XenServer, regardless of type of vGPU you are using. Without the optimized
1699
+
networking and storage drivers that the XenServer tools provide, remote
1700
+
graphics applications running on GRID vGPU will not deliver maximum performance.
1668
1701
1669
-
- To deliver high frames from multiple heads on vGPU, install XenDesktop with HDX 3D Pro remote graphics.
1702
+
- To deliver high frames from multiple heads on vGPU, install XenDesktop with
1703
+
HDX 3D Pro remote graphics.
1670
1704
1671
1705
Before continuing with configuration, consider the following:
1672
1706
1673
-
- Deploying Instances GPU/vGPU capability is not supported if hosts are not available with enough GPU capacity.
1674
-
1675
-
- A Service Offering cannot be created with the GPU values that are not supported by CloudStack UI. However, you can make an API call to achieve this.
1707
+
- Deploying Instances with GPU/vGPU capability is not supported if hosts are
1708
+
not available with enough GPU capacity.
1676
1709
1677
-
- Dynamic scaling is not supported. However, you can choose to deploy an Instance without GPU support, and at a later point, you can change the system offering to upgrade to the one with vGPU. You can achieve this by offline upgrade: stop the Instance, upgrade the Service Offering to the one with vGPU, then start the Instance.
1710
+
- Dynamic scaling is not supported. However, you can choose to deploy an
1711
+
Instance without GPU support, and at a later point, you can change the system
1712
+
offering to upgrade to the one with vGPU. You can achieve this by offline
1713
+
upgrade: stop the Instance, upgrade the Service Offering to the one with
1714
+
vGPU, then start the Instance.
1678
1715
1679
1716
- Live migration of GPU/vGPU enabled Instance is not supported.
1680
1717
1681
-
- Limiting GPU resources per Account/Domain is not supported.
1682
-
1683
1718
- Disabling GPU at Cluster level is not supported.
1684
1719
1685
1720
- Notification thresholds for GPU resource is not supported.
0 commit comments