-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add defaults for GPU PCI passthrough #1586
base: stackhpc/2024.1
Are you sure you want to change the base?
Conversation
115d514
to
8a06ffb
Compare
0856eb0
to
9cd7720
Compare
This should be using the stackhpc.linux collection if possible: stackhpc/ansible-collection-linux#28 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file may fit better under configuration/
than operations/
. Opinions welcome
2c3562c
to
bbd2eaa
Compare
Agreed, though I'd rather get this merged so we can start using it, then update once the collection supports it |
I tried out the changes on a client deployment with three GPU types, worked very well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some docs changes now.
I've used this at a different customer site, worked a treat
Once host configuration is complete, deploy the OpenStack services: | ||
.. code-block:: console | ||
|
||
kayobe overcloud service deploy -kt nova --kolla-limit compute_a100,compute_v100,compute_multi_gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once host configuration is complete, deploy the OpenStack services: | |
.. code-block:: console | |
kayobe overcloud service deploy -kt nova --kolla-limit compute_a100,compute_v100,compute_multi_gpu | |
Once host configuration is complete, deploy Nova: | |
.. code-block:: console | |
kayobe overcloud service deploy -kt nova |
Needs to target the controllers for Nova scheduler too.
This can be also defined in the openstack-config repository | ||
|
||
add extra_specs to flavor in etc/openstack-config/openstack-config.yml: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be also defined in the openstack-config repository | |
add extra_specs to flavor in etc/openstack-config/openstack-config.yml: | |
This can be also defined in the openstack-config repository. | |
Add extra_specs to flavor in etc/openstack-config/openstack-config.yml: |
This change is designed to drastically simplify pci passthrough GPU configuration.
The idea is that we have a data dictionary containing common GPU types, and templating to write that data to nova config files based on a simple group-to-gpu map.
Configuring passthrough is as simple as creating a dictionary like this:
(and passing through the group names to Kolla-Ansible)
The
pci-passthrough.yml
playbook manages host configuration (and is pre-hooked to overcloud host configure)Templates for
nova-compute.conf
,nova-api.conf
, andnova-scheduler.conf
have been added.Changes have now been tested in a prod environment