Nvidia GPU support #1

aksiksi · 2023-11-19T04:43:45Z

Some random links:

Docker Compose deploy devices: https://docs.docker.com/compose/compose-file/deploy/#devices
Podman support for CDI: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html#running-a-workload-with-cdi
Docker CDI support: Add CDI support to --devices docker/cli#3864
Docker Compose support with Podman: docker-compose: Passing gpu with driver: cdi is not supported containers/podman#19338

The text was updated successfully, but these errors were encountered:

aksiksi · 2024-02-24T18:18:10Z

With CDI support now added to NixOS (NixOS/nixpkgs#284507), GPU access should work in a container. See this thread for details: https://discourse.nixos.org/t/nvidia-gpu-support-in-podman-and-cdi-nvidia-ctk/36286.

Podman

Add the CDI device(s) to devices in your Compose file, like so:

jellyfin:
    image: lscr.io/linuxserver/jellyfin
    container_name: jellyfin
    security_opt:
      - label=disable
    devices:
      - nvidia.com/gpu=all

Docker

Enable experimental CDI support in the daemon:

{
  virtualisation.docker.daemon.settings = {
    features = { cdi = true; };
  };
}

Pass in CDI devices to your service(s):

jellyfin:
    image: lscr.io/linuxserver/jellyfin
    container_name: jellyfin
    security_opt:
      - label=disable
    devices:
      - nvidia.com/gpu=all

SpidFightFR · 2024-08-31T10:14:29Z

Any news on the docker side ? I personally added the CDI devices to my docker compose config, when i run docker-compose up it uses my GPU, when i use compose2nix, it gets back on the CPU.

aksiksi · 2024-08-31T12:34:42Z

I personally don’t use Docker. But I did some digging into Docker PRs and found that CDI support is still experimental: moby/moby#47087

To enable it, you’ll first need to set the CDI feature flag: https://docs.docker.com/reference/cli/dockerd/#enable-cdi-devices

In your NixOS config, try this:

virtualisation.docker.daemon.settings = {
  features = { cdi = true; };
};

And you will need to of course pass in “devices” in CDI format as part of your Compose config.

SpidFightFR · 2024-08-31T13:41:52Z

I personally don’t use Docker. But I did some digging into Docker PRs and found that CDI support is still experimental: moby/moby#47087

To enable it, you’ll first need to set the CDI feature flag: https://docs.docker.com/reference/cli/dockerd/#enable-cdi-devices

In your NixOS config, try this:
virtualisation.docker.daemon.settings = {
  features = { cdi = true; };
};
And you will need to of course pass in “devices” in CDI format as part of your Compose config.

I'm trying to do my tests using InvokeAI (that's the only GPU-Docker tool i got rn):

The compose part looks like this:

# Copyright (c) 2023 Eugene Brodsky https://github.com/ebr

x-invokeai: &invokeai
    image: "local/invokeai:latest"
    build:
      context: ..
      dockerfile: docker/Dockerfile

    # Create a .env file in the same directory as this docker-compose.yml file
    # and populate it with environment variables. See .env.sample
    env_file:
      - .env

    # variables without a default will automatically inherit from the host environment
    environment:
      # if set, CONTAINER_INVOKEAI_ROOT will override the Invoke runtime directory location *inside* the container
      - INVOKEAI_ROOT=${CONTAINER_INVOKEAI_ROOT:-/invokeai}
      - HF_HOME
    ports:
      - "${INVOKEAI_PORT:-9090}:${INVOKEAI_PORT:-9090}"
    volumes:
      - type: bind
        source: ${HOST_INVOKEAI_ROOT:-${INVOKEAI_ROOT:-~/invokeai}}
        target: ${CONTAINER_INVOKEAI_ROOT:-/invokeai}
        bind:
          create_host_path: true
      - ${HF_HOME:-~/.cache/huggingface}:${HF_HOME:-/invokeai/.cache/huggingface}
    tty: true
    stdin_open: true


services:
  invokeai-cuda:
    <<: *invokeai
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: cdi
              device_ids:
                - nvidia.com/gpu=all

Though the docker config.nix doesn't contain the device part, idk if it was ignored because of an error on my side ?

aksiksi · 2024-08-31T13:55:04Z

No error on your side - compose2nix does not yet support deploy.resources.reservations.devices. I can add support for that today but only for CDI - shouldn't be tricky. Will work on a PR now.

Do you have the CDI feature enabled in your Docker config? If not, it's strange how the GPU is detected when running with Docker Compose directly.

Two other minor notes:

FYI, the Build spec (build) is not supported by compose2nix, so you'll need to build the container first. See: Support for Compose Build spec #4
tty and stdin_open are not supported - not sure how critical these are, though.

SpidFightFR · 2024-08-31T14:01:30Z

I can add support for that today but only for CDI

Sounds great to me !
Thank you for your answers.

Do you have the CDI feature enabled in your Docker config? If not, it's strange how the GPU is detected when running with Docker Compose directly.

nope, i only got that enabled:
hardware.nvidia-container-toolkit.enable = true; on docker 25. Something that was recommended to me on the nixos discourse

aksiksi · 2024-08-31T14:06:05Z

Ah gotcha, thanks for clarifying! It looks like the feature flag is set by the NixOS module here: https://github.com/NixOS/nixpkgs/blob/nixos-24.05/nixos/modules/services/hardware/nvidia-container-toolkit/default.nix#L72

I'll also update the README with these steps for others who want to get CDI GPU support running in Docker.

aksiksi · 2024-08-31T14:16:59Z

In the meantime, can you please try passing in your devices via the devices block like this:

services:
  invokeai-cuda:
    <<: *invokeai
    restart: unless-stopped
    devices:
      - nvidia.com/gpu=all

The change I am making will be doing exactly this, so you'll just have another way to write it in Compose.

SpidFightFR · 2024-08-31T16:18:18Z

In the meantime, can you please try passing in your devices via the devices block like this:
services:
  invokeai-cuda:
    <<: *invokeai
    restart: unless-stopped
    devices:
      - nvidia.com/gpu=all
The change I am making will be doing exactly this, so you'll just have another way to write it in Compose.

Hey, i made the changes you provided me (thanks) and updated my flake to match your latest commit.
Docker compose no longer works in standalone, however, it works for compose2nix as a service. So i consider my problem solved. Thanks a lot ! 👍 😄

Edit: Error log for docker compose in standalone:

$ docker compose up -d
[+] Running 0/1
 ⠙ Container docker-invokeai-cuda-1  Starting                                                                                                                                                                   0.2s 
Error response from daemon: error gathering device information while adding custom device "nvidia.com/gpu=all": no such file or directory

EDIT 2: i guess i can make another service with the working standalone docker compose but ngl i don't have the use for that anymore...

aksiksi · 2024-08-31T16:32:53Z

Awesome! If you update/use the latest compose2nix PR I just merged, you can go back to your original config and it should work with compose2nix as well :)

See step (2) in the readme section I just added: https://github.com/aksiksi/compose2nix?tab=readme-ov-file#nvidia-gpu-support

aksiksi added feature New feature or request and removed feature New feature or request labels Nov 19, 2023

aksiksi closed this as completed Feb 24, 2024

aksiksi mentioned this issue Aug 31, 2024

Compose Deploy support for Docker CDI #36

Merged

aksiksi added docker Docker-specific podman Podman-specific compose Docker Compose labels Aug 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvidia GPU support #1

Nvidia GPU support #1

aksiksi commented Nov 19, 2023

aksiksi commented Feb 24, 2024 •

edited

Loading

SpidFightFR commented Aug 31, 2024

aksiksi commented Aug 31, 2024 •

edited

Loading

SpidFightFR commented Aug 31, 2024

aksiksi commented Aug 31, 2024

SpidFightFR commented Aug 31, 2024

aksiksi commented Aug 31, 2024

aksiksi commented Aug 31, 2024

SpidFightFR commented Aug 31, 2024 •

edited

Loading

aksiksi commented Aug 31, 2024 •

edited

Loading

Nvidia GPU support #1

Nvidia GPU support #1

Comments

aksiksi commented Nov 19, 2023

aksiksi commented Feb 24, 2024 • edited Loading

Podman

Docker

SpidFightFR commented Aug 31, 2024

aksiksi commented Aug 31, 2024 • edited Loading

SpidFightFR commented Aug 31, 2024

aksiksi commented Aug 31, 2024

SpidFightFR commented Aug 31, 2024

aksiksi commented Aug 31, 2024

aksiksi commented Aug 31, 2024

SpidFightFR commented Aug 31, 2024 • edited Loading

aksiksi commented Aug 31, 2024 • edited Loading

aksiksi commented Feb 24, 2024 •

edited

Loading

aksiksi commented Aug 31, 2024 •

edited

Loading

SpidFightFR commented Aug 31, 2024 •

edited

Loading

aksiksi commented Aug 31, 2024 •

edited

Loading