Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia RTD3 Power Management Unawareness Bug in Hybrid Graphics Laptops #207

Open
1 task done
funkemunky opened this issue Apr 18, 2024 · 13 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@funkemunky
Copy link

Is there an existing issue for this?

  • I searched the existing issues and did not find anything similar.

Current Behavior

When I am opening Resources app, the nvidia GPU wakes up from suspension when it is not in use by any application. The nvidia gpu status stays active when in use.

Resources is not being rendered using the nvidia gpu, but its monitoring of the dedicated gpu results in it keeping it awake indefinitely.

Expected Behavior

The nvidia gpu should ideally not be awoken when not in use. I know that there might be a limitation with the nvidia driver preventing it from being monitored even though it is in use. However, a workaround could be implemented by hooking into the nvidia driver and checking to see if it is

  1. Already awake
  2. The reason it is awake
  3. If the reason it is awake being Resources, stop monitoring.

Steps To Reproduce

  1. Check your PCI information for your nvidia card on your hybrid graphics system. I use "cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status". If it returns "suspended", the nvidia gpu is asleep.
  2. Open Resources application, and run the same command again. You will notice it returns "active", and does not stop.
  3. Check nvidia-smi to see if anything other than gnome-shell is being used, or on KDE if anything at all is being used. If either of the previous conditions are met, then it is Resources preventing rtd3 from suspending the GPU.

Environment

Program Version: 1.4.0
Package: Flatpak
System Version: Fedora 40 (all drivers and packages are up to date running their latest major versions).

*Hardware Information*
CPU: AMD Ryzen 7840H with 780m Radeon Graphics
GPU: NVidia 4060 
MEMORY: 6400MTs 32GB On-board memory
Storage: 1TB SY-Hynix NVME SSD

System Model: Lenovo Legion Slim 5 14" AMD

Anything else?

I would be happy to contribute any potential fixes and provide a test environment if you don't have an equivalent available.

@funkemunky funkemunky added the bug Something isn't working label Apr 18, 2024
@nokyan
Copy link
Owner

nokyan commented Apr 19, 2024

Hi, thanks for reporting the issue.
I'd prefer having a switch somewhere on the GPU page to toggle automatic halting of monitoring. That switch could also say that enabling that could save battery life, otherwise it might be confusing for users as to why Resources seemingly randomly stops monitoring the GPU. But I'll definitely take this into consideration. :)

@saltyming
Copy link

Hi, thanks for reporting the issue. I'd prefer having a switch somewhere on the GPU page to toggle automatic halting of monitoring. That switch could also say that enabling that could save battery life, otherwise it might be confusing for users as to why Resources seemingly randomly stops monitoring the GPU. But I'll definitely take this into consideration. :)

You can also make it check "cat /sys/bus/pci/devices/0000:01:00.0/power/runtime_status" consistently and show zero usage if the gpu is in d3.

@marcinx64
Copy link

Hi, thanks for reporting the issue. I'd prefer having a switch somewhere on the GPU page to toggle automatic halting of monitoring. That switch could also say that enabling that could save battery life, otherwise it might be confusing for users as to why Resources seemingly randomly stops monitoring the GPU. But I'll definitely take this into consideration. :)

Hi, maybe something like „GPU Suspended” in the center of the sidebars graph and some info on the tab will be enough to avoid confusion? Besides this, users with dual GPU mostly know what their hardware is capable of, eg. last time I’ve been using ASUS Armoury Crate, it doesn’t even display info when dGPU is suspended, the GPU isn’t even visible when suspended and „vanish” from the GPU list when entering power state D3.

@nokyan
Copy link
Owner

nokyan commented Nov 30, 2024

Hi,
I'm sorry I haven't yet handled this issue. I'll try to make it happen for Resources 1.8.
I think showing that the GPU is suspended in the sidebar is a good idea, I've been thinking kinda like this (imagine the graph being a flat 0%):
image
I've also had the idea to monitor GPU usage (and thus wake up the GPU) only if its page is actually selected and otherwise halt monitoring. Though I'll make this togglable (or, if possible, detect if Resources is running on a laptop with a dGPU) in the settings.

@funkemunky
Copy link
Author

I would prefer waking it not be an option. The only point of Resources is to monitor currently-used resources on a system, so why make it use them? That is my personal opinion, however. Let me know what you think.

@nokyan
Copy link
Owner

nokyan commented Jan 4, 2025

Do you mind checking out the prevent-gpu-wakeup branch?

@Kimiblock
Copy link

From what I've tested, the specific branch doesn't work on my hybrid GPU system. The PKGBUILD is listed as follows:

pkgname=resources
pkgver=1.7.1
pkgrel=1
pkgdesc='Monitor for system resources and processes'
arch=(x86_64)
url='https://apps.gnome.org/Resources/'
license=(GPL-3.0-or-later)
depends=(
  cairo
  dconf
  gcc-libs
  glib2
  glibc
  graphene
  gtk4
  hicolor-icon-theme
  libadwaita
  polkit
)
makedepends=(
  appstream
  git
  meson
  rust
)
source=("git+https://github.com/nokyan/resources.git#tag=v$pkgver")
b2sums=(a342311b26bd55f56b148f64f8044817cd395cb12e259bd9e5c479b5e7ef4d6161214a128732f735088ccdee9cea92eef6e294cebaf13dde3c3eee07b4a1caf3)

prepare() {
  cd $pkgname
  git checkout origin/prevent-gpu-wakeup

  CARGO_HOME="$srcdir/build/cargo" \
    cargo fetch --locked --target "$(rustc -vV | sed -n 's/host: //p')"
}

build() {
  arch-meson $pkgname build \
    -D profile=default

  CARGO_PROFILE_RELEASE_LTO=true \
    CARGO_PROFILE_RELEASE_CODEGEN_UNITS=1 \
    CARGO_PROFILE_RELEASE_DEBUG=2 \
    CARGO_PROFILE_RELEASE_STRIP=false \
    meson compile -C build
}

check() {
  meson test -C build --print-errorlogs --no-rebuild
}

package() {
  meson install -C build --destdir "$pkgdir" --no-rebuild
}

Is there anything I'm missing?

@nokyan
Copy link
Owner

nokyan commented Jan 11, 2025

From what I've tested, the specific branch doesn't work on my hybrid GPU system. The PKGBUILD is listed as follows:

Is there anything I'm missing?

You can try the Flatpak build, build instuctions are in the README. I don't know too much about PKGBUILDs, I'm afraid.

@funkemunky
Copy link
Author

I just compiled it natively (Flatpak-builder wasnt working because a dmidecode package couldnt download, 429 error every time).

I found that it still wakes up the dedicated nvidia RTX 4060. I am using Ubuntu 24.10 with the nvidia driver provided by them.

@nokyan
Copy link
Owner

nokyan commented Jan 24, 2025

Do you mind pulling the latest commit, running Resources with the environment variable RUST_LOG=resources=trace and looking for lines that look like this?

TRACE resources::utils::gpu > Reading /sys/class/drm/card1/device/power/runtime_status…
TRACE resources::utils::gpu > /sys/class/drm/card1/device/power/runtime_status → active (String)

@funkemunky
Copy link
Author

funkemunky commented Jan 31, 2025

For me, the nvidia GPU is card0, and those messages do not appear. Sorry for the delay, I switched back to Fedora 41. Same issue is still occurring.

Here is the log: https://pastebin.com/0bx7r8ju

@nokyan
Copy link
Owner

nokyan commented Jan 31, 2025

For me, the nvidia GPU is card0, and those messages do not appear. Sorry for the delay, I switched back to Fedora 41. Same issue is still occurring.

Here is the log: https://pastebin.com/0bx7r8ju

Thank you. I believe trace debug logs were turned off during your test. Did you make sure you were running Resources with RUST_LOG=resources=trace set? You can change line 15 in build-aux/net.nokyan.Resources.Devel.json for that, for example.

@funkemunky
Copy link
Author

funkemunky commented Jan 31, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants