Releases: intel/intel-xpu-smartune
Intel XPU SmarTune v1.0.0
Intel XPU SmarTune 1.0 Feature Description
- Resource Management
When system resources are under strain, cgroups v2 is utilized to dynamically limit CPU, memory, and disk I/O usage for the most resource-intensive applications. Concurrently, power consumption modes are switched based on the severity of the resource pressure, and resource quotas are gradually restored once the pressure subsides.
- Pressure Monitoring
Based on Linux PSI (Pressure Stall Information), CPU, memory, and I/O pressure data are collected in real-time to calculate a comprehensive score and categorize system load into four levels: Low, Medium, High, and Critical. Additionally, eBPF is employed to intercept
execvesystem calls, enabling real-time detection of the launch and termination of managed applications, while independently monitoring disk I/O utilization and system iowait.
- Priority Queuing
When the system reaches a Critical pressure level or disk I/O becomes heavily congested, launch requests for new applications are paused and placed into a prioritized queue. Once resources recover, the queued applications are automatically launched in the order of their priority. The system also supports the manual cancellation of pending launch requests within the queue.
- Application Keep-Alive
For managed applications designated with "Critical" priority, the probability of being terminated by the system's OOM Killer is significantly reduced. Furthermore, the running processes of these key applications are continuously monitored to ensure their stable and uninterrupted operation.
- Disk I/O Control
Through cgroups v2, applications consuming excessive disk I/O resources are throttled. Read/write bandwidth and IOPS quotas are allocated based on priority levels, and these limits are gradually lifted once the disk I/O pressure has dissipated.
- Network I/O Control
Utilizing a combination of cgroups, iptables, and tc/HTB, inbound and outbound traffic for managed applications is strictly controlled. Bandwidth is allocated across four priority tiers: Critical, High, Low, and System. Based on a sliding average window, network pressure levels are calculated in real-time; should the system reach a Critical pressure level, bandwidth caps are sequentially imposed—starting with Low-priority classes, followed by High-priority classes—and are gradually restored once the pressure subsides.
- Manual Control
Users are empowered to manually manage controlled applications through various operations, including adjusting priority levels, canceling queued launch requests, setting specific resource limits (for CPU, Memory, and I/O), restoring default resource quotas, enabling/disabling the "Keep-Alive" function, and removing applications from management.
- Simplified XPU Monitoring
Provides real-time statistics and visual displays regarding the utilization rates and operating frequencies of the CPU, Memory, iGPU, and NPU.