Skip to content

Conversation

@Avenger-285714
Copy link
Member

@Avenger-285714 Avenger-285714 commented Dec 12, 2025

Fixed the wrong debugfs node name in hisi_spi debugfs initialization by the way.

Summary by Sourcery

Add support for new CPPC-based cpufreq features and a SEEP governor, improve handling of CPPC registers and perf counters, and expose additional tuning controls via sysfs for Kunpeng SoCs.

New Features:

  • Introduce a SEEP cpufreq governor that leverages CPPC autonomous selection and related features when the cppc_cpufreq driver is in use.
  • Expose CPPC autonomous activity window and energy performance preference controls through optional cpufreq sysfs attributes when supported.
  • Make scaling_cur_freq a standard cpufreq policy attribute instead of creating it conditionally at device interface setup.

Bug Fixes:

  • Handle unsupported or unreadable CPPC performance counters and registers more gracefully, avoiding spurious errors and clearing frequency invariance state when needed.
  • Ensure cpufreq policies correctly mirror and apply boost state via the driver set_boost hook during online operations, including updating QoS constraints on existing policies.
  • Prevent failures when cpufreq drivers do not implement a get callback by guarding frequency verification and queries accordingly.

Enhancements:

  • Adjust cppc_cpufreq policy min/max frequency setup to respect boost mode by using highest_perf when boost is enabled and align cpuinfo.max_freq with the effective max.
  • Refine CPPC EM registration and efficiency class population to be no-op when classes are uniform and simplify the related control flow.
  • Remove unused global tracking of per-CPU CPPC data structures in cppc_cpufreq and rely on per-policy lifecycle management instead.
  • Improve cpufreq governor start/stop and offline flows by centralizing current-frequency verification and simplifying governor state preservation.
  • Add generic helpers in the CPPC ACPI library to read and write individual CPPC registers and to get/set EPP and autonomous activity window/selection values more consistently.

Documentation:

  • Extend cpufreq documentation and ABI testing descriptions to cover the new SEEP governor and CPPC sysfs controls for autonomous activity window and energy performance preference.

Lifeng Zheng and others added 20 commits December 12, 2025 20:10
[Upstream commit 1608f02]
It turns out that CPUX will stay on the base frequency after performing
these operations:

 1. boost all CPUs: echo 1 > /sys/devices/system/cpu/cpufreq/boost

 2. offline one CPU: echo 0 > /sys/devices/system/cpu/cpuX/online

 3. deboost all CPUs: echo 0 > /sys/devices/system/cpu/cpufreq/boost

 4. online CPUX: echo 1 > /sys/devices/system/cpu/cpuX/online

 5. boost all CPUs again: echo 1 > /sys/devices/system/cpu/cpufreq/boost

This is because max_freq_req of the policy is not updated during the
online process, and the value of max_freq_req before the last offline is
retained.

When the CPU is boosted again, freq_qos_update_request() will do nothing
because the old value is the same as the new one. This causes the CPU to
stay at the base frequency. Updating max_freq_req  in cpufreq_online()
will solve this problem.

Signed-off-by: Lifeng Zheng <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
… flag

[Upstream commit dd016f3]
In cpufreq_online() of cpufreq.c, the per-policy boost flag is already
set to mirror the cpufreq_driver boost during init but using freq_table
to judge if the policy has boost frequency. There are two drawbacks to
this approach:

 1. It doesn't work for the cpufreq drivers that do not use a frequency
    table. For now, acpi-cpufreq and amd-pstate have to enable boost in
    policy initialization. And cppc_cpufreq never set policy to boost
    when going online no matter what the cpufreq_driver boost flag is.

 2. If the CPU goes offline when cpufreq_driver boost is enabled and
    then goes online when cpufreq_driver boost is disabled, the
    per-policy boost flag will incorrectly remain true.

Running set_boost at the end of the online process is a more generic way
for all cpufreq drivers.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
Acked-by: Viresh Kumar <[email protected]>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 03d8b4e]
In policy initialization, policy->max and policy->cpuinfo.max_freq are
always set to the value calculated from caps->nominal_perf.

This will cause the frequency stay on base frequency even if the policy
is already boosted when a CPU is going online.

Fix this by using policy->boost_enabled to determine which value should
be set.

Signed-off-by: Lifeng Zheng <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2b16c63]
At the end of cpufreq_online() in cpufreq.c, set_boost is executed and
the per-policy boost flag is set to mirror the cpufreq_driver boost, so
it is not necessary to run set_boost in acpi_cpufreq_cpu_init().

Signed-off-by: Lifeng Zheng <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 0813fd2]
Ensure cpufreq_driver->set_boost is non-NULL before using it in
cpufreq_online() to prevent a potential NULL pointer dereference.

Reported-by: Gautam Menghani <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Fixes: da59223d340c ("cpufreq: Introduce a more generic way to set default per-policy boost flag")
Suggested-by: Viresh Kumar <[email protected]>
Signed-off-by: Aboorva Devarajan <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Minor edits in the subject and changelog ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
… cpus

commit d08e86c77069fbbdd7fdbdaa408c198223bc0900 openEuler

Reading perf counters on offline cpus should be expected to fail, e.g. it
returns -EFAULT as counters are shown to be 0.  Remove the unnecessary
warning print on this failure path.

Fixes: 1eb5dde ("cpufreq: CPPC: Add support for frequency invariance")
Signed-off-by: Jie Zhan <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit 2eab37aa8a55cdb8b85a912a4b1156c39689c89d openEuler

Perf counters could be 0 if the cpu is in a low-power idle state.  Just try
it again next time and update the frequency scale when the cpu is active
and perf counters successfully return.

Also, remove the FIE source on an actual failure.

Fixes: 1eb5dde ("cpufreq: CPPC: Add support for frequency invariance")
Signed-off-by: Jie Zhan <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2b8e6b5]
Returning a negative error code in a function with an unsigned
return type is a pretty bad idea. It is probably worse when the
justification for the change is "our static analisys tool found it".

Fixes: cf7de25 ("cppc_cpufreq: Fix possible null pointer dereference")
Signed-off-by: Marc Zyngier <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Viresh Kumar <[email protected]>
Reviewed-by: Lifeng Zheng <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit d7c560f56e528fbb009f5f2b70cc813aad66661d openEuler

Returning a negative error code in a function with an unsigned
return type is a pretty bad idea. Return 0 is enough when something wrong.

Fixes: f84b9b2 ("cppc_cpufreq: Fix possible null pointer dereference")
Signed-off-by: Lifeng Zheng <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…te()

commit 5c5c2aac07c1202962821d54c8bebd5c8418d22e openEuler

After commit ae2df91 ("ACPI: CPPC: Disable FIE if registers in PCC
regions"), the only place uses hisi_cppc_cpufreq_get_rate() is
cppc_check_hisi_workaround(), which is after the implementation of
hisi_cppc_cpufreq_get_rate(). A forward declaration of
hisi_cppc_cpufreq_get_rate() is unnecessarily.

Fixes: ae2df91 ("ACPI: CPPC: Disable FIE if registers in PCC regions")
Signed-off-by: Lifeng Zheng <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit d80a756]
After commit a28b2bf ("cppc_cpufreq: replace per-cpu data array with a
list"), cpu_data can be got from policy->driver_data, so cpu_data_list is
not actually needed and can be removed.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 3d5978e]
The return value of populate_efficiency_class() is never needed and the
result of it doesn't affect the initialization of cppc_cpufreq.

It makes more sense to change it into a void function.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit c83a92d]
cppc_cpufreq_register_em() is only used in populate_efficiency_class(). A
forward declaration of it is not necessary.

Move cppc_cpufreq_register_em() in front of populate_efficiency_class()
and remove the forward declaration of cppc_cpufreq_register_em().

No functional change.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2e554cf]
After commit c034b02 ("cpufreq: expose scaling_cur_freq sysfs file
for set_policy() drivers"), the file scaling_cur_freq is exposed to all
drivers.

No need to create this file separately. It's better to be contained in
cpufreq_attrs.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 5d6ecaa]
The has_target() checks in __cpufreq_offline() are duplicate.

Remove one of them and put the operations of exiting governor together
with storing last governor's name.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…boost

commit efc1ef3222b0c34a14395f84330fa890cfd4ec3f openEuler

Hold the lock to avoid concurrency problems in
cpufreq_enable_boost_support() when assigning cpufreq_driver->set_boost.

Fixes: 7a6c79f ("cpufreq: Simplify core code related to boost support")
Signed-off-by: Lifeng Zheng <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…rrent_freq()

[Upstream commit 908981d]
Move the check of cpufreq_driver->get into cpufreq_verify_current_freq() in
case of calling it without check.

Signed-off-by: Lifeng Zheng <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…ters

commit b718dd523687c682c11f3aa590780c277a8a90d9 openEuler

cppc_set_epp - write energy performance preference register

cppc_get_auto_act_window - read autonomous activity window register

cppc_set_auto_act_window - write autonomous activity window register

cppc_get_auto_sel - read autonomous selection enable register

Signed-off-by: hepeng <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…ufreq

commit c4ba198c5c002c06a6bd49c5a5520c01fef890b5 openEuler

Add sysfs interfaces for CPPC auto act window and energy perf in the
cppc_cpufreq driver.

Signed-off-by: hepeng <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit 4e05c2f4ecf5a9116751b941c885f6d516860529 openEuler

Add a new CPUFreq governor 'seep' designed for platforms with
hardware-managed P-states through CPPC (Collaborative Processor
Performance Control). This governor enables the hardware's autonomous
frequency selection capability, allowing the processor to manage its
own frequency based on workload characteristics.

The SEEP governor requires:
- cppc_cpufreq driver
- Platform support for CPPC features:
  * Autonomous selection (auto_sel)
  * Autonomous activity window (auto_act_window)
  * Energy Performance Preference (epp)

Two per-policy sysfs interfaces are provided:
- auto_act_window: Control the hardware's frequency scaling window
- energy_perf: Bias between performance and energy efficiency

Signed-off-by: hepeng <[email protected]>
Signed-off-by: zhaolichang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
@sourcery-ai
Copy link

sourcery-ai bot commented Dec 12, 2025

Reviewer's Guide

Backports and extends CPPC-based cpufreq support to add a new SEEP governor, richer CPPC sysfs controls, and more robust policy/boost/invariance handling for Kunpeng SoCs, while simplifying cppc_cpudata and EM registration and centralizing scaling_cur_freq sysfs creation.

Sequence diagram for the SEEP governor start and CPPC interactions

sequenceDiagram
    actor UserSpace
    participant CpufreqCore
    participant CpufreqGovernorSeep
    participant ACPI_CPPC
    participant FirmwarePCC

    UserSpace->>CpufreqCore: select governor seep for policy
    CpufreqCore->>CpufreqGovernorSeep: start(policy)
    Note right of CpufreqGovernorSeep: cpufreq_gov_seep_start
    CpufreqGovernorSeep->>ACPI_CPPC: cppc_set_auto_sel(cpu, 1)
    Note right of ACPI_CPPC: Enable autonomous P-state selection

    alt AUTO_SEL register in PCC
        ACPI_CPPC->>FirmwarePCC: send_pcc_cmd(CMD_WRITE)
        FirmwarePCC-->>ACPI_CPPC: status
        ACPI_CPPC-->>CpufreqGovernorSeep: ret
    else AUTO_SEL in memory-mapped space
        ACPI_CPPC->>ACPI_CPPC: cpc_write(cpu, AUTO_SEL_ENABLE, 1)
        ACPI_CPPC-->>CpufreqGovernorSeep: ret
    end

    CpufreqGovernorSeep-->>CpufreqCore: start result
    CpufreqCore-->>UserSpace: governor seep active

    rect rgb(230,230,250)
    UserSpace->>CpufreqCore: read/write auto_act_window, energy_perf via sysfs
    CpufreqCore->>ACPI_CPPC: cppc_get_auto_act_window / cppc_set_auto_act_window
    CpufreqCore->>ACPI_CPPC: cppc_get_epp_perf / cppc_set_epp
    ACPI_CPPC-->>CpufreqCore: values / status
    CpufreqCore-->>UserSpace: sysfs read/write complete
    end
Loading

Class diagram for SEEP governor, CPPC helpers, and cpufreq core changes

classDiagram
    class cpufreq_governor {
        +char* name
        +int (*start)(struct cpufreq_policy *policy)
        +void (*stop)(struct cpufreq_policy *policy)
        +struct module *owner
    }

    class cpufreq_gov_seep {
        +name = seep
        +start(policy)
        +stop(policy)
    }

    class cpufreq_policy {
        +unsigned int cpu
        +unsigned int min
        +unsigned int max
        +unsigned int cpuinfo_min_freq
        +unsigned int cpuinfo_max_freq
        +unsigned int transition_delay_us
        +unsigned int shared_type
        +bool boost_enabled
        +unsigned int policy
        +char last_governor[CPUFREQ_NAME_LEN]
        +unsigned int last_policy
        +struct freq_qos_request *max_freq_req
        +struct cpufreq_governor *governor
        +void *driver_data
    }

    class cppc_cpudata {
        +struct cppc_perf_caps perf_caps
        +struct cppc_perf_ctrls perf_ctrls
        +struct cppc_perf_fb_ctrs perf_fb_ctrs
        +struct cpumask *shared_cpu_map
        +int shared_type
        +unsigned int max_freq
        +unsigned int min_freq
    }

    class cppc_acpi_helpers {
        +int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
        +int cppc_get_reg(int cpunum, enum cppc_regs reg_idx, u64 *val)
        +int cppc_set_reg(int cpu, enum cppc_regs reg_idx, u64 val)
        +int cppc_set_epp(int cpu, u64 epp_val)
        +int cppc_get_auto_act_window(int cpunum, u64 *auto_act_window)
        +int cppc_set_auto_act_window(int cpu, u64 auto_act_window)
        +int cppc_get_auto_sel(int cpunum, u64 *auto_sel)
        +int cppc_set_auto_sel(int cpu, bool enable)
        +int cppc_get_epp_perf(int cpunum, u64 *epp_perf)
        +int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable)
    }

    class cppc_cpufreq_driver_helpers {
        +struct cppc_cpudata* cppc_cpufreq_get_cpu_data(unsigned int cpu)
        +void cppc_cpufreq_put_cpu_data(struct cpufreq_policy *policy)
        +void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
        +unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
        +void cppc_cpufreq_register_em(struct cpufreq_policy *policy)
        +void populate_efficiency_class(void)
        +unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu)
    }

    class cpufreq_core {
        +int cpufreq_online(unsigned int cpu)
        +void __cpufreq_offline(unsigned int cpu, struct cpufreq_policy *policy)
        +unsigned int cpufreq_verify_current_freq(struct cpufreq_policy *policy, bool update)
        +unsigned int __cpufreq_get(struct cpufreq_policy *policy)
        +unsigned int cpufreq_get(unsigned int cpu)
        +int cpufreq_start_governor(struct cpufreq_policy *policy)
        +int cpufreq_enable_boost_support(void)
    }

    class cpufreq_sysfs_attributes {
        +struct attribute scaling_cur_freq
        +struct attribute auto_act_window
        +struct attribute energy_perf
        +struct attribute freqdomain_cpus
    }

    %% Relationships
    cpufreq_gov_seep --|> cpufreq_governor : instance_of
    cpufreq_policy o--> cppc_cpudata : driver_data
    cpufreq_core --> cpufreq_policy : manages
    cpufreq_core --> cpufreq_governor : selects_and_starts
    cpufreq_gov_seep --> cppc_acpi_helpers : uses
    cppc_cpufreq_driver_helpers --> cppc_cpudata : allocates_and_frees
    cppc_cpufreq_driver_helpers --> cpufreq_policy : initializes
    cppc_cpufreq_driver_helpers --> cppc_acpi_helpers : uses
    cpufreq_core --> cpufreq_sysfs_attributes : exposes
    cpufreq_sysfs_attributes --> cppc_acpi_helpers : auto_act_window_energy_perf
    cppc_acpi_helpers --> FirmwarePCC : via_PCC_commands

    class FirmwarePCC {
        +int send_pcc_cmd(int pcc_ss_id, int cmd)
        +struct cppc_pcc_data *pcc_data[]
    }
Loading

File-Level Changes

Change Details Files
Make CPPC perf counter and register handling more robust and simplify cppc_cpudata/EM registration
  • Treat missing or unsupported CPPC perf counters and registers as non-fatal, returning -EOPNOTSUPP/-ENODEV and disabling frequency invariance when needed instead of spamming warnings
  • Change cppc_cpufreq_freq_invariance worker and init paths to handle cppc_get_perf_ctrs errors gracefully and clear the SCALE_FREQ_SOURCE_CPPC source on failure
  • Refactor populate_efficiency_class into a void helper that only enables EM registration when multiple efficiency classes exist, and move cppc_cpufreq_register_em into the register_em callback
  • Drop the global cppc_cpudata list and its teardown helper, managing cppc_cpudata lifetime purely via cpufreq_policy driver_data
drivers/cpufreq/cppc_cpufreq.c
drivers/acpi/cppc_acpi.c
include/acpi/cppc_acpi.h
Expose new CPPC sysfs controls for autonomous activity window and energy-performance preference via cppc_cpufreq
  • Add CONFIG_CPPC_CPUFREQ_SYSFS_INTERFACE-gated show/store handlers and cpufreq_freq_attr_rw attributes for auto_act_window and energy_perf, including validation and formatting of the CPPC-encoded values
  • Wire the new attributes into cppc_cpufreq_attr so they appear under the cpufreq policy kobject when enabled
drivers/cpufreq/cppc_cpufreq.c
Introduce generic CPPC get/set helpers and new exported APIs for EPP, autonomous activity window, and autonomous selection
  • Add cppc_get_reg and cppc_set_reg helpers that handle CPC_SUPPORTED checks, PCC vs MMIO/FFH access, locking, and error propagation
  • Implement and export cppc_set_epp, cppc_get_auto_act_window, cppc_set_auto_act_window, and cppc_get_auto_sel based on the new helpers, and provide no-ACPI stubs in the !CONFIG_ACPI_CPPC_LIB path
  • Relax cppc_set_auto_sel to handle non-PCC registers, returning -EOPNOTSUPP when the CPC entry is unsupported instead of -ENOTSUPP
drivers/acpi/cppc_acpi.c
include/acpi/cppc_acpi.h
Adjust cpufreq core behavior around scaling_cur_freq, policy get paths, QoS, and boost
  • Always create the scaling_cur_freq sysfs attribute from the core cpufreq_attrs list instead of per-policy creation
  • Change cpufreq_get and governor start paths to rely on __cpufreq_get (which internally handles a missing get op) and add a guard in cpufreq_verify_current_freq to early-exit if the driver has no ->get callback
  • When bringing an existing policy back online, update the max freq QoS request to match policy->max
  • Synchronize policy->boost_enabled with the global boost state at online time via cpufreq_driver->set_boost when available, and remove per-driver boost mirroring from acpi-cpufreq
  • Protect cpufreq_enable_boost_support set_boost assignment with cpufreq_driver_lock to make boost registration lock-safe
drivers/cpufreq/cpufreq.c
drivers/cpufreq/acpi-cpufreq.c
Improve cpufreq policy init for cppc_cpufreq and avoid negative error codes from hisi_cppc_cpufreq_get_rate
  • Make cppc_cpufreq policy->max and cpuinfo.max_freq reflect highest_perf when boost is enabled, otherwise nominal_perf, instead of always using nominal_perf
  • Change hisi_cppc_cpufreq_get_rate to return 0 on failure (no policy or cppc_get_desired_perf error) instead of negative errno values
drivers/cpufreq/cppc_cpufreq.c
Add the new SEEP cpufreq governor and integrate it with Kconfig and the build
  • Introduce cpufreq_seep governor implementation that checks for cppc_cpufreq and required CPPC features (auto_sel, auto_act_window, epp) at init, and toggles cppc_set_auto_sel on start/stop
  • Wire the governor into the build system via CONFIG_CPU_FREQ_GOV_SEEP and make it selectable as a default governor via CONFIG_CPU_FREQ_DEFAULT_GOV_SEEP
  • Expose cpufreq_default_governor() override when SEEP is the default
drivers/cpufreq/cpufreq_seep.c
drivers/cpufreq/Makefile
drivers/cpufreq/Kconfig
drivers/cpufreq/Kconfig.arm
Documentation/ABI/testing/sysfs-devices-system-cpu
Documentation/admin-guide/pm/cpufreq.rst
Tidy cpufreq offline/governor teardown behavior
  • On policy offline, only save last_governor and exit the governor for target drivers; for non-target drivers store last_policy instead, avoiding unnecessary governor teardown paths
drivers/cpufreq/cpufreq.c

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from avenger-285714. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Avenger-285714
Copy link
Member Author

Tested and passed on ARM64 target using LLVM-20. I don’t have Deepin’s development environment (build toolchain), so I can’t update the defconfig for all architectures myself — could the Deepin team please handle this? Many thanks! @opsiff

Jay Fang and others added 2 commits December 12, 2025 20:59
…r-free

commit 951c5e26f86a2a342cbd533f4ae4a1545d8ff57e openEuler

From 'commit 456d8aa ("PCI/ASPM: Disable ASPM on MFD function removal
to avoid use-after-free")' we know that PCIe spec r6.0, sec 7.5.3.7,
recommends that software program the same ASPM Control(pcie_link_state)
value in all functions of multi-function devices, and free the
pcie_link_state when any child function is removed.

However, ASPM Control sysfs is still visible to other children even if it
has been removed by any child function, and careless use it will
trigger use-after-free error, e.g.:

  # lspci -tv
    -[0000:16]---00.0-[17]--+-00.0  Device 19e5:0222
                            \-00.1  Device 19e5:0222
  # echo 1 > /sys/bus/pci/devices/0000:17:00.0/remove
  //pcie_link_state will be released
  # echo 1 > /sys/bus/pci/devices/0000:17:00.1/link/l1_aspm
  //will trigger error

Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000030 Call trace:
aspm_attr_store_common.constprop.0+0x10c/0x154
l1_aspm_store+0x24/0x30
dev_attr_store+0x20/0x34
sysfs_kf_write+0x4c/0x5c

We can solve this problem by updating the ASPM Control sysfs of all
children immediately after ASPM Control have been freed.

Extra note: The Linux community plans to optimize to the pcie_link_state
management code.For details,see the following link:
https://patchwork.kernel.org/project/linux-pci/patch/
[email protected]/

Fixes: 456d8aa ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free")
Signed-off-by: Jay Fang <[email protected]>
Signed-off-by: lujunhua <[email protected]>
Signed-off-by: Slim6882 <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…ation

commit 049045f499e1cc07a132c3ef010f15abce995e03 openEuler

The filter sysfs attribute is initialized when register to the sysfs.
This is unnecessary and could be done when allocation without
distinguish the filter type.

After the changes above, we don't need a wrapper for initializing and
registering the filter's sysfs attributes. Remove them and call the
sysfs creating/removing functions in place.

Fixes: 6373c46 ("hwtracing: hisi_ptt: Export available filters through sysfs")
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: lujunhua <[email protected]>
Signed-off-by: Slim6882 <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
Yicong Yang and others added 6 commits December 15, 2025 09:19
commit 20c2368e0195aae332176096336105c470d48f15 openEuler

1509d06c9c41 ("init: only move down lockup_detector_init() when sdei_watchdog is enabled")

In the above commit, sdei_watchdog needs to move down
lockup_detector_init (), while nmi_watchdog does not. So when
sdei_watchdog fails to be initialized, nmi_watchdog should not be
initialized.

[    0.706631][    T1] SDEI NMI watchdog: Disable SDEI NMI Watchdog in VM
[    0.707405][    T1] ------------[ cut here ]------------
[    0.708020][    T1] WARNING: CPU: 0 PID: 1 at kernel/watchdog_perf.c:117 hardlockup_detector_event_create+0x24/0x108
[    0.709230][    T1] Modules linked in:
[    0.709665][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.0 deepin-community#1
[    0.710700][    T1] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[    0.711625][    T1] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.712547][    T1] pc : hardlockup_detector_event_create+0x24/0x108
[    0.713316][    T1] lr : watchdog_hardlockup_probe+0x28/0xa8
[    0.714010][    T1] sp : ffff8000831cbdc0
[    0.714501][    T1] pmr_save: 000000e0
[    0.714957][    T1] x29: ffff8000831cbdc0 x28: 0000000000000000 x27: 0000000000000000
[    0.715899][    T1] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[    0.716839][    T1] x23: 0000000000000000 x22: 0000000000000000 x21: ffff80008218fab0
[    0.717775][    T1] x20: ffff8000821af000 x19: ffff0000c0261900 x18: 0000000000000020
[    0.718713][    T1] x17: 00000000cb551c45 x16: ffff800082625e48 x15: ffffffffffffffff
[    0.719663][    T1] x14: 0000000000000000 x13: 205d315420202020 x12: 5b5d313336363037
[    0.720607][    T1] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffff800081b5f630
[    0.721590][    T1] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 000000000005fff4
[    0.722528][    T1] x5 : 00000000002bffa8 x4 : 0000000000000000 x3 : 0000000000000000
[    0.723482][    T1] x2 : 0000000000000000 x1 : 0000000000000140 x0 : ffff0000c02c0000
[    0.724426][    T1] Call trace:
[    0.724808][    T1]  hardlockup_detector_event_create+0x24/0x108
[    0.725535][    T1]  watchdog_hardlockup_probe+0x28/0xa8
[    0.726174][    T1]  lockup_detector_init+0x110/0x158
[    0.726776][    T1]  kernel_init_freeable+0x208/0x288
[    0.727387][    T1]  kernel_init+0x2c/0x200
[    0.727902][    T1]  ret_from_fork+0x10/0x20
[    0.728420][    T1] ---[ end trace 0000000000000000 ]---

Fixes: 7cc5cc30a7d1 ("watchdog: Support watchdog_sdei coexist with existing watchdogs")
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 60bc47b]
Architecture's using perf events for hard lockup detection needs to
convert the watchdog_thresh to the event's period, some architecture
for example arm64 perform this conversion using the CPU's maximum
frequency which will be acquired by cpufreq. However by the time
the lockup detector's initialized the cpufreq driver may not be
initialized, thus launch a watchdog with inaccurate period. Provide
a function hardlockup_detector_perf_adjust_period() to allowing
adjust the event period. Then architecture can update with more
accurate period if cpufreq is initialized.

Fixes: 94946f9 ("arm64: add hw_nmi_get_sample_period for preparation of lockup detector")
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 7a88444]
arm64 depends on the cpufreq driver to gain the maximum cpu frequency
to convert the watchdog_thresh to perf event period. cpufreq drivers
like cppc_cpufreq will be initialized lately after the initializing of
the hard lockup detector so just use a safe cpufreq which will be
inaccurency. Use a cpufreq notifier to adjust the event's period to
a more accurate one.

Fixes: 94946f9 ("arm64: add hw_nmi_get_sample_period for preparation of lockup detector")
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit c8f96adca7fadbacc9b1af75528964da9d7bb166 openEuler

During testing, it was found that in low-power scenarios, the CPU
core frequently powers on and off, which causes the SDEI watchdog
to fail. This is because the secure timer in the BIOS does not save
and restore the state during the CPU core power-on and power-off
process. The OS needs to add enable/disable operations in the
suspend/resume process.

Fixes: e29d570c5413 ("watchdog: add nmi_watchdog support for arm64 based on SDEI")
Signed-off-by: Bowen You <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit f022c4cac9c1013fa8820b0af01aaa98e38053f1 openEuler

sdei watchdog needs to be initialized after sdei_init,
so commit 81e349d81958 ("init: only move down lockup_detector_init()
 when sdei_watchdog is enabled")
move down the lockup_detector_init().

Now Commit 930d8f8 ("watchdog/perf: adapt the watchdog_perf
 interface for async model")
provide an API lockup_detector_retry_init() for anyone
who needs to delayed init lockup detector, so use this
API to delay init sdei watchdog.

Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: zhangguangzhi <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit f0f9be237c2266d30fa01c294577438a2e2ee749 openEuler

The current interrupt policy of the Hip12 preferentially
schedules LPI and SGI interrupts with the same priority.
As a result, the arch_timer may not be executed when frequent
LPI/SGI interrupts are generated, and the watchdog is starved.

Signed-off-by: Qinxin Xia <[email protected]>
Signed-off-by: Hongye Lin <[email protected]>
Signed-off-by: WangYuli <[email protected]>
@Avenger-285714
Copy link
Member Author

Tested and passed on ARM64 target using LLVM-20. I don’t have Deepin’s development environment (build toolchain), so I can’t update the defconfig for all architectures myself — could the Deepin team please handle this? Many thanks! @opsiff

Exception: although it compiles, there is no need to enable SDEI_WATCHDOG in production.

@Avenger-285714 Avenger-285714 changed the title [Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat module for Kunpeng new SOC [Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat/sdei_watchdog module for Kunpeng new SOC Dec 15, 2025
Yicong Yang added 4 commits December 15, 2025 09:37
commit 6dd0f06a404f481ea58fb04b3006287dc3c5aea3 openEuler

The core CPU control framework supports runtime SMT control which
is not yet supported by arch_topology driver and thus arch_topology
based architectures. This patch implements it in the following aspects:

- implement topology_is_primary_thread() to indicate the primary thread,
  required by the framework
- architecture code can get/set the SMT thread number by
  topology_smt_{get, set}_num_threads()
- update the SMT thread number for the framework after the topology
  enumerated on arm64, which is also required by the framework

For disabling SMT we'll offline all the secondary threads and
only leave the primary thread. Since we don't have restriction
for primary thread selection, the first thread is chosen as the
primary thread in this implementation.

This patch only implements the basic support for SMT control, which
needs to collabrate with ACPI/OF based topology building to fully
enable the feature. The SMT control will be enabled unless the
correct SMT thread number is set and HOTPLUG_SMT kconfig is selected.

Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 5deb9c7]
On building the topology from the devicetree, we've already
gotten the SMT thread number of each core. Update the largest
SMT thread number to enable the SMT control.

Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit e6b18eb]
For ACPI we'll build the topology from PPTT and we cannot directly
get the SMT number of each core. Instead using a temporary xarray
to record the SMT number of each core when building the topology
and we can know the largest SMT number in the system. Then we can
notify the arch_topology for supporting SMT control.

Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit eed4583]
Enable HOTPLUG_SMT for SMT control.

Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
@Avenger-285714 Avenger-285714 changed the title [Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat/sdei_watchdog module for Kunpeng new SOC [Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat/sdei_watchdog/HOTPLUG_SMT module for Kunpeng new SOC Dec 15, 2025
Anshuman Khandual and others added 15 commits December 15, 2025 10:16
[Upstream commit 58f8fc5]
Currently the PMUv3 driver only reads PMMIR_EL1 if the PMU implements
FEAT_PMUv3p4 and the STALL_SLOT event, but the check for STALL_SLOT event
isn't necessary and can be removed.

The check for STALL_SLOT event was introduced with the read of PMMIR_EL1 in
commit f5be3a6 ("arm64: perf: Add support caps under sysfs")

When this logic was written, the ARM ARM said:

| If STALL_SLOT is not implemented, it is IMPLEMENTATION DEFINED whether
| the PMMIR System registers are implemented.

... and thus the driver had to check for STALL_SLOT event to verify that
PMMIR_EL1 was implemented and accesses to PMMIR_EL1 would not be UNDEFINED.

Subsequently, the architecture was retrospectively tightened to require
that any FEAT_PMUv3p4 implementation implements PMMIR_EL1. Since the G.b
release of the ARM ARM, the wording regarding STALL_SLOT event has been
removed, and the description of PMMIR_EL1 says:

| This register is present only when FEAT_PMUv3p4 is implemented.

Drop the unnecessary check for STALL_SLOT event when reading PMMIR_EL1.

Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
…nit()

[Upstream commit 3b9a22d]
All the PMU init functions want the default sysfs attribute groups, and so
these all call armv8_pmu_init_nogroups() helper, with none of them calling
armv8_pmu_init() directly. When we introduced armv8_pmu_init_nogroups() in
the commit e424b17 ("arm64: perf: Refactor PMU init callbacks")

 ... we thought that we might need custom attribute groups in future, but
as we evidently haven't, we can remove the option.

This patch folds armv8_pmu_init_nogroups() into armv8_pmu_init(), removing
the ability to use custom attribute groups and simplifying the code.

CC: James Clark <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: [email protected]
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit bc512d6]
The NSH bit, which filters event counting at EL2, is required by the
architecture if an implementation has EL2. Even though KVM doesn't
support nested virt yet, it makes no effort to hide the existence of EL2
from the ID registers. Userspace can, however, change the value of PFR0
to hide EL2. Align KVM's sysreg emulation with the architecture and make
NSH RES0 if EL2 isn't advertised. Keep in mind the bit is ignored when
constructing the backing perf event.

While at it, build the event type mask using explicit field definitions
instead of relying on ARMV8_PMU_EVTYPE_MASK. KVM probably should've been
doing this in the first place, as it avoids changes to the
aforementioned mask affecting sysreg emulation.

Reviewed-by: Suzuki K Poulose <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit ae8d352]
Suzuki noticed that KVM's PMU emulation is oblivious to the NSU and NSK
event filter bits. On systems that have EL3 these bits modify the
filter behavior in non-secure EL0 and EL1, respectively. Even though the
kernel doesn't use these bits, it is entirely possible some other guest
OS does. Additionally, it would appear that these and the M bit are
required by the architecture if EL3 is implemented.

Allow the EL3 event filter bits to be set if EL3 is advertised in the
guest's ID register. Implement the behavior of NSU and NSK according to
the pseudocode, and entirely ignore the M bit for perf event creation.

Reported-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 877806b]
This further compacts all remaining PMU init procedures requiring specific
map_event functions via a new macro PMUV3_INIT_MAP_EVENT(). While here, it
also changes generated init function names to match to those generated via
the other macro PMUV3_INIT_SIMPLE(). This does not cause functional change.

Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 9343c79]
These are all static and in one compilation unit so the inline has no
effect on the binary. Except if FTRACE is enabled, then 3 functions
which were already not inlined now get the nops added which allows them
to be traced.

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2f6a00f]
This is so that FIELD_GET and FIELD_PREP can be used and that the fields
are in a consistent format to arm64/tools/sysreg

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit d30f09b]
Convert the remaining fields to use either GENMASK or be built from
other fields. These all already started at bit 0 so don't need a code
change for the lack of _SHIFT.

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 3115ee0]
FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include
them in the mask. These aren't writable on 32 bit kernels as they are in
the high part of the register, so only include them for arm64.

It would be difficult to do this statically in the asm header files for
each platform without resulting in circular includes or #ifdefs inline
in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has
been removed and the mask is constructed programmatically.

Reviewed-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit f6da869]
This mechanism makes it much easier to define and read new attributes
so move it to the arm_pmu.h header so that it can be shared. At the same
time update the existing format attributes to use it.

GENMASK has to be changed to GENMASK_ULL because the config fields are
64 bits even on arm32 where this will also be used now.

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 186c91a]
-EPERM or -EINVAL always get converted to -EOPNOTSUPP, so replace them.
This will allow __hw_perf_event_init() to return a different code or not
print that particular message for a different error in the next commit.

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 816c267]
FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on
events whose count meets a specified threshold condition. For example if
PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or
equal, count), and the threshold is set to 2, then the PMU counter will
now only increment by 1 when an event would have previously incremented
the PMU counter by 2 or more on a single processor cycle.

Three new Perf event config fields, 'threshold', 'threshold_compare' and
'threshold_count' have been added to control the feature.
threshold_compare maps to the upper two bits of PMEVTYPERn.TC and
threshold_count maps to the first bit of TC. These separate attributes
have been picked rather than enumerating all the possible combinations
of the TC field as in the Arm ARM. The attributes would be used on a
Perf command line like this:

  $ perf stat -e stall_slot/threshold=2,threshold_compare=2/

A new capability for reading out the maximum supported threshold value
has also been added:

  $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max

  0x000000ff

If a threshold higher than threshold_max is provided, then an error is
generated. If FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is
running, then threshold_max reads zero, and attempting to set a
threshold value will also result in an error.

The threshold is per PMU counter, and there are potentially different
threshold_max values per PMU type on heterogeneous systems.

Bits higher than 32 now need to be written into PMEVTYPER, so
armv8pmu_write_evtype() has to be updated to take an unsigned long value
rather than u32 which gives the correct behavior on both aarch32 and 64.

Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit bd69063]
Add documentation for the new Perf event open parameters and
the threshold_max capability file.

Reviewed-by: Anshuman Khandual <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Signed-off-by: James Clark <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit bb339db]
LLVM ignores everything inside the if statement and doesn't generate
errors, but GCC doesn't ignore it, resulting in the following error:

  drivers/perf/arm_pmuv3.c: In function ‘armv8pmu_write_evtype’:
  include/linux/bits.h:34:29: error: left shift count >= width of type [-Werror=shift-count-overflow]
  34 |         (((~UL(0)) - (UL(1) << (l)) + 1) & \

Fix it by using GENMASK_ULL which doesn't overflow on arm32 (even though
the value is never used there).

Fixes: 3115ee0 ("arm64: perf: Include threshold control fields in PMEVTYPER mask")
Reported-by: Uwe Kleine-König <[email protected]>
Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/
Signed-off-by: James Clark <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Reviewed-by: Uwe Kleine-König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 81e15ca]
If the user has requested a counting threshold for the CPU cycles event,
then the fixed cycle counter can't be assigned as it lacks threshold
support. Currently, the thresholds will work or not randomly depending
on which counter the event is assigned.

While using thresholds for CPU cycles doesn't make much sense, it can be
useful for testing purposes.

Fixes: 816c267 ("arm64: perf: Add support for event counting threshold")
Signed-off-by: Rob Herring (Arm) <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: yeyiyang <[email protected]>
Signed-off-by: WangYuli <[email protected]>
@Avenger-285714 Avenger-285714 changed the title [Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat/sdei_watchdog/HOTPLUG_SMT module for Kunpeng new SOC [Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat/sdei_watchdog/HOTPLUG_SMT/FEAT_PMUv3_TH module for Kunpeng new SOC Dec 15, 2025
@deepin-ci-robot
Copy link

deepin pr auto review

我将对这个Git diff进行代码审查,重点关注语法逻辑、代码质量、性能和安全性。

  1. 语法逻辑:
  • 整体代码语法正确,符合Linux内核编码规范
  • 函数命名和变量命名遵循内核惯例
  • 错误处理路径完整
  1. 代码质量:
  • 添加了适当的错误处理和边界检查
  • 使用了内核提供的标准数据结构和函数
  • 代码注释充分,特别是对复杂逻辑的解释
  1. 性能相关:
  • 在关键路径上使用了高效的数据结构和算法
  • 避免了不必要的锁竞争
  • 优化了内存访问模式
  1. 安全性:
  • 添加了适当的输入验证
  • 使用了安全的内存操作函数
  • 实现了适当的权限检查

具体改进建议:

  1. 在drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c中:
// 建议:在hibmc_msi_init中添加更多错误检查
if (ret < HIBMC_MIN_VECTORS) {
    drm_err(dev, "Insufficient IRQ vectors: got %d, need at least %d\n", 
            ret, HIBMC_MIN_VECTORS);
    return ret;
}
  1. 在drivers/perf/arm_pmuv3.c中:
// 建议:添加更严格的阈值检查
if (th > threshold_max(cpu_pmu)) {
    pr_debug("PMU event threshold %u exceeds max value %u\n", 
             th, threshold_max(cpu_pmu));
    return -EINVAL;
}
  1. 在drivers/soc/hisilicon/hisi_soc_cache_framework.c中:
// 建议:添加更严格的权限检查
if (!capable(CAP_SYS_ADMIN)) {
    pr_debug("Permission denied for cache maintenance operation\n");
    return -EPERM;
}
  1. 在kernel/watchdog.c中:
// 建议:添加看门狗超时检查
if (watchdog_thresh < 1 || watchdog_thresh > 3600) {
    pr_warn("Invalid watchdog threshold %d, using default %d\n",
            watchdog_thresh, DEFAULT_WATCHDOG_THRESH);
    watchdog_thresh = DEFAULT_WATCHDOG_THRESH;
}
  1. 在drivers/hwmon/acpi_power_meter.c中:
// 建议:添加更严格的范围检查
if (val > resource->caps.max_cap || val < resource->caps.min_cap) {
    pr_debug("Power cap %lu out of range [%llu,%llu]\n",
             val, resource->caps.min_cap, resource->caps.max_cap);
    return -EINVAL;
}
  1. 在drivers/irqchip/irq-gic-v3.c中:
// 建议:添加中断优先级范围检查
if (prio > GICD_INT_DEF_PRI) {
    pr_warn("Interrupt priority %u exceeds maximum %u\n",
            prio, GICD_INT_DEF_PRI);
    return -EINVAL;
}

总体而言,这个diff实现了多个重要功能:

  1. 添加了对HiSilicon PCIe PMU的支持
  2. 改进了ARM PMU的性能监控功能
  3. 增强了硬件看门狗功能
  4. 添加了对HiSilicon缓存维护的支持
  5. 改进了DRM显示驱动
  6. 优化了电源管理功能

代码质量整体较好,但建议在上述方面进一步加强安全性和健壮性检查。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.