-
Notifications
You must be signed in to change notification settings - Fork 104
[Deepin-Kernel-SIG] [linux 6.6-y] [HISI] Backport cpufreq/pcie/ext-gpu/soc_cache/power_meter/perf_iostat/sdei_watchdog/HOTPLUG_SMT/FEAT_PMUv3_TH module for Kunpeng new SOC #1373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: linux-6.6.y
Are you sure you want to change the base?
Conversation
[Upstream commit 1608f02] It turns out that CPUX will stay on the base frequency after performing these operations: 1. boost all CPUs: echo 1 > /sys/devices/system/cpu/cpufreq/boost 2. offline one CPU: echo 0 > /sys/devices/system/cpu/cpuX/online 3. deboost all CPUs: echo 0 > /sys/devices/system/cpu/cpufreq/boost 4. online CPUX: echo 1 > /sys/devices/system/cpu/cpuX/online 5. boost all CPUs again: echo 1 > /sys/devices/system/cpu/cpufreq/boost This is because max_freq_req of the policy is not updated during the online process, and the value of max_freq_req before the last offline is retained. When the CPU is boosted again, freq_qos_update_request() will do nothing because the old value is the same as the new one. This causes the CPU to stay at the base frequency. Updating max_freq_req in cpufreq_online() will solve this problem. Signed-off-by: Lifeng Zheng <[email protected]> Acked-by: Viresh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
… flag [Upstream commit dd016f3] In cpufreq_online() of cpufreq.c, the per-policy boost flag is already set to mirror the cpufreq_driver boost during init but using freq_table to judge if the policy has boost frequency. There are two drawbacks to this approach: 1. It doesn't work for the cpufreq drivers that do not use a frequency table. For now, acpi-cpufreq and amd-pstate have to enable boost in policy initialization. And cppc_cpufreq never set policy to boost when going online no matter what the cpufreq_driver boost flag is. 2. If the CPU goes offline when cpufreq_driver boost is enabled and then goes online when cpufreq_driver boost is disabled, the per-policy boost flag will incorrectly remain true. Running set_boost at the end of the online process is a more generic way for all cpufreq drivers. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] Acked-by: Viresh Kumar <[email protected]> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 03d8b4e] In policy initialization, policy->max and policy->cpuinfo.max_freq are always set to the value calculated from caps->nominal_perf. This will cause the frequency stay on base frequency even if the policy is already boosted when a CPU is going online. Fix this by using policy->boost_enabled to determine which value should be set. Signed-off-by: Lifeng Zheng <[email protected]> Acked-by: Viresh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2b16c63] At the end of cpufreq_online() in cpufreq.c, set_boost is executed and the per-policy boost flag is set to mirror the cpufreq_driver boost, so it is not necessary to run set_boost in acpi_cpufreq_cpu_init(). Signed-off-by: Lifeng Zheng <[email protected]> Acked-by: Viresh Kumar <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 0813fd2] Ensure cpufreq_driver->set_boost is non-NULL before using it in cpufreq_online() to prevent a potential NULL pointer dereference. Reported-by: Gautam Menghani <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Fixes: da59223d340c ("cpufreq: Introduce a more generic way to set default per-policy boost flag") Suggested-by: Viresh Kumar <[email protected]> Signed-off-by: Aboorva Devarajan <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Minor edits in the subject and changelog ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
… cpus commit d08e86c77069fbbdd7fdbdaa408c198223bc0900 openEuler Reading perf counters on offline cpus should be expected to fail, e.g. it returns -EFAULT as counters are shown to be 0. Remove the unnecessary warning print on this failure path. Fixes: 1eb5dde ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Jie Zhan <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
commit 2eab37aa8a55cdb8b85a912a4b1156c39689c89d openEuler Perf counters could be 0 if the cpu is in a low-power idle state. Just try it again next time and update the frequency scale when the cpu is active and perf counters successfully return. Also, remove the FIE source on an actual failure. Fixes: 1eb5dde ("cpufreq: CPPC: Add support for frequency invariance") Signed-off-by: Jie Zhan <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2b8e6b5] Returning a negative error code in a function with an unsigned return type is a pretty bad idea. It is probably worse when the justification for the change is "our static analisys tool found it". Fixes: cf7de25 ("cppc_cpufreq: Fix possible null pointer dereference") Signed-off-by: Marc Zyngier <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Viresh Kumar <[email protected]> Reviewed-by: Lifeng Zheng <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
commit d7c560f56e528fbb009f5f2b70cc813aad66661d openEuler Returning a negative error code in a function with an unsigned return type is a pretty bad idea. Return 0 is enough when something wrong. Fixes: f84b9b2 ("cppc_cpufreq: Fix possible null pointer dereference") Signed-off-by: Lifeng Zheng <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…te() commit 5c5c2aac07c1202962821d54c8bebd5c8418d22e openEuler After commit ae2df91 ("ACPI: CPPC: Disable FIE if registers in PCC regions"), the only place uses hisi_cppc_cpufreq_get_rate() is cppc_check_hisi_workaround(), which is after the implementation of hisi_cppc_cpufreq_get_rate(). A forward declaration of hisi_cppc_cpufreq_get_rate() is unnecessarily. Fixes: ae2df91 ("ACPI: CPPC: Disable FIE if registers in PCC regions") Signed-off-by: Lifeng Zheng <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit d80a756] After commit a28b2bf ("cppc_cpufreq: replace per-cpu data array with a list"), cpu_data can be got from policy->driver_data, so cpu_data_list is not actually needed and can be removed. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 3d5978e] The return value of populate_efficiency_class() is never needed and the result of it doesn't affect the initialization of cppc_cpufreq. It makes more sense to change it into a void function. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit c83a92d] cppc_cpufreq_register_em() is only used in populate_efficiency_class(). A forward declaration of it is not necessary. Move cppc_cpufreq_register_em() in front of populate_efficiency_class() and remove the forward declaration of cppc_cpufreq_register_em(). No functional change. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2e554cf] After commit c034b02 ("cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers"), the file scaling_cur_freq is exposed to all drivers. No need to create this file separately. It's better to be contained in cpufreq_attrs. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 5d6ecaa] The has_target() checks in __cpufreq_offline() are duplicate. Remove one of them and put the operations of exiting governor together with storing last governor's name. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…boost commit efc1ef3222b0c34a14395f84330fa890cfd4ec3f openEuler Hold the lock to avoid concurrency problems in cpufreq_enable_boost_support() when assigning cpufreq_driver->set_boost. Fixes: 7a6c79f ("cpufreq: Simplify core code related to boost support") Signed-off-by: Lifeng Zheng <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…rrent_freq() [Upstream commit 908981d] Move the check of cpufreq_driver->get into cpufreq_verify_current_freq() in case of calling it without check. Signed-off-by: Lifeng Zheng <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…ters commit b718dd523687c682c11f3aa590780c277a8a90d9 openEuler cppc_set_epp - write energy performance preference register cppc_get_auto_act_window - read autonomous activity window register cppc_set_auto_act_window - write autonomous activity window register cppc_get_auto_sel - read autonomous selection enable register Signed-off-by: hepeng <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…ufreq commit c4ba198c5c002c06a6bd49c5a5520c01fef890b5 openEuler Add sysfs interfaces for CPPC auto act window and energy perf in the cppc_cpufreq driver. Signed-off-by: hepeng <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
commit 4e05c2f4ecf5a9116751b941c885f6d516860529 openEuler Add a new CPUFreq governor 'seep' designed for platforms with hardware-managed P-states through CPPC (Collaborative Processor Performance Control). This governor enables the hardware's autonomous frequency selection capability, allowing the processor to manage its own frequency based on workload characteristics. The SEEP governor requires: - cppc_cpufreq driver - Platform support for CPPC features: * Autonomous selection (auto_sel) * Autonomous activity window (auto_act_window) * Energy Performance Preference (epp) Two per-policy sysfs interfaces are provided: - auto_act_window: Control the hardware's frequency scaling window - energy_perf: Bias between performance and energy efficiency Signed-off-by: hepeng <[email protected]> Signed-off-by: zhaolichang <[email protected]> Signed-off-by: WangYuli <[email protected]>
Reviewer's GuideBackports and extends CPPC-based cpufreq support to add a new SEEP governor, richer CPPC sysfs controls, and more robust policy/boost/invariance handling for Kunpeng SoCs, while simplifying cppc_cpudata and EM registration and centralizing scaling_cur_freq sysfs creation. Sequence diagram for the SEEP governor start and CPPC interactionssequenceDiagram
actor UserSpace
participant CpufreqCore
participant CpufreqGovernorSeep
participant ACPI_CPPC
participant FirmwarePCC
UserSpace->>CpufreqCore: select governor seep for policy
CpufreqCore->>CpufreqGovernorSeep: start(policy)
Note right of CpufreqGovernorSeep: cpufreq_gov_seep_start
CpufreqGovernorSeep->>ACPI_CPPC: cppc_set_auto_sel(cpu, 1)
Note right of ACPI_CPPC: Enable autonomous P-state selection
alt AUTO_SEL register in PCC
ACPI_CPPC->>FirmwarePCC: send_pcc_cmd(CMD_WRITE)
FirmwarePCC-->>ACPI_CPPC: status
ACPI_CPPC-->>CpufreqGovernorSeep: ret
else AUTO_SEL in memory-mapped space
ACPI_CPPC->>ACPI_CPPC: cpc_write(cpu, AUTO_SEL_ENABLE, 1)
ACPI_CPPC-->>CpufreqGovernorSeep: ret
end
CpufreqGovernorSeep-->>CpufreqCore: start result
CpufreqCore-->>UserSpace: governor seep active
rect rgb(230,230,250)
UserSpace->>CpufreqCore: read/write auto_act_window, energy_perf via sysfs
CpufreqCore->>ACPI_CPPC: cppc_get_auto_act_window / cppc_set_auto_act_window
CpufreqCore->>ACPI_CPPC: cppc_get_epp_perf / cppc_set_epp
ACPI_CPPC-->>CpufreqCore: values / status
CpufreqCore-->>UserSpace: sysfs read/write complete
end
Class diagram for SEEP governor, CPPC helpers, and cpufreq core changesclassDiagram
class cpufreq_governor {
+char* name
+int (*start)(struct cpufreq_policy *policy)
+void (*stop)(struct cpufreq_policy *policy)
+struct module *owner
}
class cpufreq_gov_seep {
+name = seep
+start(policy)
+stop(policy)
}
class cpufreq_policy {
+unsigned int cpu
+unsigned int min
+unsigned int max
+unsigned int cpuinfo_min_freq
+unsigned int cpuinfo_max_freq
+unsigned int transition_delay_us
+unsigned int shared_type
+bool boost_enabled
+unsigned int policy
+char last_governor[CPUFREQ_NAME_LEN]
+unsigned int last_policy
+struct freq_qos_request *max_freq_req
+struct cpufreq_governor *governor
+void *driver_data
}
class cppc_cpudata {
+struct cppc_perf_caps perf_caps
+struct cppc_perf_ctrls perf_ctrls
+struct cppc_perf_fb_ctrs perf_fb_ctrs
+struct cpumask *shared_cpu_map
+int shared_type
+unsigned int max_freq
+unsigned int min_freq
}
class cppc_acpi_helpers {
+int cppc_get_perf(int cpunum, enum cppc_regs reg_idx, u64 *perf)
+int cppc_get_reg(int cpunum, enum cppc_regs reg_idx, u64 *val)
+int cppc_set_reg(int cpu, enum cppc_regs reg_idx, u64 val)
+int cppc_set_epp(int cpu, u64 epp_val)
+int cppc_get_auto_act_window(int cpunum, u64 *auto_act_window)
+int cppc_set_auto_act_window(int cpu, u64 auto_act_window)
+int cppc_get_auto_sel(int cpunum, u64 *auto_sel)
+int cppc_set_auto_sel(int cpu, bool enable)
+int cppc_get_epp_perf(int cpunum, u64 *epp_perf)
+int cppc_set_epp_perf(int cpu, struct cppc_perf_ctrls *perf_ctrls, bool enable)
}
class cppc_cpufreq_driver_helpers {
+struct cppc_cpudata* cppc_cpufreq_get_cpu_data(unsigned int cpu)
+void cppc_cpufreq_put_cpu_data(struct cpufreq_policy *policy)
+void cppc_cpufreq_cpu_fie_init(struct cpufreq_policy *policy)
+unsigned int cppc_cpufreq_get_transition_delay_us(unsigned int cpu)
+void cppc_cpufreq_register_em(struct cpufreq_policy *policy)
+void populate_efficiency_class(void)
+unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu)
}
class cpufreq_core {
+int cpufreq_online(unsigned int cpu)
+void __cpufreq_offline(unsigned int cpu, struct cpufreq_policy *policy)
+unsigned int cpufreq_verify_current_freq(struct cpufreq_policy *policy, bool update)
+unsigned int __cpufreq_get(struct cpufreq_policy *policy)
+unsigned int cpufreq_get(unsigned int cpu)
+int cpufreq_start_governor(struct cpufreq_policy *policy)
+int cpufreq_enable_boost_support(void)
}
class cpufreq_sysfs_attributes {
+struct attribute scaling_cur_freq
+struct attribute auto_act_window
+struct attribute energy_perf
+struct attribute freqdomain_cpus
}
%% Relationships
cpufreq_gov_seep --|> cpufreq_governor : instance_of
cpufreq_policy o--> cppc_cpudata : driver_data
cpufreq_core --> cpufreq_policy : manages
cpufreq_core --> cpufreq_governor : selects_and_starts
cpufreq_gov_seep --> cppc_acpi_helpers : uses
cppc_cpufreq_driver_helpers --> cppc_cpudata : allocates_and_frees
cppc_cpufreq_driver_helpers --> cpufreq_policy : initializes
cppc_cpufreq_driver_helpers --> cppc_acpi_helpers : uses
cpufreq_core --> cpufreq_sysfs_attributes : exposes
cpufreq_sysfs_attributes --> cppc_acpi_helpers : auto_act_window_energy_perf
cppc_acpi_helpers --> FirmwarePCC : via_PCC_commands
class FirmwarePCC {
+int send_pcc_cmd(int pcc_ss_id, int cmd)
+struct cppc_pcc_data *pcc_data[]
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Tested and passed on ARM64 target using LLVM-20. I don’t have Deepin’s development environment (build toolchain), so I can’t update the defconfig for all architectures myself — could the Deepin team please handle this? Many thanks! @opsiff |
…r-free commit 951c5e26f86a2a342cbd533f4ae4a1545d8ff57e openEuler From 'commit 456d8aa ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free")' we know that PCIe spec r6.0, sec 7.5.3.7, recommends that software program the same ASPM Control(pcie_link_state) value in all functions of multi-function devices, and free the pcie_link_state when any child function is removed. However, ASPM Control sysfs is still visible to other children even if it has been removed by any child function, and careless use it will trigger use-after-free error, e.g.: # lspci -tv -[0000:16]---00.0-[17]--+-00.0 Device 19e5:0222 \-00.1 Device 19e5:0222 # echo 1 > /sys/bus/pci/devices/0000:17:00.0/remove //pcie_link_state will be released # echo 1 > /sys/bus/pci/devices/0000:17:00.1/link/l1_aspm //will trigger error Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030 Call trace: aspm_attr_store_common.constprop.0+0x10c/0x154 l1_aspm_store+0x24/0x30 dev_attr_store+0x20/0x34 sysfs_kf_write+0x4c/0x5c We can solve this problem by updating the ASPM Control sysfs of all children immediately after ASPM Control have been freed. Extra note: The Linux community plans to optimize to the pcie_link_state management code.For details,see the following link: https://patchwork.kernel.org/project/linux-pci/patch/ [email protected]/ Fixes: 456d8aa ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") Signed-off-by: Jay Fang <[email protected]> Signed-off-by: lujunhua <[email protected]> Signed-off-by: Slim6882 <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…ation commit 049045f499e1cc07a132c3ef010f15abce995e03 openEuler The filter sysfs attribute is initialized when register to the sysfs. This is unnecessary and could be done when allocation without distinguish the filter type. After the changes above, we don't need a wrapper for initializing and registering the filter's sysfs attributes. Remove them and call the sysfs creating/removing functions in place. Fixes: 6373c46 ("hwtracing: hisi_ptt: Export available filters through sysfs") Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: lujunhua <[email protected]> Signed-off-by: Slim6882 <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
commit 20c2368e0195aae332176096336105c470d48f15 openEuler
1509d06c9c41 ("init: only move down lockup_detector_init() when sdei_watchdog is enabled")
In the above commit, sdei_watchdog needs to move down
lockup_detector_init (), while nmi_watchdog does not. So when
sdei_watchdog fails to be initialized, nmi_watchdog should not be
initialized.
[ 0.706631][ T1] SDEI NMI watchdog: Disable SDEI NMI Watchdog in VM
[ 0.707405][ T1] ------------[ cut here ]------------
[ 0.708020][ T1] WARNING: CPU: 0 PID: 1 at kernel/watchdog_perf.c:117 hardlockup_detector_event_create+0x24/0x108
[ 0.709230][ T1] Modules linked in:
[ 0.709665][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.0 deepin-community#1
[ 0.710700][ T1] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[ 0.711625][ T1] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.712547][ T1] pc : hardlockup_detector_event_create+0x24/0x108
[ 0.713316][ T1] lr : watchdog_hardlockup_probe+0x28/0xa8
[ 0.714010][ T1] sp : ffff8000831cbdc0
[ 0.714501][ T1] pmr_save: 000000e0
[ 0.714957][ T1] x29: ffff8000831cbdc0 x28: 0000000000000000 x27: 0000000000000000
[ 0.715899][ T1] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 0.716839][ T1] x23: 0000000000000000 x22: 0000000000000000 x21: ffff80008218fab0
[ 0.717775][ T1] x20: ffff8000821af000 x19: ffff0000c0261900 x18: 0000000000000020
[ 0.718713][ T1] x17: 00000000cb551c45 x16: ffff800082625e48 x15: ffffffffffffffff
[ 0.719663][ T1] x14: 0000000000000000 x13: 205d315420202020 x12: 5b5d313336363037
[ 0.720607][ T1] x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffff800081b5f630
[ 0.721590][ T1] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 000000000005fff4
[ 0.722528][ T1] x5 : 00000000002bffa8 x4 : 0000000000000000 x3 : 0000000000000000
[ 0.723482][ T1] x2 : 0000000000000000 x1 : 0000000000000140 x0 : ffff0000c02c0000
[ 0.724426][ T1] Call trace:
[ 0.724808][ T1] hardlockup_detector_event_create+0x24/0x108
[ 0.725535][ T1] watchdog_hardlockup_probe+0x28/0xa8
[ 0.726174][ T1] lockup_detector_init+0x110/0x158
[ 0.726776][ T1] kernel_init_freeable+0x208/0x288
[ 0.727387][ T1] kernel_init+0x2c/0x200
[ 0.727902][ T1] ret_from_fork+0x10/0x20
[ 0.728420][ T1] ---[ end trace 0000000000000000 ]---
Fixes: 7cc5cc30a7d1 ("watchdog: Support watchdog_sdei coexist with existing watchdogs")
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 60bc47b] Architecture's using perf events for hard lockup detection needs to convert the watchdog_thresh to the event's period, some architecture for example arm64 perform this conversion using the CPU's maximum frequency which will be acquired by cpufreq. However by the time the lockup detector's initialized the cpufreq driver may not be initialized, thus launch a watchdog with inaccurate period. Provide a function hardlockup_detector_perf_adjust_period() to allowing adjust the event period. Then architecture can update with more accurate period if cpufreq is initialized. Fixes: 94946f9 ("arm64: add hw_nmi_get_sample_period for preparation of lockup detector") Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 7a88444] arm64 depends on the cpufreq driver to gain the maximum cpu frequency to convert the watchdog_thresh to perf event period. cpufreq drivers like cppc_cpufreq will be initialized lately after the initializing of the hard lockup detector so just use a safe cpufreq which will be inaccurency. Use a cpufreq notifier to adjust the event's period to a more accurate one. Fixes: 94946f9 ("arm64: add hw_nmi_get_sample_period for preparation of lockup detector") Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
commit c8f96adca7fadbacc9b1af75528964da9d7bb166 openEuler
During testing, it was found that in low-power scenarios, the CPU
core frequently powers on and off, which causes the SDEI watchdog
to fail. This is because the secure timer in the BIOS does not save
and restore the state during the CPU core power-on and power-off
process. The OS needs to add enable/disable operations in the
suspend/resume process.
Fixes: e29d570c5413 ("watchdog: add nmi_watchdog support for arm64 based on SDEI")
Signed-off-by: Bowen You <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit f022c4cac9c1013fa8820b0af01aaa98e38053f1 openEuler
sdei watchdog needs to be initialized after sdei_init,
so commit 81e349d81958 ("init: only move down lockup_detector_init()
when sdei_watchdog is enabled")
move down the lockup_detector_init().
Now Commit 930d8f8 ("watchdog/perf: adapt the watchdog_perf
interface for async model")
provide an API lockup_detector_retry_init() for anyone
who needs to delayed init lockup detector, so use this
API to delay init sdei watchdog.
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: zhangguangzhi <[email protected]>
Signed-off-by: WangYuli <[email protected]>
commit f0f9be237c2266d30fa01c294577438a2e2ee749 openEuler The current interrupt policy of the Hip12 preferentially schedules LPI and SGI interrupts with the same priority. As a result, the arch_timer may not be executed when frequent LPI/SGI interrupts are generated, and the watchdog is starved. Signed-off-by: Qinxin Xia <[email protected]> Signed-off-by: Hongye Lin <[email protected]> Signed-off-by: WangYuli <[email protected]>
Exception: although it compiles, there is no need to enable SDEI_WATCHDOG in production. |
commit 6dd0f06a404f481ea58fb04b3006287dc3c5aea3 openEuler
The core CPU control framework supports runtime SMT control which
is not yet supported by arch_topology driver and thus arch_topology
based architectures. This patch implements it in the following aspects:
- implement topology_is_primary_thread() to indicate the primary thread,
required by the framework
- architecture code can get/set the SMT thread number by
topology_smt_{get, set}_num_threads()
- update the SMT thread number for the framework after the topology
enumerated on arm64, which is also required by the framework
For disabling SMT we'll offline all the secondary threads and
only leave the primary thread. Since we don't have restriction
for primary thread selection, the first thread is chosen as the
primary thread in this implementation.
This patch only implements the basic support for SMT control, which
needs to collabrate with ACPI/OF based topology building to fully
enable the feature. The SMT control will be enabled unless the
correct SMT thread number is set and HOTPLUG_SMT kconfig is selected.
Signed-off-by: Yicong Yang <[email protected]>
Signed-off-by: Jie Liu <[email protected]>
Signed-off-by: huwentao <[email protected]>
Signed-off-by: WangYuli <[email protected]>
[Upstream commit 5deb9c7] On building the topology from the devicetree, we've already gotten the SMT thread number of each core. Update the largest SMT thread number to enable the SMT control. Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Jie Liu <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit e6b18eb] For ACPI we'll build the topology from PPTT and we cannot directly get the SMT number of each core. Instead using a temporary xarray to record the SMT number of each core when building the topology and we can know the largest SMT number in the system. Then we can notify the arch_topology for supporting SMT control. Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Jie Liu <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit eed4583] Enable HOTPLUG_SMT for SMT control. Signed-off-by: Yicong Yang <[email protected]> Signed-off-by: Jie Liu <[email protected]> Signed-off-by: huwentao <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 58f8fc5] Currently the PMUv3 driver only reads PMMIR_EL1 if the PMU implements FEAT_PMUv3p4 and the STALL_SLOT event, but the check for STALL_SLOT event isn't necessary and can be removed. The check for STALL_SLOT event was introduced with the read of PMMIR_EL1 in commit f5be3a6 ("arm64: perf: Add support caps under sysfs") When this logic was written, the ARM ARM said: | If STALL_SLOT is not implemented, it is IMPLEMENTATION DEFINED whether | the PMMIR System registers are implemented. ... and thus the driver had to check for STALL_SLOT event to verify that PMMIR_EL1 was implemented and accesses to PMMIR_EL1 would not be UNDEFINED. Subsequently, the architecture was retrospectively tightened to require that any FEAT_PMUv3p4 implementation implements PMMIR_EL1. Since the G.b release of the ARM ARM, the wording regarding STALL_SLOT event has been removed, and the description of PMMIR_EL1 says: | This register is present only when FEAT_PMUv3p4 is implemented. Drop the unnecessary check for STALL_SLOT event when reading PMMIR_EL1. Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: James Clark <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
…nit() [Upstream commit 3b9a22d] All the PMU init functions want the default sysfs attribute groups, and so these all call armv8_pmu_init_nogroups() helper, with none of them calling armv8_pmu_init() directly. When we introduced armv8_pmu_init_nogroups() in the commit e424b17 ("arm64: perf: Refactor PMU init callbacks") ... we thought that we might need custom attribute groups in future, but as we evidently haven't, we can remove the option. This patch folds armv8_pmu_init_nogroups() into armv8_pmu_init(), removing the ability to use custom attribute groups and simplifying the code. CC: James Clark <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Mark Rutland <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit bc512d6] The NSH bit, which filters event counting at EL2, is required by the architecture if an implementation has EL2. Even though KVM doesn't support nested virt yet, it makes no effort to hide the existence of EL2 from the ID registers. Userspace can, however, change the value of PFR0 to hide EL2. Align KVM's sysreg emulation with the architecture and make NSH RES0 if EL2 isn't advertised. Keep in mind the bit is ignored when constructing the backing perf event. While at it, build the event type mask using explicit field definitions instead of relying on ARMV8_PMU_EVTYPE_MASK. KVM probably should've been doing this in the first place, as it avoids changes to the aforementioned mask affecting sysreg emulation. Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit ae8d352] Suzuki noticed that KVM's PMU emulation is oblivious to the NSU and NSK event filter bits. On systems that have EL3 these bits modify the filter behavior in non-secure EL0 and EL1, respectively. Even though the kernel doesn't use these bits, it is entirely possible some other guest OS does. Additionally, it would appear that these and the M bit are required by the architecture if EL3 is implemented. Allow the EL3 event filter bits to be set if EL3 is advertised in the guest's ID register. Implement the behavior of NSU and NSK according to the pseudocode, and entirely ignore the M bit for perf event creation. Reported-by: Suzuki K Poulose <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 877806b] This further compacts all remaining PMU init procedures requiring specific map_event functions via a new macro PMUV3_INIT_MAP_EVENT(). While here, it also changes generated init function names to match to those generated via the other macro PMUV3_INIT_SIMPLE(). This does not cause functional change. Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: James Clark <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 9343c79] These are all static and in one compilation unit so the inline has no effect on the binary. Except if FTRACE is enabled, then 3 functions which were already not inlined now get the nops added which allows them to be traced. Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 2f6a00f] This is so that FIELD_GET and FIELD_PREP can be used and that the fields are in a consistent format to arm64/tools/sysreg Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit d30f09b] Convert the remaining fields to use either GENMASK or be built from other fields. These all already started at bit 0 so don't need a code change for the lack of _SHIFT. Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 3115ee0] FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include them in the mask. These aren't writable on 32 bit kernels as they are in the high part of the register, so only include them for arm64. It would be difficult to do this statically in the asm header files for each platform without resulting in circular includes or #ifdefs inline in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has been removed and the mask is constructed programmatically. Reviewed-by: Suzuki K Poulose <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit f6da869] This mechanism makes it much easier to define and read new attributes so move it to the arm_pmu.h header so that it can be shared. At the same time update the existing format attributes to use it. GENMASK has to be changed to GENMASK_ULL because the config fields are 64 bits even on arm32 where this will also be used now. Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 186c91a] -EPERM or -EINVAL always get converted to -EOPNOTSUPP, so replace them. This will allow __hw_perf_event_init() to return a different code or not print that particular message for a different error in the next commit. Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 816c267] FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on events whose count meets a specified threshold condition. For example if PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or equal, count), and the threshold is set to 2, then the PMU counter will now only increment by 1 when an event would have previously incremented the PMU counter by 2 or more on a single processor cycle. Three new Perf event config fields, 'threshold', 'threshold_compare' and 'threshold_count' have been added to control the feature. threshold_compare maps to the upper two bits of PMEVTYPERn.TC and threshold_count maps to the first bit of TC. These separate attributes have been picked rather than enumerating all the possible combinations of the TC field as in the Arm ARM. The attributes would be used on a Perf command line like this: $ perf stat -e stall_slot/threshold=2,threshold_compare=2/ A new capability for reading out the maximum supported threshold value has also been added: $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max 0x000000ff If a threshold higher than threshold_max is provided, then an error is generated. If FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then threshold_max reads zero, and attempting to set a threshold value will also result in an error. The threshold is per PMU counter, and there are potentially different threshold_max values per PMU type on heterogeneous systems. Bits higher than 32 now need to be written into PMEVTYPER, so armv8pmu_write_evtype() has to be updated to take an unsigned long value rather than u32 which gives the correct behavior on both aarch32 and 64. Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit bd69063] Add documentation for the new Perf event open parameters and the threshold_max capability file. Reviewed-by: Anshuman Khandual <[email protected]> Reviewed-by: Suzuki K Poulose <[email protected]> Acked-by: Namhyung Kim <[email protected]> Signed-off-by: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit bb339db] LLVM ignores everything inside the if statement and doesn't generate errors, but GCC doesn't ignore it, resulting in the following error: drivers/perf/arm_pmuv3.c: In function ‘armv8pmu_write_evtype’: include/linux/bits.h:34:29: error: left shift count >= width of type [-Werror=shift-count-overflow] 34 | (((~UL(0)) - (UL(1) << (l)) + 1) & \ Fix it by using GENMASK_ULL which doesn't overflow on arm32 (even though the value is never used there). Fixes: 3115ee0 ("arm64: perf: Include threshold control fields in PMEVTYPER mask") Reported-by: Uwe Kleine-König <[email protected]> Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ Signed-off-by: James Clark <[email protected]> Acked-by: Mark Rutland <[email protected]> Reviewed-by: Uwe Kleine-König <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
[Upstream commit 81e15ca] If the user has requested a counting threshold for the CPU cycles event, then the fixed cycle counter can't be assigned as it lacks threshold support. Currently, the thresholds will work or not randomly depending on which counter the event is assigned. While using thresholds for CPU cycles doesn't make much sense, it can be useful for testing purposes. Fixes: 816c267 ("arm64: perf: Add support for event counting threshold") Signed-off-by: Rob Herring (Arm) <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]> Signed-off-by: yeyiyang <[email protected]> Signed-off-by: WangYuli <[email protected]>
deepin pr auto review我将对这个Git diff进行代码审查,重点关注语法逻辑、代码质量、性能和安全性。
具体改进建议:
// 建议:在hibmc_msi_init中添加更多错误检查
if (ret < HIBMC_MIN_VECTORS) {
drm_err(dev, "Insufficient IRQ vectors: got %d, need at least %d\n",
ret, HIBMC_MIN_VECTORS);
return ret;
}
// 建议:添加更严格的阈值检查
if (th > threshold_max(cpu_pmu)) {
pr_debug("PMU event threshold %u exceeds max value %u\n",
th, threshold_max(cpu_pmu));
return -EINVAL;
}
// 建议:添加更严格的权限检查
if (!capable(CAP_SYS_ADMIN)) {
pr_debug("Permission denied for cache maintenance operation\n");
return -EPERM;
}
// 建议:添加看门狗超时检查
if (watchdog_thresh < 1 || watchdog_thresh > 3600) {
pr_warn("Invalid watchdog threshold %d, using default %d\n",
watchdog_thresh, DEFAULT_WATCHDOG_THRESH);
watchdog_thresh = DEFAULT_WATCHDOG_THRESH;
}
// 建议:添加更严格的范围检查
if (val > resource->caps.max_cap || val < resource->caps.min_cap) {
pr_debug("Power cap %lu out of range [%llu,%llu]\n",
val, resource->caps.min_cap, resource->caps.max_cap);
return -EINVAL;
}
// 建议:添加中断优先级范围检查
if (prio > GICD_INT_DEF_PRI) {
pr_warn("Interrupt priority %u exceeds maximum %u\n",
prio, GICD_INT_DEF_PRI);
return -EINVAL;
}总体而言,这个diff实现了多个重要功能:
代码质量整体较好,但建议在上述方面进一步加强安全性和健壮性检查。 |
Fixed the wrong debugfs node name in hisi_spi debugfs initialization by the way.
Summary by Sourcery
Add support for new CPPC-based cpufreq features and a SEEP governor, improve handling of CPPC registers and perf counters, and expose additional tuning controls via sysfs for Kunpeng SoCs.
New Features:
Bug Fixes:
Enhancements:
Documentation: