| Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 75141a770f4f8225d316f6c7e146723a32e9720e upstream.
When concurrently bringing up and down two SMT threads of a physical
core, many warning call traces occur as below:
The issue timeline is as follows:
1. When the system starts,
cpufreq: CPU: 220, policy->related_cpus: 220-221, policy->cpus: 220-221
2. Offline CPU 220 and CPU 221.
3. Online CPU 220
- CPU 221 is now offline, as acpi_get_psd_map() use
for_each_online_cpu(), so the cpu_data->shared_cpu_map,
policy->cpus, and related_cpus has only CPU 220.
cpufreq: CPU: 220, policy->related_cpus: 220, policy->cpus: 220
4. Offline CPU 220
5. Online CPU 221, the below call trace occurs:
- Since CPU 220 and CPU 221 share one policy, and
policy->related_cpus = 220 after step 3, so CPU 221
is not in policy->related_cpus but
per_cpu(cpufreq_cpu_data, cpu221) is not NULL.
After reverting commit 56eb0c0ed345 ("ACPI: CPPC: Fix remaining
for_each_possible_cpu() to use online CPUs"), the issue disappeared.
The _PSD (P-State Dependency) defines the hardware-level dependency of
frequency control across CPU cores. Since this relationship is a physical
attribute of the hardware topology, it remains constant regardless of the
online or offline status of the CPUs.
Using for_each_online_cpu() in acpi_get_psd_map() is problematic. If a
CPU is offline, it will be excluded from the shared_cpu_map.
Consequently, if that CPU is brought online later, the kernel will fail
to recognize it as part of any shared frequency domain.
Switch back to for_each_possible_cpu() to ensure that all cores defined
in the ACPI tables are correctly mapped into their respective performance
domains from the start. This aligns with the logic of policy->related_cpus,
which must encompass all potentially available cores in the domain to
prevent logic gaps during CPU hotplug operations.
To resolve the original issue regarding the "nosmt" or "nosmt=force"
boot parameter, as send_pcc_cmd() function already does if (!desc)
continue, so reverting that loop back to for_each_possible_cpu() is ok,
only need to change the match_cpc_ptr NULL case in acpi_get_psd_map() to
continue as Sean suggested.
How to reproduce, on arm64 machine with SMT support which use acpi cppc
cpufreq driver:
bash test.sh 220 & bash test.sh 221 &
The test.sh is as below:
while true
do
echo 0 > /sys/devices/system/cpu/cpu${1}/online
sleep 0.5
cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus
echo 1 > /sys/devices/system/cpu/cpu${1}/online
cat /sys/devices/system/cpu/cpu${1}/cpufreq/related_cpus
done
CPU: 221 PID: 1119 Comm: cpuhp/221 Kdump: loaded Not tainted 6.6.0debug+ #5
Hardware name: To be filled by O.E.M. S920X20/BC83AMDA01-7270Z, BIOS 20.39 09/04/2024
pstate: a1400009 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : cpufreq_online+0x8ac/0xa90
lr : cpuhp_cpufreq_online+0x18/0x30
sp : ffff80008739bce0
x29: ffff80008739bce0 x28: 0000000000000000 x27: ffff28400ca32200
x26: 0000000000000000 x25: 0000000000000003 x24: ffffd483503ff000
x23: ffffd483504051a0 x22: ffffd48350024a00 x21: 00000000000000dd
x20: 000000000000001d x19: ffff28400ca32000 x18: 0000000000000000
x17: 0000000000000020 x16: ffffd4834e6a3fc8 x15: 0000000000000020
x14: 0000000000000008 x13: 0000000000000001 x12: 00000000ffffffff
x11: 0000000000000040 x10: ffffd48350430728 x9 : ffffd4834f087c78
x8 : 0000000000000001 x7 : ffff2840092bdf00 x6 : ffffd483504264f0
x5 : ffffd48350405000 x4 : ffff283f7f95cc60 x3 : 0000000000000000
x2 : ffff53bc2f94b000 x1 : 00000000000000dd x0 : 0000000000000000
Call trace:
cpufreq_online+0x8ac/0xa90
cpuhp_cpufreq_online+0x18/0x30
cpuhp_invoke_callback+0x128/0x580
cpuhp_thread_fun+0x110/0x1b0
smpboot_thread_fn+0x140/0x190
kthread+0xec/0x100
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
Cc: All applicable <stable@vger.kernel.org>
Fixes: 56eb0c0ed345 ("ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs")
Co-developed-by: Sean Kelley <skelley@nvidia.com>
Signed-off-by: Sean Kelley <skelley@nvidia.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260417040112.3727756-1-ruanjinjie@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI support updates from Rafael J. Wysocki:
"These are mostly fixes and cleanups on top of the ACPI support updates
merged recently, including two new quirks, an ACPI CPPC library fix,
and fixes and cleanups of a few core ACPI device drivers:
- Add an unused power resource handling quirk for THUNDEROBOT ZERO
(Zhai Can)
- Fix remaining for_each_possible_cpu() in the ACPI CPPC library to
use online CPUs (Sean V Kelley)
- Drop redundant checks from the ACPI notify handler and the driver
remove callback in the ACPI battery driver (Rafael Wysocki)
- Move the creation of the wakeup source during the ACPI button
driver probe to an earlier point to avoid missing a wakeup event
due to a race and clean up system wakeup handling and remove
callback in that driver (Rafael Wysocki)
- Drop unnecessary driver_data pointer clearing from the ACPI EC and
SMBUS HC drivers and make the ACPI backlight (video) driver clear
the device's driver_data pointer on remove (Rafael Wysocki)
- Force enabling of PWM2 on the Yogabook YB1-X90 tablets (Yauhen
Kharuzhy)"
* tag 'acpi-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Add unused power resource quirk for THUNDEROBOT ZERO
ACPI: driver: Drop driver_data pointer clearing from two drivers
ACPI: video: Clear driver_data pointer on remove
ACPI: button: Tweak acpi_button_remove()
ACPI: button: Tweak system wakeup handling
ACPI: battery: Drop redundant checks from acpi_battery_remove()
ACPI: CPPC: Fix remaining for_each_possible_cpu() to use online CPUs
ACPI: x86: Force enabling of PWM2 on the Yogabook YB1-X90
ACPI: button: Call device_init_wakeup() earlier during probe
ACPI: battery: Drop redundant check from acpi_battery_notify()
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However, send_pcc_cmd() and acpi_get_psd_map() still iterate over all
possible CPUs. In acpi_get_psd_map(), encountering an offline CPU
returns -EFAULT, causing cppc_cpufreq initialization to fail.
This breaks systems booted with "nosmt" or "nosmt=force".
Fix by using for_each_online_cpu() in both functions.
Fixes: 80b8286aeec0 ("ACPI / CPPC: support for batching CPPC requests")
Signed-off-by: Sean V Kelley <skelley@nvidia.com>
Link: https://patch.msgid.link/20260211212254.30190-1-skelley@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"By the number of commits, cpufreq is the leading party (again) and the
most visible change there is the removal of the omap-cpufreq driver
that has not been used for a long time (good riddance). There are also
quite a few changes in the cppc_cpufreq driver, mostly related to
fixing its frequency invariance engine in the case when the CPPC
registers used by it are not in PCC. In addition to that, support for
AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev
list is updated for some platforms. The remaining cpufreq changes are
assorted fixes and cleanups.
Next up is cpuidle and the changes there are dominated by intel_idle
driver updates, mostly related to the new command line facility
allowing users to adjust the list of C-states used by the driver.
There are also a few updates of cpuidle governors, including two menu
governor fixes and some refinements of the teo governor, and a
MAINTAINERS update adding Christian Loehle as a cpuidle reviewer.
[Thanks for stepping up Christian!]
The most significant update related to system suspend and hibernation
is the one to stop freezing the PM runtime workqueue during system PM
transitions which allows some deadlocks to be avoided. There is also a
fix for possible concurrent bit field updates in the core device
suspend code and a few other minor fixes.
Apart from the above, several drivers are updated to discard the
return value of pm_runtime_put() which is going to be converted to a
void function as soon as everybody stops using its return value, PL4
support for Ice Lake is added to the Intel RAPL power capping driver,
and there are assorted cleanups, documentation fixes, and some
cpupower utility improvements.
Specifics:
- Remove the unused omap-cpufreq driver (Andreas Kemnade)
- Optimize error handling code in cpufreq_boost_trigger_state() and
make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy
supports boost (Lifeng Zheng)
- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio)
- Minor improvements to the cpufreq and cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen)
- Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole)
- Update arch_freq_scale in the CPPC cpufreq driver's frequency
invariance engine (FIE) in scheduler ticks if the related CPPC
registers are not in PCC (Jie Zhan)
- Assorted minor cleanups and improvements in ARM cpufreq drivers
(Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov)
- Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit
Gupta)
- Make the scaling_setspeed cpufreq sysfs attribute return the actual
requested frequency to avoid confusion (Pengjie Zhang)
- Simplify the idle CPU time granularity test in the ondemand cpufreq
governor (Frederic Weisbecker)
- Enable asym capacity in intel_pstate only when CPU SMT is not
possible (Yaxiong Tian)
- Update the description of rate_limit_us default value in cpufreq
documentation (Yaxiong Tian)
- Add a command line option to adjust the C-states table in the
intel_idle driver, remove the 'preferred_cstates' module parameter
from it, add C-states validation to it and clean it up (Artem
Bityutskiy)
- Make the menu cpuidle governor always check the time till the
closest timer event when the scheduler tick has been stopped to
prevent it from mistakenly selecting the deepest available idle
state (Rafael Wysocki)
- Update the teo cpuidle governor to avoid making suboptimal
decisions in certain corner cases and generally improve idle state
selection accuracy (Rafael Wysocki)
- Remove an unlikely() annotation on the early-return condition in
menu_select() that leads to branch misprediction 100% of the time
on systems with only 1 idle state enabled, like ARM64 servers
(Breno Leitao)
- Add Christian Loehle to MAINTAINERS as a cpuidle reviewer
(Christian Loehle)
- Stop flagging the PM runtime workqueue as freezable to avoid system
suspend and resume deadlocks in subsystems that assume asynchronous
runtime PM to work during system-wide PM transitions (Rafael
Wysocki)
- Drop redundant NULL pointer checks before acomp_request_free() from
the hibernation code handling image saving (Rafael Wysocki)
- Update wakeup_sources_walk_start() to handle empty lists of wakeup
sources as appropriate (Samuel Wu)
- Make dev_pm_clear_wake_irq() check the power.wakeirq value under
power.lock to avoid race conditions (Gui-Dong Han)
- Avoid bit field races related to power.work_in_progress in the core
device suspend code (Xuewen Yan)
- Make several drivers discard pm_runtime_put() return value in
preparation for converting that function to a void one (Rafael
Wysocki)
- Add PL4 support for Ice Lake to the Intel RAPL power capping driver
(Daniel Tang)
- Replace sprintf() with sysfs_emit() in power capping sysfs show
functions (Sumeet Pawnikar)
- Make dev_pm_opp_get_level() return value match the documentation
after a previous update of the latter (Aleks Todorov)
- Use scoped for each OF child loop in the OPP code (Krzysztof
Kozlowski)
- Fix a bug in an example code snippet and correct typos in the
energy model management documentation (Patrick Little)
- Fix miscellaneous problems in cpupower (Kaushlendra Kumar):
* idle_monitor: Fix incorrect value logged after stop
* Fix inverted APERF capability check
* Use strcspn() to strip trailing newline
* Reset errno before strtoull()
* Show C0 in idle-info dump
- Improve cpupower installation procedure by making the systemd step
optional and allowing users to disable the installation of
systemd's unit file (João Marcos Costa)"
* tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
PM: sleep: core: Avoid bit field races related to work_in_progress
PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races
cpufreq: Documentation: Update description of rate_limit_us default value
cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible
PM: wakeup: Handle empty list in wakeup_sources_walk_start()
PM: EM: Documentation: Fix bug in example code snippet
Documentation: Fix typos in energy model documentation
cpuidle: governors: teo: Refine intercepts-based idle state lookup
cpuidle: governors: teo: Adjust the classification of wakeup events
cpufreq: ondemand: Simplify idle cputime granularity test
cpufreq: userspace: make scaling_setspeed return the actual requested frequency
PM: hibernate: Drop NULL pointer checks before acomp_request_free()
cpufreq: CPPC: Add generic helpers for sysfs show/store
cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
cpufreq: ti-cpufreq: add support for AM62L3 SoC
cpufreq: dt-platdev: Add ti,am62l3 to blocklist
cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
cpufreq: scmi: correct SCMI explanation
cpufreq: dt-platdev: Block the driver from probing on more QC platforms
rust: cpumask: rename methods of Cpumask for clarity and consistency
...
|
|
Update EPP (Energy Performance Preference) constants for more clarity:
- Add CPPC_EPP_PERFORMANCE_PREF (0x00) for performance preference.
- Rename CPPC_ENERGY_PERF_MAX to CPPC_EPP_ENERGY_EFFICIENCY_PREF (0xFF)
for energy efficiency.
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20260120145623.2959636-4-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull CPUFreq Arm updates for 7.0 from Viresh Kumar:
"- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling,
Dhruva Gole, and Konrad Dybcio).
- Minor improvements to the cpufreq / cpumask rust implementation
(Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen).
- Add support for AM62L3 SoC to ti-cpufreq driver (Dhruva Gole).
- Update FIE arch_freq_scale in ticks for non-PCC regs (Jie Zhan).
- Other minor cleanups / improvements (Felix Gu, Juan Martinez, Luca
Weiss, and Sergey Shtylyov)."
* tag 'cpufreq-arm-updates-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id()
cpufreq: ti-cpufreq: add support for AM62L3 SoC
cpufreq: dt-platdev: Add ti,am62l3 to blocklist
cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy
cpufreq: scmi: correct SCMI explanation
cpufreq: dt-platdev: Block the driver from probing on more QC platforms
rust: cpumask: rename methods of Cpumask for clarity and consistency
cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs
cpufreq: CPPC: Factor out cppc_fie_kworker_init()
ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu()
rust: cpufreq: replace `kernel::c_str!` with C-Strings
cpufreq: Add Tegra186 and Tegra194 to cpufreq-dt-platdev blocklist
dt-bindings: cpufreq: qcom-hw: document Milos CPUFREQ Hardware
rust: cpufreq: add __rust_helper to helpers
rust: cpufreq: always inline functions using build_assert with arguments
|
|
Factor out cppc_perf_ctrs_in_pcc_cpu() for checking whether per-cpu CPC
regs are defined in PCC channels, and export it out for further use.
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
The current implementation overlooks the 'guaranteed_perf'
register in this check.
If the Guaranteed Performance register is located in the PCC
subspace, the function currently attempts to read it without
acquiring the lock and without sending the CMD_READ doorbell
to the firmware. This can result in reading stale data.
Fixes: 29523f095397 ("ACPI / CPPC: Add support for guaranteed performance")
Signed-off-by: Pengjie Zhang <zhangpengjie2@huawei.com>
Cc: 4.20+ <stable@vger.kernel.org> # 4.20+
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251210132227.1988380-1-zhangpengjie2@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPU via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_perf_ctrs_in_pcc() checks if the CPPC
perf-ctrs are in a PCC region for all the present CPUs, which breaks
when the kernel is booted with "nosmt=force".
Hence, limit the check only to the online CPUs.
Fixes: ae2df912d1a5 ("ACPI: CPPC: Disable FIE if registers in PCC regions")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-5-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_allow_fast_switch() checks for the validity
of the _CPC object for all the present CPUs. This breaks when the
kernel is booted with "nosmt=force".
Check fast_switch capability only on online CPUs
Fixes: 15eece6c5b05 ("ACPI: CPPC: Fix NULL pointer dereference when nosmp is used")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-4-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function acpi_cpc_valid() checks for the validity of the
_CPC object for all the present CPUs. This breaks when the kernel is
booted with "nosmt=force".
Hence check the validity of the _CPC objects of only the online CPUs.
Fixes: 2aeca6bd0277 ("ACPI: CPPC: Check present CPUs for determining _CPC is valid")
Reported-by: Christopher Harris <chris.harris79@gmail.com>
Closes: https://lore.kernel.org/lkml/CAM+eXpdDT7KjLV0AxEwOLkSJ2QtrsvGvjA2cCHvt1d0k2_C4Cw@mail.gmail.com/
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Tested-by: Chrisopher Harris <chris.harris79@gmail.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-3-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fix spelling from "pachage" to "package".
Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
[ rjw: Changelog and subject edits ]
Link: https://patch.msgid.link/20251031055240.2791-1-chuguangqing@inspur.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Instead of using CPUFREQ_ETERNAL for signaling an error condition
in cppc_get_transition_latency(), change the return value type of
that function to int and make it return a proper negative error
code on failures.
No intentional functional impact.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Qais Yousef <qyousef@layalina.io>
|
|
With nosmp in cmdline, other CPUs are not brought up, leaving
their cpc_desc_ptr NULL. CPU0's iteration via for_each_possible_cpu()
dereferences these NULL pointers, causing panic.
Panic backtrace:
[ 0.401123] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b8
...
[ 0.403255] [<ffffffff809a5818>] cppc_allow_fast_switch+0x6a/0xd4
...
Kernel panic - not syncing: Attempted to kill init!
Fixes: 3cc30dd00a58 ("cpufreq: CPPC: Enable fast_switch")
Reported-by: Xu Lu <luxu.kernel@bytedance.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://patch.msgid.link/20250604023036.99553-1-cuiyunhui@bytedance.com
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
cppc_set_epp() - write energy performance preference register value,
based on ACPI 6.5, s8.4.6.1.7
cppc_get_auto_act_window() - read autonomous activity window register
value, based on ACPI 6.5, s8.4.6.1.6
cppc_set_auto_act_window() - write autonomous activity window register
value, based on ACPI 6.5, s8.4.6.1.6
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-9-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel(). Using a
cppc_perf_caps to carry the value is unnecessary.
Add a check to ensure the pointer 'enable' is not null.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-8-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Refactor register value get and set ABIs by using cppc_get_reg_val(),
cppc_set_reg_val() and CPPC_REG_VAL_READ().
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-7-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add cppc_set_reg_val() as a generic function for setting CPPC register
values, with this features:
1. Check register. If a register is writeable, it must be a buffer and
can not be null.
2. Extract the operations if register is in PCC out as
cppc_set_reg_val_in_pcc().
This function can be used to reduce some existing code duplication.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-6-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Extract the operations if register is in pcc out from cppc_get_reg_val()
as cppc_get_reg_val_in_pcc().
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-5-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Rename cppc_get_perf() to cppc_get_reg_val() as a generic function to
read CPPC registers.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-4-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Optimize cppc_get_perf() with three changes:
1. Change the error kind to "no such device" when pcc_ss_id < 0, as
other register value getting functions.
2. Add a check to ensure the pointer 'perf' is no null.
3. Add a check to verify if the register is supported to be read before
using it. The logic is:
(1) If the register is of the integer type, check whether the
register is optional and its value is 0. If yes, the register
is not supported.
(2) If the register is of other types, a null one is not supported.
4. Return the result of cpc_read() instead of 0.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-3-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In ACPI 6.5, s8.4.6.1 _CPC (Continuous Performance Control), whether
each of the per-cpu cpc_regs[] is mandatory or optional is defined.
Since the CPC_SUPPORTED() check is only for optional _CPC fields,
another macro to check if the field is optional is needed.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20250411093855.982491-2-zhenglifeng1@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The PCC driver now handles mapping and unmapping of shared memory
areas as part of pcc_mbox_{request,free}_channel(). Without these before,
this ACPI CPPC driver did handling of those mappings like several
other PCC mailbox client drivers.
There were redundant operations, leading to unnecessary code. Maintaining
the consistency across these driver was harder due to scattered handling
of shmem.
Just use the mapped shmem and remove all redundant operations from this
driver.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://patch.msgid.link/20250411113144.1151094-2-sudeep.holla@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Merge updates of the ACPI battery and EC drivers, an ACPI Platform
Firmware Runtime (PFR) telemetry driver update and an ACPI OS support
layer change for 6.13-rc1:
- Use DEFINE_SIMPLE_DEV_PM_OPS in the ACPI battery driver, make it use
devm_ for initializing mutexes and allocating driver data, and make
it check the register_pm_notifier() return value (Thomas Weißschuh,
Andy Shevchenko).
- Make the ACPI EC driver support compile-time conditional and allow
ACPI to be built without CONFIG_HAS_IOPORT (Arnd Bergmann).
- Remove a redundant error check from the pfr_telemetry driver (Colin
Ian King).
* acpi-battery:
ACPI: battery: Check for error code from devm_mutex_init() call
ACPI: battery: use DEFINE_SIMPLE_DEV_PM_OPS
ACPI: battery: initialize mutexes through devm_ APIs
ACPI: battery: allocate driver data through devm_ APIs
ACPI: battery: check result of register_pm_notifier()
* acpi-ec:
ACPI: EC: make EC support compile-time conditional
* acpi-pfr:
ACPI: pfr_telemetry: remove redundant error check on ret
* acpi-osl:
ACPI: allow building without CONFIG_HAS_IOPORT
|
|
Since commit 60949b7b8054 ("ACPI: CPPC: Fix MASK_VAL() usage"), _CPC
registers cannot be changed from 1 to 0.
It turns out that there is an extra OR after MASK_VAL_WRITE(), which
has already ORed prev_val with the register mask.
Remove the extra OR to fix the problem.
Fixes: 60949b7b8054 ("ACPI: CPPC: Fix MASK_VAL() usage")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20241113103309.761031-1-zhenglifeng1@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
arch_init_invariance_cppc() is called at the end of
acpi_cppc_processor_probe() in order to configure frequency invariance
based upon the values from _CPC.
This however doesn't work on AMD CPPC shared memory designs that have
AMD preferred cores enabled because _CPC needs to be analyzed from all
cores to judge if preferred cores are enabled.
This issue manifests to users as a warning since commit 21fb59ab4b97
("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"):
```
Could not retrieve highest performance (-19)
```
However the warning isn't the cause of this, it was actually
commit 279f838a61f9 ("x86/amd: Detect preferred cores in
amd_get_boost_ratio_numerator()") which exposed the issue.
To fix this problem, change arch_init_invariance_cppc() into a new weak
symbol that is called at the end of acpi_processor_driver_init().
Each architecture that supports it can declare the symbol to override
the weak one.
Define it for x86, in arch/x86/kernel/acpi/cppc.c, and for all of the
architectures using the generic arch_topology.c code.
Fixes: 279f838a61f9 ("x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()")
Reported-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219431
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20241104222855.3959267-1-superm1@kernel.org
[ rjw: Changelog edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
CONFIG_HAS_IOPORT will soon become optional and cause a build time
failure when it is disabled but a driver calls inb()/outb(). At the
moment, all architectures that can support ACPI have port I/O, but this
is not necessarily the case in the future on non-x86 architectures.
The result is a set of errors like:
drivers/acpi/osl.c: In function 'acpi_os_read_port':
include/asm-generic/io.h:542:14: error: call to '_inb' declared with attribute error: inb()) requires CONFIG_HAS_IOPORT
Nothing should actually call these functions in this configuration,
and if it does, the result would be undefined behavior today, possibly
a NULL pointer dereference.
Change the low-level functions to return a proper error code when
HAS_IOPORT is disabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20241030123701.1538919-2-arnd@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The following BUG was triggered:
=============================
[ BUG: Invalid wait context ]
6.12.0-rc2-XXX #406 Not tainted
-----------------------------
kworker/1:1/62 is trying to lock:
ffffff8801593030 (&cpc_ptr->rmw_lock){+.+.}-{3:3}, at: cpc_write+0xcc/0x370
other info that might help us debug this:
context-{5:5}
2 locks held by kworker/1:1/62:
#0: ffffff897ef5ec98 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2c/0x50
#1: ffffff880154e238 (&sg_policy->update_lock){....}-{2:2}, at: sugov_update_shared+0x3c/0x280
stack backtrace:
CPU: 1 UID: 0 PID: 62 Comm: kworker/1:1 Not tainted 6.12.0-rc2-g9654bd3e8806 #406
Workqueue: 0x0 (events)
Call trace:
dump_backtrace+0xa4/0x130
show_stack+0x20/0x38
dump_stack_lvl+0x90/0xd0
dump_stack+0x18/0x28
__lock_acquire+0x480/0x1ad8
lock_acquire+0x114/0x310
_raw_spin_lock+0x50/0x70
cpc_write+0xcc/0x370
cppc_set_perf+0xa0/0x3a8
cppc_cpufreq_fast_switch+0x40/0xc0
cpufreq_driver_fast_switch+0x4c/0x218
sugov_update_shared+0x234/0x280
update_load_avg+0x6ec/0x7b8
dequeue_entities+0x108/0x830
dequeue_task_fair+0x58/0x408
__schedule+0x4f0/0x1070
schedule+0x54/0x130
worker_thread+0xc0/0x2e8
kthread+0x130/0x148
ret_from_fork+0x10/0x20
sugov_update_shared() locks a raw_spinlock while cpc_write() locks a
spinlock.
To have a correct wait-type order, update rmw_lock to a raw spinlock and
ensure that interrupts will be disabled on the CPU holding it.
Fixes: 60949b7b8054 ("ACPI: CPPC: Fix MASK_VAL() usage")
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://patch.msgid.link/20241028125657.1271512-1-pierre.gondois@arm.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When the nominal_freq recorded by the kernel is equal to the lowest_freq,
and the frequency adjustment operation is triggered externally, there is
a logic error in cppc_perf_to_khz()/cppc_khz_to_perf(), resulting in perf
and khz conversion errors.
Fix this by adding a branch processing logic when nominal_freq is equal
to lowest_freq.
Fixes: ec1c7ad47664 ("cpufreq: CPPC: Fix performance/frequency conversion")
Signed-off-by: liwei <liwei728@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/20241024022952.2627694-1-liwei728@huawei.com
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
|
|
Some Asus AMD systems are reported to not be able to change EPP values
because the BIOS doesn't advertise support for the CPPC MSR and the PCC
region is not configured.
However the ACPI 6.2 specification allows CPC registers to be declared
in FFH:
```
Starting with ACPI Specification 6.2, all _CPC registers can be in
PCC, System Memory, System IO, or Functional Fixed Hardware address
spaces. OSPM support for this more flexible register space scheme
is indicated by the “Flexible Address Space for CPPC Registers” _OSC
bit.
```
If this _OSC has been set allow using FFH to configure EPP.
Reported-by: al0uette@outlook.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218686
Suggested-by: al0uette@outlook.com
Tested-by: vderp@icloud.com
Tested-by: al0uette@outlook.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patch.msgid.link/20240910031524.106387-1-superm1@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
MASK_VAL() was added as a way to handle bit_offset and bit_width for
registers located in system memory address space. However, while suited
for reading, it does not work for writing and result in corrupted
registers when writing values with bit_offset > 0. Moreover, when a
register is collocated with another one at the same address but with a
different mask, the current code results in the other registers being
overwritten with 0s. The write procedure for SYSTEM_MEMORY registers
should actually read the value, mask it, update it and write it with the
updated value. Moreover, since registers can be located in the same
word, we must take care of locking the access before doing it. We should
potentially use a global lock since we don't know in if register
addresses aren't shared with another _CPC package but better not
encourage vendors to do so. Assume that registers can use the same word
inside a _CPC package and thus, use a per _CPC package lock.
Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses")
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Link: https://patch.msgid.link/20240826101648.95654-1-cleger@rivosinc.com
[ rjw: Dropped redundant semicolon ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Replace ternary operator with umax() in cppc_find_dmi_mhz().
Signed-off-by: Prabhakar Pujeri <prabhakar.pujeri@gmail.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20240626130941.1527127-2-prabhakar.pujeri@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Expose the CPPC guaranteed performance as reported by the platform through
GuaranteedPerformanceRegister.
The current value is already read in cppc_get_perf_caps() and stored in
struct cppc_perf_caps (to be used by the intel_pstate driver), so only the
attribute itself needs to be defined.
Signed-off-by: Petr Tesařík <ptesarik@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Merge cpufreq updates for 6.10:
- Rework the handling of disabled turbo in the intel_pstate driver and
make it update the maximum CPU frequency consistently regardless of
the reason on top of a number of cleanups (Rafael Wysocki).
- Add missing checks for NULL .exit() cpufreq driver callback to the
cpufreq core (Viresh Kumar).
- Prevent pulicy->max from going above the frequency QoS maximum value
when cpufreq_frequency_table_verify() is used (Xuewen Yan).
- Prevent a negative CPU number or frequency value from being printed
if they are really large (Joshua Yeong).
- Update MAINTAINERS entry for amd-pstate to add two new submaintainers
and a designated reviewer (Huang Rui).
- Clean up the amd-pstate driver and update its documentation (Gautham
Shenoy).
- Fix the highest frequency issue in the amd-pstate driver which limits
performance (Perry Yuan).
- Enable CPPC v2 for certain processors in the family 17H, as requested
by TR40 processor users who expect improved performance and lower
system temperature (Perry Yuan).
- Change latency and delay values to be read from platform firmware
firstly for more accurate timing (Perry Yuan).
- A new quirk is introduced for supporting amd-pstate on legacy
processors which either lack CPPC capability, or only only have CPPC
v2 capability (Perry Yuan).
- Sun50i: Add support for opp_supported_hw, H616 platform and general
cleanups (Andre Przywara, Martin Botka, Brandon Cheo Fusi, Dan
Carpenter, Viresh Kumar).
- CPPC: Fix possible null pointer dereference (Aleksandr Mishin).
- Eliminate uses of of_node_put() (Javier Carrasco, and Shivani Gupta).
- brcmstb-avs: ISO C90 forbids mixed declarations (Portia Stephens).
- mediatek: Add support for MT7988A (Sam Shih).
- cpufreq-qcom-hw: Add SM4450 compatibles in DT bindings (Tengfei Fan).
- Fix struct cpudata::epp_cached kernel-doc in the intel_pstate cpufreq
driver (Jeff Johnson).
* pm-cpufreq: (46 commits)
cpufreq: amd-pstate: fix the highest frequency issue which limits performance
cpufreq: intel_pstate: fix struct cpudata::epp_cached kernel-doc
cpufreq: Fix up printing large CPU numbers and frequency values
MAINTAINERS: cpufreq: amd-pstate: Add co-maintainers and reviewer
cpufreq: amd-pstate: remove unused variable lowest_nonlinear_freq
cpufreq: amd-pstate: fix c |