linux.git/drivers/base, branch v6.6.120

PM: runtime: Do not clear needs_force_resume with enabled runtime PM

2026-01-11T14:22:01+00:00

commit 359afc8eb02a518fbdd0cbd462c8c2827c6cbec2 upstream.

Commit 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in
pm_runtime_reinit()") added provisional clearing of power.needs_force_resume
to pm_runtime_reinit(), but it is done unconditionally which is a
mistake because pm_runtime_reinit() may race with driver probing
and removal [1].

To address this, notice that power.needs_force_resume should never
be set when runtime PM is enabled and so it only needs to be cleared
when runtime PM is disabled, and update pm_runtime_init() to only
clear that flag when runtime PM is disabled.

Fixes: 89d9cec3b1e9 ("PM: runtime: Clear power.needs_force_resume in pm_runtime_reinit()")
Reported-by: Ed Tsai 
Closes: https://lore.kernel.org/linux-pm/20251215122154.3180001-1-ed.tsai@mediatek.com/ [1]
Signed-off-by: Rafael J. Wysocki 
Cc: 6.17+  # 6.17+
Reviewed-by: Ulf Hansson 
Link: https://patch.msgid.link/12807571.O9o76ZdvQC@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman

base/node / ACPI: Enumerate node access class for 'struct access_coordinate'

2025-11-24T09:30:04+00:00

[ Upstream commit 11270e526276ffad4c4237acb393da82a3287487 ]

Both generic node and HMAT handling code have been using magic numbers to
indicate access classes for 'struct access_coordinate'. Introduce enums to
enumerate the access0 and access1 classes shared by the two subsystems.
Update the function parameters and callers as appropriate to utilize the
new enum.

Access0 is named to ACCESS_COORDINATE_LOCAL in order to indicate that the
access class is for 'struct access_coordinate' between a target node and
the nearest initiator node.

Access1 is named to ACCESS_COORDINATE_CPU in order to indicate that the
access class is for 'struct access_coordinate' between a target node and
the nearest CPU node.

Cc: Greg Kroah-Hartman 
Cc: Rafael J. Wysocki 
Reviewed-by: Jonathan Cameron 
Tested-by: Jonathan Cameron 
Acked-by: Greg Kroah-Hartman 
Signed-off-by: Dave Jiang 
Link: https://lore.kernel.org/r/20240308220055.2172956-3-dave.jiang@intel.com
Signed-off-by: Dan Williams 
Stable-dep-of: 214291cbaace ("acpi/hmat: Fix lockdep warning for hmem_register_resource()")
Signed-off-by: Sasha Levin

base/node / acpi: Change 'node_hmem_attrs' to 'access_coordinates'

2025-11-24T09:30:03+00:00

[ Upstream commit 6a954e94d038f41d79c4e04348c95774d1c9337d ]

Dan Williams suggested changing the struct 'node_hmem_attrs' to
'access_coordinates' [1]. The struct is a container of r/w-latency and
r/w-bandwidth numbers. Moving forward, this container will also be used by
CXL to store the performance characteristics of each link hop in
the PCIE/CXL topology. So, where node_hmem_attrs is just the access
parameters of a memory-node, access_coordinates applies more broadly
to hardware topology characteristics. The observation is that seemed like
an exercise in having the application identify "where" it falls on a
spectrum of bandwidth and latency needs. For the tuple of
read/write-latency and read/write-bandwidth, "coordinates" is not a perfect
fit. Sometimes it is just conveying values in isolation and not a
"location" relative to other performance points, but in the end this data
is used to identify the performance operation point of a given memory-node.
[2]

Link: http://lore.kernel.org/r/64471313421f7_1b66294d5@dwillia2-xfh.jf.intel.com.notmuch/
Link: https://lore.kernel.org/linux-cxl/645e6215ee0de_1e6f2945e@dwillia2-xfh.jf.intel.com.notmuch/
Suggested-by: Dan Williams 
Reviewed-by: Dan Williams 
Reviewed-by: Jonathan Cameron 
Signed-off-by: Dave Jiang 
Acked-by: Greg Kroah-Hartman 
Link: https://lore.kernel.org/r/170319615734.2212653.15319394025985499185.stgit@djiang5-mobl3
Signed-off-by: Dan Williams 
Stable-dep-of: 214291cbaace ("acpi/hmat: Fix lockdep warning for hmem_register_resource()")
Signed-off-by: Sasha Levin

regmap: slimbus: fix bus_context pointer in regmap init calls

2025-11-24T09:29:15+00:00

commit 434f7349a1f00618a620b316f091bd13a12bc8d2 upstream.

Commit 4e65bda8273c ("ASoC: wcd934x: fix error handling in
wcd934x_codec_parse_data()") revealed the problem in the slimbus regmap.
That commit breaks audio playback, for instance, on sdm845 Thundercomm
Dragonboard 845c board:

 Unable to handle kernel paging request at virtual address ffff8000847cbad4
 ...
 CPU: 5 UID: 0 PID: 776 Comm: aplay Not tainted 6.18.0-rc1-00028-g7ea30958b305 #11 PREEMPT
 Hardware name: Thundercomm Dragonboard 845c (DT)
 ...
 Call trace:
  slim_xfer_msg+0x24/0x1ac [slimbus] (P)
  slim_read+0x48/0x74 [slimbus]
  regmap_slimbus_read+0x18/0x24 [regmap_slimbus]
  _regmap_raw_read+0xe8/0x174
  _regmap_bus_read+0x44/0x80
  _regmap_read+0x60/0xd8
  _regmap_update_bits+0xf4/0x140
  _regmap_select_page+0xa8/0x124
  _regmap_raw_write_impl+0x3b8/0x65c
  _regmap_bus_raw_write+0x60/0x80
  _regmap_write+0x58/0xc0
  regmap_write+0x4c/0x80
  wcd934x_hw_params+0x494/0x8b8 [snd_soc_wcd934x]
  snd_soc_dai_hw_params+0x3c/0x7c [snd_soc_core]
  __soc_pcm_hw_params+0x22c/0x634 [snd_soc_core]
  dpcm_be_dai_hw_params+0x1d4/0x38c [snd_soc_core]
  dpcm_fe_dai_hw_params+0x9c/0x17c [snd_soc_core]
  snd_pcm_hw_params+0x124/0x464 [snd_pcm]
  snd_pcm_common_ioctl+0x110c/0x1820 [snd_pcm]
  snd_pcm_ioctl+0x34/0x4c [snd_pcm]
  __arm64_sys_ioctl+0xac/0x104
  invoke_syscall+0x48/0x104
  el0_svc_common.constprop.0+0x40/0xe0
  do_el0_svc+0x1c/0x28
  el0_svc+0x34/0xec
  el0t_64_sync_handler+0xa0/0xf0
  el0t_64_sync+0x198/0x19c

The __devm_regmap_init_slimbus() started to be used instead of
__regmap_init_slimbus() after the commit mentioned above and turns out
the incorrect bus_context pointer (3rd argument) was used in
__devm_regmap_init_slimbus(). It should be just "slimbus" (which is equal
to &slimbus->dev). Correct it. The wcd934x codec seems to be the only or
the first user of devm_regmap_init_slimbus() but we should fix it till
the point where __devm_regmap_init_slimbus() was introduced therefore
two "Fixes" tags.

While at this, also correct the same argument in __regmap_init_slimbus().

Fixes: 4e65bda8273c ("ASoC: wcd934x: fix error handling in wcd934x_codec_parse_data()")
Fixes: 7d6f7fb053ad ("regmap: add SLIMbus support")
Cc: stable@vger.kernel.org
Cc: Dmitry Baryshkov 
Cc: Ma Ke 
Cc: Steev Klimaszewski 
Cc: Srinivas Kandagatla 
Reviewed-by: Abel Vesa 
Signed-off-by: Alexey Klimov 
Reviewed-by: Dmitry Baryshkov 
Link: https://patch.msgid.link/20251022201013.1740211-1-alexey.klimov@linaro.org
Signed-off-by: Mark Brown 
Signed-off-by: Greg Kroah-Hartman

devcoredump: Fix circular locking dependency with devcd->mutex.

2025-10-29T13:07:05+00:00

[ Upstream commit a91c8096590bd7801a26454789f2992094fe36da ]

The original code causes a circular locking dependency found by lockdep.

======================================================
WARNING: possible circular locking dependency detected
6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 Tainted: G S   U
------------------------------------------------------
xe_fault_inject/5091 is trying to acquire lock:
ffff888156815688 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}, at: __flush_work+0x25d/0x660

but task is already holding lock:

ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&devcd->mutex){+.+.}-{3:3}:
       mutex_lock_nested+0x4e/0xc0
       devcd_data_write+0x27/0x90
       sysfs_kf_bin_write+0x80/0xf0
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (kn->active#236){++++}-{0:0}:
       kernfs_drain+0x1e2/0x200
       __kernfs_remove+0xae/0x400
       kernfs_remove_by_name_ns+0x5d/0xc0
       remove_files+0x54/0x70
       sysfs_remove_group+0x3d/0xa0
       sysfs_remove_groups+0x2e/0x60
       device_remove_attrs+0xc7/0x100
       device_del+0x15d/0x3b0
       devcd_del+0x19/0x30
       process_one_work+0x22b/0x6f0
       worker_thread+0x1e8/0x3d0
       kthread+0x11c/0x250
       ret_from_fork+0x26c/0x2e0
       ret_from_fork_asm+0x1a/0x30
-> #0 ((work_completion)(&(&devcd->del_wk)->work)){+.+.}-{0:0}:
       __lock_acquire+0x1661/0x2860
       lock_acquire+0xc4/0x2f0
       __flush_work+0x27a/0x660
       flush_delayed_work+0x5d/0xa0
       dev_coredump_put+0x63/0xa0
       xe_driver_devcoredump_fini+0x12/0x20 [xe]
       devm_action_release+0x12/0x30
       release_nodes+0x3a/0x120
       devres_release_all+0x8a/0xd0
       device_unbind_cleanup+0x12/0x80
       device_release_driver_internal+0x23a/0x280
       device_driver_detach+0x14/0x20
       unbind_store+0xaf/0xc0
       drv_attr_store+0x21/0x50
       sysfs_kf_write+0x4a/0x80
       kernfs_fop_write_iter+0x169/0x220
       vfs_write+0x293/0x560
       ksys_write+0x72/0xf0
       __x64_sys_write+0x19/0x30
       x64_sys_call+0x2bf/0x2660
       do_syscall_64+0x93/0xb60
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
other info that might help us debug this:
Chain exists of: (work_completion)(&(&devcd->del_wk)->work) --> kn->active#236 --> &devcd->mutex
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&devcd->mutex);
                               lock(kn->active#236);
                               lock(&devcd->mutex);
  lock((work_completion)(&(&devcd->del_wk)->work));
 *** DEADLOCK ***
5 locks held by xe_fault_inject/5091:
 #0: ffff8881129f9488 (sb_writers#5){.+.+}-{0:0}, at: ksys_write+0x72/0xf0
 #1: ffff88810c755078 (&of->mutex#2){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x123/0x220
 #2: ffff8881054811a0 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x55/0x280
 #3: ffff888156815620 (&devcd->mutex){+.+.}-{3:3}, at: dev_coredump_put+0x3f/0xa0
 #4: ffffffff8359e020 (rcu_read_lock){....}-{1:2}, at: __flush_work+0x72/0x660
stack backtrace:
CPU: 14 UID: 0 PID: 5091 Comm: xe_fault_inject Tainted: G S   U              6.16.0-rc6-lgci-xe-xe-pw-151626v3+ #1 PREEMPT_{RT,(lazy)}
Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A DDR4(MS-7D25), BIOS 1.10 12/13/2021
Call Trace:
 
 dump_stack_lvl+0x91/0xf0
 dump_stack+0x10/0x20
 print_circular_bug+0x285/0x360
 check_noncircular+0x135/0x150
 ? register_lock_class+0x48/0x4a0
 __lock_acquire+0x1661/0x2860
 lock_acquire+0xc4/0x2f0
 ? __flush_work+0x25d/0x660
 ? mark_held_locks+0x46/0x90
 ? __flush_work+0x25d/0x660
 __flush_work+0x27a/0x660
 ? __flush_work+0x25d/0x660
 ? trace_hardirqs_on+0x1e/0xd0
 ? __pfx_wq_barrier_func+0x10/0x10
 flush_delayed_work+0x5d/0xa0
 dev_coredump_put+0x63/0xa0
 xe_driver_devcoredump_fini+0x12/0x20 [xe]
 devm_action_release+0x12/0x30
 release_nodes+0x3a/0x120
 devres_release_all+0x8a/0xd0
 device_unbind_cleanup+0x12/0x80
 device_release_driver_internal+0x23a/0x280
 ? bus_find_device+0xa8/0xe0
 device_driver_detach+0x14/0x20
 unbind_store+0xaf/0xc0
 drv_attr_store+0x21/0x50
 sysfs_kf_write+0x4a/0x80
 kernfs_fop_write_iter+0x169/0x220
 vfs_write+0x293/0x560
 ksys_write+0x72/0xf0
 __x64_sys_write+0x19/0x30
 x64_sys_call+0x2bf/0x2660
 do_syscall_64+0x93/0xb60
 ? __f_unlock_pos+0x15/0x20
 ? __x64_sys_getdents64+0x9b/0x130
 ? __pfx_filldir64+0x10/0x10
 ? do_syscall_64+0x1a2/0xb60
 ? clear_bhb_loop+0x30/0x80
 ? clear_bhb_loop+0x30/0x80
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x76e292edd574
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
RSP: 002b:00007fffe247a828 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076e292edd574
RDX: 000000000000000c RSI: 00006267f6306063 RDI: 000000000000000b
RBP: 000000000000000c R08: 000076e292fc4b20 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 00006267f6306063
R13: 000000000000000b R14: 00006267e6859c00 R15: 000076e29322a000
 
xe 0000:03:00.0: [drm] Xe device coredump has been deleted.

Fixes: 01daccf74832 ("devcoredump : Serialize devcd_del work")
Cc: Mukesh Ojha 
Cc: Greg Kroah-Hartman 
Cc: Johannes Berg 
Cc: Rafael J. Wysocki 
Cc: Danilo Krummrich 
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Maarten Lankhorst 
Cc: Matthew Brost 
Acked-by: Mukesh Ojha 
Link: https://lore.kernel.org/r/20250723142416.1020423-1-dev@lankhorst.se
Signed-off-by: Greg Kroah-Hartman 
[ replaced disable_delayed_work_sync() with cancel_delayed_work_sync() ]
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

arch_topology: Fix incorrect error check in topology_parse_cpu_capacity()

2025-10-29T13:07:02+00:00

commit 2eead19334516c8e9927c11b448fbe512b1f18a1 upstream.

Fix incorrect use of PTR_ERR_OR_ZERO() in topology_parse_cpu_capacity()
which causes the code to proceed with NULL clock pointers. The current
logic uses !PTR_ERR_OR_ZERO(cpu_clk) which evaluates to true for both
valid pointers and NULL, leading to potential NULL pointer dereference
in clk_get_rate().

Per include/linux/err.h documentation, PTR_ERR_OR_ZERO(ptr) returns:
"The error code within @ptr if it is an error pointer; 0 otherwise."

This means PTR_ERR_OR_ZERO() returns 0 for both valid pointers AND NULL
pointers. Therefore !PTR_ERR_OR_ZERO(cpu_clk) evaluates to true (proceed)
when cpu_clk is either valid or NULL, causing clk_get_rate(NULL) to be
called when of_clk_get() returns NULL.

Replace with !IS_ERR_OR_NULL(cpu_clk) which only proceeds for valid
pointers, preventing potential NULL pointer dereference in clk_get_rate().

Cc: stable 
Signed-off-by: Kaushlendra Kumar 
Reviewed-by: Sudeep Holla 
Fixes: b8fe128dad8f ("arch_topology: Adjust initial CPU capacities with current freq")
Link: https://patch.msgid.link/20250923174308.1771906-1-kaushlendra.kumar@intel.com
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Greg Kroah-Hartman

PM: runtime: Add new devm functions

2025-10-23T14:16:39+00:00

[ Upstream commit 73db799bf5efc5a04654bb3ff6c9bf63a0dfa473 ]

Add `devm_pm_runtime_set_active_enabled()` and
`devm_pm_runtime_get_noresume()` for simplifying
common cases in drivers.

Signed-off-by: Bence Csókás 
Link: https://patch.msgid.link/20250327195928.680771-3-csokas.bence@prolan.hu
Signed-off-by: Rafael J. Wysocki 
Stable-dep-of: 0792c1984a45 ("iio: imu: inv_icm42600: Simplify pm_runtime setup")
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

drivers/base/node: fix double free in register_one_node()

2025-10-15T09:58:06+00:00

[ Upstream commit 0efdedfa537eb534c251a5b4794caaf72cc55869 ]

When device_register() fails in register_node(), it calls
put_device(&node->dev).  This triggers node_device_release(), which calls
kfree(to_node(dev)), thereby freeing the entire node structure.

As a result, when register_node() returns an error, the node memory has
already been freed.  Calling kfree(node) again in register_one_node()
leads to a double free.

This patch removes the redundant kfree(node) from register_one_node() to
prevent the double free.

Link: https://lkml.kernel.org/r/20250918054144.58980-1-donettom@linux.ibm.com
Fixes: 786eb990cfb7 ("drivers/base/node: handle error properly in register_one_node()")
Signed-off-by: Donet Tom 
Acked-by: David Hildenbrand 
Acked-by: Oscar Salvador 
Cc: Alison Schofield 
Cc: Chris Mason 
Cc: Danilo Krummrich 
Cc: Dave Jiang 
Cc: Greg Kroah-Hartman 
Cc: Hiroyouki Kamezawa 
Cc: Joanthan Cameron 
Cc: "Ritesh Harjani (IBM)" 
Cc: Yury Norov (NVIDIA) 
Cc: Zi Yan 
Signed-off-by: Andrew Morton 
Signed-off-by: Sasha Levin

drivers/base/node: handle error properly in register_one_node()

2025-10-15T09:58:00+00:00

[ Upstream commit 786eb990cfb78aab94eb74fb32a030e14723a620 ]

If register_node() returns an error, it is not handled correctly.
The function will proceed further and try to register CPUs under the
node, which is not correct.

So, in this patch, if register_node() returns an error, we return
immediately from the function.

Link: https://lkml.kernel.org/r/20250822084845.19219-1-donettom@linux.ibm.com
Fixes: 76b67ed9dce6 ("[PATCH] node hotplug: register cpu: remove node struct")
Signed-off-by: Donet Tom 
Acked-by: David Hildenbrand 
Cc: Alison Schofield 
Cc: Danilo Krummrich 
Cc: Dave Jiang 
Cc: Donet Tom 
Cc: Greg Kroah-Hartman 
Cc: Hiroyouki Kamezawa 
Cc: Joanthan Cameron 
Cc: Oscar Salvador 
Cc: "Ritesh Harjani (IBM)" 
Cc: Yury Norov (NVIDIA) 
Cc: Zi Yan 
Signed-off-by: Andrew Morton 
Signed-off-by: Sasha Levin

PM: sleep: core: Clear power.must_resume in noirq suspend error path

2025-10-15T09:57:51+00:00

[ Upstream commit be82483d1b60baf6747884bd74cb7de484deaf76 ]

If system suspend is aborted in the "noirq" phase (for instance, due to
an error returned by one of the device callbacks), power.is_noirq_suspended
will not be set for some devices and device_resume_noirq() will return
early for them.  Consequently, noirq resume callbacks will not run for
them at all because the noirq suspend callbacks have not run for them
yet.

If any of them has power.must_resume set and late suspend has been
skipped for it (due to power.smart_suspend), early resume should be
skipped for it either, or its state may become inconsistent (for
instance, if the early resume assumes that it will always follow
noirq resume).

Make that happen by clearing power.must_resume in device_resume_noirq()
for devices with power.is_noirq_suspended clear that have been left in
suspend by device_suspend_late(), which will subsequently cause
device_resume_early() to leave the device in suspend and avoid
changing its state.

Fixes: 0d4b54c6fee8 ("PM / core: Add LEAVE_SUSPENDED driver flag")
Link: https://lore.kernel.org/linux-pm/5d692b81-6f58-4e86-9cb0-ede69a09d799@rowland.harvard.edu/
Signed-off-by: Rafael J. Wysocki 
Reviewed-by: Ulf Hansson 
Link: https://patch.msgid.link/3381776.aeNJFYEL58@rafael.j.wysocki
Signed-off-by: Sasha Levin