| Age | Commit message (Collapse) | Author | Files | Lines |
|
Link: https://lore.kernel.org/r/20251017145147.138822285@linuxfoundation.org
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Brett Mastbergen <bmastbergen@ciq.com>
Tested-by: Hardik Garg <hargar@linux.microsoft.com>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Brett A C Sheffield <bacs@librecast.net>
Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Tested-by: Dileep Malepu <dileep.debian@gmail.com>
Tested-by: Miguel Ojeda <ojeda@kernel.org>
Tested-by: Pascal Ernster <git@hardfalcon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e4f574ca9c6dfa66695bb054ff5df43ecea873ec upstream.
A while back I had reported that an NFSv3 client could successfully
mount using '-o xprtsec=none' an export that had been exported with
'xprtsec=tls:mtls'. By "successfully" I mean that the mount command
would succeed and the mount would show up in /proc/mount. Attempting
to do anything futher with the mount would be met with NFS3ERR_ACCES.
This was fixed (albeit accidentally) by commit bb4f07f2409c ("nfsd:
Fix NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT") and was
subsequently re-broken by commit 0813c5f01249 ("nfsd: fix access
checking for NLM under XPRTSEC policies").
Transport Layer Security isn't an RPC security flavor or pseudo-flavor,
so we shouldn't be conflating them when determining whether the access
checks can be bypassed. Split check_nfsd_access() into two helpers, and
have __fh_verify() call the helpers directly since __fh_verify() has
logic that allows one or both of the checks to be skipped. All other
sites will continue to call check_nfsd_access().
Link: https://lore.kernel.org/linux-nfs/ZjO3Qwf_G87yNXb2@aion/
Fixes: 9280c5774314 ("NFSD: Handle new xprtsec= export option")
Cc: stable@vger.kernel.org
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6c7ca6a02f8f9549a438a08a23c6327580ecf3d6 upstream.
When calling in listmount() mnt_ns_release() may be passed a NULL
pointer. Handle that case gracefully.
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bace10b59624e6bd8d68bc9304357f292f1b3dcf upstream.
Assumption that chain DMA module starts the link DMA when 1ms of
data is available from host is not correct. Instead the firmware
chain DMA module fills the link DMA with initial buffer of zeroes
and the host and link DMAs are started at the same time.
This results in a small error in delay calculation. This can become a
more severe problem if host DMA has delays that exceed 1ms. This results
in negative delay to be calculated and bogus values reported to
applications. This can confuse some applications like
alsa_conformance_test.
Fix the issue by correctly calculating the firmware chain DMA
preamble size and initializing the start offset to this value.
Cc: stable@vger.kernel.org
Fixes: a1d203d390e0 ("ASoC: SOF: ipc4-pcm: Enable delay reporting for ChainDMA streams")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20251002074719.2084-3-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0813c5f01249dbc32ccbc68d27a24fde5bf2901c upstream.
When an export policy with xprtsec policy is set with "tls"
and/or "mtls", but an NFS client is doing a v3 xprtsec=tls
mount, then NLM locking calls fail with an error because
there is currently no support for NLM with TLS.
Until such support is added, allow NLM calls under TLS-secured
policy.
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK")
Cc: stable@vger.kernel.org
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Reviewed-by: NeilBrown <neil@brown.name>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d9d6b74e4be989f919498798fa40df37a74b5bb0 upstream.
__fh_verify() added a call to svc_xprt_set_valid() to help do connection
management but during LOCALIO path rqstp argument is NULL, leading to
NULL pointer dereferencing and a crash.
Fixes: eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection limiting calculations.")
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f9c506fb69bdcfb9d7138281378129ff037f2aa1 upstream.
The cycles event will fallback to task-clock in the hybrid test when
running virtualized. Change the test to not fail for this.
Fixes: 65d11821910bd910 ("perf test: Add a test for default perf stat command")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241212173354.9860-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Dequeuing a fair task on a throttled hierarchy returns early on
encountering a throttled cfs_rq since the throttle path has already
dequeued the hierarchy above and has adjusted the h_nr_* accounting till
the root cfs_rq.
dequeue_entities() crucially misses calling __block_task() for delayed
tasks being dequeued on the throttled hierarchies, but this was mostly
harmless until commit b7ca5743a260 ("sched/core: Tweak
wait_task_inactive() to force dequeue sched_delayed tasks") since all
existing cases would re-enqueue the task if task_on_rq_queued() returned
true and the task would eventually be blocked at pick after the
hierarchy was unthrottled.
wait_task_inactive() is special as it expects the delayed task on
throttled hierarchy to reach the blocked state on dequeue but since
__block_task() is never called, task_on_rq_queued() continues to return
true. Furthermore, since the task is now off the hierarchy, the pick
never reaches it to fully block the task even after unthrottle leading
to wait_task_inactive() looping endlessly.
Remedy this by calling __block_task() if a delayed task is being
dequeued on a throttled hierarchy.
This fix is only required for stabled kernels implementing delay dequeue
(>= v6.12) before v6.18 since upstream commit e1fad12dcb66 ("sched/fair:
Switch to task based throttle model") indirectly fixes this by removing
the early return conditions in dequeue_entities() as part of the per-task
throttle feature.
Cc: stable@vger.kernel.org
Reported-by: Matt Fleming <matt@readmodwrite.com>
Closes: https://lore.kernel.org/all/20250925133310.1843863-1-matt@readmodwrite.com/
Fixes: b7ca5743a260 ("sched/core: Tweak wait_task_inactive() to force dequeue sched_delayed tasks")
Tested-by: Matt Fleming <mfleming@cloudflare.com>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9a6ebbdbd41235ea3bc0c4f39e2076599b8113cc ]
With lazytime mount option enabled we can be switching many dirty inodes
on cgroup exit to the parent cgroup. The numbers observed in practice
when systemd slice of a large cron job exits can easily reach hundreds
of thousands or millions. The logic in inode_do_switch_wbs() which sorts
the inode into appropriate place in b_dirty list of the target wb
however has linear complexity in the number of dirty inodes thus overall
time complexity of switching all the inodes is quadratic leading to
workers being pegged for hours consuming 100% of the CPU and switching
inodes to the parent wb.
Simple reproducer of the issue:
FILES=10000
# Filesystem mounted with lazytime mount option
MNT=/mnt/
echo "Creating files and switching timestamps"
for (( j = 0; j < 50; j ++ )); do
mkdir $MNT/dir$j
for (( i = 0; i < $FILES; i++ )); do
echo "foo" >$MNT/dir$j/file$i
done
touch -a -t 202501010000 $MNT/dir$j/file*
done
wait
echo "Syncing and flushing"
sync
echo 3 >/proc/sys/vm/drop_caches
echo "Reading all files from a cgroup"
mkdir /sys/fs/cgroup/unified/mycg1 || exit
echo $$ >/sys/fs/cgroup/unified/mycg1/cgroup.procs || exit
for (( j = 0; j < 50; j ++ )); do
cat /mnt/dir$j/file* >/dev/null &
done
wait
echo "Switching wbs"
# Now rmdir the cgroup after the script exits
We need to maintain b_dirty list ordering to keep writeback happy so
instead of sorting inode into appropriate place just append it at the
end of the list and clobber dirtied_time_when. This may result in inode
writeback starting later after cgroup switch however cgroup switches are
rare so it shouldn't matter much. Since the cgroup had write access to
the inode, there are no practical concerns of the possible DoS issues.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 66c14dccd810d42ec5c73bb8a9177489dfd62278 ]
process_inode_switch_wbs_work() can be switching over 100 inodes to a
different cgroup. Since switching an inode requires counting all dirty &
under-writeback pages in the address space of each inode, this can take
a significant amount of time. Add a possibility to reschedule after
processing each inode to avoid softlockups.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7f9d34b0a7cb93d678ee7207f0634dbf79e47fe5 ]
The inode mode loaded from corrupted disk can be invalid. Do like what
commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk")
does.
Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/429b3ef1-13de-4310-9a8e-c2dc9a36234a@I-love.SAKURA.ne.jp
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 278033a225e13ec21900f0a92b8351658f5377f2 ]
When CONFIG_TMPFS is enabled, the initial root filesystem is a tmpfs.
By default, a tmpfs mount is limited to using 50% of the available RAM
for its content. This can be problematic in memory-constrained
environments, particularly during a kdump capture.
In a kdump scenario, the capture kernel boots with a limited amount of
memory specified by the 'crashkernel' parameter. If the initramfs is
large, it may fail to unpack into the tmpfs rootfs due to insufficient
space. This is because to get X MB of usable space in tmpfs, 2*X MB of
memory must be available for the mount. This leads to an OOM failure
during the early boot process, preventing a successful crash dump.
This patch introduces a new kernel command-line parameter,
initramfs_options, which allows passing specific mount options directly
to the rootfs when it is first mounted. This gives users control over
the rootfs behavior.
For example, a user can now specify initramfs_options=size=75% to allow
the tmpfs to use up to 75% of the available memory. This can
significantly reduce the memory pressure for kdump.
Consider a practical example:
To unpack a 48MB initramfs, the tmpfs needs 48MB of usable space. With
the default 50% limit, this requires a memory pool of 96MB to be
available for the tmpfs mount. The total memory requirement is therefore
approximately: 16MB (vmlinuz) + 48MB (loaded initramfs) + 48MB (unpacked
kernel) + 96MB (for tmpfs) + 12MB (runtime overhead) ≈ 220MB.
By using initramfs_options=size=75%, the memory pool required for the
48MB tmpfs is reduced to 48MB / 0.75 = 64MB. This reduces the total
memory requirement by 32MB (96MB - 64MB), allowing the kdump to succeed
with a smaller crashkernel size, such as 192MB.
An alternative approach of reusing the existing rootflags parameter was
considered. However, a new, dedicated initramfs_options parameter was
chosen to avoid altering the current behavior of rootflags (which
applies to the final root filesystem) and to prevent any potential
regressions.
Also add documentation for the new kernel parameter "initramfs_options"
This approach is inspired by prior discussions and patches on the topic.
Ref: https://www.lightofdawn.org/blog/?viewDetailed=00128
Ref: https://landley.net/notes-2015.html#01-01-2015
Ref: https://lkml.org/lkml/2021/6/29/783
Ref: https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html#what-is-rootfs
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Link: https://lore.kernel.org/20250815121459.3391223-1-lichliu@redhat.com
Tested-by: Rob Landley <rob@landley.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 006568ab4c5ca2309ceb36fa553e390b4aa9c0c7 ]
__task_pid_nr_ns
ns = task_active_pid_ns(current);
pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns);
if (pid && ns->level <= pid->level) {
Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
For example:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
Mem abort info:
ESR = 0x0000000096000007
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x07: level 3 translation fault
Data abort info:
ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=00000002175aa000
[0000000000000058] pgd=08000002175ab003, p4d=08000002175ab003, pud=08000002175ab003, pmd=08000002175be003, pte=0000000000000000
pstate: 834000c5 (Nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : __task_pid_nr_ns+0x74/0xd0
lr : __task_pid_nr_ns+0x24/0xd0
sp : ffffffc08001bd10
x29: ffffffc08001bd10 x28: ffffffd4422b2000 x27: 0000000000000001
x26: ffffffd442821168 x25: ffffffd442821000 x24: 00000f89492eab31
x23: 00000000000000c0 x22: ffffff806f5693c0 x21: ffffff806f5693c0
x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000000
x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 00000000023a1adc
x14: 0000000000000003 x13: 00000000007ef6d8 x12: 001167c391c78800
x11: 00ffffffffffffff x10: 0000000000000000 x9 : 0000000000000001
x8 : ffffff80816fa3c0 x7 : 0000000000000000 x6 : 49534d702d535449
x5 : ffffffc080c4c2c0 x4 : ffffffd43ee128c8 x3 : ffffffd43ee124dc
x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffffff806f5693c0
Call trace:
__task_pid_nr_ns+0x74/0xd0
...
__handle_irq_event_percpu+0xd4/0x284
handle_irq_event+0x48/0xb0
handle_fasteoi_irq+0x160/0x2d8
generic_handle_domain_irq+0x44/0x60
gic_handle_irq+0x4c/0x114
call_on_irq_stack+0x3c/0x74
do_interrupt_handler+0x4c/0x84
el1_interrupt+0x34/0x58
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x68/0x6c
account_kernel_stack+0x60/0x144
exit_task_stack_account+0x1c/0x80
do_exit+0x7e4/0xaf8
...
get_signal+0x7bc/0x8d8
do_notify_resume+0x128/0x828
el0_svc+0x6c/0x70
el0t_64_sync_handler+0x68/0xbc
el0t_64_sync+0x1a8/0x1ac
Code: 35fffe54 911a02a8 f9400108 b4000128 (b9405a69)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception in interrupt
Signed-off-by: gaoxiang17 <gaoxiang17@xiaomi.com>
Link: https://lore.kernel.org/20250802022123.3536934-1-gxxa03070307@gmail.com
Reviewed-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 73861970938ad1323eb02bbbc87f6fbd1e5bacca ]
The inode mode loaded from corrupted disk can be invalid. Do like what
commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk")
does.
Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Link: https://lore.kernel.org/ec982681-84b8-4624-94fa-8af15b77cbd2@I-love.SAKURA.ne.jp
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f8f59a2c05dc16d19432e3154a9ac7bc385f4b92 ]
If the process runs in 32-bit compat mode, copy_file_range results can be
in the in-band error range. In this case limit copy length to MAX_RW_COUNT
to prevent a signed overflow.
Reported-by: Florian Weimer <fweimer@redhat.com>
Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Link: https://lore.kernel.org/20250813151107.99856-1-mszeredi@redhat.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f75e07bf5226da640fa99a0594687c780d9bace4 ]
According to the PLIC specification[1], global interrupt sources are
assigned small unsigned integer identifiers beginning at the value 1.
An interrupt ID of 0 is reserved to mean "no interrupt".
The current plic_irq_resume() and plic_irq_suspend() functions incorrectly
start the loop from index 0, which accesses the register space for the
reserved interrupt ID 0.
Change the loop to start from index 1, skipping the reserved
interrupt ID 0 as per the PLIC specification.
This prevents potential undefined behavior when accessing the reserved
register space during suspend/resume cycles.
Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Co-developed-by: Jia Wang <wangjia@ultrarisc.com>
Signed-off-by: Jia Wang <wangjia@ultrarisc.com>
Co-developed-by: Charles Mirabile <cmirabil@redhat.com>
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
Signed-off-by: Lucas Zampieri <lzampier@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 40d7af5375a4e27d8576d9d11954ac213d06f09e ]
Replace the open coded
if (foo)
__set_bit(n, bar);
else
__clear_bit(n, bar);
with __assign_bit(). No functional change intended.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/all/20240902130824.2878644-1-lihongbo22@huawei.com
Stable-dep-of: f75e07bf5226 ("irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resume")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit bc3905a71f02511607d3ccf732360580209cac4c upstream.
The tailcall_bpf2bpf_hierarchy_fentry test hangs on s390. Its call
graph is as follows:
entry()
subprog_tail()
trampoline()
fentry()
the rest of subprog_tail() # via BPF_TRAMP_F_CALL_ORIG
return to entry()
The problem is that the rest of subprog_tail() increments the tail call
counter, but the trampoline discards the incremented value. This
results in an astronomically large number of tail calls.
Fix by making the trampoline write the incremented tail call counter
back.
Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250813121016.163375-4-iii@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c861a6b147137d10b5ff88a2c492ba376cd1b8b0 upstream.
The tailcall_bpf2bpf_hierarchy_1 test hangs on s390. Its call graph is
as follows:
entry()
subprog_tail()
bpf_tail_call_static(0) -> entry + tail_call_start
subprog_tail()
bpf_tail_call_static(0) -> entry + tail_call_start
entry() copies its tail call counter to the subprog_tail()'s frame,
which then increments it. However, the incremented result is discarded,
leading to an astronomically large number of tail calls.
Fix by writing the incremented counter back to the entry()'s frame.
Fixes: dd691e847d28 ("s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250813121016.163375-3-iii@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e26d523edf2a62b142d2dd2dd9b87f61ed92f33a upstream.
Currently the caller-allocated portion of the stack frame is described
using constants, hardcoded values, and an ASCII drawing, making it
harder than necessary to ensure that everything is in sync.
Declare a struct and use offsetof() and offsetofend() macros to refer
to various values stored within the frame.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250624121501.50536-3-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b2268d550d20ff860bddfe3a91b1aec00414689a upstream.
The calculation of the distance from %r15 to the caller-allocated
portion of the stack frame is copy-pasted into multiple places in the
JIT code.
Move it to bpf_jit_prog() and save the result into bpf_jit::frame_off,
so that the other parts of the JIT can use it.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20250624121501.50536-2-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
subpage to shared zeropage
commit 9658d698a8a83540bf6a6c80d13c9a61590ee985 upstream.
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several
important PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes to
be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Link: https://lkml.kernel.org/r/20250930081040.80926-1-lance.yang@linux.dev
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Dev Jain <dev.jain@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e2c69490dda5d4c9f1bfbb2898989c8f3530e354 upstream
Prior to commit b52da4054ee0 ("ipmi: Rework user message limit handling"),
i_ipmi_request() used to increase the user reference counter if the receive
message is provided by the caller of IPMI API functions. This is no longer
the case. However, ipmi_free_recv_msg() is still called and decreases the
reference counter. This results in the reference counter reaching zero,
the user data pointer is released, and all kinds of interesting crashes are
seen.
Fix the problem by increasing user reference counter if the receive message
has been provided by the caller.
Fixes: b52da4054ee0 ("ipmi: Rework user message limit handling")
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20251006201857.3433837-1-linux@roeck-us.net>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b52da4054ee0bf9ecb44996f2c83236ff50b3812 upstream
This patch required quite a bit of work to backport due to a number
of unrelated changes that do not make sense to backport. This has
been run against my test suite and passes all tests.
The limit on the number of user messages had a number of issues,
improper counting in some cases and a use after free.
Restructure how this is all done to handle more in the receive message
allocation routine, so all refcouting and user message limit counts
are done in that routine. It's a lot cleaner and safer.
Reported-by: Gilles BULOZ <gilles.buloz@kontron.com>
Closes: https://lore.kernel.org/lkml/aLsw6G0GyqfpKs2S@mail.minyard.net/
Fixes: 8e76741c3d8b ("ipmi: Add a limit on the number of users that may use IPMI")
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Gilles BULOZ <gilles.buloz@kontron.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 4b1ff850e0c1aacc23e923ed22989b827b9808f9 upstream.
When servers set the C-flag in their MP_CAPABLE to tell clients not to
create subflows to the initial address and port, clients will likely not
use their other endpoints. That's because the in-kernel path-manager
uses the 'subflow' endpoints to create subflows only to the initial
address and port.
If the limits have not been modified to accept ADD_ADDR, the client
doesn't try to establish new subflows. If the limits accept ADD_ADDR,
the routing routes will be used to select the source IP.
The C-flag is typically set when the server is operating behind a legacy
Layer 4 load balancer, or using anycast IP address. Clients having their
different 'subflow' endpoints setup, don't end up creating multiple
subflows as expected, and causing some deployment issues.
A special case is then added here: when servers set the C-flag in the
MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted.
The 'subflows' endpoints will then be used with this new remote IP and
port. This exception is only allowed when the ADD_ADDR is sent
immediately after the 3WHS, and makes the client switching to the 'fully
established' mode. After that, 'select_local_address()' will not be able
to find any subflows, because 'id_avail_bitmap' will be filled in
mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully
established' mode.
Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-1-ad126cc47c6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ Conflict in pm.c, because commit 498d7d8b75f1 ("mptcp: pm: remove
'_nl' from mptcp_pm_nl_is_init_remote_addr") renamed an helper in the
context, and it is not in this version. The same new code can be
applied at the same place.
Conflict in pm_kernel.c, because the modified code has been moved from
pm_netlink.c to pm_kernel.c in commit 8617e85e04bd ("mptcp: pm: split
in-kernel PM specific code"), which is not in this version. The
resolution is easy: simply by applying the patch where 'pm_kernel.c'
has been replaced 'pm_netlink.c'. 'patch --merge' managed to apply
this modified patch without creating any conflicts. ]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit baf60d5cb8bc6b85511c5df5f0ad7620bb66d23c ]
In certain circumstances, the ACPI handle of a data-only node may be
NULL, in which case it does not make sense to attempt to attach that
node to an ACPI namespace object, so update the code to avoid attempts
to do so.
This prevents confusing and unuseful error messages from being printed.
Also document the fact that the ACPI handle of a data-only node may be
NULL and when that happens in a code comment. In addition, make
acpi_add_nondev_subnodes() print a diagnostic message for each data-only
node with an unknown ACPI namespace scope.
Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles")
Cc: 6.0+ <stable@vger.kernel.org> # 6.0+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 737c3a09dcf69ba2814f3674947ccaec1861c985 ]
In some places in the ACPI device properties handling code, it is
unclear why the code is what it is. Some assumptions are not documented
and some pieces of code are based on knowledge that is not mentioned
anywhere.
Add code comments explaining these things.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Stable-dep-of: baf60d5cb8bc ("ACPI: property: Do not pass NULL handles to acpi_attach_data()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d06118fe9b03426484980ed4c189a8c7b99fa631 ]
Data-only subnode links following the ACPI data subnode GUID in a _DSD
package are expected to point to named objects returning _DSD-equivalent
packages. If a reference to such an object is used in the target field
of any of those links, that object will be evaluated in place (as a
named object) and its return data will be embedded in the outer _DSD
package.
For this reason, it is not expected to see a subnode link with the
target field containing a local reference (that would mean pointing
to a device or another object that cannot be evaluated in place and
therefore cannot return a _DSD-equivalent package).
Accordingly, simplify the code parsing data-only subnode links to
simply print a message when it encounters a local reference in the
target field of one of those links.
Moreover, since acpi_nondev_subnode_data_ok() would only have one
caller after the change above, fold it into that caller.
Link: https://lore.kernel.org/linux-acpi/CAJZ5v0jVeSrDO6hrZhKgRZrH=FpGD4vNUjFD8hV9WwN9TLHjzQ@mail.gmail.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Stable-dep-of: baf60d5cb8bc ("ACPI: property: Do not pass NULL handles to acpi_attach_data()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 399dbcadc01ebf0035f325eaa8c264f8b5cd0a14 ]
There is no synchronization between different code paths in the ACPI
battery driver that update its sysfs interface or its power supply
class device interface. In some cases this results to functional
failures due to race conditions.
One example of this is when two ACPI notifications:
- ACPI_BATTERY_NOTIFY_STATUS (0x80)
- ACPI_BATTERY_NOTIFY_INFO (0x81)
are triggered (by the platform firmware) in a row with a little delay
in between after removing and reinserting a laptop battery. Both
notifications cause acpi_battery_update() to be called and if the delay
between them is sufficiently small, sysfs_add_battery() can be re-entered
before battery->bat is set which leads to a duplicate sysfs entry error:
sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT1'
CPU: 1 UID: 0 PID: 185 Comm: kworker/1:4 Kdump: loaded Not tainted 6.12.38+deb13-amd64 #1 Debian 6.12.38-1
Hardware name: Gateway NV44 /SJV40-MV , BIOS V1.3121 04/08/2009
Workqueue: kacpi_notify acpi_os_execute_deferred
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
sysfs_warn_dup.cold+0x17/0x23
sysfs_create_dir_ns+0xce/0xe0
kobject_add_internal+0xba/0x250
kobject_add+0x96/0xc0
? get_device_parent+0xde/0x1e0
device_add+0xe2/0x870
__power_supply_register.part.0+0x20f/0x3f0
? wake_up_q+0x4e/0x90
sysfs_add_battery+0xa4/0x1d0 [battery]
acpi_battery_update+0x19e/0x290 [battery]
acpi_battery_notify+0x50/0x120 [battery]
acpi_ev_notify_dispatch+0x49/0x70
acpi_os_execute_deferred+0x1a/0x30
process_one_work+0x177/0x330
worker_thread+0x251/0x390
? __pfx_worker_thread+0x10/0x10
kthread+0xd2/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
kobject: kobject_add_internal failed for BAT1 with -EEXIST, don't try to register things with the same name in the same directory.
There are also other scenarios in which analogous issues may occur.
Address this by using a common lock in all of the code paths leading
to updates of driver interfaces: ACPI Notify () handler, system resume
callback and post-resume notification, device addition and removal.
This new lock replaces sysfs_lock that has been used only in
sysfs_remove_battery() which now is going to be always called under
the new lock, so it doesn't need any internal locking any more.
Fixes: 10666251554c ("ACPI: battery: Install Notify() handler directly")
Closes: https://lore.kernel.org/linux-acpi/20250910142653.313360-1-luogf2025@163.com/
Reported-by: GuangFei Luo <luogf2025@163.com>
Tested-by: GuangFei Luo <luogf2025@163.com>
Cc: 6.6+ <stable@vger.kernel.org> # 6.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 815daedc318b2f9f1b956d0631377619a0d69d96 ]
Even if it's not critical, the avoidance of checking the error code
from devm_mutex_init() call today diminishes the point of using devm
variant of it. Tomorrow it may even leak something. Add the missed
check.
Fixes: 0710c1ce5045 ("ACPI: battery: initialize mutexes through devm_ APIs")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20241030162754.2110946-1-andriy.shevchenko@linux.intel.com
[ rjw: Added 2 empty code lines ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 399dbcadc01e ("ACPI: battery: Add synchronization between interface updates")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0710c1ce50455ed0db91bffa0eebbaa4f69b1773 ]
Simplify the cleanup logic a bit.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20240904-acpi-battery-cleanups-v1-3-a3bf74f22d40@weissschuh.net
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 399dbcadc01e ("ACPI: battery: Add synchronization between interface updates")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 909dfc60692331e1599d5e28a8f08a611f353aef ]
Simplify the cleanup logic a bit.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20240904-acpi-battery-cleanups-v1-2-a3bf74f22d40@weissschuh.net
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 399dbcadc01e ("ACPI: battery: Add synchronization between interface updates")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 898374fdd7f06fa4c4a66e8be3135efeae6128d5 ]
When a listener is added, a part of creation of transport also registers
program/port with rpcbind. However, when the listener is removed,
while transport goes away, rpcbind still has the entry for that
port/type.
When deleting the transport, unregister with rpcbind when appropriate.
---v2 created a new xpt_flag XPT_RPCB_UNREG to mark TCP and UDP
transport and at xprt destroy send rpcbind unregister if flag set.
Suggested-by: Chuck Lever <chuck.lever@oracle.com>
Fixes: d093c9089260 ("nfsd: fix management of listener transports")
Cc: stable@vger.kernel.org
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit eccbbc7c00a5aae5e704d4002adfaf4c3fa4b30d ]
The heuristic for limiting the number of incoming connections to nfsd
currently uses sv_nrthreads - allowing more connections if more threads
were configured.
A future patch will allow number of threads to grow dynamically so that
there will be no need to configure sv_nrthreads. So we need a different
solution for limiting connections.
It isn't clear what problem is solved by limiting connections (as
mentioned in a code comment) but the most likely problem is a connection
storm - many connections that are not doing productive work. These will
be closed after about 6 minutes already but it might help to slow down a
storm.
This patch adds a per-connection flag XPT_PEER_VALID which indicates
that the peer has presented a filehandle for which it has some sort of
access. i.e the peer is known to be trusted in some way. We now only
count connections which have NOT been determined to be valid. There
should be relative few of these at any given time.
If the number of non-validated peer exceed a limit - currently 64 - we
close the oldest non-validated peer to avoid having too many of these
useless connections.
Note that this patch significantly changes the meaning of the various
configuration parameters for "max connections". The next patch will
remove all of these.
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4cc9b9f2bf4dfe13fe573da978e626e2248df388 ]
NFSD_MAY_LOCK means a few different things.
- it means that GSS is not required.
- it means that with NFSEXP_NOAUTHNLM, authentication is not required
- it means that OWNER_OVERRIDE is allowed.
None of these are specific to locking, they are specific to the NLM
protocol.
So:
- rename to NFSD_MAY_NLM
- set NFSD_MAY_OWNER_OVERRIDE and NFSD_MAY_BYPASS_GSS in nlm_fopen()
so that NFSD_MAY_NLM doesn't need to imply these.
- move the test on NFSEXP_NOAUTHNLM out of nfsd_permission() and
into fh_verify where other special-case tests on the MAY flags
happen. nfsd_permission() can be called from other places than
fh_verify(), but none of these will have NFSD_MAY_NLM.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6640556b0c80edc66d6f50abe53f00311a873536 ]
NFSv4 LOCK operations should not avoid the set of authorization
checks that apply to all other NFSv4 operations. Also, the
"no_auth_nlm" export option should apply only to NLM LOCK requests.
It's not necessary or sensible to apply it to NFSv4 LOCK operations.
Instead, set no permission bits when calling fh_verify(). Subsequent
stateid processing handles authorization checks.
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit bb4f07f2409c26c01e97e6f9b432545f353e3b66 ]
Currently NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT do not bypass
only GSS, but bypass any method. This is a problem specially for NFS3
AUTH_NULL-only exports.
The purpose of NFSD_MAY_BYPASS_GSS_ON_ROOT is described in RFC 2623,
section 2.3.2, to allow mounting NFS2/3 GSS-only export without
authentication. So few procedures which do not expose security risk used
during mount time can be called also with AUTH_NONE or AUTH_SYS, to allow
client mount operation to finish successfully.
The problem with current implementation is that for AUTH_NULL-only exports,
the NFSD_MAY_BYPASS_GSS_ON_ROOT is active also for NFS3 AUTH_UNIX mount
attempts which confuse NFS3 clients, and make them think that AUTH_UNIX is
enabled and is working. Linux NFS3 client never switches from AUTH_UNIX to
AUTH_NONE on active mount, which makes the mount inaccessible.
Fix the NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT implementation
and really allow to bypass only exports which have enabled some real
authentication (GSS, TLS, or any other).
The result would be: For AUTH_NULL-only export if client attempts to do
mount with AUTH_UNIX flavor then it will receive access errors, which
instruct client that AUTH_UNIX flavor is not usable and will either try
other auth flavor (AUTH_NULL if enabled) or fails mount procedure.
Similarly if client attempt to do mount with AUTH_NULL flavor and only
AUTH_UNIX flavor is enabled then the client will receive access error.
This should fix problems with AUTH_NULL-only or AUTH_UNIX-only exports if
client attempts to mount it with other auth flavor (e.g. with AUTH_NULL for
AUTH_UNIX-only export, or with AUTH_UNIX for AUTH_NULL-only export).
Signed-off-by: Pali Rohár <pali@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0dccbc75e18df85399a71933d60b97494110f559 ]
When running as an SNP or TDX guest under KVM, force the legacy PCI hole,
i.e. memory between Top of Lower Usable DRAM and 4GiB, to be mapped as UC
via a forced variable MTRR range.
In most KVM-based setups, legacy devices such as the HPET and TPM are
enumerated via ACPI. ACPI enumeration includes a Memory32Fixed entry, and
optionally a SystemMemory descriptor for an OperationRegion, e.g. if the
device needs to be accessed via a Control Method.
If a SystemMemory entry is present, then the kernel's ACPI driver will
auto-ioremap the region so that it can be accessed at will. However, the
ACPI spec doesn't provide a way to enumerate the memory type of
SystemMemory regions, i.e. there's no way to tell software that a region
must be mapped as UC vs. WB, etc. As a result, Linux's ACPI driver always
maps SystemMemory regions using ioremap_cache(), i.e. as WB on x86.
The dedicated device drivers however, e.g. the HPET driver and TPM driver,
want to map their associated memory as UC or WC, as accessing PCI devices
using WB is unsupported.
On bare metal and non-CoCO, the conflicting requirements "work" as firmware
configures the PCI hole (and other device memory) to be UC in the MTRRs.
So even though the ACPI mappings request WB, they are forced to UC- in the
kernel's tracking due to the kernel properly handling the MTRR overrides,
and thus are compatible with the drivers' requested WC/UC-.
With force WB MTRRs on SNP and TDX guests, the ACPI mappings get their
requested WB if the ACPI mappings are established before the dedicated
driver code attempts to initialize the device. E.g. if acpi_init()
runs before the corresponding device driver is probed, ACPI's WB mapping
will "win", and result in the driver's ioremap() failing because the
existing WB mapping isn't compatible with the requested WC/UC-.
E.g. when a TPM is emulated by the hypervisor (ignoring the security
implications of relying on what is allegedly an untrusted entity to store
measurements), the TPM driver will request UC and fail:
[ 1.730459] ioremap error for 0xfed40000-0xfed45000, requested 0x2, got 0x0
[ 1.732780] tpm_tis MSFT0101:00: probe with driver tpm_tis failed with error -12
Note, the '0x2' and '0x0' values refer to "enum page_cache_mode", not x86's
memtypes (which frustratingly are an almost pure inversion; 2 == WB, 0 == UC).
E.g. tracing mapping requests for TPM TIS yields:
Mapping TPM TIS with req_type = 0
WARNING: CPU: 22 PID: 1 at arch/x86/mm/pat/memtype.c:530 memtype_reserve+0x2ab/0x460
Modules linked in:
CPU: 22 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.16.0-rc7+ #2 VOLUNTARY
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/29/2025
RIP: 0010:memtype_reserve+0x2ab/0x460
__ioremap_caller+0x16d/0x3d0
ioremap_cache+0x17/0x30
x86_acpi_os_ioremap+0xe/0x20
acpi_os_map_iomem+0x1f3/0x240
acpi_os_map_memory+0xe/0x20
acpi_ex_system_memory_space_handler+0x273/0x440
acpi_ev_address_space_dispatch+0x176/0x4c0
acpi_ex_access_region+0x2ad/0x530
acpi_ex_field_datum_io+0xa2/0x4f0
acpi_ex_extract_from_field+0x296/0x3e0
acpi_ex_read_data_from_field+0xd1/0x460
acpi_ex_resolve_node_to_value+0x2ee/0x530
acpi_ex_resolve_to_value+0x1f2/0x540
acpi_ds_evaluate_name_path+0x11b/0x190
acpi_ds_exec_end_op+0x456/0x960
acpi_ps_parse_loop+0x27a/0xa50
acpi_ps_parse_aml+0x226/0x600
acpi_ps_execute_method+0x172/0x3e0
acpi_ns_evaluate+0x175/0x5f0
acpi_evaluate_object+0x213/0x490
acpi_evaluate_integer+0x6d/0x140
acpi_bus_get_status+0x93/0x150
acpi_add_single_object+0x43a/0x7c0
acpi_bus_check_add+0x149/0x3a0
acpi_bus_check_add_1+0x16/0x30
acpi_ns_walk_namespace+0x22c/0x360
acpi_walk_namespace+0x15c/0x170
acpi_bus_scan+0x1dd/0x200
acpi_scan_init+0xe5/0x2b0
acpi_init+0x264/0x5b0
do_one_initcall+0x5a/0x310
kernel_init_freeable+0x34f/0x4f0
kernel_init+0x1b/0x200
ret_from_fork+0x186/0x1b0
ret_from_fork_asm+0x1a/0x30
</TASK>
The above traces are from a Google-VMM based VM, but the same behavior
happens with a QEMU based VM that is modified to add a SystemMemory range
for the TPM TIS address space.
The only reason this doesn't cause problems for HPET, which appears to
require a SystemMemory region, is because HPET gets special treatment via
x86_init.timers.timer_init(), and so gets a chance to create its UC-
mapping before acpi_init() clobbers things. Disabling the early call to
hpet_time_init() yields the same behavior for HPET:
[ 0.318264] ioremap error for 0xfed00000-0xfed01000, requested 0x2, got 0x0
Hack around the ACPI gap by forcing the legacy PCI hole to UC when
overriding the (virtual) MTRRs for CoCo guest, so that ioremap handling
of MTRRs naturally kicks in and forces the ACPI mappings to be UC.
Note, the requested/mapped memtype doesn't actually matter in terms of
accessing the device. In practically every setup, legacy PCI devices are
emulated by the hypervisor, and accesses are intercepted and handled as
emulated MMIO, i.e. never access physical memory and thus don't have an
effective memtype.
Even in a theoretical setup where such devices are passed through by the
host, i.e. point at real MMIO memory, it is KVM's (as the hypervisor)
responsibility to force the memory to be WC/UC, e.g. via EPT memtype
under TDX or real hardware MTRRs under SNP. Not doing so cannot work,
and the hypervisor is highly motivated to do the right thing as letting
the guest access hardware MMIO with WB would likely result in a variety
of fatal #MCs.
In other words, forcing the range to be UC is all about coercing the
kernel's tracking into thinking that it has established UC mappings, so
that the ioremap code doesn't reject mappings from e.g. the TPM driver and
thus prevent the driver from loading and the device from functioning.
Note #2, relying on guest firmware to handle this scenario, e.g. by setting
virtual MTRRs and then consuming them in Linux, is not a viable option, as
the virtual MTRR state is managed by the untrusted hypervisor, and because
OVMF at least has stopped programming virtual MTRRs when running as a TDX
guest.
Link: https://lore.kernel.org/all/8137d98e-8825-415b-9282-1d2a115bb51a@linux.intel.com
Fixes: 8e690b817e38 ("x86/kvm: Override default caching mode for SEV-SNP and TDX")
Cc: stable@vger.kernel.org
Cc: Peter Gonda <pgonda@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jürgen Groß <jgross@suse.com>
Cc: Korakit Seemakhupt <korakit@google.com>
Cc: Jianxiong Gao <jxgao@google.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Suggested-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Korakit Seemakhupt <korakit@google.com>
Link: https://lore.kernel.org/r/20250828005249.39339-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6a5abeea9c72e1d2c538622b4cf66c80cc816fd3 ]
Rename the helper to better reflect its function.
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lore.kernel.org/all/20241202073139.448208-1-kirill.shutemov%40linux.intel.com
Stable-dep-of: 0dccbc75e18d ("x86/kvm: Force legacy PCI hole to UC when overriding MTRRs for TDX/SNP")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f620d66af3165838bfa845dcf9f5f9b4089bf508 ]
Commit 68d54ceeec0e ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the
zero page") attempted to fix ptrace() reading of tags from the zero page
by marking it as PG_mte_tagged during cpu_enable_mte(). The same commit
also changed the ptrace() tag access permission check to the VM_MTE vma
flag while turning the page flag test into a WARN_ON_ONCE().
Attempting to set the PG_mte_tagged flag early with
CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled may either hang (after commit
d77e59a8fccd "arm64: mte: Lock a page for MTE tag initialisation") or
have the flags cleared later during page_alloc_init_late(). In addition,
pages_identical() -> memcmp_pages() will reject any comparison with the
zero page as it is marked as tagged.
Partially revert the above commit to avoid setting PG_mte_tagged on the
zero page. Update the __access_remote_tags() warning on untagged pages
to ignore the zero page since it is known to have the tags initialised.
Note that all user mapping of the zero page are marked as pte_special().
The arm64 set_pte_at() will not call mte_sync_tags() on such pages, so
PG_mte_tagged will remain cleared.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 68d54ceeec0e ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page")
Reported-by: Gergely Kovacs <Gergely.Kovacs2@arm.com>
Cc: stable@vger.kernel.org # 5.10.x
Cc: Will Deacon <will@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lance Yang <lance.yang@linux.dev>
Acked-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Will Deacon <will@kernel.org>
[ Adjust context ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e8c84e2082e69335f66c8ade4895e80ec270d7c4 ]
Massage statmount() and make sure we don't call path_put() under the
namespace semaphore. If we put the last reference we're fscked.
Fixes: 46eae99ef733 ("add statmount(2) syscall")
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 716f86b523d8ec3c17015ee0b03135c7aa6f2f08 ]
SRSO_USER_KERNEL_NO denotes whether the CPU is affected by SRSO across
user/kernel boundaries. Advertise it to guest userspace.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://lore.kernel.org/r/20241202120416.6054-3-bp@kernel.org
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f97aef092e199c10a3da96ae79b571edd5362faa ]
Commit a755d0e2d41b ("cpufreq: Honour transition_latency over
transition_delay_us") caused platforms where cpuinfo.transition_latency
is CPUFREQ_ETERNAL to get a very large transition latency whereas
previously it had been capped at 10 ms (and later at 2 ms).
This led to a user-observable regression between 6.6 and 6.12 as
described by Shawn:
"The dbs sampling_rate was 10000 us on 6.6 and suddently becomes
6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms
because the default transition delay was dropped [...].
It slows down dbs governor's reacting to CPU loading change
dramatically. Also, as transition_delay_us is used by schedutil
governor as rate_limit_us, it shows a negative impact on device
idle power consumption, because the device gets slightly less time
in the lowest OPP."
Evidently, the expectation of the drivers using CPUFREQ_ETERNAL as
cpuinfo.transition_latency was that it would be capped by the core,
but they may as well return a default transition latency value instead
of CPUFREQ_ETERNAL and the core need not do anything with it.
Accordingly, introduce CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS and make
all of the drivers in question use it instead of CPUFREQ_ETERNAL. Also
update the related Rust binding.
Fixes: a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
Closes: https://lore.kernel.org/linux-pm/20250922125929.453444-1-shawnguo2@yeah.net/
Reported-by: Shawn Guo <shawnguo@kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 6.6+ <stable@vger.kernel.org> # 6.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2264949.irdbgypaU6@rafael.j.wysocki
[ rjw: Fix typo in new symbol name, drop redundant type cast from Rust binding ]
Tested-by: Shawn Guo <shawnguo@kernel.org> # with cpufreq-dt driver
Reviewed-by: Qais Yousef <qyousef@layalina.io>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ omitted Rust changes ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7b26da407420e5054e3f06c5d13271697add9423 ]
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 ]
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Hans de Goede <hansg@kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9eb99c08508714906db078b5efbe075329a3fb06 ]
REGCACHE_NONE is the default type of the cache when not provided.
Drop unneeded explicit assignment to it.
Note, it's defined to 0, and if ever be redefined, it will break
literally a lot of the drivers, so it very unlikely to happen.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20250129152823.1802273-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Lee Jones <lee@kernel.org>
Stable-dep-of: 64e0d839c589 ("mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 70e997e0107e5ed85c1a3ef2adfccbe351c29d71 ]
The max_register = 128 setting in the regmap config is not valid.
The Intel Dollar Cove TI PMIC has an eeprom unlock register at address 0x88
and a number of EEPROM registers at 0xF?. Increase max_register to 0xff so
that these registers can be accessed.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://lore.kernel.org/r/20241208150028.325349-1-hdegoede@redhat.com
Signed-off-by: Lee Jones <lee@kernel.org>
Stable-dep-of: 64e0d839c589 ("mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit bcd1383516bb5a6f72b2d1e7f7ad42c4a14837d1 ]
When the sampling rates going in (host) and out (dai) from the DSP
are different, the IPC4 delay reporting does not work correctly.
Add support for this case by scaling the all raw position values to
a common timebase before calculating real-time delay for the PCM.
Cc: stable@vger.kernel.org
Fixes: 0ea06680dfcb ("ASoC: SOF: ipc4-pcm: Correct the delay calculation")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20251002074719.2084-2-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a1d203d390e04798ccc1c3c06019cd4411885d6d ]
All streams (currently) which is configured to use ChainDMA can only work
on Link/host DMA pairs where the link side position can be access via host
registers (like HDA on CAVS 2.5 platforms).
Since the firmware does not provide time_info for ChainDMA, unlike for HDA
stream, the kernel should calculate the start and end offsets that is
needed for the delay calculation.
With this small change we can report accurate delays when the stream is
configured to use ChainDMA.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://patch.msgid.link/20250619102848.12389-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: bcd1383516bb ("ASoC: SOF: ipc4-pcm: fix delay calculation when DSP resamples")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 85afa9ea122dd9d4a2ead104a951d318975dcd25 ]
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|