summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-10-19Linux 6.17.4v6.17.4Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20251017145201.780251198@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Hardik Garg <hargar@linux.microsoft.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Dileep Malepu <dileep.debian@gmail.com> Tested-by: Markus Reichelt <lkt+2023@mareichelt.com> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Pascal Ernster <git@hardfalcon.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19mount: handle NULL values in mnt_ns_release()Christian Brauner1-1/+1
[ Upstream commit 6c7ca6a02f8f9549a438a08a23c6327580ecf3d6 ] When calling in listmount() mnt_ns_release() may be passed a NULL pointer. Handle that case gracefully. Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19pidfs: validate extensible ioctlsChristian Brauner2-1/+15
[ Upstream commit 3c17001b21b9f168c957ced9384abe969019b609 ] Validate extensible ioctls stricter than we do now. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19iomap: error out on file IO when there is no inline_data bufferDarrick J. Wong2-5/+13
[ Upstream commit 6a96fb653b6481ec73e9627ade216b299e4de9ea ] Return IO errors if an ->iomap_begin implementation returns an IOMAP_INLINE buffer but forgets to set the inline_data pointer. Filesystems should never do this, but we could help fs developers (me) fix their bugs by handling this more gracefully than crashing the kernel. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/175803480324.966383.7414345025943296442.stgit@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19writeback: Avoid excessively long inode switching timesJan Kara1-10/+11
[ Upstream commit 9a6ebbdbd41235ea3bc0c4f39e2076599b8113cc ] With lazytime mount option enabled we can be switching many dirty inodes on cgroup exit to the parent cgroup. The numbers observed in practice when systemd slice of a large cron job exits can easily reach hundreds of thousands or millions. The logic in inode_do_switch_wbs() which sorts the inode into appropriate place in b_dirty list of the target wb however has linear complexity in the number of dirty inodes thus overall time complexity of switching all the inodes is quadratic leading to workers being pegged for hours consuming 100% of the CPU and switching inodes to the parent wb. Simple reproducer of the issue: FILES=10000 # Filesystem mounted with lazytime mount option MNT=/mnt/ echo "Creating files and switching timestamps" for (( j = 0; j < 50; j ++ )); do mkdir $MNT/dir$j for (( i = 0; i < $FILES; i++ )); do echo "foo" >$MNT/dir$j/file$i done touch -a -t 202501010000 $MNT/dir$j/file* done wait echo "Syncing and flushing" sync echo 3 >/proc/sys/vm/drop_caches echo "Reading all files from a cgroup" mkdir /sys/fs/cgroup/unified/mycg1 || exit echo $$ >/sys/fs/cgroup/unified/mycg1/cgroup.procs || exit for (( j = 0; j < 50; j ++ )); do cat /mnt/dir$j/file* >/dev/null & done wait echo "Switching wbs" # Now rmdir the cgroup after the script exits We need to maintain b_dirty list ordering to keep writeback happy so instead of sorting inode into appropriate place just append it at the end of the list and clobber dirtied_time_when. This may result in inode writeback starting later after cgroup switch however cgroup switches are rare so it shouldn't matter much. Since the cgroup had write access to the inode, there are no practical concerns of the possible DoS issues. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19writeback: Avoid softlockup when switching many inodesJan Kara1-1/+10
[ Upstream commit 66c14dccd810d42ec5c73bb8a9177489dfd62278 ] process_inode_switch_wbs_work() can be switching over 100 inodes to a different cgroup. Since switching an inode requires counting all dirty & under-writeback pages in the address space of each inode, this can take a significant amount of time. Add a possibility to reschedule after processing each inode to avoid softlockups. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19mnt_ns_tree_remove(): DTRT if mnt_ns had never been added to mnt_ns_listAl Viro1-1/+1
[ Upstream commit 38f4885088fc5ad41b8b0a2a2cfc73d01e709e5c ] Actual removal is done under the lock, but for checking if need to bother the lockless RB_EMPTY_NODE() is safe - either that namespace had never been added to mnt_ns_tree, in which case the the node will stay empty, or whoever had allocated it has called mnt_ns_tree_add() and it has already run to completion. After that point RB_EMPTY_NODE() will become false and will remain false, no matter what we do with other nodes in the tree. Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19nsfs: validate extensible ioctlsChristian Brauner1-1/+3
[ Upstream commit f8527a29f4619f74bc30a9845ea87abb9a6faa1e ] Validate extensible ioctls stricter than we do now. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19cramfs: Verify inode mode when loading from diskTetsuo Handa1-1/+10
[ Upstream commit 7f9d34b0a7cb93d678ee7207f0634dbf79e47fe5 ] The inode mode loaded from corrupted disk can be invalid. Do like what commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk") does. Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/429b3ef1-13de-4310-9a8e-c2dc9a36234a@I-love.SAKURA.ne.jp Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19fs: Add 'initramfs_options' to set initramfs mount optionsLichen Liu2-1/+13
[ Upstream commit 278033a225e13ec21900f0a92b8351658f5377f2 ] When CONFIG_TMPFS is enabled, the initial root filesystem is a tmpfs. By default, a tmpfs mount is limited to using 50% of the available RAM for its content. This can be problematic in memory-constrained environments, particularly during a kdump capture. In a kdump scenario, the capture kernel boots with a limited amount of memory specified by the 'crashkernel' parameter. If the initramfs is large, it may fail to unpack into the tmpfs rootfs due to insufficient space. This is because to get X MB of usable space in tmpfs, 2*X MB of memory must be available for the mount. This leads to an OOM failure during the early boot process, preventing a successful crash dump. This patch introduces a new kernel command-line parameter, initramfs_options, which allows passing specific mount options directly to the rootfs when it is first mounted. This gives users control over the rootfs behavior. For example, a user can now specify initramfs_options=size=75% to allow the tmpfs to use up to 75% of the available memory. This can significantly reduce the memory pressure for kdump. Consider a practical example: To unpack a 48MB initramfs, the tmpfs needs 48MB of usable space. With the default 50% limit, this requires a memory pool of 96MB to be available for the tmpfs mount. The total memory requirement is therefore approximately: 16MB (vmlinuz) + 48MB (loaded initramfs) + 48MB (unpacked kernel) + 96MB (for tmpfs) + 12MB (runtime overhead) ≈ 220MB. By using initramfs_options=size=75%, the memory pool required for the 48MB tmpfs is reduced to 48MB / 0.75 = 64MB. This reduces the total memory requirement by 32MB (96MB - 64MB), allowing the kdump to succeed with a smaller crashkernel size, such as 192MB. An alternative approach of reusing the existing rootflags parameter was considered. However, a new, dedicated initramfs_options parameter was chosen to avoid altering the current behavior of rootflags (which applies to the final root filesystem) and to prevent any potential regressions. Also add documentation for the new kernel parameter "initramfs_options" This approach is inspired by prior discussions and patches on the topic. Ref: https://www.lightofdawn.org/blog/?viewDetailed=00128 Ref: https://landley.net/notes-2015.html#01-01-2015 Ref: https://lkml.org/lkml/2021/6/29/783 Ref: https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html#what-is-rootfs Signed-off-by: Lichen Liu <lichliu@redhat.com> Link: https://lore.kernel.org/20250815121459.3391223-1-lichliu@redhat.com Tested-by: Rob Landley <rob@landley.net> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19pid: Add a judgment for ns null in pid_nr_nsgaoxiang171-1/+1
[ Upstream commit 006568ab4c5ca2309ceb36fa553e390b4aa9c0c7 ] __task_pid_nr_ns ns = task_active_pid_ns(current); pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); if (pid && ns->level <= pid->level) { Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns. For example: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 Mem abort info: ESR = 0x0000000096000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=00000002175aa000 [0000000000000058] pgd=08000002175ab003, p4d=08000002175ab003, pud=08000002175ab003, pmd=08000002175be003, pte=0000000000000000 pstate: 834000c5 (Nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : __task_pid_nr_ns+0x74/0xd0 lr : __task_pid_nr_ns+0x24/0xd0 sp : ffffffc08001bd10 x29: ffffffc08001bd10 x28: ffffffd4422b2000 x27: 0000000000000001 x26: ffffffd442821168 x25: ffffffd442821000 x24: 00000f89492eab31 x23: 00000000000000c0 x22: ffffff806f5693c0 x21: ffffff806f5693c0 x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000000 x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 00000000023a1adc x14: 0000000000000003 x13: 00000000007ef6d8 x12: 001167c391c78800 x11: 00ffffffffffffff x10: 0000000000000000 x9 : 0000000000000001 x8 : ffffff80816fa3c0 x7 : 0000000000000000 x6 : 49534d702d535449 x5 : ffffffc080c4c2c0 x4 : ffffffd43ee128c8 x3 : ffffffd43ee124dc x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffffff806f5693c0 Call trace: __task_pid_nr_ns+0x74/0xd0 ... __handle_irq_event_percpu+0xd4/0x284 handle_irq_event+0x48/0xb0 handle_fasteoi_irq+0x160/0x2d8 generic_handle_domain_irq+0x44/0x60 gic_handle_irq+0x4c/0x114 call_on_irq_stack+0x3c/0x74 do_interrupt_handler+0x4c/0x84 el1_interrupt+0x34/0x58 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x68/0x6c account_kernel_stack+0x60/0x144 exit_task_stack_account+0x1c/0x80 do_exit+0x7e4/0xaf8 ... get_signal+0x7bc/0x8d8 do_notify_resume+0x128/0x828 el0_svc+0x6c/0x70 el0t_64_sync_handler+0x68/0xbc el0t_64_sync+0x1a8/0x1ac Code: 35fffe54 911a02a8 f9400108 b4000128 (b9405a69) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt Signed-off-by: gaoxiang17 <gaoxiang17@xiaomi.com> Link: https://lore.kernel.org/20250802022123.3536934-1-gxxa03070307@gmail.com Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19minixfs: Verify inode mode when loading from diskTetsuo Handa1-1/+7
[ Upstream commit 73861970938ad1323eb02bbbc87f6fbd1e5bacca ] The inode mode loaded from corrupted disk can be invalid. Do like what commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk") does. Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/ec982681-84b8-4624-94fa-8af15b77cbd2@I-love.SAKURA.ne.jp Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19copy_file_range: limit size if in compat modeMiklos Szeredi1-5/+9
[ Upstream commit f8f59a2c05dc16d19432e3154a9ac7bc385f4b92 ] If the process runs in 32-bit compat mode, copy_file_range results can be in the in-band error range. In this case limit copy length to MAX_RW_COUNT to prevent a signed overflow. Reported-by: Florian Weimer <fweimer@redhat.com> Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/20250813151107.99856-1-mszeredi@redhat.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resumeLucas Zampieri1-2/+4
[ Upstream commit f75e07bf5226da640fa99a0594687c780d9bace4 ] According to the PLIC specification[1], global interrupt sources are assigned small unsigned integer identifiers beginning at the value 1. An interrupt ID of 0 is reserved to mean "no interrupt". The current plic_irq_resume() and plic_irq_suspend() functions incorrectly start the loop from index 0, which accesses the register space for the reserved interrupt ID 0. Change the loop to start from index 1, skipping the reserved interrupt ID 0 as per the PLIC specification. This prevents potential undefined behavior when accessing the reserved register space during suspend/resume cycles. Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation") Co-developed-by: Jia Wang <wangjia@ultrarisc.com> Signed-off-by: Jia Wang <wangjia@ultrarisc.com> Co-developed-by: Charles Mirabile <cmirabil@redhat.com> Signed-off-by: Charles Mirabile <cmirabil@redhat.com> Signed-off-by: Lucas Zampieri <lzampier@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0 Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-19ACPI: property: Do not pass NULL handles to acpi_attach_data()Rafael J. Wysocki1-0/+12
[ Upstream commit baf60d5cb8bc6b85511c5df5f0ad7620bb66d23c ] In certain circumstances, the ACPI handle of a data-only node may be NULL, in which case it does not make sense to attempt to attach that node to an ACPI namespace object, so update the code to avoid attempts to do so. This prevents confusing and unuseful error messages from being printed. Also document the fact that the ACPI handle of a data-only node may be NULL and when that happens in a code comment. In addition, make acpi_add_nondev_subnodes() print a diagnostic message for each data-only node with an unknown ACPI namespace scope. Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles") Cc: 6.0+ <stable@vger.kernel.org> # 6.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ACPI: property: Add code comments explaining what is going onRafael J. Wysocki1-2/+44
[ Upstream commit 737c3a09dcf69ba2814f3674947ccaec1861c985 ] In some places in the ACPI device properties handling code, it is unclear why the code is what it is. Some assumptions are not documented and some pieces of code are based on knowledge that is not mentioned anywhere. Add code comments explaining these things. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com> Stable-dep-of: baf60d5cb8bc ("ACPI: property: Do not pass NULL handles to acpi_attach_data()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ACPI: property: Disregard references in data-only subnode listsRafael J. Wysocki1-29/+22
[ Upstream commit d06118fe9b03426484980ed4c189a8c7b99fa631 ] Data-only subnode links following the ACPI data subnode GUID in a _DSD package are expected to point to named objects returning _DSD-equivalent packages. If a reference to such an object is used in the target field of any of those links, that object will be evaluated in place (as a named object) and its return data will be embedded in the outer _DSD package. For this reason, it is not expected to see a subnode link with the target field containing a local reference (that would mean pointing to a device or another object that cannot be evaluated in place and therefore cannot return a _DSD-equivalent package). Accordingly, simplify the code parsing data-only subnode links to simply print a message when it encounters a local reference in the target field of one of those links. Moreover, since acpi_nondev_subnode_data_ok() would only have one caller after the change above, fold it into that caller. Link: https://lore.kernel.org/linux-acpi/CAJZ5v0jVeSrDO6hrZhKgRZrH=FpGD4vNUjFD8hV9WwN9TLHjzQ@mail.gmail.com/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com> Stable-dep-of: baf60d5cb8bc ("ACPI: property: Do not pass NULL handles to acpi_attach_data()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19arm64: dts: qcom: qcs615: add missing dt property in QUP SEsViken Dadhaniya1-0/+6
[ Upstream commit 6a5e9b9738a32229e2673d4eccfcbfe2ef3a1ab4 ] Add the missing required-opps and operating-points-v2 properties to several I2C, SPI, and UART nodes in the QUP SEs. Fixes: f6746dc9e379 ("arm64: dts: qcom: qcs615: Add QUPv3 configuration") Cc: stable@vger.kernel.org Signed-off-by: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250630064338.2487409-1-viken.dadhaniya@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: mc: Clear minor number before put deviceEdward Adam Davis1-5/+1
[ Upstream commit 8cfc8cec1b4da88a47c243a11f384baefd092a50 ] The device minor should not be cleared after the device is released. Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time") Cc: stable@vger.kernel.org Reported-by: syzbot+031d0cfd7c362817963f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f Tested-by: syzbot+031d0cfd7c362817963f@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> [ moved clear_bit from media_devnode_release callback to media_devnode_unregister before put_device ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19mm/ksm: fix incorrect KSM counter handling in mm_struct during forkDonet Tom1-1/+7
[ Upstream commit 4d6fc29f36341d7795db1d1819b4c15fe9be7b23 ] Patch series "mm/ksm: Fix incorrect accounting of KSM counters during fork", v3. The first patch in this series fixes the incorrect accounting of KSM counters such as ksm_merging_pages, ksm_rmap_items, and the global ksm_zero_pages during fork. The following patch add a selftest to verify the ksm_merging_pages counter was updated correctly during fork. Test Results ============ Without the first patch ----------------------- # [RUN] test_fork_ksm_merging_page_count not ok 10 ksm_merging_page in child: 32 With the first patch -------------------- # [RUN] test_fork_ksm_merging_page_count ok 10 ksm_merging_pages is not inherited after fork This patch (of 2): Currently, the KSM-related counters in `mm_struct`, such as `ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited by the child process during fork. This results in inconsistent accounting. When a process uses KSM, identical pages are merged and an rmap item is created for each merged page. The `ksm_merging_pages` and `ksm_rmap_items` counters are updated accordingly. However, after a fork, these counters are copied to the child while the corresponding rmap items are not. As a result, when the child later triggers an unmerge, there are no rmap items present in the child, so the counters remain stale, leading to incorrect accounting. A similar issue exists with `ksm_zero_pages`, which maintains both a global counter and a per-process counter. During fork, the per-process counter is inherited by the child, but the global counter is not incremented. Since the child also references zero pages, the global counter should be updated as well. Otherwise, during zero-page unmerge, both the global and per-process counters are decremented, causing the global counter to become inconsistent. To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during fork, and the global ksm_zero_pages counter is updated with the per-process ksm_zero_pages value inherited by the child. This ensures that KSM statistics remain accurate and reflect the activity of each process correctly. Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.1758648700.git.donettom@linux.ibm.com Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process") Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process") Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM") Signed-off-by: Donet Tom <donettom@linux.ibm.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Donet Tom <donettom@linux.ibm.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: <stable@vger.kernel.org> [6.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ replaced mm_flags_test() calls with test_bit() ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19Squashfs: reject negative file sizes in squashfs_read_inode()Phillip Lougher1-0/+4
[ Upstream commit 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b ] Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs. This warning is ultimately caused because the underlying Squashfs file system returns a file with a negative file size. This commit checks for a negative file size and returns EINVAL. [phillip@squashfs.org.uk: only need to check 64 bit quantity] Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk Fixes: 6545b246a2c8 ("Squashfs: inode operations") Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: syzbot+f754e01116421e9754b9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/ Cc: Amir Goldstein <amir73il@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19Squashfs: add additional inode sanity checkingPhillip Lougher1-2/+18
[ Upstream commit 9ee94bfbe930a1b39df53fa2d7b31141b780eb5a ] Patch series "Squashfs: performance improvement and a sanity check". This patchset adds an additional sanity check when reading regular file inodes, and adds support for SEEK_DATA/SEEK_HOLE lseek() whence values. This patch (of 2): Add an additional sanity check when reading regular file inodes. A regular file if the file size is an exact multiple of the filesystem block size cannot have a fragment. This is because by definition a fragment block stores tailends which are not a whole block in size. Link: https://lkml.kernel.org/r/20250923220652.568416-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20250923220652.568416-2-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 9f1c14c1de1b ("Squashfs: reject negative file sizes in squashfs_read_inode()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ipmi: Fix handling of messages with provided receive message pointerGuenter Roeck1-1/+4
commit e2c69490dda5d4c9f1bfbb2898989c8f3530e354 upstream. Prior to commit b52da4054ee0 ("ipmi: Rework user message limit handling"), i_ipmi_request() used to increase the user reference counter if the receive message is provided by the caller of IPMI API functions. This is no longer the case. However, ipmi_free_recv_msg() is still called and decreases the reference counter. This results in the reference counter reaching zero, the user data pointer is released, and all kinds of interesting crashes are seen. Fix the problem by increasing user reference counter if the receive message has been provided by the caller. Fixes: b52da4054ee0 ("ipmi: Rework user message limit handling") Reported-by: Eric Dumazet <edumazet@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Message-ID: <20251006201857.3433837-1-linux@roeck-us.net> Signed-off-by: Corey Minyard <corey@minyard.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: free orphan info with kvfreeJan Kara1-2/+2
commit 971843c511c3c2f6eda96c6b03442913bfee6148 upstream. Orphan info is now getting allocated with kvmalloc_array(). Free it with kvfree() instead of kfree() to avoid complaints from mm. Reported-by: Chris Mason <clm@meta.com> Fixes: 0a6ce20c1564 ("ext4: verify orphan file size is not too big") Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Message-ID: <20251007134936.7291-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ACPICA: Allow to skip Global Lock initializationHuacai Chen2-0/+10
commit feb8ae81b2378b75a99c81d315602ac8918ed382 upstream. Introduce acpi_gbl_use_global_lock, which allows to skip the Global Lock initialization. This is useful for systems without Global Lock (such as loong_arch), so as to avoid error messages during boot phase: ACPI Error: Could not enable global_lock event (20240827/evxfevnt-182) ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59) Link: https://github.com/acpica/acpica/commit/463cb0fe Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: validate ea_ino and size in check_xattrsDeepanshu Kartikey1-0/+4
commit 44d2a72f4d64655f906ba47a5e108733f59e6f28 upstream. During xattr block validation, check_xattrs() processes xattr entries without validating that entries claiming to use EA inodes have non-zero sizes. Corrupted filesystems may contain xattr entries where e_value_size is zero but e_value_inum is non-zero, indicating invalid xattr data. Add validation in check_xattrs() to detect this corruption pattern early and return -EFSCORRUPTED, preventing invalid xattr entries from causing issues throughout the ext4 codebase. Cc: stable@kernel.org Suggested-by: Theodore Ts'o <tytso@mit.edu> Reported-by: syzbot+4c9d23743a2409b80293@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=4c9d23743a2409b80293 Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Message-ID: <20250923133245.1091761-1-kartikey406@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: guard against EA inode refcount underflow in xattr updateAhmet Eray Karadag1-7/+8
commit 57295e835408d8d425bef58da5253465db3d6888 upstream. syzkaller found a path where ext4_xattr_inode_update_ref() reads an EA inode refcount that is already <= 0 and then applies ref_change (often -1). That lets the refcount underflow and we proceed with a bogus value, triggering errors like: EXT4-fs error: EA inode <n> ref underflow: ref_count=-1 ref_change=-1 EXT4-fs warning: ea_inode dec ref err=-117 Make the invariant explicit: if the current refcount is non-positive, treat this as on-disk corruption, emit ext4_error_inode(), and fail the operation with -EFSCORRUPTED instead of updating the refcount. Delete the WARN_ONCE() as negative refcounts are now impossible; keep error reporting in ext4_error_inode(). This prevents the underflow and the follow-on orphan/cleanup churn. Reported-by: syzbot+0be4f339a8218d2a5bb1@syzkaller.appspotmail.com Fixes: https://syzbot.org/bug?extid=0be4f339a8218d2a5bb1 Cc: stable@kernel.org Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com> Message-ID: <20250920021342.45575-1-eraykrdg1@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: fix an off-by-one issue during moving extentsZhang Yi1-1/+1
commit 12e803c8827d049ae8f2c743ef66ab87ae898375 upstream. During the movement of a written extent, mext_page_mkuptodate() is called to read data in the range [from, to) into the page cache and to update the corresponding buffers. Therefore, we should not wait on any buffer whose start offset is >= 'to'. Otherwise, it will return -EIO and fail the extents movement. $ for i in `seq 3 -1 0`; \ do xfs_io -fs -c "pwrite -b 1024 $((i * 1024)) 1024" /mnt/foo; \ done $ umount /mnt && mount /dev/pmem1s /mnt # drop cache $ e4defrag /mnt/foo e4defrag 1.47.0 (5-Feb-2023) ext4 defragmentation for /mnt/foo [1/1]/mnt/foo: 0% [ NG ] Success: [0/1] Cc: stable@kernel.org Fixes: a40759fb16ae ("ext4: remove array of buffer_heads from mext_page_mkuptodate()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-ID: <20250912105841.1886799-1-yi.zhang@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()Theodore Ts'o1-12/+5
commit 8ecb790ea8c3fc69e77bace57f14cf0d7c177bd8 upstream. Unlike other strings in the ext4 superblock, we rely on tune2fs to make sure s_mount_opts is NUL terminated. Harden parse_apply_sb_mount_options() by treating s_mount_opts as a potential __nonstring. Cc: stable@vger.kernel.org Fixes: 8b67f04ab9de ("ext4: Add mount options in superblock") Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Message-ID: <20250916-tune2fs-v2-1-d594dc7486f0@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: correctly handle queries for metadata mappingsOjaswin Mujoo1-5/+9
commit 46c22a8bb4cb03211da1100d7ee4a2005bf77c70 upstream. Currently, our handling of metadata is _ambiguous_ in some scenarios, that is, we end up returning unknown if the range only covers the mapping partially. For example, in the following case: $ xfs_io -c fsmap -d 0: 254:16 [0..7]: static fs metadata 8 1: 254:16 [8..15]: special 102:1 8 2: 254:16 [16..5127]: special 102:2 5112 3: 254:16 [5128..5255]: special 102:3 128 4: 254:16 [5256..5383]: special 102:4 128 5: 254:16 [5384..70919]: inodes 65536 6: 254:16 [70920..70967]: unknown 48 ... $ xfs_io -c fsmap -d 24 33 0: 254:16 [24..39]: unknown 16 <--- incomplete reporting $ xfs_io -c fsmap -d 24 33 (With patch) 0: 254:16 [16..5127]: special 102:2 5112 This is because earlier in ext4_getfsmap_meta_helper, we end up ignoring any extent that starts before our queried range, but overlaps it. While the man page [1] is a bit ambiguous on this, this fix makes the output make more sense since we are anyways returning an "unknown" extent. This is also consistent to how XFS does it: $ xfs_io -c fsmap -d ... 6: 254:16 [104..127]: free space 24 7: 254:16 [128..191]: inodes 64 ... $ xfs_io -c fsmap -d 137 150 0: 254:16 [128..191]: inodes 64 <-- full extent returned [1] https://man7.org/linux/man-pages/man2/ioctl_getfsmap.2.html Reported-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: stable@kernel.org Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Message-ID: <023f37e35ee280cd9baac0296cbadcbe10995cab.1757058211.git.ojaswin@linux.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: increase i_disksize to offset + len in ext4_update_disksize_before_punch()Yongjian Sun1-2/+8
commit 9d80eaa1a1d37539224982b76c9ceeee736510b9 upstream. After running a stress test combined with fault injection, we performed fsck -a followed by fsck -fn on the filesystem image. During the second pass, fsck -fn reported: Inode 131512, end of extent exceeds allowed value (logical block 405, physical block 1180540, len 2) This inode was not in the orphan list. Analysis revealed the following call chain that leads to the inconsistency: ext4_da_write_end() //does not update i_disksize ext4_punch_hole() //truncate folio, keep size ext4_page_mkwrite() ext4_block_page_mkwrite() ext4_block_write_begin() ext4_get_block() //insert written extent without update i_disksize journal commit echo 1 > /sys/block/xxx/device/delete da-write path updates i_size but does not update i_disksize. Then ext4_punch_hole truncates the da-folio yet still leaves i_disksize unchanged(in the ext4_update_disksize_before_punch function, the condition offset + len < size is met). Then ext4_page_mkwrite sees ext4_nonda_switch return 1 and takes the nodioread_nolock path, the folio about to be written has just been punched out, and it’s offset sits beyond the current i_disksize. This may result in a written extent being inserted, but again does not update i_disksize. If the journal gets committed and then the block device is yanked, we might run into this. It should be noted that replacing ext4_punch_hole with ext4_zero_range in the call sequence may also trigger this issue, as neither will update i_disksize under these circumstances. To fix this, we can modify ext4_update_disksize_before_punch to increase i_disksize to min(i_size, offset + len) when both i_size and (offset + len) are greater than i_disksize. Cc: stable@kernel.org Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Baokun Li <libaokun1@huawei.com> Message-ID: <20250911133024.1841027-1-sunyongjian@huaweicloud.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: verify orphan file size is not too bigJan Kara1-1/+12
commit 0a6ce20c156442a4ce2a404747bb0fb05d54eeb3 upstream. In principle orphan file can be arbitrarily large. However orphan replay needs to traverse it all and we also pin all its buffers in memory. Thus filesystems with absurdly large orphan files can lead to big amounts of memory consumed. Limit orphan file size to a sane value and also use kvmalloc() for allocating array of block descriptor structures to avoid large order allocations for sane but large orphan files. Reported-by: syzbot+0b92850d68d9b12934f5@syzkaller.appspotmail.com Fixes: 02f310fcf47f ("ext4: Speedup ext4 orphan inode handling") Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Message-ID: <20250909112206.10459-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: fail unaligned direct IO write with EINVALJan Kara1-35/+0
commit 963845748fe67125006859229487b45485564db7 upstream. Commit bc264fea0f6f ("iomap: support incremental iomap_iter advances") changed the error handling logic in iomap_iter(). Previously any error from iomap_dio_bio_iter() got propagated to userspace, after this commit if ->iomap_end returns error, it gets propagated to userspace instead of an error from iomap_dio_bio_iter(). This results in unaligned writes to ext4 to silently fallback to buffered IO instead of erroring out. Now returning ENOTBLK for DIO writes from ext4_iomap_end() seems unnecessary these days. It is enough to return ENOTBLK from ext4_iomap_begin() when we don't support DIO write for that particular file offset (due to hole). Fixes: bc264fea0f6f ("iomap: support incremental iomap_iter advances") Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Message-ID: <20250901112739.32484-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19ext4: add ext4_sb_bread_nofail() helper function for ext4_free_branches()Baokun Li3-1/+12
commit d8b90e6387a74bcb1714c8d1e6a782ff709de9a9 upstream. The implicit __GFP_NOFAIL flag in ext4_sb_bread() was removed in commit 8a83ac54940d ("ext4: call bdev_getblk() from sb_getblk_gfp()"), meaning the function can now fail under memory pressure. Most callers of ext4_sb_bread() propagate the error to userspace and do not remount the filesystem read-only. However, ext4_free_branches() handles ext4_sb_bread() failure by remounting the filesystem read-only. This implies that an ext3 filesystem (mounted via the ext4 driver) could be forcibly remounted read-only due to a transient page allocation failure, which is unacceptable. To mitigate this, introduce a new helper function, ext4_sb_bread_nofail(), which explicitly uses __GFP_NOFAIL, and use it in ext4_free_branches(). Fixes: 8a83ac54940d ("ext4: call bdev_getblk() from sb_getblk_gfp()") Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Allow stop on firmware only if start was issued.Dikshita Agarwal1-1/+1
commit 56a2d85ee8f9b994e5cd17039133218c57c5902b upstream. For HFI Gen1, the instances substate is changed to LOAD_RESOURCES only when a START command is issues to the firmware. If STOP is called without a prior START, the firmware may reject the command and throw some erros. Handle this by adding a substate check before issuing STOP command to the firmware. Fixes: 11712ce70f8e ("media: iris: implement vb2 streaming ops") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Fix format check for CAPTURE plane in try_fmtDikshita Agarwal1-1/+1
commit 2dbd2645c07df8de04ee37b24f2395800513391e upstream. Previously, the format validation relied on an array of supported formats, which only listed formats for the OUTPUT plane. This caused failures when validating formats for the CAPTURE plane. Update the check to validate against the only supported format on the CAPTURE plane, which is NV12. Fixes: fde6161d91bb ("media: iris: Add HEVC and VP9 formats for decoder") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Fix missing LAST flag handling during drainDikshita Agarwal3-4/+3
commit 8172f57746d68e5c3c743f725435d75c5a4db1ac upstream. Improve drain handling by ensuring the LAST flag is attached to final capture buffer when drain response is received from the firmware. Previously, the driver failed to attach the V4L2_BUF_FLAG_LAST flag when a drain response was received from the firmware, relying on userspace to mark the next queued buffer as LAST. This update fixes the issue by checking the pending drain status, attaching the LAST flag to the capture buffer received from the firmware (with EOS attached), and returning it to the V4L2 layer correctly. Fixes: d09100763bed ("media: iris: add support for drain sequence") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Send dummy buffer address for all codecs during drainDikshita Agarwal1-2/+1
commit dec073dd8452e174a69db8444e0932e6b4f31c99 upstream. Firmware can handle a dummy address for buffers with the EOS flag. To ensure consistent behavior across all codecs, update the drain command to always send a dummy buffer address. This makes the drain handling uniform and avoids any codec specific assumptions. Fixes: 478c4478610d ("media: iris: Add codec specific check for VP9 decoder drain handling") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Update vbuf flags before v4l2_m2m_buf_doneDikshita Agarwal1-2/+2
commit 8a432174ac263fb9dd93d232b99c84e430e6d6b5 upstream. Update the vbuf flags appropriately in error cases before calling v4l2_m2m_buf_done(). Previously, the flag update was skippied in error scenario, which could result in incorrect state reporting for buffers. Fixes: 17f2a485ca67 ("media: iris: implement vb2 ops for buf_queue and firmware response") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Simplify session stop logic by relying on vb2 checksDikshita Agarwal1-23/+19
commit 0fe10666d3b4d0757b7f4671892523855ee68cc8 upstream. Remove earlier complex conditional checks in the non-streaming path that attempted to verify if stop was called on a plane that was previously started. These explicit checks are redundant, as vb2 already ensures that stop is only called on ports that have been started, maintaining correct buffer state management. Fixes: 11712ce70f8e ("media: iris: implement vb2 streaming ops") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Always destroy internal buffers on firmware release responseDikshita Agarwal1-4/+1
commit 9cae3619e465dd1cdaa5a5ffbbaf4f41338b0022 upstream. Currently, internal buffers are destroyed only if 'PENDING_RELEASE' flag is set when a release response is received from the firmware, which is incorrect. Internal buffers should always be destroyed when the firmware explicitly releases it, regardless of whether the 'PENDING_RELEASE' flag was set by the driver. This is specially important during force-stop scenarios, where the firmware may release buffers without driver marking them for release. Fix this by removing the incorrect check and ensuring all buffers are properly cleaned up. Fixes: 73702f45db81 ("media: iris: allocate, initialize and queue internal buffers") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Allow substate transition to load resources during output streamingDikshita Agarwal1-1/+2
commit 65f72c6a8d97c0cbdc785cb9a35dc358dee67959 upstream. A client (e.g., GST for encoder) can initiate streaming on the capture port before the output port, causing the instance state to transition to OUTPUT_STREAMING. When streaming is subsequently started on the output port, the instance state advances to STREAMING, and the substate should transition to LOAD_RESOURCES. Previously, the code blocked the substate transition to LOAD_RESOURCES if the instance state was OUTPUT_STREAMING. This update modifies the logic to permit the substate transition to LOAD_RESOURCES when the instance state is OUTPUT_STREAMING, thereby supporting this client streaming sequence. Fixes: 547f7b8c5090 ("media: iris: add check to allow sub states transitions") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Fix buffer count reporting in internal buffer checkDikshita Agarwal1-0/+1
commit cba6aed4223e83ae0f2ed1c0f68d974fd62847bc upstream. Initialize the count variable to zero before counting unreleased internal buffers in iris_check_num_queued_internal_buffers(). This prevents stale values from previous iterations and ensures accurate error reporting for each buffer type. Without this initialization, the count could accumulate across types, leading to incorrect log messages. Fixes: d2abb1ff5a3c ("media: iris: Verify internal buffer release on close") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Fix port streaming handlingDikshita Agarwal3-4/+32
commit 4b67ef9b333ed645879b4b1a11e35e019ff4cfea upstream. The previous check to block capture port streaming before output port was incorrect and caused some valid usecase to fail. While removing that check allows capture port to enter streaming independently, it also introduced firmware errors due to premature queuing of DPB buffers before the firmware session was fully started which happens only when streamon is called on output port. Fix this by deferring DPB buffer queuing to the firmware until both capture and output are streaming and state is 'STREAMING'. Fixes: 11712ce70f8e ("media: iris: implement vb2 streaming ops") Cc: stable@vger.kernel.org Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: vpu3x: Add MNoC low power handshake during hardware power-offDikshita Agarwal1-2/+30
commit 93fad55aa996eef17a837ed95b1d621ef05d967b upstream. Add the missing write to AON_WRAPPER_MVP_NOC_LPI_CONTROL before reading the LPI status register. Introduce a handshake loop to ensure MNoC enters low power mode reliably during VPU3 hardware power-off with timeout handling. Fixes: 02083a1e00ae ("media: platform: qcom/iris: add support for vpu33") Cc: stable@vger.kernel.org Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: fix module removal if firmware download failedNeil Armstrong1-4/+6
commit fde38008fc4f43db8c17869491870df24b501543 upstream. Fix remove if firmware failed to load: qcom-iris aa00000.video-codec: Direct firmware load for qcom/vpu/vpu33_p4.mbn failed with error -2 qcom-iris aa00000.video-codec: firmware download failed qcom-iris aa00000.video-codec: core init failed then: $ echo aa00000.video-codec > /sys/bus/platform/drivers/qcom-iris/unbind Triggers: genpd genpd:1:aa00000.video-codec: Runtime PM usage count underflow! ------------[ cut here ]------------ video_cc_mvs0_clk already disabled WARNING: drivers/clk/clk.c:1206 at clk_core_disable+0xa4/0xac, CPU#1: sh/542 <snip> pc : clk_core_disable+0xa4/0xac lr : clk_core_disable+0xa4/0xac <snip> Call trace: clk_core_disable+0xa4/0xac (P) clk_disable+0x30/0x4c iris_disable_unprepare_clock+0x20/0x48 [qcom_iris] iris_vpu_power_off_hw+0x48/0x58 [qcom_iris] iris_vpu33_power_off_hardware+0x44/0x230 [qcom_iris] iris_vpu_power_off+0x34/0x84 [qcom_iris] iris_core_deinit+0x44/0xc8 [qcom_iris] iris_remove+0x20/0x48 [qcom_iris] platform_remove+0x20/0x30 device_remove+0x4c/0x80 <snip> ---[ end trace 0000000000000000 ]--- ------------[ cut here ]------------ video_cc_mvs0_clk already unprepared WARNING: drivers/clk/clk.c:1065 at clk_core_unprepare+0xf0/0x110, CPU#2: sh/542 <snip> pc : clk_core_unprepare+0xf0/0x110 lr : clk_core_unprepare+0xf0/0x110 <snip> Call trace: clk_core_unprepare+0xf0/0x110 (P) clk_unprepare+0x2c/0x44 iris_disable_unprepare_clock+0x28/0x48 [qcom_iris] iris_vpu_power_off_hw+0x48/0x58 [qcom_iris] iris_vpu33_power_off_hardware+0x44/0x230 [qcom_iris] iris_vpu_power_off+0x34/0x84 [qcom_iris] iris_core_deinit+0x44/0xc8 [qcom_iris] iris_remove+0x20/0x48 [qcom_iris] platform_remove+0x20/0x30 device_remove+0x4c/0x80 <snip> ---[ end trace 0000000000000000 ]--- genpd genpd:0:aa00000.video-codec: Runtime PM usage count underflow! ------------[ cut here ]------------ gcc_video_axi0_clk already disabled WARNING: drivers/clk/clk.c:1206 at clk_core_disable+0xa4/0xac, CPU#4: sh/542 <snip> pc : clk_core_disable+0xa4/0xac lr : clk_core_disable+0xa4/0xac <snip> Call trace: clk_core_disable+0xa4/0xac (P) clk_disable+0x30/0x4c iris_disable_unprepare_clock+0x20/0x48 [qcom_iris] iris_vpu33_power_off_controller+0x17c/0x428 [qcom_iris] iris_vpu_power_off+0x48/0x84 [qcom_iris] iris_core_deinit+0x44/0xc8 [qcom_iris] iris_remove+0x20/0x48 [qcom_iris] platform_remove+0x20/0x30 device_remove+0x4c/0x80 <snip> ------------[ cut here ]------------ gcc_video_axi0_clk already unprepared WARNING: drivers/clk/clk.c:1065 at clk_core_unprepare+0xf0/0x110, CPU#4: sh/542 <snip> pc : clk_core_unprepare+0xf0/0x110 lr : clk_core_unprepare+0xf0/0x110 <snip> Call trace: clk_core_unprepare+0xf0/0x110 (P) clk_unprepare+0x2c/0x44 iris_disable_unprepare_clock+0x28/0x48 [qcom_iris] iris_vpu33_power_off_controller+0x17c/0x428 [qcom_iris] iris_vpu_power_off+0x48/0x84 [qcom_iris] iris_core_deinit+0x44/0xc8 [qcom_iris] iris_remove+0x20/0x48 [qcom_iris] platform_remove+0x20/0x30 device_remove+0x4c/0x80 <snip> ---[ end trace 0000000000000000 ]--- Skip deinit if initialization never succeeded. Fixes: d7378f84e94e ("media: iris: introduce iris core state management with shared queues") Fixes: d19b163356b8 ("media: iris: implement video firmware load/unload") Fixes: bb8a95aa038e ("media: iris: implement power management") Cc: stable@vger.kernel.org Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Fix firmware reference leak and unmap memory after loadStephan Gerhold1-9/+6
commit 57429b0fddfe3cea21a56326576451a4a4c2019b upstream. When we succeed loading the firmware, we don't want to hold on to the firmware pointer anymore, since it won't be freed anywhere else. The same applies for the mapped memory. Unmapping the memory is particularly important since the memory will be protected after the Iris firmware is started, so we need to make sure there will be no accidental access to this region (even if just a speculative one from the CPU). Almost the same firmware loading code also exists in venus/firmware.c, there it is implemented correctly. Fix this by dropping the early "return ret" and move the call of qcom_scm_pas_auth_and_reset() out of iris_load_fw_to_memory(). We should unmap the memory before bringing the firmware out of reset. Cc: stable@vger.kernel.org Fixes: d19b163356b8 ("media: iris: implement video firmware load/unload") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19media: iris: Call correct power off callback in cleanup pathKrzysztof Kozlowski1-1/+1
commit 2fbb823a0744665fe6015bd03d435bd334ccecf7 upstream. Driver implements different callbacks for the power off controller (.power_off_controller): - iris_vpu_power_off_controller, - iris_vpu33_power_off_controller, The generic wrapper for handling power off - iris_vpu_power_off() - calls them via 'iris_platform_data->vpu_ops', so shall the cleanup code in iris_vpu_power_on(). This makes also sense if looking at caller of iris_vpu_power_on(), which unwinds also with the wrapper calling respective platfortm code (unwinds with iris_vpu_power_off()). Otherwise power off sequence on the newer VPU3.3 in error path is not complete. Fixes: c69df5de4ac3 ("media: platform: qcom/iris: add power_off_controller to vpu_ops") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19nfsd: nfserr_jukebox in nlm_fopen should lead to a retryOlga Kornievskaia1-0/+15
commit a082e4b4d08a4a0e656d90c2c05da85f23e6d0c9 upstream. When v3 NLM request finds a conflicting delegation, it triggers a delegation recall and nfsd_open fails with EAGAIN. nfsd_open then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead of returning nlm_failed for when there is a conflicting delegation, drop this NLM request so that the client retries. Once delegation is recalled and if a local lock is claimed, a retry would lead to nfsd returning a nlm_lck_blocked error or a successful nlm lock. Fixes: d343fce148a4 ("[PATCH] knfsd: Allow lockd to drop replies as appropriate") Cc: stable@vger.kernel.org # v6.6 Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19NFSD: Fix destination buffer size in nfsd4_ssc_setup_dul()Thorsten Blum1-1/+1
commit ab1c282c010c4f327bd7addc3c0035fd8e3c1721 upstream. Commit 5304877936c0 ("NFSD: Fix strncpy() fortify warning") replaced strncpy(,, sizeof(..)) with strlcpy(,, sizeof(..) - 1), but strlcpy() already guaranteed NUL-termination of the destination buffer and subtracting one byte potentially truncated the source string. The incorrect size was then carried over in commit 72f78ae00a8e ("NFSD: move from strlcpy with unused retval to strscpy") when switching from strlcpy() to strscpy(). Fix this off-by-one error by using the full size of the destination buffer again. Cc: stable@vger.kernel.org Fixes: 5304877936c0 ("NFSD: Fix strncpy() fortify warning") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>