summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-10-15Linux 6.17.3v6.17.3Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20251013144411.274874080@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Pascal Ernster <git@hardfalcon.net> Tested-by: Dileep Malepu <dileep.debian@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15drm/amdgpu/vcn: Fix double-free of vcn dump bufferLijo Lazar4-5/+1
commit 1a0e57eb96c3fca338665ffd7d9b59f351e5fea7 upstream. The buffer is already freed as part of amdgpu_vcn_reg_dump_fini(). The issue is introduced by below patch series. Fixes: de55cbff5ce9 ("drm/amdgpu/vcn: Add regdump helper functions") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15scsi: ufs: core: Fix PM QoS mutex initializationMarek Szyprowski1-3/+3
commit 0ba7a254afd037cfc2b656f379c54b43c6e574e8 upstream. hba->pm_qos_mutex is used very early as a part of ufshcd_init(), so it need to be initialized before that call. This fixes the following warning: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: kernel/locking/mutex.c:577 at __mutex_lock+0x268/0x894, CPU#4: kworker/u32:4/72 Modules linked in: CPU: 4 UID: 0 PID: 72 Comm: kworker/u32:4 Not tainted 6.17.0-rc7-next-20250926+ #11223 PREEMPT Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) Workqueue: events_unbound deferred_probe_work_func pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __mutex_lock+0x268/0x894 lr : __mutex_lock+0x268/0x894 ... Call trace: __mutex_lock+0x268/0x894 (P) mutex_lock_nested+0x24/0x30 ufshcd_pm_qos_update+0x30/0x78 ufshcd_setup_clocks+0x2d4/0x3c4 ufshcd_init+0x234/0x126c ufshcd_pltfrm_init+0x62c/0x82c ufs_qcom_probe+0x20/0x58 platform_probe+0x5c/0xac really_probe+0xbc/0x298 __driver_probe_device+0x78/0x12c driver_probe_device+0x40/0x164 __device_attach_driver+0xb8/0x138 bus_for_each_drv+0x80/0xdc __device_attach+0xa8/0x1b0 device_initial_probe+0x14/0x20 bus_probe_device+0xb0/0xb4 deferred_probe_work_func+0x8c/0xc8 process_one_work+0x208/0x60c worker_thread+0x244/0x388 kthread+0x150/0x228 ret_from_fork+0x10/0x20 irq event stamp: 57267 hardirqs last enabled at (57267): [<ffffd761485e868c>] _raw_spin_unlock_irqrestore+0x74/0x78 hardirqs last disabled at (57266): [<ffffd76147b13c44>] clk_enable_lock+0x7c/0xf0 softirqs last enabled at (56270): [<ffffd7614734446c>] handle_softirqs+0x4c4/0x4dc softirqs last disabled at (56265): [<ffffd76147290690>] __do_softirq+0x14/0x20 ---[ end trace 0000000000000000 ]--- Fixes: 79dde5f7dc7c ("scsi: ufs: core: Fix data race in CPU latency PM QoS request handling") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Message-Id: <20250929112730.3782765-1-m.szyprowski@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15usb: cdns3: cdnsp-pci: remove redundant pci_disable_device() callMiaoqian Lin1-4/+1
commit e9c206324eeb213957a567a9d066bdeb355c7491 upstream. The cdnsp-pci driver uses pcim_enable_device() to enable a PCI device, which means the device will be automatically disabled on driver detach through the managed device framework. The manual pci_disable_device() call in the error path is therefore redundant. Found via static anlaysis and this is similar to commit 99ca0b57e49f ("thermal: intel: int340x: processor: Fix warning during module unload"). Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Link: https://lore.kernel.org/r/20250903141613.2535472-1-linmq006@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15arm64: dts: qcom: qcm2290: Disable USB SS bus instances in park modeKonrad Dybcio1-0/+1
commit 27f94b71532203b079537180924023a5f636fca1 upstream. 2290 was found in the field to also require this quirk, as long & high-bandwidth workloads (e.g. USB ethernet) are consistently able to crash the controller otherwise. The same change has been made for a number of SoCs in [1], but QCM2290 somehow escaped the list (even though the very closely related SM6115 was there). Upon a controller crash, the log would read: xhci-hcd.12.auto: xHCI host not responding to stop endpoint command xhci-hcd.12.auto: xHCI host controller not responding, assume dead xhci-hcd.12.auto: HC died; cleaning up Add snps,parkmode-disable-ss-quirk to the DWC3 instance in order to prevent the aforementioned breakage. [1] https://lore.kernel.org/all/20240704152848.3380602-1-quic_kriskura@quicinc.com/ Cc: stable@vger.kernel.org Reported-by: Rob Clark <robin.clark@oss.qualcomm.com> Fixes: a64a0192b70c ("arm64: dts: qcom: Add initial QCM2290 device tree") Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250708-topic-2290_usb-v1-1-661e70a63339@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15usb: typec: tipd: Clear interrupts firstSven Peter1-13/+11
commit be5ae730ffa6fd774a00a4705c1e11e078b08ca1 upstream. Right now the interrupt handler first reads all updated status registers and only then clears the interrupts. It's possible that a duplicate interrupt for a changed register or plug state comes in after the interrupts have been processed but before they have been cleared: * plug is inserted, TPS_REG_INT_PLUG_EVENT is set * TPS_REG_INT_EVENT1 is read * tps6598x_handle_plug_event() has run and registered the plug * plug is removed again, TPS_REG_INT_PLUG_EVENT is set (again) * TPS_REG_INT_CLEAR1 is written, TPS_REG_INT_PLUG_EVENT is cleared We then have no plug connected and no pending interrupt but the tipd core still thinks there is a plug. It's possible to trigger this with e.g. a slightly broken Type-C to USB A converter. Fix this by first clearing the interrupts and only then reading the updated registers. Fixes: 45188f27b3d0 ("usb: typec: tipd: Add support for Apple CD321X") Fixes: 0a4c005bd171 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers") Cc: stable@kernel.org Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Sven Peter <sven@kernel.org> Link: https://lore.kernel.org/r/20250914-apple-usb3-tipd-v1-1-4e99c8649024@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15net: usb: asix: hold PM usage ref to avoid PM/MDIO + RTNL deadlockOleksij Rempel1-0/+29
commit 3d3c4cd5c62f24bb3cb4511b7a95df707635e00a upstream. Prevent USB runtime PM (autosuspend) for AX88772* in bind. usbnet enables runtime PM (autosuspend) by default, so disabling it via the usb_driver flag is ineffective. On AX88772B, autosuspend shows no measurable power saving with current driver (no link partner, admin up/down). The ~0.453 W -> ~0.248 W drop on v6.1 comes from phylib powering the PHY off on admin-down, not from USB autosuspend. The real hazard is that with runtime PM enabled, ndo_open() (under RTNL) may synchronously trigger autoresume (usb_autopm_get_interface()) into asix_resume() while the USB PM lock is held. Resume paths then invoke phylink/phylib and MDIO, which also expect RTNL, leading to possible deadlocks or PM lock vs MDIO wake issues. To avoid this, keep the device runtime-PM active by taking a usage reference in ax88772_bind() and dropping it in unbind(). A non-zero PM usage count blocks runtime suspend regardless of userspace policy (.../power/control - pm_runtime_allow/forbid), making this approach robust against sysfs overrides. Holding a runtime-PM usage ref does not affect system-wide suspend; system sleep/resume callbacks continue to run as before. Fixes: 4a2c7217cd5a ("net: usb: asix: ax88772: manage PHY PM from MAC") Reported-by: Hubert Wiśniewski <hubert.wisniewski.25632@gmail.com> Closes: https://lore.kernel.org/all/DCGHG5UJT9G3.2K1GHFZ3H87T0@gmail.com Tested-by: Hubert Wiśniewski <hubert.wisniewski.25632@gmail.com> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Closes: https://lore.kernel.org/all/b5ea8296-f981-445d-a09a-2f389d7f6fdd@samsung.com Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20251005081203.3067982-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15net/9p: Fix buffer overflow in USB transport layerDominique Martinet1-3/+13
commit c04db81cd0288dfc68b7a0f7d09bd49b40bba451 upstream. A buffer overflow vulnerability exists in the USB 9pfs transport layer where inconsistent size validation between packet header parsing and actual data copying allows a malicious USB host to overflow heap buffers. The issue occurs because: - usb9pfs_rx_header() validates only the declared size in packet header - usb9pfs_rx_complete() uses req->actual (actual received bytes) for memcpy This allows an attacker to craft packets with small declared size (bypassing validation) but large actual payload (triggering overflow in memcpy). Add validation in usb9pfs_rx_complete() to ensure req->actual does not exceed the buffer capacity before copying data. Reported-by: Yuhao Jiang <danisjiang@gmail.com> Closes: https://lkml.kernel.org/r/20250616132539.63434-1-danisjiang@gmail.com Fixes: a3be076dc174 ("net/9p/usbg: Add new usb gadget function transport") Cc: stable@vger.kernel.org Message-ID: <20250622-9p-usb_overflow-v3-1-ab172691b946@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15bus: fsl-mc: Check return value of platform_get_resource()Salah Triki1-0/+3
commit 25f526507b8ccc6ac3a43bc094d09b1f9b0b90ae upstream. platform_get_resource() returns NULL in case of failure, so check its return value and propagate the error in order to prevent NULL pointer dereference. Fixes: 6305166c8771 ("bus: fsl-mc: Add ACPI support for fsl-mc") Cc: stable@vger.kernel.org Signed-off-by: Salah Triki <salah.triki@gmail.com> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/aKwuK6TRr5XNYQ8u@pc Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15pinctrl: check the return value of pinmux_ops::get_function_name()Bartosz Golaszewski1-1/+1
commit 4002ee98c022d671ecc1e4a84029e9ae7d8a5603 upstream. While the API contract in docs doesn't specify it explicitly, the generic implementation of the get_function_name() callback from struct pinmux_ops - pinmux_generic_get_function_name() - can fail and return NULL. This is already checked in pinmux_check_ops() so add a similar check in pinmux_func_name_to_selector() instead of passing the returned pointer right down to strcmp() where the NULL can get dereferenced. This is normal operation when adding new pinfunctions. Cc: stable@vger.kernel.org Tested-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15tee: fix register_shm_helper()Jens Wiklander1-0/+8
commit d5cf5b37064b1699d946e8b7ab4ac7d7d101814c upstream. In register_shm_helper(), fix incorrect error handling for a call to iov_iter_extract_pages(). A case is missing for when iov_iter_extract_pages() only got some pages and return a number larger than 0, but not the requested amount. This fixes a possible NULL pointer dereference following a bad input from ioctl(TEE_IOC_SHM_REGISTER) where parts of the buffer isn't mapped. Cc: stable@vger.kernel.org Reported-by: Masami Ichikawa <masami256@gmail.com> Closes: https://lore.kernel.org/op-tee/CACOXgS-Bo2W72Nj1_44c7bntyNYOavnTjJAvUbEiQfq=u9W+-g@mail.gmail.com/ Tested-by: Masami Ichikawa <masami256@gmail.com> Fixes: 7bdee4157591 ("tee: Use iov_iter to better support shared buffer registration") Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15thunderbolt: Fix use-after-free in tb_dp_dprx_workDuoming Zhou1-2/+3
commit 67600ccfc4f38ebd331b9332ac94717bfbc87ea7 upstream. The original code relies on cancel_delayed_work() in tb_dp_dprx_stop(), which does not ensure that the delayed work item tunnel->dprx_work has fully completed if it was already running. This leads to use-after-free scenarios where tb_tunnel is deallocated by tb_tunnel_put(), while tunnel->dprx_work remains active and attempts to dereference tb_tunnel in tb_dp_dprx_work(). A typical race condition is illustrated below: CPU 0 | CPU 1 tb_dp_tunnel_active() | tb_deactivate_and_free_tunnel()| tb_dp_dprx_start() tb_tunnel_deactivate() | queue_delayed_work() tb_dp_activate() | tb_dp_dprx_stop() | tb_dp_dprx_work() //delayed worker cancel_delayed_work() | tb_tunnel_put(tunnel); | | tunnel = container_of(...); //UAF | tunnel-> //UAF Replacing cancel_delayed_work() with cancel_delayed_work_sync() is not feasible as it would introduce a deadlock: both tb_dp_dprx_work() and the cleanup path acquire tb->lock, and cancel_delayed_work_sync() would wait indefinitely for the work item that cannot proceed. Instead, implement proper reference counting: - If cancel_delayed_work() returns true (work is pending), we release the reference in the stop function. - If it returns false (work is executing or already completed), the reference is released in delayed work function itself. This ensures the tb_tunnel remains valid during work item execution while preventing memory leaks. This bug was found by static analysis. Fixes: d6d458d42e1e ("thunderbolt: Handle DisplayPort tunnel activation asynchronously") Cc: stable@vger.kernel.org Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15PCI: endpoint: pci-epf-test: Add NULL check for DMA channels before releaseShin'ichiro Kawasaki1-6/+11
commit 85afa9ea122dd9d4a2ead104a951d318975dcd25 upstream. The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be NULL even after EPF initialization. Then it is prudent to check that they have non-NULL values before releasing the channels. Add the checks in pci_epf_test_clean_dma_chan(). Without the checks, NULL pointer dereferences happen and they can lead to a kernel panic in some cases: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Call trace: dma_release_channel+0x2c/0x120 (P) pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test] pci_epc_deinit_notify+0x74/0xc0 tegra_pcie_ep_pex_rst_irq+0x250/0x5d8 irq_thread_fn+0x34/0xb8 irq_thread+0x18c/0x2e8 kthread+0x14c/0x210 ret_from_fork+0x10/0x20 Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> [mani: trimmed the stack trace] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15remoteproc: pru: Fix potential NULL pointer dereference in ↵Zhen Ni1-1/+2
pru_rproc_set_ctable() commit d41e075b077142bb9ae5df40b9ddf9fd7821a811 upstream. pru_rproc_set_ctable() accessed rproc->priv before the IS_ERR_OR_NULL check, which could lead to a null pointer dereference. Move the pru assignment, ensuring we never dereference a NULL rproc pointer. Fixes: 102853400321 ("remoteproc: pru: Add pru_rproc_set_ctable() function") Cc: stable@vger.kernel.org Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> Link: https://lore.kernel.org/r/20250923112109.1165126-1-zhen.ni@easystack.cn Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15PCI/AER: Avoid NULL pointer dereference in aer_ratelimit()Breno Leitao1-0/+3
commit deb2f228388ff3a9d0623e3b59a053e9235c341d upstream. When platform firmware supplies error information to the OS, e.g., via the ACPI APEI GHES mechanism, it may identify an error source device that doesn't advertise an AER Capability and therefore dev->aer_info, which contains AER stats and ratelimiting data, is NULL. pci_dev_aer_stats_incr() already checks dev->aer_info for NULL, but aer_ratelimit() did not, leading to NULL pointer dereferences like this one from the URL below: {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 {1}[Hardware Error]: event severity: corrected {1}[Hardware Error]: device_id: 0000:00:00.0 {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2020 {1}[Hardware Error]: aer_cor_status: 0x00001000, aer_cor_mask: 0x00002000 BUG: kernel NULL pointer dereference, address: 0000000000000264 RIP: 0010:___ratelimit+0xc/0x1b0 pci_print_aer+0x141/0x360 aer_recover_work_func+0xb5/0x130 [8086:2020] is an Intel "Sky Lake-E DMI3 Registers" device that claims to be a Root Port but does not advertise an AER Capability. Add a NULL check in aer_ratelimit() to avoid the NULL pointer dereference. Note that this also prevents ratelimiting these events from GHES. Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging") Link: https://lore.kernel.org/r/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/ Signed-off-by: Breno Leitao <leitao@debian.org> [bhelgaas: add crash details to commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250929-aer_crash_2-v1-1-68ec4f81c356@debian.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15sunrpc: fix null pointer dereference on zero-length checksumLei Lu1-1/+1
commit 6df164e29bd4e6505c5a2e0e5f1e1f6957a16a42 upstream. In xdr_stream_decode_opaque_auth(), zero-length checksum.len causes checksum.data to be set to NULL. This triggers a NPD when accessing checksum.data in gss_krb5_verify_mic_v2(). This patch ensures that the value of checksum.len is not less than XDR_UNIT. Fixes: 0653028e8f1c ("SUNRPC: Convert gss_verify_header() to use xdr_stream") Cc: stable@kernel.org Signed-off-by: Lei Lu <llfamsec@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15Input: uinput - zero-initialize uinput_ff_upload_compat to avoid info leakZhen Ni1-0/+1
commit d3366a04770eea807f2826cbdb96934dd8c9bf79 upstream. Struct ff_effect_compat is embedded twice inside uinput_ff_upload_compat, contains internal padding. In particular, there is a hole after struct ff_replay to satisfy alignment requirements for the following union member. Without clearing the structure, copy_to_user() may leak stack data to userspace. Initialize ff_up_compat to zero before filling valid fields. Fixes: 2d56f3a32c0e ("Input: refactor evdev 32bit compat to be shareable with uinput") Cc: stable@vger.kernel.org Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> Link: https://lore.kernel.org/r/20250928063737.74590-1-zhen.ni@easystack.cn Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15Input: atmel_mxt_ts - allow reset GPIO to sleepMarek Vasut1-1/+1
commit c7866ee0a9ddd9789faadf58cdac6abd7aabf045 upstream. The reset GPIO is not toggled in any critical section where it couldn't sleep, allow the reset GPIO to sleep. This allows the driver to operate reset GPIOs connected to I2C GPIO expanders. Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Link: https://lore.kernel.org/r/20251005023335.166483-1-marek.vasut@mailbox.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15misc: fastrpc: Skip reference for DMA handlesLing Xu1-18/+27
commit 10df039834f84a297c72ec962c0f9b7c8c5ca31a upstream. If multiple dma handles are passed with same fd over a remote call the kernel driver takes a reference and expects that put for the map will be called as many times to free the map. But DSP only updates the fd one time in the fd list when the DSP refcount goes to zero and hence kernel make put call only once for the fd. This can cause SMMU fault issue as the same fd can be used in future for some other call. Fixes: 35a82b87135d ("misc: fastrpc: Add dma handle implementation") Cc: stable@kernel.org Co-developed-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ling Xu <quic_lxu5@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Srinivas Kandagatla <srini@kernel.org> Link: https://lore.kernel.org/r/20250912131236.303102-5-srini@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15misc: fastrpc: fix possible map leak in fastrpc_put_argsLing Xu1-3/+7
commit da1ba64176e0138f2bfa96f9e43e8c3640d01e1e upstream. copy_to_user() failure would cause an early return without cleaning up the fdlist, which has been updated by the DSP. This could lead to map leak. Fix this by redirecting to a cleanup path on failure, ensuring that all mapped buffers are properly released before returning. Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method") Cc: stable@kernel.org Co-developed-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ling Xu <quic_lxu5@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Srinivas Kandagatla <srini@kernel.org> Link: https://lore.kernel.org/r/20250912131236.303102-4-srini@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15misc: fastrpc: Fix fastrpc_map_lookup operationLing Xu1-1/+6
commit 9031626ade38b092b72638dfe0c6ffce8d8acd43 upstream. Fastrpc driver creates maps for user allocated fd buffers. Before creating a new map, the map list is checked for any already existing maps using map fd. Checking with just map fd is not sufficient as the user can pass offsetted buffer with less size when the map is created and then a larger size the next time which could result in memory issues. Check for dma_buf object also when looking up for the map. Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method") Cc: stable@kernel.org Co-developed-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ling Xu <quic_lxu5@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Srinivas Kandagatla <srini@kernel.org> Link: https://lore.kernel.org/r/20250912131236.303102-3-srini@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15misc: fastrpc: Save actual DMA size in fastrpc_map structureLing Xu1-9/+18
commit 8b5b456222fd604079b5cf2af1f25ad690f54a25 upstream. For user passed fd buffer, map is created using DMA calls. The map related information is stored in fastrpc_map structure. The actual DMA size is not stored in the structure. Store the actual size of buffer and check it against the user passed size. Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method") Cc: stable@kernel.org Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Co-developed-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com> Signed-off-by: Ling Xu <quic_lxu5@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Srinivas Kandagatla <srini@kernel.org> Link: https://lore.kernel.org/r/20250912131236.303102-2-srini@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15nvdimm: ndtest: Return -ENOMEM if devm_kcalloc() fails in ndtest_probe()Guangshuo Li1-1/+12
commit a9e6aa994917ee602798bbb03180a194b37865bb upstream. devm_kcalloc() may fail. ndtest_probe() allocates three DMA address arrays (dcr_dma, label_dma, dimm_dma) and later unconditionally uses them in ndtest_nvdimm_init(), which can lead to a NULL pointer dereference under low-memory conditions. Check all three allocations and return -ENOMEM if any allocation fails, jumping to the common error path. Do not emit an extra error message since the allocator already warns on allocation failure. Fixes: 9399ab61ad82 ("ndtest: Add dimms to the two buses") Cc: stable@vger.kernel.org Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15selftests/mm: skip soft-dirty tests when CONFIG_MEM_SOFT_DIRTY is disabledLance Yang4-20/+24
commit 0389c305ef56cbadca4cbef44affc0ec3213ed30 upstream. The madv_populate and soft-dirty kselftests currently fail on systems where CONFIG_MEM_SOFT_DIRTY is disabled. Introduce a new helper softdirty_supported() into vm_util.c/h to ensure tests are properly skipped when the feature is not enabled. Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit") Signed-off-by: Lance Yang <lance.yang@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15mm: hugetlb: avoid soft lockup when mprotect to large memory areaYang Shi1-0/+2
commit f52ce0ea90c83a28904c7cc203a70e6434adfecb upstream. When calling mprotect() to a large hugetlb memory area in our customer's workload (~300GB hugetlb memory), soft lockup was observed: watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [t2_new_sysv:126916] CPU: 98 PID: 126916 Comm: t2_new_sysv Kdump: loaded Not tainted 6.17-rc7 Hardware name: GIGACOMPUTING R2A3-T40-AAV1/Jefferson CIO, BIOS 5.4.4.1 07/15/2025 pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mte_clear_page_tags+0x14/0x24 lr : mte_sync_tags+0x1c0/0x240 sp : ffff80003150bb80 x29: ffff80003150bb80 x28: ffff00739e9705a8 x27: 0000ffd2d6a00000 x26: 0000ff8e4bc00000 x25: 00e80046cde00f45 x24: 0000000000022458 x23: 0000000000000000 x22: 0000000000000004 x21: 000000011b380000 x20: ffff000000000000 x19: 000000011b379f40 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : ffffc875e0aa5e2c x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 x5 : fffffc01ce7a5c00 x4 : 00000000046cde00 x3 : fffffc0000000000 x2 : 0000000000000004 x1 : 0000000000000040 x0 : ffff0046cde7c000 Call trace:   mte_clear_page_tags+0x14/0x24   set_huge_pte_at+0x25c/0x280   hugetlb_change_protection+0x220/0x430   change_protection+0x5c/0x8c   mprotect_fixup+0x10c/0x294   do_mprotect_pkey.constprop.0+0x2e0/0x3d4   __arm64_sys_mprotect+0x24/0x44   invoke_syscall+0x50/0x160   el0_svc_common+0x48/0x144   do_el0_svc+0x30/0xe0   el0_svc+0x30/0xf0   el0t_64_sync_handler+0xc4/0x148   el0t_64_sync+0x1a4/0x1a8 Soft lockup is not triggered with THP or base page because there is cond_resched() called for each PMD size. Although the soft lockup was triggered by MTE, it should be not MTE specific. The other processing which takes long time in the loop may trigger soft lockup too. So add cond_resched() for hugetlb to avoid soft lockup. Link: https://lkml.kernel.org/r/20250929202402.1663290-1-yang@os.amperecomputing.com Fixes: 8f860591ffb2 ("[PATCH] Enable mprotect on huge pages") Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15fbdev: simplefb: Fix use after free in simplefb_detach_genpds()Janne Grunau1-8/+23
commit da1bb9135213744e7ec398826c8f2e843de4fb94 upstream. The pm_domain cleanup can not be devres managed as it uses struct simplefb_par which is allocated within struct fb_info by framebuffer_alloc(). This allocation is explicitly freed by unregister_framebuffer() in simplefb_remove(). Devres managed cleanup runs after the device remove call and thus can no longer access struct simplefb_par. Call simplefb_detach_genpds() explicitly from simplefb_destroy() like the cleanup functions for clocks and regulators. Fixes an use after free on M2 Mac mini during aperture_remove_conflicting_devices() using the downstream asahi kernel with Debian's kernel config. For unknown reasons this started to consistently dereference an invalid pointer in v6.16.3 based kernels. [ 6.736134] BUG: KASAN: slab-use-after-free in simplefb_detach_genpds+0x58/0x220 [ 6.743545] Read of size 4 at addr ffff8000304743f0 by task (udev-worker)/227 [ 6.750697] [ 6.752182] CPU: 6 UID: 0 PID: 227 Comm: (udev-worker) Tainted: G S 6.16.3-asahi+ #16 PREEMPTLAZY [ 6.752186] Tainted: [S]=CPU_OUT_OF_SPEC [ 6.752187] Hardware name: Apple Mac mini (M2, 2023) (DT) [ 6.752189] Call trace: [ 6.752190] show_stack+0x34/0x98 (C) [ 6.752194] dump_stack_lvl+0x60/0x80 [ 6.752197] print_report+0x17c/0x4d8 [ 6.752201] kasan_report+0xb4/0x100 [ 6.752206] __asan_report_load4_noabort+0x20/0x30 [ 6.752209] simplefb_detach_genpds+0x58/0x220 [ 6.752213] devm_action_release+0x50/0x98 [ 6.752216] release_nodes+0xd0/0x2c8 [ 6.752219] devres_release_all+0xfc/0x178 [ 6.752221] device_unbind_cleanup+0x28/0x168 [ 6.752224] device_release_driver_internal+0x34c/0x470 [ 6.752228] device_release_driver+0x20/0x38 [ 6.752231] bus_remove_device+0x1b0/0x380 [ 6.752234] device_del+0x314/0x820 [ 6.752238] platform_device_del+0x3c/0x1e8 [ 6.752242] platform_device_unregister+0x20/0x50 [ 6.752246] aperture_detach_platform_device+0x1c/0x30 [ 6.752250] aperture_detach_devices+0x16c/0x290 [ 6.752253] aperture_remove_conflicting_devices+0x34/0x50 ... [ 6.752343] [ 6.967409] Allocated by task 62: [ 6.970724] kasan_save_stack+0x3c/0x70 [ 6.974560] kasan_save_track+0x20/0x40 [ 6.978397] kasan_save_alloc_info+0x40/0x58 [ 6.982670] __kasan_kmalloc+0xd4/0xd8 [ 6.986420] __kmalloc_noprof+0x194/0x540 [ 6.990432] framebuffer_alloc+0xc8/0x130 [ 6.994444] simplefb_probe+0x258/0x2378 ... [ 7.054356] [ 7.055838] Freed by task 227: [ 7.058891] kasan_save_stack+0x3c/0x70 [ 7.062727] kasan_save_track+0x20/0x40 [ 7.066565] kasan_save_free_info+0x4c/0x80 [ 7.070751] __kasan_slab_free+0x6c/0xa0 [ 7.074675] kfree+0x10c/0x380 [ 7.077727] framebuffer_release+0x5c/0x90 [ 7.081826] simplefb_destroy+0x1b4/0x2c0 [ 7.085837] put_fb_info+0x98/0x100 [ 7.089326] unregister_framebuffer+0x178/0x320 [ 7.093861] simplefb_remove+0x3c/0x60 [ 7.097611] platform_remove+0x60/0x98 [ 7.101361] device_remove+0xb8/0x160 [ 7.105024] device_release_driver_internal+0x2fc/0x470 [ 7.110256] device_release_driver+0x20/0x38 [ 7.114529] bus_remove_device+0x1b0/0x380 [ 7.118628] device_del+0x314/0x820 [ 7.122116] platform_device_del+0x3c/0x1e8 [ 7.126302] platform_device_unregister+0x20/0x50 [ 7.131012] aperture_detach_platform_device+0x1c/0x30 [ 7.136157] aperture_detach_devices+0x16c/0x290 [ 7.140779] aperture_remove_conflicting_devices+0x34/0x50 ... Reported-by: Daniel Huhardeaux <tech@tootai.net> Cc: stable@vger.kernel.org Fixes: 92a511a568e44 ("fbdev/simplefb: Add support for generic power-domains") Signed-off-by: Janne Grunau <j@jannau.net> Reviewed-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15KVM: SVM: Skip fastpath emulation on VM-Exit if next RIP isn't validSean Christopherson1-2/+10
commit 0910dd7c9ad45a2605c45fd2bf3d1bcac087687c upstream. Skip the WRMSR and HLT fastpaths in SVM's VM-Exit handler if the next RIP isn't valid, e.g. because KVM is running with nrips=false. SVM must decode and emulate to skip the instruction if the CPU doesn't provide the next RIP, and getting the instruction bytes to decode requires reading guest memory. Reading guest memory through the emulator can fault, i.e. can sleep, which is disallowed since the fastpath handlers run with IRQs disabled. BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:106 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 32611, name: qemu preempt_count: 1, expected: 0 INFO: lockdep is turned off. irq event stamp: 30580 hardirqs last enabled at (30579): [<ffffffffc08b2527>] vcpu_run+0x1787/0x1db0 [kvm] hardirqs last disabled at (30580): [<ffffffffb4f62e32>] __schedule+0x1e2/0xed0 softirqs last enabled at (30570): [<ffffffffb4247a64>] fpu_swap_kvm_fpstate+0x44/0x210 softirqs last disabled at (30568): [<ffffffffb4247a64>] fpu_swap_kvm_fpstate+0x44/0x210 CPU: 298 UID: 0 PID: 32611 Comm: qemu Tainted: G U 6.16.0-smp--e6c618b51cfe-sleep #782 NONE Tainted: [U]=USER Hardware name: Google Astoria-Turin/astoria, BIOS 0.20241223.2-0 01/17/2025 Call Trace: <TASK> dump_stack_lvl+0x7d/0xb0 __might_resched+0x271/0x290 __might_fault+0x28/0x80 kvm_vcpu_read_guest_page+0x8d/0xc0 [kvm] kvm_fetch_guest_virt+0x92/0xc0 [kvm] __do_insn_fetch_bytes+0xf3/0x1e0 [kvm] x86_decode_insn+0xd1/0x1010 [kvm] x86_emulate_instruction+0x105/0x810 [kvm] __svm_skip_emulated_instruction+0xc4/0x140 [kvm_amd] handle_fastpath_invd+0xc4/0x1a0 [kvm] vcpu_run+0x11a1/0x1db0 [kvm] kvm_arch_vcpu_ioctl_run+0x5cc/0x730 [kvm] kvm_vcpu_ioctl+0x578/0x6a0 [kvm] __se_sys_ioctl+0x6d/0xb0 do_syscall_64+0x8a/0x2c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7f479d57a94b </TASK> Note, this is essentially a reapply of commit 5c30e8101e8d ("KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"), but with different justification (KVM now grabs SRCU when skipping the instruction for other reasons). Fixes: b439eb8ab578 ("Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250805190526.1453366-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15ext4: fix checks for orphan inodesJan Kara5-9/+15
commit acf943e9768ec9d9be80982ca0ebc4bfd6b7631e upstream. When orphan file feature is enabled, inode can be tracked as orphan either in the standard orphan list or in the orphan file. The first can be tested by checking ei->i_orphan list head, the second is recorded by EXT4_STATE_ORPHAN_FILE inode state flag. There are several places where we want to check whether inode is tracked as orphan and only some of them properly check for both possibilities. Luckily the consequences are mostly minor, the worst that can happen is that we track an inode as orphan although we don't need to and e2fsck then complains (resulting in occasional ext4/307 xfstest failures). Fix the problem by introducing a helper for checking whether an inode is tracked as orphan and use it in appropriate places. Fixes: 4a79a98c7b19 ("ext4: Improve scalability of ext4 orphan file handling") Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Message-ID: <20250925123038.20264-2-jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15ext4: fix potential null deref in ext4_mb_init()Baokun Li1-0/+10
commit 3c3fac6bc0a9c00dbe65d8dc0d3a282afe4d3188 upstream. In ext4_mb_init(), ext4_mb_avg_fragment_size_destroy() may be called when sbi->s_mb_avg_fragment_size remains uninitialized (e.g., if groupinfo slab cache allocation fails). Since ext4_mb_avg_fragment_size_destroy() lacks null pointer checking, this leads to a null pointer dereference. ================================================================== EXT4-fs: no memory for groupinfo slab cache BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: Oops: 0002 [#1] SMP PTI CPU:2 UID: 0 PID: 87 Comm:mount Not tainted 6.17.0-rc2 #1134 PREEMPT(none) RIP: 0010:_raw_spin_lock_irqsave+0x1b/0x40 Call Trace: <TASK> xa_destroy+0x61/0x130 ext4_mb_init+0x483/0x540 __ext4_fill_super+0x116d/0x17b0 ext4_fill_super+0xd3/0x280 get_tree_bdev_flags+0x132/0x1d0 vfs_get_tree+0x29/0xd0 do_new_mount+0x197/0x300 __x64_sys_mount+0x116/0x150 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ================================================================== Therefore, add necessary null check to ext4_mb_avg_fragment_size_destroy() to prevent this issue. The same fix is also applied to ext4_mb_largest_free_orders_destroy(). Reported-by: syzbot+1713b1aa266195b916c2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=1713b1aa266195b916c2 Cc: stable@kernel.org Fixes: f7eaacbb4e54 ("ext4: convert free groups order lists to xarrays") Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15ksmbd: add max ip connections parameterNamjae Jeon4-13/+23
commit d8b6dc9256762293048bf122fc11c4e612d0ef5d upstream. This parameter set the maximum number of connections per ip address. The default is 8. Cc: stable@vger.kernel.org Fixes: c0d41112f1a5 ("ksmbd: extend the connection limiting mechanism to support IPv6") Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15ksmbd: fix error code overwriting in smb2_get_info_filesystem()Matvey Kovalev1-1/+2
commit 88daf2f448aad05a2e6df738d66fe8b0cf85cee0 upstream. If client doesn't negotiate with SMB3.1.1 POSIX Extensions, then proper error code won't be returned due to overwriting. Return error immediately. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e2f34481b24db ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Matvey Kovalev <matvey.kovalev@ispras.ru> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15ksmbd: Fix race condition in RPC handle list accessYunseong Kim1-9/+17
commit 305853cce379407090a73b38c5de5ba748893aee upstream. The 'sess->rpc_handle_list' XArray manages RPC handles within a ksmbd session. Access to this list is intended to be protected by 'sess->rpc_lock' (an rw_semaphore). However, the locking implementation was flawed, leading to potential race conditions. In ksmbd_session_rpc_open(), the code incorrectly acquired only a read lock before calling xa_store() and xa_erase(). Since these operations modify the XArray structure, a write lock is required to ensure exclusive access and prevent data corruption from concurrent modifications. Furthermore, ksmbd_session_rpc_method() accessed the list using xa_load() without holding any lock at all. This could lead to reading inconsistent data or a potential use-after-free if an entry is concurrently removed and the pointer is dereferenced. Fix these issues by: 1. Using down_write() and up_write() in ksmbd_session_rpc_open() to ensure exclusive access during XArray modification, and ensuring the lock is correctly released on error paths. 2. Adding down_read() and up_read() in ksmbd_session_rpc_method() to safely protect the lookup. Fixes: a1f46c99d9ea ("ksmbd: fix use-after-free in ksmbd_session_rpc_open") Fixes: b685757c7b08 ("ksmbd: Implements sess->rpc_handle_list as xarray") Cc: stable@vger.kernel.org Signed-off-by: Yunseong Kim <ysk@kzalloc.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15mm/ksm: fix flag-dropping behavior in ksm_madviseJakub Acs2-1/+2
commit f04aad36a07cc17b7a5d5b9a2d386ce6fae63e93 upstream. syzkaller discovered the following crash: (kernel BUG) [ 44.607039] ------------[ cut here ]------------ [ 44.607422] kernel BUG at mm/userfaultfd.c:2067! [ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI [ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none) [ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460 <snip other registers, drop unreliable trace> [ 44.617726] Call Trace: [ 44.617926] <TASK> [ 44.619284] userfaultfd_release+0xef/0x1b0 [ 44.620976] __fput+0x3f9/0xb60 [ 44.621240] fput_close_sync+0x110/0x210 [ 44.622222] __x64_sys_close+0x8f/0x120 [ 44.622530] do_syscall_64+0x5b/0x2f0 [ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 44.623244] RIP: 0033:0x7f365bb3f227 Kernel panics because it detects UFFD inconsistency during userfaultfd_release_all(). Specifically, a VMA which has a valid pointer to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags. The inconsistency is caused in ksm_madvise(): when user calls madvise() with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode, it accidentally clears all flags stored in the upper 32 bits of vma->vm_flags. Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and int are 32-bit wide. This setup causes the following mishap during the &= ~VM_MERGEABLE assignment. VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000. After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then promoted to unsigned long before the & operation. This promotion fills upper 32 bits with leading 0s, as we're doing unsigned conversion (and even for a signed conversion, this wouldn't help as the leading bit is 0). & operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears the upper 32-bits of its value. Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the BIT() macro. Note: other VM_* flags are not affected: This only happens to the VM_MERGEABLE flag, as the other VM_* flags are all constants of type int and after ~ operation, they end up with leading 1 and are thus converted to unsigned long with leading 1s. Note 2: After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is no longer a kernel BUG, but a WARNING at the same place: [ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067 but the root-cause (flag-drop) remains the same. [akpm@linux-foundation.org: rust bindgen wasn't able to handle BIT(), from Miguel] Link: https://lore.kernel.org/oe-kbuild-all/202510030449.VfSaAjvd-lkp@intel.com/ Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de Fixes: 7677f7fd8be7 ("userfaultfd: add minor fault registration mode") Signed-off-by: Jakub Acs <acsjakub@amazon.de> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: SeongJae Park <sj@kernel.org> Tested-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Xu Xin <xu.xin16@zte.com.cn> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Fix uninitialized symbol 'retval_off'Huacai Chen1-5/+4
commit 7b6c2d172d023d344527d3cb4516d0d6b29f4919 upstream. In __arch_prepare_bpf_trampoline(), retval_off is meaningful only when save_ret is not 0, so the current logic is correct. But it may cause a build warning: arch/loongarch/net/bpf_jit.c:1547 __arch_prepare_bpf_trampoline() error: uninitialized symbol 'retval_off'. So initialize retval_off unconditionally to fix it. Cc: stable@vger.kernel.org Fixes: f9b6b41f0cf3 ("LoongArch: BPF: Add basic bpf trampoline support") Closes: https://lore.kernel.org/r/202508191020.PBBh07cK-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Remove duplicated flags checkHengqi Chen1-7/+2
commit 909d3e3f51b1bc00f33a484ce0d41b42fed01965 upstream. The check for (BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_SHARE_IPMODIFY) is duplicated in __arch_prepare_bpf_trampoline(). Remove it. While at it, make sure stack_size and nargs are initialized once. Cc: stable@vger.kernel.org Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: No text_poke() for kernel textHengqi Chen1-2/+4
commit 3d770bd11b943066db11dba7be0b6f0d81cb5d50 upstream. The current implementation of bpf_arch_text_poke() requires 5 nops at patch site which is not applicable for kernel/module functions. Because LoongArch reserves ONLY 2 nops at the function entry. With CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y, this can be done by ftrace instead. See the following commit for details: * commit b91e014f078e ("bpf: Make BPF trampoline use register_ftrace_direct() API") * commit 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support") Cc: stable@vger.kernel.org Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Remove duplicated bpf_flush_icache()Hengqi Chen1-1/+0
commit b0f50dc09bf008b2e581d5e6ad570d325725881c upstream. The bpf_flush_icache() is called by bpf_arch_text_copy() already. So remove it. This has been done in arm64 and riscv. Cc: stable@vger.kernel.org Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Make error handling robust in arch_prepare_bpf_trampoline()Hengqi Chen1-1/+4
commit de2c0b7788648850b68b75f7cc8698b2749dd31e upstream. Bail out instead of trying to perform a bpf_arch_text_copy() if __arch_prepare_bpf_trampoline() failed. Cc: stable@vger.kernel.org Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Make trampoline size stableHengqi Chen1-2/+2
commit ea645cfd3d5f74a2bd40a60003f113b3c467975d upstream. When attach fentry/fexit BPF programs, __arch_prepare_bpf_trampoline() is called twice with different `struct bpf_tramp_image *im`: bpf_trampoline_update() -> arch_bpf_trampoline_size() -> __arch_prepare_bpf_trampoline() -> arch_prepare_bpf_trampoline() -> __arch_prepare_bpf_trampoline() Use move_imm() will emit unstable instruction sequences, so let's use move_addr() instead to prevent subtle bugs. (I observed this while debugging other issues with printk.) Cc: stable@vger.kernel.org Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Don't align trampoline sizeHengqi Chen1-2/+1
commit a04731cbee6e981afa4263289a0c0059c8b2e7b9 upstream. Currently, arch_alloc_bpf_trampoline() use bpf_prog_pack_alloc() which will pack multiple trampolines into a huge page. So, no need to assume the trampoline size is page aligned. Cc: stable@vger.kernel.org Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: No support of struct argument in trampoline programsHengqi Chen1-0/+6
commit e82406c7cbdd368c5459b8a45e118811d2ba0794 upstream. The current implementation does not support struct argument. This causes a oops when running bpf selftest: $ ./test_progs -a tracing_struct Oops[#1]: CPU -1 Unable to handle kernel paging request at virtual address 0000000000000018, era == 9000000085bef268, ra == 90000000844f3938 rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...0: (19 ticks this GP) idle=1094/1/0x4000000000000000 softirq=1380/1382 fqs=801 rcu: (detected by 0, t=5252 jiffies, g=1197, q=52 ncpus=4) Sending NMI from CPU 0 to CPUs 1: rcu: rcu_preempt kthread starved for 2495 jiffies! g1197 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=2 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack:0 pid:15 tgid:15 ppid:2 task_flags:0x208040 flags:0x00000800 Stack : 9000000100423e80 0000000000000402 0000000000000010 90000001003b0680 9000000085d88000 0000000000000000 0000000000000040 9000000087159350 9000000085c2b9b0 0000000000000001 900000008704a000 0000000000000005 00000000ffff355b 00000000ffff355b 0000000000000000 0000000000000004 9000000085d90510 0000000000000000 0000000000000002 7b5d998f8281e86e 00000000ffff355c 7b5d998f8281e86e 000000000000003f 9000000087159350 900000008715bf98 0000000000000005 9000000087036000 900000008704a000 9000000100407c98 90000001003aff80 900000008715c4c0 9000000085c2b9b0 00000000ffff355b 9000000085c33d3c 00000000000000b4 0000000000000000 9000000007002150 00000000ffff355b 9000000084615480 0000000007000002 ... Call Trace: [<9000000085c2a868>] __schedule+0x410/0x1520 [<9000000085c2b9ac>] schedule+0x34/0x190 [<9000000085c33d38>] schedule_timeout+0x98/0x140 [<90000000845e9120>] rcu_gp_fqs_loop+0x5f8/0x868 [<90000000845ed538>] rcu_gp_kthread+0x260/0x2e0 [<900000008454e8a4>] kthread+0x144/0x238 [<9000000085c26b60>] ret_from_kernel_thread+0x28/0xc8 [<90000000844f20e4>] ret_from_kernel_thread_asm+0xc/0x88 rcu: Stack dump where RCU GP kthread last ran: Sending NMI from CPU 0 to CPUs 2: NMI backtrace for cpu 2 skipped: idling at idle_exit+0x0/0x4 Reject it for now. Cc: stable@vger.kernel.org Fixes: f9b6b41f0cf3 ("LoongArch: BPF: Add basic bpf trampoline support") Tested-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: BPF: Sign-extend struct ops return values properlyHengqi Chen1-1/+36
commit 8b51b11b3d81c1ed48a52f87da9256d737b723a0 upstream. The ns_bpf_qdisc selftest triggers a kernel panic: Oops[#1]: CPU 0 Unable to handle kernel paging request at virtual address 0000000000741d58, era == 90000000851b5ac0, ra == 90000000851b5aa4 CPU: 0 UID: 0 PID: 449 Comm: test_progs Tainted: G OE 6.16.0+ #3 PREEMPT(full) Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022 pc 90000000851b5ac0 ra 90000000851b5aa4 tp 90000001076b8000 sp 90000001076bb600 a0 0000000000741ce8 a1 0000000000000001 a2 90000001076bb5c0 a3 0000000000000008 a4 90000001004c4620 a5 9000000100741ce8 a6 0000000000000000 a7 0100000000000000 t0 0000000000000010 t1 0000000000000000 t2 9000000104d24d30 t3 0000000000000001 t4 4f2317da8a7e08c4 t5 fffffefffc002f00 t6 90000001004c4620 t7 ffffffffc61c5b3d t8 0000000000000000 u0 0000000000000001 s9 0000000000000050 s0 90000001075bc800 s1 0000000000000040 s2 900000010597c400 s3 0000000000000008 s4 90000001075bc880 s5 90000001075bc8f0 s6 0000000000000000 s7 0000000000741ce8 s8 0000000000000000 ra: 90000000851b5aa4 __qdisc_run+0xac/0x8d8 ERA: 90000000851b5ac0 __qdisc_run+0xc8/0x8d8 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 00000004 (PPLV0 +PIE -PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000741d58 PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod(OE)] Process test_progs (pid: 449, threadinfo=000000009af02b3a, task=00000000e9ba4956) Stack : 0000000000000000 90000001075bc8ac 90000000869524a8 9000000100741ce8 90000001075bc800 9000000100415300 90000001075bc8ac 0000000000000000 900000010597c400 900000008694a000 0000000000000000 9000000105b59000 90000001075bc800 9000000100741ce8 0000000000000050 900000008513000c 9000000086936000 0000000100094d4c fffffff400676208 0000000000000000 9000000105b59000 900000008694a000 9000000086bf0dc0 9000000105b59000 9000000086bf0d68 9000000085147010 90000001075be788 0000000000000000 9000000086bf0f98 0000000000000001 0000000000000010 9000000006015840 0000000000000000 9000000086be6c40 0000000000000000 0000000000000000 0000000000000000 4f2317da8a7e08c4 0000000000000101 4f2317da8a7e08c4 ... Call Trace: [<90000000851b5ac0>] __qdisc_run+0xc8/0x8d8 [<9000000085130008>] __dev_queue_xmit+0x578/0x10f0 [<90000000853701c0>] ip6_finish_output2+0x2f0/0x950 [<9000000085374bc8>] ip6_finish_output+0x2b8/0x448 [<9000000085370b24>] ip6_xmit+0x304/0x858 [<90000000853c4438>] inet6_csk_xmit+0x100/0x170 [<90000000852b32f0>] __tcp_transmit_skb+0x490/0xdd0 [<90000000852b47fc>] tcp_connect+0xbcc/0x1168 [<90000000853b9088>] tcp_v6_connect+0x580/0x8a0 [<90000000852e7738>] __inet_stream_connect+0x170/0x480 [<90000000852e7a98>] inet_stream_connect+0x50/0x88 [<90000000850f2814>] __sys_connect+0xe4/0x110 [<90000000850f2858>] sys_connect+0x18/0x28 [<9000000085520c94>] do_syscall+0x94/0x1a0 [<9000000083df1fb8>] handle_syscall+0xb8/0x158 Code: 4001ad80 2400873f 2400832d <240073cc> 001137ff 001133ff 6407b41f 001503cc 0280041d ---[ end trace 0000000000000000 ]--- The bpf_fifo_dequeue prog returns a skb which is a pointer. The pointer is treated as a 32bit value and sign extend to 64bit in epilogue. This behavior is right for most bpf prog types but wrong for struct ops which requires LoongArch ABI. So let's sign extend struct ops return values according to the LoongArch ABI ([1]) and return value spec in function model. [1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html Cc: stable@vger.kernel.org Fixes: 6abf17d690d8 ("LoongArch: BPF: Add struct ops support for trampoline") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15pwm: loongson: Fix LOONGSON_PWM_FREQ_DEFAULTXi Ruoyao1-1/+1
commit 75604e9a5b60707722028947d6dc6bdacb42282e upstream. Per the 7A1000 and 7A2000 user manual, the clock frequency of their PWM controllers is 50 MHz, not 50 kHz. Fixes: 2b62c89448dd ("pwm: Add Loongson PWM controller support") Signed-off-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Binbin Zhou <zhoubinbin@loongson.cn> Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20250816104904.4779-2-xry111@xry111.site Cc: stable@vger.kernel.org Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15LoongArch: Automatically disable kaslr if boot from kexec_fileYouling Tang1-0/+4
commit c8168b4faf1d62cbb320a3e518ad31cdd567cb05 upstream. Automatically disable kaslr when the kernel loads from kexec_file. kexec_file loads the secondary kernel image to a non-linked address, inherently providing KASLR-like randomization. However, on LoongArch where System RAM may be non-contiguous, enabling KASLR for the second kernel may relocate it to an invalid memory region and cause a boot failure. Thus, we disable KASLR when "kexec_file" is detected in the command line. To ensure compatibility with older kernels loaded via kexec_file, this patch should be backported to stable branches. Cc: stable@vger.kernel.org Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15dm: fix NULL pointer dereference in __dm_suspend()Zheng Qixing1-3/+4
commit 8d33a030c566e1f105cd5bf27f37940b6367f3be upstream. There is a race condition between dm device suspend and table load that can lead to null pointer dereference. The issue occurs when suspend is invoked before table load completes: BUG: kernel NULL pointer dereference, address: 0000000000000054 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 Call Trace: <TASK> blk_mq_quiesce_queue+0x2c/0x50 dm_stop_queue+0xd/0x20 __dm_suspend+0x130/0x330 dm_suspend+0x11a/0x180 dev_suspend+0x27e/0x560 ctl_ioctl+0x4cf/0x850 dm_ctl_ioctl+0xd/0x20 vfs_ioctl+0x1d/0x50 __se_sys_ioctl+0x9b/0xc0 __x64_sys_ioctl+0x19/0x30 x64_sys_call+0x2c4a/0x4620 do_syscall_64+0x9e/0x1b0 The issue can be triggered as below: T1 T2 dm_suspend table_load __dm_suspend dm_setup_md_queue dm_mq_init_request_queue blk_mq_init_allocated_queue => q->mq_ops = set->ops; (1) dm_stop_queue / dm_wait_for_completion => q->tag_set NULL pointer! (2) => q->tag_set = set; (3) Fix this by checking if a valid table (map) exists before performing request-based suspend and waiting for target I/O. When map is NULL, skip these table-dependent suspend steps. Even when map is NULL, no I/O can reach any target because there is no table loaded; I/O submitted in this state will fail early in the DM layer. Skipping the table-dependent suspend logic in this case is safe and avoids NULL pointer dereferences. Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") Cc: stable@vger.kernel.org Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15dm: fix queue start/stop imbalance under suspend/load/resume racesZheng Qixing2-3/+6
commit 7f597c2cdb9d3263a6fce07c4fc0a9eaa8e8fc43 upstream. When suspend and load run concurrently, before q->mq_ops is set in blk_mq_init_allocated_queue(), __dm_suspend() skip dm_stop_queue(). As a result, the queue's quiesce depth is not incremented. Later, once table load has finished and __dm_resume() runs, which triggers q->quiesce_depth ==0 warning in blk_mq_unquiesce_queue(): Call Trace: <TASK> dm_start_queue+0x16/0x20 [dm_mod] __dm_resume+0xac/0xb0 [dm_mod] dm_resume+0x12d/0x150 [dm_mod] do_resume+0x2c2/0x420 [dm_mod] dev_suspend+0x30/0x130 [dm_mod] ctl_ioctl+0x402/0x570 [dm_mod] dm_ctl_ioctl+0x23/0x30 [dm_mod] Fix this by explicitly tracking whether the request queue was stopped in __dm_suspend() via a new DMF_QUEUE_STOPPED flag. Only call dm_start_queue() in __dm_resume() if the queue was actually stopped. Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue quiesce/unquiesce") Cc: stable@vger.kernel.org Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15tracing: Stop fortify-string from warning in tracing_mark_raw_write()Steven Rostedt1-2/+6
commit 54b91e54b113d4f15ab023a44f508251db6e22e7 upstream. The way tracing_mark_raw_write() records its data is that it has the following structure: struct { struct trace_entry; int id; char buf[]; }; But memcpy(&entry->id, buf, size) triggers the following warning when the size is greater than the id: ------------[ cut here ]------------ memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4) WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0 Modules linked in: CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted 6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0 Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48 RSP: 0018:ffff888104c3fc78 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 1ffffffff6b363b4 RDI: 0000000000000001 RBP: ffff888100058a00 R08: ffffffffb041d459 R09: ffffed1020987f40 R10: 0000000000000007 R11: 0000000000000001 R12: ffff888100bb9010 R13: 0000000000000000 R14: 00000000000003e3 R15: ffff888134800000 FS: 00007fa61d286740(0000) GS:ffff888286cad000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000560d28d509f1 CR3: 00000001047a4006 CR4: 0000000000172ef0 Call Trace: <TASK> tracing_mark_raw_write+0x1fe/0x290 ? __pfx_tracing_mark_raw_write+0x10/0x10 ? security_file_permission+0x50/0xf0 ? rw_verify_area+0x6f/0x4b0 vfs_write+0x1d8/0xdd0 ? __pfx_vfs_write+0x10/0x10 ? __pfx_css_rstat_updated+0x10/0x10 ? count_memcg_events+0xd9/0x410 ? fdget_pos+0x53/0x5e0 ksys_write+0x182/0x200 ? __pfx_ksys_write+0x10/0x10 ? do_user_addr_fault+0x4af/0xa30 do_syscall_64+0x63/0x350 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fa61d318687 Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff RSP: 002b:00007ffd87fe0120 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fa61d286740 RCX: 00007fa61d318687 RDX: 0000000000000006 RSI: 0000560d28d509f0 RDI: 0000000000000001 RBP: 0000560d28d509f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000006 R13: 00007fa61d4715c0 R14: 00007fa61d46ee80 R15: 0000000000000000 </TASK> ---[ end trace 0000000000000000 ]--- This is because fortify string sees that the size of entry->id is only 4 bytes, but it is writing more than that. But this is OK as the dynamic_array is allocated to handle that copy. The size allocated on the ring buffer was actually a bit too big: size = sizeof(*entry) + cnt; But cnt includes the 'id' and the buffer data, so adding cnt to the size of *entry actually allocates too much on the ring buffer. Change the allocation to: size = struct_size(entry, buf, cnt - sizeof(entry->id)); and the memcpy() to unsafe_memcpy() with an added justification. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space") Reported-by: syzbot+9a2ede1643175f350105@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15tracing: Fix tracing_mark_raw_write() to use buf and not ubufSteven Rostedt1-2/+2
commit bda745ee8fbb63330d8f2f2ea4157229a5df959e upstream. The fix to use a per CPU buffer to read user space tested only the writes to trace_marker. But it appears that the selftests are missing tests to the trace_maker_raw file. The trace_maker_raw file is used by applications that writes data structures and not strings into the file, and the tools read the raw ring buffer to process the structures it writes. The fix that reads the per CPU buffers passes the new per CPU buffer to the trace_marker file writes, but the update to the trace_marker_raw write read the data from user space into the per CPU buffer, but then still used then passed the user space address to the function that records the data. Pass in the per CPU buffer and not the user space address. TODO: Add a test to better test trace_marker_raw. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/20251011035243.386098147@kernel.org Fixes: 64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space") Reported-by: syzbot+9a2ede1643175f350105@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68e973f5.050a0220.1186a4.0010.GAE@google.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15tracing: Have trace_marker use per-cpu data to read user spaceSteven Rostedt1-48/+220
commit 64cf7d058a005c5c31eb8a0b741f35dc12915d18 upstream. It was reported that using __copy_from_user_inatomic() can actually schedule. Which is bad when preemption is disabled. Even though there's logic to check in_atomic() is set, but this is a nop when the kernel is configured with PREEMPT_NONE. This is due to page faulting and the code could schedule with preemption disabled. Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud.com/ The solution was to change the __copy_from_user_inatomic() to copy_from_user_nofault(). But then it was reported that this caused a regression in Android. There's several applications writing into trace_marker() in Android, but now instead of showing the expected data, it is showing: tracing_mark_write: <faulted> After reverting the conversion to copy_from_user_nofault(), Android was able to get the data again. Writes to the trace_marker is a way to efficiently and quickly enter data into the Linux tracing buffer. It takes no locks and was designed to be as non-intrusive as possible. This means it cannot allocate memory, and must use pre-allocated data. A method that is actively being worked on to have faultable system call tracepoints read user space data is to allocate per CPU buffers, and use them in the callback. The method uses a technique similar to seqcount. That is something like this: preempt_disable(); cpu = smp_processor_id(); buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu); do { cnt = nr_context_switches_cpu(cpu); migrate_disable(); preempt_enable(); ret = copy_from_user(buffer, ptr, size); preempt_disable(); migrate_enable(); } while (!ret && cnt != nr_context_switches_cpu(cpu)); if (!ret) ring_buffer_write(buffer); preempt_enable(); It's a little more involved than that, but the above is the basic logic. The idea is to acquire the current CPU buffer, disable migration, and then enable preemption. At this moment, it can safely use copy_from_user(). After reading the data from user space, it disables preemption again. It then checks to see if there was any new scheduling on this CPU. If there was, it must assume that the buffer was corrupted by another task. If there wasn't, then the buffer is still valid as only tasks in preemptable context can write to this buffer and only those that are running on the CPU. By using this method, where trace_marker open allocates the per CPU buffers, trace_marker writes can access user space and even fault it in, without having to allocate or take any locks of its own. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Luo Gengkun <luogengkun@huaweicloud.com> Cc: Wattson CI <wattson-external@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home Fixes: 3d62ab32df065 ("tracing: Fix tracing_marker may trigger page fault during preempt_disable") Reported-by: Runping Lai <runpinglai@google.com> Tested-by: Runping Lai <runpinglai@google.com> Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runpinglai@google.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-15tracing: Fix irqoff tracers on failure of acquiring calltimeSteven Rostedt1-13/+10
commit c834a97962c708ff5bb8582ca76b0e1225feb675 upstream. The functions irqsoff_graph_entry() and irqsoff_graph_return() both call func_prolog_dec() that will test if the data->disable is already set and if not, increment it and return. If it was set, it returns false and the caller exits. The caller of this function must decrement the disable counter, but misses doing so if the calltime fails to be acquired. Instead of exiting out when calltime is NULL, change the logic to do the work if it is not NULL and still do the clean up at the end of the function if it is NULL. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20251008114943.6f60f30f@gandalf.local.home Fixes: a485ea9e3ef3 ("tracing: Fix irqsoff and wakeup latency tracers when using function graph") Reported-by: Sasha Levin <sashal@kernel.org> Closes: https://lore.kernel.org/linux-trace-kernel/20251006175848.1906912-2-sashal@kernel.org/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>