linux.git - Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Age	Commit message (Collapse)	Author	Files	Lines
2025-12-07	ksmbd: fix use-after-free in session logoff	Sean Heelan	1	-4/+0
	commit 2fc9feff45d92a92cd5f96487655d5be23fb7e2b upstream. The sess->user object can currently be in use by another thread, for example if another connection has sent a session setup request to bind to the session being free'd. The handler for that connection could be in the smb2_sess_setup function which makes use of sess->user. Signed-off-by: Sean Heelan <seanheelan@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Nazar Kalashnikov <sivartiwe@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	nfsd: Replace clamp_t in nfsd4_get_drc_mem()	NeilBrown	1	-2/+4
	A recent change to clamp_t() in 6.1.y caused fs/nfsd/nfs4state.c to fail to compile with gcc-9. The code in nfsd4_get_drc_mem() was written with the assumption that when "max < min", clamp(val, min, max) would return max. This assumption is not documented as an API promise and the change caused a compile failure if it could be statically determined that "max < min". The relevant code was no longer present upstream when commit 1519fbc8832b ("minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()") landed there, so there is no upstream change to nfsd4_get_drc_mem() to backport. There is no clear case that the existing code in nfsd4_get_drc_mem() is functioning incorrectly. The goal of this patch is to permit the clean application of commit 1519fbc8832b ("minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()"), and any commits that depend on it, to LTS kernels without affecting the ability to compile those kernels. This is done by open-coding the __clamp() macro sans the built-in type checking. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220745#c0 Signed-off-by: NeilBrown <neil@brown.name> Stable-dep-of: 1519fbc8832b ("minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed_by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	smb: client: fix memory leak in cifs_construct_tcon()	Paulo Alcantara	1	-0/+1
	commit 3184b6a5a24ec9ee74087b2a550476f386df7dc2 upstream. When having a multiuser mount with domain= specified and using cifscreds, cifs_set_cifscreds() will end up setting @ctx->domainname, so it needs to be freed before leaving cifs_construct_tcon(). This fixes the following memory leak reported by kmemleak: mount.cifs //srv/share /mnt -o domain=ZELDA,multiuser,... su - testuser cifscreds add -d ZELDA -u testuser ... ls /mnt/1 ... umount /mnt echo scan > /sys/kernel/debug/kmemleak cat /sys/kernel/debug/kmemleak unreferenced object 0xffff8881203c3f08 (size 8): comm "ls", pid 5060, jiffies 4307222943 hex dump (first 8 bytes): 5a 45 4c 44 41 00 cc cc ZELDA... backtrace (crc d109a8cf): __kmalloc_node_track_caller_noprof+0x572/0x710 kstrdup+0x3a/0x70 cifs_sb_tlink+0x1209/0x1770 [cifs] cifs_get_fattr+0xe1/0xf50 [cifs] cifs_get_inode_info+0xb5/0x240 [cifs] cifs_revalidate_dentry_attr+0x2d1/0x470 [cifs] cifs_getattr+0x28e/0x450 [cifs] vfs_getattr_nosec+0x126/0x180 vfs_statx+0xf6/0x220 do_statx+0xab/0x110 __x64_sys_statx+0xd5/0x130 do_syscall_64+0xbb/0x380 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: f2aee329a68f ("cifs: set domainName when a domain-key is used in multiuser") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Cc: Jay Shin <jaeshin@redhat.com> Cc: stable@vger.kernel.org Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	cifs: fix typo in enable_gcm_256 module parameter	Steve French	1	-1/+1
	[ Upstream commit f765fdfcd8b5bce92c6aa1a517ff549529ddf590 ] Fix typo in description of enable_gcm_256 module parameter Suggested-by: Thomas Spear <speeddymon@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	exfat: check return value of sb_min_blocksize in exfat_read_boot_sector	Yongpeng Yang	1	-1/+4
	commit f2c1f631630e01821fe4c3fdf6077bc7a8284f82 upstream. sb_min_blocksize() may return 0. Check its return value to avoid accessing the filesystem super block when sb->s_blocksize is 0. Cc: stable@vger.kernel.org # v6.15 Fixes: 719c1e1829166d ("exfat: add super block operations") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Link: https://patch.msgid.link/20251104125009.2111925-3-yangyongpeng.storage@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	eventpoll: Replace rwlock with spinlock	Nam Cao	1	-113/+26
	[ Upstream commit 0c43094f8cc9d3d99d835c0ac9c4fe1ccc62babd ] The ready event list of an epoll object is protected by read-write semaphore: - The consumer (waiter) acquires the write lock and takes items. - the producer (waker) takes the read lock and adds items. The point of this design is enabling epoll to scale well with large number of producers, as multiple producers can hold the read lock at the same time. Unfortunately, this implementation may cause scheduling priority inversion problem. Suppose the consumer has higher scheduling priority than the producer. The consumer needs to acquire the write lock, but may be blocked by the producer holding the read lock. Since read-write semaphore does not support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y), we have a case of priority inversion: a higher priority consumer is blocked by a lower priority producer. This problem was reported in [1]. Furthermore, this could also cause stall problem, as described in [2]. Fix this problem by replacing rwlock with spinlock. This reduces the event bandwidth, as the producers now have to contend with each other for the spinlock. According to the benchmark from https://github.com/rouming/test-tools/blob/master/stress-epoll.c: On 12 x86 CPUs: Before After Diff threads events/ms events/ms 8 7162 4956 -31% 16 8733 5383 -38% 32 7968 5572 -30% 64 10652 5739 -46% 128 11236 5931 -47% On 4 riscv CPUs: Before After Diff threads events/ms events/ms 8 2958 2833 -4% 16 3323 3097 -7% 32 3451 3240 -6% 64 3554 3178 -11% 128 3601 3235 -10% Although the numbers look bad, it should be noted that this benchmark creates multiple threads who do nothing except constantly generating new epoll events, thus contention on the spinlock is high. For real workload, the event rate is likely much lower, and the performance drop is not as bad. Using another benchmark (perf bench epoll wait) where spinlock contention is lower, improvement is even observed on x86: On 12 x86 CPUs: Before: Averaged 110279 operations/sec (+- 1.09%), total secs = 8 After: Averaged 114577 operations/sec (+- 2.25%), total secs = 8 On 4 riscv CPUs: Before: Averaged 175767 operations/sec (+- 0.62%), total secs = 8 After: Averaged 167396 operations/sec (+- 0.23%), total secs = 8 In conclusion, no one is likely to be upset over this change. After all, spinlock was used originally for years, and the commit which converted to rwlock didn't mention a real workload, just that the benchmark numbers are nice. This patch is not exactly the revert of commit a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention"), because git revert conflicts in some places which are not obvious on the resolution. This patch is intended to be backported, therefore go with the obvious approach: - Replace rwlock_t with spinlock_t one to one - Delete list_add_tail_lockless() and chain_epi_lockless(). These were introduced to allow producers to concurrently add items to the list. But now that spinlock no longer allows producers to touch the event list concurrently, these two functions are not necessary anymore. Fixes: a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention") Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/ec92458ea357ec503c737ead0f10b2c6e4c37d47.1752581388.git.namcao@linutronix.de Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Cc: stable@vger.kernel.org Reported-by: Frederic Weisbecker <frederic@kernel.org> Closes: https://lore.kernel.org/linux-rt-users/20210825132754.GA895675@lothringen/ [1] Reported-by: Valentin Schneider <vschneid@redhat.com> Closes: https://lore.kernel.org/linux-rt-users/xhsmhttqvnall.mognet@vschneid.remote.csb/ [2] Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Florian Bezdeka <florian.bezdeka@siemens.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	btrfs: do not update last_log_commit when logging inode due to a new name	Filipe Manana	1	-1/+1
	commit bfe3d755ef7cec71aac6ecda34a107624735aac7 upstream. When logging that a new name exists, we skip updating the inode's last_log_commit field to prevent a later explicit fsync against the inode from doing nothing (as updating last_log_commit makes btrfs_inode_in_log() return true). We are detecting, at btrfs_log_inode(), that logging a new name is happening by checking the logging mode is not LOG_INODE_EXISTS, but that is not enough because we may log parent directories when logging a new name of a file in LOG_INODE_ALL mode - we need to check that the logging_new_name field of the log context too. An example scenario where this results in an explicit fsync against a directory not persisting changes to the directory is the following: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ touch /mnt/foo $ sync $ mkdir /mnt/dir # Write some data to our file and fsync it. $ xfs_io -c "pwrite -S 0xab 0 64K" -c "fsync" /mnt/foo # Add a new link to our file. Since the file was logged before, we # update it in the log tree by calling btrfs_log_new_name(). $ ln /mnt/foo /mnt/dir/bar # fsync the root directory - we expect it to persist the dentry for # the new directory "dir". $ xfs_io -c "fsync" /mnt <power fail> After mounting the fs the entry for directory "dir" does not exists, despite the explicit fsync on the root directory. Here's why this happens: 1) When we fsync the file we log the inode, so that it's present in the log tree; 2) When adding the new link we enter btrfs_log_new_name(), and since the inode is in the log tree we proceed to updating the inode in the log tree; 3) We first set the inode's last_unlink_trans to the current transaction (early in btrfs_log_new_name()); 4) We then eventually enter btrfs_log_inode_parent(), and after logging the file's inode, we call btrfs_log_all_parents() because the inode's last_unlink_trans matches the current transaction's ID (updated in the previous step); 5) So btrfs_log_all_parents() logs the root directory by calling btrfs_log_inode() for the root's inode with a log mode of LOG_INODE_ALL so that new dentries are logged; 6) At btrfs_log_inode(), because the log mode is LOG_INODE_ALL, we update root inode's last_log_commit to the last transaction that changed the inode (->last_sub_trans field of the inode), which corresponds to the current transaction's ID; 7) Then later when user space explicitly calls fsync against the root directory, we enter btrfs_sync_file(), which calls skip_inode_logging() and that returns true, since its call to btrfs_inode_in_log() returns true and there are no ordered extents (it's a directory, never has ordered extents). This results in btrfs_sync_file() returning without syncing the log or committing the current transaction, so all the updates we did when logging the new name, including logging the root directory, are not persisted. So fix this by but updating the inode's last_log_commit if we are sure we are not logging a new name (if ctx->logging_new_name is false). A test case for fstests will follow soon. Reported-by: Vyacheslav Kovalevsky <slava.kovalevskiy.2014@gmail.com> Link: https://lore.kernel.org/linux-btrfs/03c5d7ec-5b3d-49d1-95bc-8970a7f82d87@gmail.com/ Fixes: 130341be7ffa ("btrfs: always update the logged transaction when logging new names") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	fs/proc: fix uaf in proc_readdir_de()	Wei Yang	1	-3/+9
	commit 895b4c0c79b092d732544011c3cecaf7322c36a1 upstream. Pde is erased from subdir rbtree through rb_erase(), but not set the node to EMPTY, which may result in uaf access. We should use RB_CLEAR_NODE() set the erased node to EMPTY, then pde_subdir_next() will return NULL to avoid uaf access. We found an uaf issue while using stress-ng testing, need to run testcase getdent and tun in the same time. The steps of the issue is as follows: 1) use getdent to traverse dir /proc/pid/net/dev_snmp6/, and current pde is tun3; 2) in the [time windows] unregister netdevice tun3 and tun2, and erase them from rbtree. erase tun3 first, and then erase tun2. the pde(tun2) will be released to slab; 3) continue to getdent process, then pde_subdir_next() will return pde(tun2) which is released, it will case uaf access. CPU 0 \| CPU 1 ------------------------------------------------------------------------- traverse dir /proc/pid/net/dev_snmp6/ \| unregister_netdevice(tun->dev) //tun3 tun2 sys_getdents64() \| iterate_dir() \| proc_readdir() \| proc_readdir_de() \| snmp6_unregister_dev() pde_get(de); \| proc_remove() read_unlock(&proc_subdir_lock); \| remove_proc_subtree() \| write_lock(&proc_subdir_lock); [time window] \| rb_erase(&root->subdir_node, &parent->subdir); \| write_unlock(&proc_subdir_lock); read_lock(&proc_subdir_lock); \| next = pde_subdir_next(de); \| pde_put(de); \| de = next; //UAF \| rbtree of dev_snmp6 \| pde(tun3) / \ NULL pde(tun2) Link: https://lkml.kernel.org/r/20251025024233.158363-1-albin_yang@163.com Signed-off-by: Wei Yang <albinwyang@tencent.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: wangzijie <wangzijie1@honor.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	ksmbd: close accepted socket when per-IP limit rejects connection	Joshua Rogers	1	-1/+4
	commit 98a5fd31cbf72d46bf18e50b3ab0ce86d5f319a9 upstream. When the per-IP connection limit is exceeded in ksmbd_kthread_fn(), the code sets ret = -EAGAIN and continues the accept loop without closing the just-accepted socket. That leaks one socket per rejected attempt from a single IP and enables a trivial remote DoS. Release client_sk before continuing. This bug was found with ZeroPath. Cc: stable@vger.kernel.org Signed-off-by: Joshua Rogers <linux@joshua.hu> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	NFSD: free copynotify stateid in nfs4_free_ol_stateid()	Olga Kornievskaia	1	-1/+2
	commit 4aa17144d5abc3c756883e3a010246f0dba8b468 upstream. Typically copynotify stateid is freed either when parent's stateid is being close/freed or in nfsd4_laundromat if the stateid hasn't been used in a lease period. However, in case when the server got an OPEN (which created a parent stateid), followed by a COPY_NOTIFY using that stateid, followed by a client reboot. New client instance while doing CREATE_SESSION would force expire previous state of this client. It leads to the open state being freed thru release_openowner-> nfs4_free_ol_stateid() and it finds that it still has copynotify stateid associated with it. We currently print a warning and is triggerred WARNING: CPU: 1 PID: 8858 at fs/nfsd/nfs4state.c:1550 nfs4_free_ol_stateid+0xb0/0x100 [nfsd] This patch, instead, frees the associated copynotify stateid here. If the parent stateid is freed (without freeing the copynotify stateids associated with it), it leads to the list corruption when laundromat ends up freeing the copynotify state later. [ 1626.839430] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 1626.842828] Modules linked in: nfnetlink_queue nfnetlink_log bluetooth cfg80211 rpcrdma rdma_cm iw_cm ib_cm ib_core nfsd nfs_acl lockd grace nfs_localio ext4 crc16 mbcache jbd2 overlay uinput snd_seq_dummy snd_hrtimer qrtr rfkill vfat fat uvcvideo snd_hda_codec_generic videobuf2_vmalloc videobuf2_memops snd_hda_intel uvc snd_intel_dspcfg videobuf2_v4l2 videobuf2_common snd_hda_codec snd_hda_core videodev snd_hwdep snd_seq mc snd_seq_device snd_pcm snd_timer snd soundcore sg loop auth_rpcgss vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vmw_vmci vsock xfs 8021q garp stp llc mrp nvme ghash_ce e1000e nvme_core sr_mod nvme_keyring nvme_auth cdrom vmwgfx drm_ttm_helper ttm sunrpc dm_mirror dm_region_hash dm_log iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse dm_multipath dm_mod nfnetlink [ 1626.855594] CPU: 2 UID: 0 PID: 199 Comm: kworker/u24:33 Kdump: loaded Tainted: G B W 6.17.0-rc7+ #22 PREEMPT(voluntary) [ 1626.857075] Tainted: [B]=BAD_PAGE, [W]=WARN [ 1626.857573] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.24006586.BA64.2406042154 06/04/2024 [ 1626.858724] Workqueue: nfsd4 laundromat_main [nfsd] [ 1626.859304] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 1626.860010] pc : __list_del_entry_valid_or_report+0x148/0x200 [ 1626.860601] lr : __list_del_entry_valid_or_report+0x148/0x200 [ 1626.861182] sp : ffff8000881d7a40 [ 1626.861521] x29: ffff8000881d7a40 x28: 0000000000000018 x27: ffff0000c2a98200 [ 1626.862260] x26: 0000000000000600 x25: 0000000000000000 x24: ffff8000881d7b20 [ 1626.862986] x23: ffff0000c2a981e8 x22: 1fffe00012410e7d x21: ffff0000920873e8 [ 1626.863701] x20: ffff0000920873e8 x19: ffff000086f22998 x18: 0000000000000000 [ 1626.864421] x17: 20747562202c3839 x16: 3932326636383030 x15: 3030666666662065 [ 1626.865092] x14: 6220646c756f6873 x13: 0000000000000001 x12: ffff60004fd9e4a3 [ 1626.865713] x11: 1fffe0004fd9e4a2 x10: ffff60004fd9e4a2 x9 : dfff800000000000 [ 1626.866320] x8 : 00009fffb0261b5e x7 : ffff00027ecf2513 x6 : 0000000000000001 [ 1626.866938] x5 : ffff00027ecf2510 x4 : ffff60004fd9e4a3 x3 : 0000000000000000 [ 1626.867553] x2 : 0000000000000000 x1 : ffff000096069640 x0 : 000000000000006d [ 1626.868167] Call trace: [ 1626.868382] __list_del_entry_valid_or_report+0x148/0x200 (P) [ 1626.868876] _free_cpntf_state_locked+0xd0/0x268 [nfsd] [ 1626.869368] nfs4_laundromat+0x6f8/0x1058 [nfsd] [ 1626.869813] laundromat_main+0x24/0x60 [nfsd] [ 1626.870231] process_one_work+0x584/0x1050 [ 1626.870595] worker_thread+0x4c4/0xc60 [ 1626.870893] kthread+0x2f8/0x398 [ 1626.871146] ret_from_fork+0x10/0x20 [ 1626.871422] Code: aa1303e1 aa1403e3 910e8000 97bc55d7 (d4210000) [ 1626.871892] SMP: stopping secondary CPUs Reported-by: rtm@csail.mit.edu Closes: https://lore.kernel.org/linux-nfs/d8f064c1-a26f-4eed-b4f0-1f7f608f415f@oracle.com/T/#t Fixes: 624322f1adc5 ("NFSD add COPY_NOTIFY operation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	NFSv4: Fix an incorrect parameter when calling nfs4_call_sync()	Trond Myklebust	1	-3/+6
	[ Upstream commit 1f214e9c3aef2d0936be971072e991d78a174d71 ] The Smatch static checker noted that in _nfs4_proc_lookupp(), the flag RPC_TASK_TIMEOUT is being passed as an argument to nfs4_init_sequence(), which is clearly incorrect. Since LOOKUPP is an idempotent operation, nfs4_init_sequence() should not ask the server to cache the result. The RPC_TASK_TIMEOUT flag needs to be passed down to the RPC layer. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Fixes: 76998ebb9158 ("NFSv4: Observe the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	smb/server: fix possible refcount leak in smb2_sess_setup()	ZhangGuoDong	1	-0/+1
	[ Upstream commit 379510a815cb2e64eb0a379cb62295d6ade65df0 ] Reference count of ksmbd_session will leak when session need reconnect. Fix this by adding the missing ksmbd_user_session_put(). Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	smb/server: fix possible memory leak in smb2_read()	ZhangGuoDong	1	-0/+1
	[ Upstream commit 6fced056d2cc8d01b326e6fcfabaacb9850b71a4 ] Memory leak occurs when ksmbd_vfs_read() fails. Fix this by adding the missing kvfree(). Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	NFS: check if suid/sgid was cleared after a write as needed	Scott Mayhew	1	-1/+2
	[ Upstream commit 9ff022f3820a31507cb93be6661bf5f3ca0609a4 ] I noticed xfstests generic/193 and generic/355 started failing against knfsd after commit e7a8ebc305f2 ("NFSD: Offer write delegation for OPEN with OPEN4_SHARE_ACCESS_WRITE"). I ran those same tests against ONTAP (which has had write delegation support for a lot longer than knfsd) and they fail there too... so while it's a new failure against knfsd, it isn't an entirely new failure. Add the NFS_INO_REVAL_FORCED flag so that the presence of a delegation doesn't keep the inode from being revalidated to fetch the updated mode. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	NFS4: Fix state renewals missing after boot	Joshua Watt	1	-0/+1
	[ Upstream commit 9bb3baa9d1604cd20f49ae7dac9306b4037a0e7a ] Since the last renewal time was initialized to 0 and jiffies start counting at -5 minutes, any clients connected in the first 5 minutes after a reboot would have their renewal timer set to a very long interval. If the connection was idle, this would result in the client state timing out on the server and the next call to the server would return NFS4ERR_BADSESSION. Fix this by initializing the last renewal time to the current jiffies instead of 0. Signed-off-by: Joshua Watt <jpewhacker@gmail.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	smb: client: fix refcount leak in smb2_set_path_attr	Shuhao Fu	1	-0/+2
	[ Upstream commit b540de9e3b4fab3b9e10f30714a6f5c1b2a50ec3 ] Fix refcount leak in `smb2_set_path_attr` when path conversion fails. Function `cifs_get_writable_path` returns `cfile` with its reference counter `cfile->count` increased on success. Function `smb2_compound_op` would decrease the reference counter for `cfile`, as stated in its comment. By calling `smb2_rename_path`, the reference counter of `cfile` would leak if `cifs_convert_path_to_utf16` fails in `smb2_set_path_attr`. Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name") Acked-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	smb: client: validate change notify buffer before copy	Joshua Rogers	1	-2/+5
	commit 4012abe8a78fbb8869634130024266eaef7081fe upstream. SMB2_change_notify called smb2_validate_iov() but ignored the return code, then kmemdup()ed using server provided OutputBufferOffset/Length. Check the return of smb2_validate_iov() and bail out on error. Discovered with help from the ZeroPath security tooling. Signed-off-by: Joshua Rogers <linux@joshua.hu> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: stable@vger.kernel.org Fixes: e3e9463414f61 ("smb3: improve SMB3 change notification support") Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	ceph: add checking of wait_for_completion_killable() return value	Viacheslav Dubeyko	1	-1/+4
	[ Upstream commit b7ed1e29cfe773d648ca09895b92856bd3a2092d ] The Coverity Scan service has detected the calling of wait_for_completion_killable() without checking the return value in ceph_lock_wait_for_completion() [1]. The CID 1636232 defect contains explanation: "If the function returns an error value, the error value may be mistaken for a normal value. In ceph_lock_wait_for_completion(): Value returned from a function is not checked for errors before being used. (CWE-252)". The patch adds the checking of wait_for_completion_killable() return value and return the error code from ceph_lock_wait_for_completion(). [1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1636232 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	9p: sysfs_init: don't hardcode error to ENOMEM	Randall P. Embry	1	-2/+5
	[ Upstream commit 528f218b31aac4bbfc58914d43766a22ab545d48 ] v9fs_sysfs_init() always returned -ENOMEM on failure; return the actual sysfs_create_group() error instead. Signed-off-by: Randall P. Embry <rpembry@gmail.com> Message-ID: <20250926-v9fs_misc-v1-3-a8b3907fc04d@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	9p: fix /sys/fs/9p/caches overwriting itself	Randall P. Embry	1	-1/+1
	[ Upstream commit 86db0c32f16c5538ddb740f54669ace8f3a1f3d7 ] caches_show() overwrote its buffer on each iteration, so only the last cache tag was visible in sysfs output. Properly append with snprintf(buf + count, …). Signed-off-by: Randall P. Embry <rpembry@gmail.com> Message-ID: <20250926-v9fs_misc-v1-2-a8b3907fc04d@codewreck.org> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	fs/hpfs: Fix error code for new_inode() failure in mkdir/create/mknod/symlink	Yikang Yue	1	-6/+12
	[ Upstream commit 32058c38d3b79a28963a59ac0353644dc24775cd ] The function call new_inode() is a primitive for allocating an inode in memory, rather than planning disk space for it. Therefore, -ENOMEM should be returned as the error code rather than -ENOSPC. To be specific, new_inode()'s call path looks like this: new_inode new_inode_pseudo alloc_inode ops->alloc_inode (hpfs_alloc_inode) alloc_inode_sb kmem_cache_alloc_lru Therefore, the failure of new_inode() indicates a memory presure issue (-ENOMEM), not a lack of disk space. However, the current implementation of hpfs_mkdir/create/mknod/symlink incorrectly returns -ENOSPC when new_inode() fails. This patch fix this by set err to -ENOMEM before the goto statement. BTW, we also noticed that other nested calls within these four functions, like hpfs_alloc_f/dnode and hpfs_add_dirent, might also fail due to memory presure. But similarly, only -ENOSPC is returned. Addressing these will involve code modifications in other functions, and we plan to submit dedicated patches for these issues in the future. For this patch, we focus on new_inode(). Signed-off-by: Yikang Yue <yikangy2@illinois.edu> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	btrfs: mark dirty extent range for out of bound prealloc extents	austinchang	1	-0/+10
	[ Upstream commit 3b1a4a59a2086badab391687a6a0b86e03048393 ] In btrfs_fallocate(), when the allocated range overlaps with a prealloc extent and the extent starts after i_size, the range doesn't get marked dirty in file_extent_tree. This results in persisting an incorrect disk_i_size for the inode when not using the no-holes feature. This is reproducible since commit 41a2ee75aab0 ("btrfs: introduce per-inode file extent tree"), then became hidden since commit 3d7db6e8bd22 ("btrfs: don't allocate file extent tree for non regular files") and then visible again after commit 8679d2687c35 ("btrfs: initialize inode::file_extent_tree after i_mode has been set"), which fixes the previous commit. The following reproducer triggers the problem: $ cat test.sh MNT=/mnt/test DEV=/dev/vdb mkdir -p $MNT mkfs.btrfs -f -O ^no-holes $DEV mount $DEV $MNT touch $MNT/file1 fallocate -n -o 1M -l 2M $MNT/file1 umount $MNT mount $DEV $MNT len=$((1 * 1024 * 1024)) fallocate -o 1M -l $len $MNT/file1 du --bytes $MNT/file1 umount $MNT mount $DEV $MNT du --bytes $MNT/file1 umount $MNT Running the reproducer gives the following result: $ ./test.sh (...) 2097152 /mnt/test/file1 1048576 /mnt/test/file1 The difference is exactly 1048576 as we assigned. Fix by adding a call to btrfs_inode_set_file_extent_range() in btrfs_fallocate_update_isize(). Fixes: 41a2ee75aab0 ("btrfs: introduce per-inode file extent tree") Signed-off-by: austinchang <austinchang@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	smb: client: transport: avoid reconnects triggered by pending task work	Fiona Ebner	1	-1/+9
	[ Upstream commit 00be6f26a2a7c671f1402d74c4d3c30a5844660a ] When io_uring is used in the same task as CIFS, there might be unnecessary reconnects, causing issues in user-space applications like QEMU with a log like: > CIFS: VFS: \\10.10.100.81 Error -512 sending data on socket to server Certain io_uring completions might be added to task_work with notify_method being TWA_SIGNAL and thus TIF_NOTIFY_SIGNAL is set for the task. In __smb_send_rqst(), signals are masked before calling smb_send_kvec(), but the masking does not apply to TIF_NOTIFY_SIGNAL. If sk_stream_wait_memory() is reached via sock_sendmsg() while TIF_NOTIFY_SIGNAL is set, signal_pending(current) will evaluate to true there, and -EINTR will be propagated all the way from sk_stream_wait_memory() to sock_sendmsg() in smb_send_kvec(). Afterwards, __smb_send_rqst() will see that not everything was written and reconnect. Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	ksmbd: use sock_create_kern interface to create kernel socket	Namjae Jeon	1	-3/+4
	[ Upstream commit 3677ca67b9791481af16d86e47c3c7d1f2442f95 ] we should use sock_create_kern() if the socket resides in kernel space. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	orangefs: fix xattr related buffer overflow...	Mike Marshall	1	-5/+7
	[ Upstream commit 025e880759c279ec64d0f754fe65bf45961da864 ] Willy Tarreau <w@1wt.eu> forwarded me a message from Disclosure <disclosure@aisle.com> with the following warning: > The helper `xattr_key()` uses the pointer variable in the loop condition > rather than dereferencing it. As `key` is incremented, it remains non-NULL > (until it runs into unmapped memory), so the loop does not terminate on > valid C strings and will walk memory indefinitely, consuming CPU or hanging > the thread. I easily reproduced this with setfattr and getfattr, causing a kernel oops, hung user processes and corrupted orangefs files. Disclosure sent along a diff (not a patch) with a suggested fix, which I based this patch on. After xattr_key started working right, xfstest generic/069 exposed an xattr related memory leak that lead to OOM. xattr_key returns a hashed key. When adding xattrs to the orangefs xattr cache, orangefs used hash_add, a kernel hashing macro. hash_add also hashes the key using hash_log which resulted in additions to the xattr cache going to the wrong hash bucket. generic/069 tortures a single file and orangefs does a getattr for the xattr "security.capability" every time. Orangefs negative caches on xattrs which includes a kmalloc. Since adds to the xattr cache were going to the wrong bucket, every getattr for "security.capability" resulted in another kmalloc, none of which were ever freed. I changed the two uses of hash_add to hlist_add_head instead and the memory leak ceased and generic/069 quit throwing furniture. Signed-off-by: Mike Marshall <hubcap@omnibond.com> Reported-by: Stanislav Fort of Aisle Research <stanislav.fort@aisle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	exfat: limit log print for IO error	Chi Zhiling	1	-5/+6
	[ Upstream commit 6dfba108387bf4e71411b3da90b2d5cce48ba054 ] For exFAT filesystems with 4MB read_ahead_size, removing the storage device when the read operation is in progress, which cause the last read syscall spent 150s [1]. The main reason is that exFAT generates excessive log messages [2]. After applying this patch, approximately 300,000 lines of log messages were suppressed, and the delay of the last read() syscall was reduced to about 4 seconds. [1]: write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000120> read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000032> write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000119> read(4, 0x7fccf28ae000, 131072) = -1 EIO (Input/output error) <150.186215> [2]: [ 333.696603] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5) [ 333.697378] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5) [ 333.698156] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5) Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	ext4: increase IO priority of fastcommit	Julian Sun	1	-1/+1
	[ Upstream commit 46e75c56dfeafb6756773b71cabe187a6886859a ] The following code paths may result in high latency or even task hangs: 1. fastcommit io is throttled by wbt. 2. jbd2_fc_wait_bufs() might wait for a long time while JBD2_FAST_COMMIT_ONGOING is set in journal->flags, and then jbd2_journal_commit_transaction() waits for the JBD2_FAST_COMMIT_ONGOING bit for a long time while holding the write lock of j_state_lock. 3. start_this_handle() waits for read lock of j_state_lock which results in high latency or task hang. Given the fact that ext4_fc_commit() already modifies the current process' IO priority to match that of the jbd2 thread, it should be reasonable to match jbd2's IO submission flags as well. Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Julian Sun <sunjunchao@bytedance.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-ID: <20250827121812.1477634-1-sunjunchao@bytedance.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	fs: ext4: change GFP_KERNEL to GFP_NOFS to avoid deadlock	chuguangqing	1	-1/+1
	[ Upstream commit 1534f72dc2a11ded38b0e0268fbcc0ca24e9fd4a ] The parent function ext4_xattr_inode_lookup_create already uses GFP_NOFS for memory alloction, so the function ext4_xattr_inode_cache_find should use same gfp_flag. Signed-off-by: chuguangqing <chuguangqing@inspur.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	nfs4_setup_readdir(): insufficient locking for ->d_parent->d_inode dereferencing	Al Viro	1	-0/+2
	[ Upstream commit a890a2e339b929dbd843328f9a92a1625404fe63 ] Theoretically it's an oopsable race, but I don't believe one can manage to hit it on real hardware; might become doable on a KVM, but it still won't be easy to attack. Anyway, it's easy to deal with - since xdr_encode_hyper() is just a call of put_unaligned_be64(), we can put that under ->d_lock and be done with that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	NFSv4.1: fix mount hang after CREATE_SESSION failure	Anthony Iliopoulos	1	-0/+3
	[ Upstream commit bf75ad096820fee5da40e671ebb32de725a1c417 ] When client initialization goes through server trunking discovery, it schedules the state manager and then sleeps waiting for nfs_client initialization completion. The state manager can fail during state recovery, and specifically in lease establishment as nfs41_init_clientid() will bail out in case of errors returned from nfs4_proc_create_session(), without ever marking the client ready. The session creation can fail for a variety of reasons e.g. during backchannel parameter negotiation, with status -EINVAL. The error status will propagate all the way to the nfs4_state_manager but the client status will not be marked, and thus the mount process will remain blocked waiting. Fix it by adding -EINVAL error handling to nfs4_state_manager(). Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	NFSv4: handle ERR_GRACE on delegation recalls	Olga Kornievskaia	1	-2/+2
	[ Upstream commit be390f95242785adbf37d7b8a5101dd2f2ba891b ] RFC7530 states that clients should be prepared for the return of NFS4ERR_GRACE errors for non-reclaim lock and I/O requests. Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	jfs: fix uninitialized waitqueue in transaction manager	Shaurya Rane	1	-4/+5
	[ Upstream commit 300b072df72694ea330c4c673c035253e07827b8 ] The transaction manager initialization in txInit() was not properly initializing TxBlock[0].waitor waitqueue, causing a crash when txEnd(0) is called on read-only filesystems. When a filesystem is mounted read-only, txBegin() returns tid=0 to indicate no transaction. However, txEnd(0) still gets called and tries to access TxBlock[0].waitor via tid_to_tblock(0), but this waitqueue was never initialized because the initialization loop started at index 1 instead of 0. This causes a 'non-static key' lockdep warning and system crash: INFO: trying to register non-static key in txEnd Fix by ensuring all transaction blocks including TxBlock[0] have their waitqueues properly initialized during txInit(). Reported-by: syzbot+c4f3462d8b2ad7977bea@syzkaller.appspotmail.com Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	jfs: Verify inode mode when loading from disk	Tetsuo Handa	1	-1/+7
	[ Upstream commit 7a5aa54fba2bd591b22b9b624e6baa9037276986 ] The inode mode loaded from corrupted disk can be invalid. Do like what commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk") does. Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	allow finish_no_open(file, ERR_PTR(-E...))	Al Viro	1	-4/+6
	[ Upstream commit fe91e078b60d1beabf5cef4a37c848457a6d2dfb ] ... allowing any ->lookup() return value to be passed to it. Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	ntfs3: pretend $Extend records as regular files	Tetsuo Handa	1	-0/+1
	[ Upstream commit 4e8011ffec79717e5fdac43a7e79faf811a384b7 ] Since commit af153bb63a33 ("vfs: catch invalid modes in may_open()") requires any inode be one of S_IFDIR/S_IFLNK/S_IFREG/S_IFCHR/S_IFBLK/ S_IFIFO/S_IFSOCK type, use S_IFREG for $Extend records. Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs()	Ryusuke Konishi	1	-3/+0
	commit fb881cd7604536b17a1927fb0533f9a6982ffcc5 upstream. After commit c0e473a0d226 ("block: fix race between set_blocksize and read paths") was merged, set_blocksize() called by sb_set_blocksize() now locks the inode of the backing device file. As a result of this change, syzbot started reporting deadlock warnings due to a circular dependency involving the semaphore "ns_sem" of the nilfs object, the inode lock of the backing device file, and the locks that this inode lock is transitively dependent on. This is caused by a new lock dependency added by the above change, since init_nilfs() calls sb_set_blocksize() in the lock section of "ns_sem". However, these warnings are false positives because init_nilfs() is called in the early stage of the mount operation and the filesystem has not yet started. The reason why "ns_sem" is locked in init_nilfs() was to avoid a race condition in nilfs_fill_super() caused by sharing a nilfs object among multiple filesystem instances (super block structures) in the early implementation. However, nilfs objects and super block structures have long ago become one-to-one, and there is no longer any need to use the semaphore there. So, fix this issue by removing the use of the semaphore "ns_sem" in init_nilfs(). Link: https://lkml.kernel.org/r/20250503053327.12294-1-konishi.ryusuke@gmail.com Fixes: c0e473a0d226 ("block: fix race between set_blocksize and read paths") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+00f7f5b884b117ee6773@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=00f7f5b884b117ee6773 Tested-by: syzbot+00f7f5b884b117ee6773@syzkaller.appspotmail.com Reported-by: syzbot+f30591e72bfc24d4715b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f30591e72bfc24d4715b Tested-by: syzbot+f30591e72bfc24d4715b@syzkaller.appspotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mahmoud Adam <mngyadam@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	direct_write_fallback(): on error revert the ->ki_pos update from buffered write	Al Viro	1	-0/+1
	commit 8287474aa5ffb41df52552c4ae4748e791d2faf2 upstream. If we fail filemap_write_and_wait_range() on the range the buffered write went into, we only report the "number of bytes which we direct-written", to quote the comment in there. Which is fine, but buffered write has already advanced iocb->ki_pos, so we need to roll that back. Otherwise we end up with e.g. write(2) advancing position by more than the amount it reports having written. Fixes: 182c25e9c157 "filemap: update ki_pos in generic_perform_write" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Message-Id: <20230827214518.GU3390869@ZenIV> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Mahmoud Adam <mngyadam@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	fs: factor out a direct_write_fallback helper	Christoph Hellwig	1	-0/+41
	commit 44fff0fa08ec5a6d9d5fb05443a36d854d0ece4d upstream. Add a helper dealing with handling the syncing of a buffered write fallback for direct I/O. Link: https://lkml.kernel.org/r/20230601145904.1385409-10-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Hannes Reinecke <hare@suse.de> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Xiubo Li <xiubli@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [backing_dev_info still being used here. do small changes to the patch to keep the out label. Which means replacing all returns to goto out.] Signed-off-by: Mahmoud Adam <mngyadam@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	filemap: update ki_pos in generic_perform_write	Christoph Hellwig	4	-10/+3
	commit 182c25e9c157f37bd0ab5a82fe2417e2223df459 upstream. All callers of generic_perform_write need to updated ki_pos, move it into common code. Link: https://lkml.kernel.org/r/20230601145904.1385409-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mahmoud Adam <mngyadam@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	filemap: add a kiocb_invalidate_post_direct_write helper	Christoph Hellwig	2	-18/+4
	commit c402a9a9430b670926decbb284b756ee6f47c1ec upstream. Add a helper to invalidate page cache after a dio write. Link: https://lkml.kernel.org/r/20230601145904.1385409-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Xiubo Li <xiubli@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mahmoud Adam <mngyadam@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	NFSD: Fix crash in nfsd4_read_release()	Chuck Lever	1	-3/+4
	commit abb1f08a2121dd270193746e43b2a9373db9ad84 upstream. When tracing is enabled, the trace_nfsd_read_done trace point crashes during the pynfs read.testNoFh test. Fixes: 15a8b55dbb1b ("nfsd: call op_release, even when op_func returns an error") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-07	btrfs: use smp_mb__after_atomic() when forcing COW in create_pending_snapshot()	Filipe Manana	1	-1/+1
	[ Upstream commit 45c222468d33202c07c41c113301a4b9c8451b8f ] After setting the BTRFS_ROOT_FORCE_COW flag on the root we are doing a full write barrier, smp_wmb(), but we don't need to, all we need is a smp_mb__after_atomic(). The use of the smp_wmb() is from the old days when we didn't use a bit and used instead an int field in the root to signal if cow is forced. After the int field was changed to a bit in the root's state (flags field), we forgot to update the memory barrier in create_pending_snapshot() to smp_mb__after_atomic(), but we did the change in commit_fs_roots() after clearing BTRFS_ROOT_FORCE_COW. That happened in commit 27cdeb7096b8 ("Btrfs: use bitfield instead of integer data type for the some variants in btrfs_root"). On the reader side, in should_cow_block(), we also use the counterpart smp_mb__before_atomic() which generates further confusion. So change the smp_wmb() to smp_mb__after_atomic(). In fact we don't even need any barrier at all since create_pending_snapshot() is called in the critical section of a transaction commit and therefore no one can concurrently join/attach the transaction, or start a new one, until the transaction is unblocked. By the time someone starts a new transaction and enters should_cow_block(), a lot of implicit memory barriers already took place by having acquired several locks such as fs_info->trans_lock and extent buffer locks on the root node at least. Nevertlheless, for consistency use smp_mb__after_atomic() after setting the force cow bit in create_pending_snapshot(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	btrfs: always drop log root tree reference in btrfs_replay_log()	Filipe Manana	2	-2/+1
	[ Upstream commit 2f5b8095ea47b142c56c09755a8b1e14145a2d30 ] Currently we have this odd behaviour: 1) At btrfs_replay_log() we drop the reference of the log root tree if the call to btrfs_recover_log_trees() failed; 2) But if the call to btrfs_recover_log_trees() did not fail, we don't drop the reference in btrfs_replay_log() - we expect that btrfs_recover_log_trees() does it in case it returns success. Let's simplify this and make btrfs_replay_log() always drop the reference on the log root tree, not only this simplifies code as it's what makes sense since it's btrfs_replay_log() who grabbed the reference in the first place. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	btrfs: scrub: replace max_t()/min_t() with clamp() in scrub_throttle_dev_io()	Thorsten Blum	1	-2/+1
	[ Upstream commit a7f3dfb8293c4cee99743132d69863a92e8f4875 ] Replace max_t() followed by min_t() with a single clamp(). As was pointed by David Laight in https://lore.kernel.org/linux-btrfs/20250906122458.75dfc8f0@pumpkin/ the calculation may overflow u32 when the input value is too large, so clamp_t() is not used. In practice the expected values are in range of megabytes to gigabytes (throughput limit) so the bug would not happen. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: David Sterba <dsterba@suse.com> [ Use clamp() and add explanation. ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	btrfs: zoned: refine extent allocator hint selection	Naohiro Aota	1	-2/+4
	[ Upstream commit 0d703963d297964451783e1a0688ebdf74cd6151 ] The hint block group selection in the extent allocator is wrong in the first place, as it can select the dedicated data relocation block group for the normal data allocation. Since we separated the normal data space_info and the data relocation space_info, we can easily identify a block group is for data relocation or not. Do not choose it for the normal data allocation. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-29	ksmbd: transport_ipc: validate payload size before reading handle	Qianchang Zhao	1	-1/+7
	commit 6f40e50ceb99fc8ef37e5c56e2ec1d162733fef0 upstream. handle_response() dereferences the payload as a 4-byte handle without verifying that the declared payload size is at least 4 bytes. A malformed or truncated message from ksmbd.mountd can lead to a 4-byte read past the declared payload size. Validate the size before dereferencing. This is a minimal fix to guard the initial handle read. Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: stable@vger.kernel.org Reported-by: Qianchang Zhao <pioooooooooip@gmail.com> Signed-off-by: Qianchang Zhao <pioooooooooip@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	ksmbd: browse interfaces list on FSCTL_QUERY_INTERFACE_INFO IOCTL	Namjae Jeon	6	-38/+41
	[ Upstream commit b2d99376c5d61eb60ffdb6c503e4b6c8f9712ddd ] ksmbd.mount will give each interfaces list and bind_interfaces_only flags to ksmbd server. Previously, the interfaces list was sent only when bind_interfaces_only was enabled. ksmbd server browse only interfaces list given from ksmbd.conf on FSCTL_QUERY_INTERFACE_INFO IOCTL. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	xfs: always warn about deprecated mount options	Darrick J. Wong	1	-12/+21
	[ Upstream commit 630785bfbe12c3ee3ebccd8b530a98d632b7e39d ] The deprecation of the 'attr2' mount option in 6.18 wasn't entirely successful because nobody noticed that the kernel never printed a warning about attr2 being set in fstab if the only xfs filesystem is the root fs; the initramfs mounts the root fs with no mount options; and the init scripts only conveyed the fstab options by remounting the root fs. Fix this by making it complain all the time. Cc: stable@vger.kernel.org # v5.13 Fixes: 92cf7d36384b99 ("xfs: Skip repetitive warnings about mount options") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org> [ Update existing xfs_fs_warn_deprecated() callers ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	f2fs: fix wrong block mapping for multi-devices	Jaegeuk Kim	1	-1/+1
	[ Upstream commit 9d5c4f5c7a2c7677e1b3942772122b032c265aae ] Assuming the disk layout as below, disk0: 0 --- 0x00035abfff disk1: 0x00035ac000 --- 0x00037abfff disk2: 0x00037ac000 --- 0x00037ebfff and we want to read data from offset=13568 having len=128 across the block devices, we can illustrate the block addresses like below. 0 .. 0x00037ac000 ------------------- 0x00037ebfff, 0x00037ec000 ------- \| ^ ^ ^ \| fofs 0 13568 13568+128 \| ------------------------------------------------------ \| LBA 0x37e8aa9 0x37ebfa9 0x37ec029 --- map 0x3caa9 0x3ffa9 In this example, we should give the relative map of the target block device ranging from 0x3caa9 to 0x3ffa9 where the length should be calculated by 0x37ebfff + 1 - 0x37ebfa9. In the below equation, however, map->m_pblk was supposed to be the original address instead of the one from the target block address. - map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk); Cc: stable@vger.kernel.org Fixes: 71f2c8206202 ("f2fs: multidevice: support direct IO") Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	f2fs: factor a f2fs_map_blocks_cached helper	Christoph Hellwig	1	-27/+38
	[ Upstream commit 0094e98bd1477a6b7d97c25b47b19a7317c35279 ] Add a helper to deal with everything needed to return a f2fs_map_blocks structure based on a lookup in the extent cache. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Stable-dep-of: 9d5c4f5c7a2c ("f2fs: fix wrong block mapping for multi-devices") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>