summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2025-04-01cifs: Add a new xattr system.smb3_ntsd_sacl for getting or setting SACLsPali Rohár1-0/+18
Access to SACL part of SMB security descriptor is granted by SACL privilege which by default is accessible only for local administrator. But it can be granted to any other user by local GPO or AD. SACL access is not granted by DACL permissions and therefore is it possible that some user would not have access to DACLs of some file, but would have access to SACLs of all files. So it means that for accessing SACLs (either getting or setting) in some cases requires not touching or asking for DACLs. Currently Linux SMB client does not allow to get or set SACLs without touching DACLs. Which means that user without DACL access is not able to get or set SACLs even if it has access to SACLs. Fix this problem by introducing a new xattr "system.smb3_ntsd_sacl" for accessing only SACLs part of the security descriptor (therefore without DACLs and OWNER/GROUP). Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01bcachefs: fix ref leak in btree_node_read_all_replicasKent Overstreet1-0/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-01ksmbd: validate zero num_subauth before sub_auth is accessedNorbert Szetei1-0/+5
Access psid->sub_auth[psid->num_subauth - 1] without checking if num_subauth is non-zero leads to an out-of-bounds read. This patch adds a validation step to ensure num_subauth != 0 before sub_auth is accessed. Cc: stable@vger.kernel.org Signed-off-by: Norbert Szetei <norbert@doyensec.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01ksmbd: fix overflow in dacloffset bounds checkNorbert Szetei1-4/+12
The dacloffset field was originally typed as int and used in an unchecked addition, which could overflow and bypass the existing bounds check in both smb_check_perm_dacl() and smb_inherit_dacl(). This could result in out-of-bounds memory access and a kernel crash when dereferencing the DACL pointer. This patch converts dacloffset to unsigned int and uses check_add_overflow() to validate access to the DACL. Cc: stable@vger.kernel.org Signed-off-by: Norbert Szetei <norbert@doyensec.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01ksmbd: fix session use-after-free in multichannel connectionNamjae Jeon3-11/+14
There is a race condition between session setup and ksmbd_sessions_deregister. The session can be freed before the connection is added to channel list of session. This patch check reference count of session before freeing it. Cc: stable@vger.kernel.org Reported-by: Sean Heelan <seanheelan@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-31smb: client: Update IO sizes after reconnectionWang Zhaolong1-2/+22
When a SMB connection is reset and reconnected, the negotiated IO parameters (rsize/wsize) can become out of sync with the server's current capabilities. This can lead to suboptimal performance or even IO failures if the server's limits have changed. This patch implements automatic IO size renegotiation: 1. Adds cifs_renegotiate_iosize() function to update all superblocks associated with a tree connection 2. Updates each mount's rsize/wsize based on current server capabilities 3. Calls this function after successful tree connection reconnection With this change, all mount points will automatically maintain optimal and reliable IO parameters after network disruptions, using the bidirectional mapping added in previous patches. This completes the series improving connection resilience by keeping mount parameters synchronized with server capabilities. Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-31smb: client: Store original IO parameters and prevent zero IO sizesWang Zhaolong5-11/+30
During mount option processing and negotiation with the server, the original user-specified rsize/wsize values were being modified directly. This makes it impossible to recover these values after a connection reset, leading to potential degraded performance after reconnection. The other problem is that When negotiating read and write sizes, there are cases where the negotiated values might calculate to zero, especially during reconnection when server->max_read or server->max_write might be reset. In general, these values come from the negotiation response. According to MS-SMB2 specification, these values should be at least 65536 bytes. This patch improves IO parameter handling: 1. Adds vol_rsize and vol_wsize fields to store the original user-specified values separately from the negotiated values 2. Uses got_rsize/got_wsize flags to determine if values were user-specified rather than checking for non-zero values, which is more reliable 3. Adds a prevent_zero_iosize() helper function to ensure IO sizes are never negotiated down to zero, which could happen in edge cases like when server->max_read/write is zero The changes make the CIFS client more resilient to unusual server responses and reconnection scenarios, preventing potential failures when IO sizes are calculated to be zero. Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-31smb:client: smb: client: Add reverse mapping from tcon to superblocksWang Zhaolong4-1/+20
Currently, when a SMB connection is reset and renegotiated with the server, there's no way to update all related mount points with new negotiated sizes. This is because while superblocks (cifs_sb_info) maintain references to tree connections (tcon) through tcon_link structures, there is no reverse mapping from a tcon back to all the superblocks using it. This patch adds a bidirectional relationship between tcon and cifs_sb_info structures by: 1. Adding a cifs_sb_list to tcon structure with appropriate locking 2. Adding tcon_sb_link to cifs_sb_info to join the list 3. Managing the list entries during mount and umount operations The bidirectional relationship enables future functionality to locate and update all superblocks connected to a specific tree connection, such as: - Updating negotiated parameters after reconnection - Efficiently notifying all affected mounts of capability changes This is the first part of a series to improve connection resilience by keeping all mount parameters in sync with server capabilities after reconnection. Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-31cifs: remove unreachable code in cifs_get_tcp_session()Roman Smirnov1-5/+1
echo_interval is checked at mount time, the code has become unreachable. Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-31cifs: fix integer overflow in match_server()Roman Smirnov1-0/+5
The echo_interval is not limited in any way during mounting, which makes it possible to write a large number to it. This can cause an overflow when multiplying ctx->echo_interval by HZ in match_server(). Add constraints for echo_interval to smb3_fs_context_parse_param(). Found by Linux Verification Center (linuxtesting.org) with Svace. Fixes: adfeb3e00e8e1 ("cifs: Make echo interval tunable") Cc: stable@vger.kernel.org Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-31Merge tag 'bcachefs-2025-03-31' of git://evilpiepirate.org/bcachefsLinus Torvalds57-506/+856
Pull more bcachefs updates from Kent Overstreet: "All bugfixes and logging improvements" * tag 'bcachefs-2025-03-31' of git://evilpiepirate.org/bcachefs: (35 commits) bcachefs: fix bch2_write_point_to_text() units bcachefs: Log original key being moved in data updates bcachefs: BCH_JSET_ENTRY_log_bkey bcachefs: Reorder error messages that include journal debug bcachefs: Don't use designated initializers for disk_accounting_pos bcachefs: Silence errors after emergency shutdown bcachefs: fix units in rebalance_status bcachefs: bch2_ioctl_subvolume_destroy() fixes bcachefs: Clear fs_path_parent on subvolume unlink bcachefs: Change btree_insert_node() assertion to error bcachefs: Better printing of inconsistency errors bcachefs: bch2_count_fsck_err() bcachefs: Better helpers for inconsistency errors bcachefs: Consistent indentation of multiline fsck errors bcachefs: Add an "ignore unknown" option to bch2_parse_mount_opts() bcachefs: bch2_time_stats_init_no_pcpu() bcachefs: Fix bch2_fs_get_tree() error path bcachefs: fix logging in journal_entry_err_msg() bcachefs: add missing newline in bch2_trans_updates_to_text() bcachefs: print_string_as_lines: fix extra newline ...
2025-03-31Merge tag 'fs_for_v6.15-rc1' of ↵Linus Torvalds4-263/+337
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, udf, and isofs updates from Jan Kara: - conversion of ext2 to the new mount API - small folio conversion work for ext2 - a fix of an unexpected return value in udf in inode_getblk() - a fix of handling of corrupted directory in isofs * tag 'fs_for_v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix inode_getblk() return value ext2: Make ext2_params_spec static ext2: create ext2_msg_fc for use during parsing ext2: convert to the new mount API ext2: Remove reference to bh->b_page isofs: fix KMSAN uninit-value bug in do_isofs_readdir()
2025-03-31Merge tag 'exfat-for-6.15-rc1' of ↵Linus Torvalds6-88/+140
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Fix random stack corruption and incorrect error returns in exfat_get_block() - Optimize exfat_get_block() by improving checking corner cases - Fix an endless loop by self-linked chain in exfat_find_last_cluster - Remove dead EXFAT_CLUSTERS_UNTRACKED codes - Add missing shutdown check - Improve the delete performance with discard mount option * tag 'exfat-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: call bh_read in get_block only when necessary exfat: fix potential wrong error return from get_block exfat: fix missing shutdown check exfat: fix the infinite loop in exfat_find_last_cluster() exfat: fix random stack corruption after get_block exfat: remove count used cluster from exfat_statfs() exfat: support batch discard of clusters when freeing clusters
2025-03-31Merge tag 'v6.15rc-part1-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds7-43/+68
Pull smb server updates from Steve French: - Two fixes for bounds checks of open contexts - Two multichannel fixes, including one for important UAF - Oplock/lease break fix for potential ksmbd connection refcount leak - Security fix to free crypto data more securely - Fix to enable allowing Kerberos authentication by default - Two RDMA/smbdirect fixes - Minor cleanup * tag 'v6.15rc-part1-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix r_count dec/increment mismatch ksmbd: fix multichannel connection failure ksmbd: fix use-after-free in ksmbd_sessions_deregister() ksmbd: use ib_device_get_netdev() instead of calling ops.get_netdev ksmbd: use aead_request_free to match aead_request_alloc Revert "ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev()" ksmbd: add bounds check for create lease context ksmbd: add bounds check for durable handle context ksmbd: make SMB_SERVER_KERBEROS5 enable by default ksmbd: Use str_read_write() and str_true_false() helpers
2025-03-31Merge tag '6.15-rc-part1-smb3-client-fixes' of ↵Linus Torvalds18-44/+288
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - Fix for network namespace refcount leak - Multichannel fix and minor multichannel debug message cleanup - Fix potential null ptr reference in SMB3 close - Fix for special file handling when reparse points not supported by server - Two ACL fixes one for stricter ACE validation, one for incorrect perms requested - Three RFC1001 fixes: one for SMB3 mounts on port 139, one for better default hostname, and one for better session response processing - Minor update to email address for MAINTAINERS file - Allow disabling Unicode for access to old SMB1 servers - Three minor cleanups * tag '6.15-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add new mount option -o nounicode to disable SMB1 UNICODE mode cifs: Set default Netbios RFC1001 server name to hostname in UNC smb: client: Fix netns refcount imbalance causing leaks and use-after-free cifs: add validation check for the fields in smb_aces CIFS: Propagate min offload along with other parameters from primary to secondary channels. cifs: Improve establishing SMB connection with NetBIOS session cifs: Fix establishing NetBIOS session for SMB2+ connection cifs: Fix getting DACL-only xattr system.cifs_acl and system.smb3_acl cifs: Check if server supports reparse points before using them MAINTAINERS: reorder preferred email for Steve French cifs: avoid NULL pointer dereference in dbg call smb: client: Remove redundant check in smb2_is_path_accessible() smb: client: Remove redundant check in cifs_oplock_break() smb: mark the new channel addition log as informational log with cifs_info smb: minor cleanup to remove unused function declaration
2025-03-31Merge tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds20-244/+575
Pull nfsd updates from Chuck Lever: "Neil Brown contributed more scalability improvements to NFSD's open file cache, and Jeff Layton contributed a menagerie of repairs to NFSD's NFSv4 callback / backchannel implementation. Mike Snitzer contributed a change to NFS re-export support that disables support for file locking on a re-exported NFSv4 mount. This is because NFSv4 state recovery is currently difficult if not impossible for re-exported NFS mounts. The change aims to prevent data integrity exposures after the re-export server crashes. Work continues on the evolving NFSD netlink administrative API. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.15 development cycle" * tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (45 commits) NFSD: Add a Kconfig setting to enable delegated timestamps sysctl: Fixes nsm_local_state bounds nfsd: use a long for the count in nfsd4_state_shrinker_count() nfsd: remove obsolete comment from nfs4_alloc_stid nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault() nfsd: reorganize struct nfs4_delegation for better packing nfsd: handle errors from rpc_call_async() nfsd: move cb_need_restart flag into cb_flags nfsd: replace CB_GETATTR_BUSY with NFSD4_CALLBACK_RUNNING nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY nfsd: prevent callback tasks running concurrently nfsd: disallow file locking and delegations for NFSv4 reexport nfsd: filecache: drop the list_lru lock during lock gc scans nfsd: filecache: don't repeatedly add/remove files on the lru list nfsd: filecache: introduce NFSD_FILE_RECENT nfsd: filecache: use list_lru_walk_node() in nfsd_file_gc() nfsd: filecache: use nfsd_file_dispose_list() in nfsd_file_close_inode_sync() NFSD: Re-organize nfsd_file_gc_worker() nfsd: filecache: remove race handling. fs: nfs: acl: Avoid -Wflex-array-member-not-at-end warning ...
2025-03-31bcachefs: Fix null ptr deref in bch2_write_endio()Kent Overstreet1-6/+13
This was previously hard to hit since it requires racing with device removal, but splitting up io_ref uncovered it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-31bcachefs: Fix field spanning write warningKent Overstreet2-2/+3
Struct with embedded VLA... memcpy: detected field-spanning write (size 8) of single field "&gc->r.e" at fs/bcachefs/ec.c:465 (size 3) WARNING: CPU: 1 PID: 936 at fs/bcachefs/ec.c:465 bch2_trigger_stripe+0x706/0x730 Modules linked in: CPU: 1 UID: 0 PID: 936 Comm: mount.bcachefs Not tainted 6.14.0-rc6-ktest-00236-gefb0b5c62dbc #55 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:bch2_trigger_stripe+0x706/0x730 Code: b4 00 01 b9 03 00 00 00 48 89 fb 48 c7 c7 33 54 da 81 48 89 d6 49 89 d6 48 c7 c2 c3 36 db 81 e8 60 54 c5 ff 48 89 df 4c 89 f2 <0f> 0b e9 5c fd ff ff e8 fe 5e 4e 00 bf 10 00 00 00 48 c7 c6 ff ff RSP: 0018:ffff88817081f680 EFLAGS: 00010246 RAX: f8fe7dd1c56b5600 RBX: ffff888101265368 RCX: 0000000000000027 RDX: 0000000000000008 RSI: 00000000fffbffff RDI: ffff888101265368 RBP: 0000000000000000 R08: 000000000003ffff R09: ffff88817f1fe000 R10: 00000000000bfffd R11: 0000000000000004 R12: ffff8881012652c0 R13: 0000000000000000 R14: 0000000000000008 R15: ffff88817081f6c9 FS: 00007fc428bc7c80(0000) GS:ffff888179280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffd3ee4a038 CR3: 000000010a9bc000 CR4: 0000000000750eb0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xce/0x1b0 ? bch2_trigger_stripe+0x706/0x730 ? report_bug+0x11b/0x1a0 ? bch2_trigger_stripe+0x706/0x730 ? handle_bug+0x5e/0x90 ? exc_invalid_op+0x1a/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? bch2_trigger_stripe+0x706/0x730 bch2_gc_mark_key+0x2cf/0x430 bch2_check_allocations+0x1a64/0x1ed0 ? vsnprintf+0x1ad/0x420 ? bch2_check_allocations+0x191f/0x1ed0 bch2_run_recovery_passes+0x13b/0x2b0 bch2_fs_recovery+0x9b7/0x1290 ? __bch2_print+0xb2/0xf0 ? bch2_printbuf_exit+0x1e/0x30 ? print_mount_opts+0x153/0x180 bch2_fs_start+0x274/0x3b0 bch2_fs_get_tree+0x516/0x6e0 vfs_get_tree+0x21/0xa0 do_new_mount+0x153/0x350 __x64_sys_mount+0x16c/0x1f0 do_syscall_64+0x6c/0x140 ? arch_exit_to_user_mode_prepare+0x9/0x40 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-31bcachefs: Fix striping behaviourKent Overstreet1-12/+48
For striping across devices, we maintain "clocks", and we advance them by the inverse of "how much free space this device has left", so that we round robin biased in favor of devices with more free space. This code was originally trying to do EWMA-ish stuff when originally written, ~10 years ago, and was never properly cleaned up when it was realized that an EWMA is not the right approach here. That left a bug, when we rescale to keep all the clocks in the correct range and prevent overflow. It was assumed that we'd always be allocated from the device with the smallest clock hand, but that's actually not correct: with the target options, allocations will be first tried from a subset of devices, and then the entire filesystem if that fails. Thus, the rescale from the first allocation - allocating from a subset of devices - can pick the wrong rescale value and cause the rest of the clocks to go to 0, losing information. This resuls in incorrect striping behaviour when the desired number of replicas doesn't fit on the foreground target. Link: https://www.reddit.com/r/bcachefs/comments/1jn3t26/replica_allocation_not_evenly_distributed_among/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-31fuse: remove unneeded atomic set in uring creationJoanne Koong1-1/+0
When the ring is allocated, it is kzalloc-ed. ring->queue_refs will already be initialized to 0 by default. It does not need to be atomically set to 0. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: fix uring race condition for null dereference of fcJoanne Koong1-2/+2
There is a race condition leading to a kernel crash from a null dereference when attemping to access fc->lock in fuse_uring_create_queue(). fc may be NULL in the case where another thread is creating the uring in fuse_uring_create() and has set fc->ring but has not yet set ring->fc when fuse_uring_create_queue() reads ring->fc. There is another race condition as well where in fuse_uring_register(), ring->nr_queues may still be 0 and not yet set to the new value when we compare qid against it. This fix sets fc->ring only after ring->fc and ring->nr_queues have been set, which guarantees now that ring->fc is a proper pointer when any queues are created and ring->nr_queues reflects the right number of queues if ring is not NULL. We must use smp_store_release() and smp_load_acquire() semantics to ensure the ordering will remain correct where fc->ring is assigned only after ring->fc and ring->nr_queues have been assigned. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Fixes: 24fe962c86f5 ("fuse: {io-uring} Handle SQEs - register commands") Reviewed-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: Increase FUSE_NAME_MAX to PATH_MAXBernd Schubert4-5/+20
Our file system has a translation capability for S3-to-posix. The current value of 1kiB is enough to cover S3 keys, but does not allow encoding of %xx escape characters. The limit is increased to (PATH_MAX - 1), as we need 3 x 1024 and that is close to PATH_MAX (4kB) already. -1 is used as the terminating null is not included in the length calculation. Testing large file names was hard with libfuse/example file systems, so I created a new memfs that does not have a 255 file name length limitation. https://github.com/libfuse/libfuse/pull/1077 The connection is initialized with FUSE_NAME_LOW_MAX, which is set to the previous value of FUSE_NAME_MAX of 1024. With FUSE_MIN_READ_BUFFER of 8192 that is enough for two file names + fuse headers. When FUSE_INIT reply sets max_pages to a value > 1 we know that fuse daemon supports request buffers of at least 2 pages (+ header) and can therefore hold 2 x PATH_MAX file names - operations like rename or link that need two file names are no issue then. Signed-off-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: Allocate only namelen buf memory in fuse_notify_Bernd Schubert1-12/+14
fuse_notify_inval_entry and fuse_notify_delete were using fixed allocations of FUSE_NAME_MAX to hold the file name. Often that large buffers are not needed as file names might be smaller, so this uses the actual file name size to do the allocation. Signed-off-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: add default_request_timeout and max_request_timeout sysctlsJoanne Koong4-5/+65
Introduce two new sysctls, "default_request_timeout" and "max_request_timeout". These control how long (in seconds) a server can take to reply to a request. If the server does not reply by the timeout, then the connection will be aborted. The upper bound on these sysctl values is 65535. "default_request_timeout" sets the default timeout if no timeout is specified by the fuse server on mount. 0 (default) indicates no default timeout should be enforced. If the server did specify a timeout, then default_request_timeout will be ignored. "max_request_timeout" sets the max amount of time the server may take to reply to a request. 0 (default) indicates no maximum timeout. If max_request_timeout is set and the fuse server attempts to set a timeout greater than max_request_timeout, the system will use max_request_timeout as the timeout. Similarly, if default_request_timeout is greater than max_request_timeout, the system will use max_request_timeout as the timeout. If the server does not request a timeout and default_request_timeout is set to 0 but max_request_timeout is set, then the timeout will be max_request_timeout. Please note that these timeouts are not 100% precise. The request may take roughly an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max timeout due to how it's internally implemented. $ sysctl -a | grep fuse.default_request_timeout fs.fuse.default_request_timeout = 0 $ echo 65536 | sudo tee /proc/sys/fs/fuse/default_request_timeout tee: /proc/sys/fs/fuse/default_request_timeout: Invalid argument $ echo 65535 | sudo tee /proc/sys/fs/fuse/default_request_timeout 65535 $ sysctl -a | grep fuse.default_request_timeout fs.fuse.default_request_timeout = 65535 $ echo 0 | sudo tee /proc/sys/fs/fuse/default_request_timeout 0 $ sysctl -a | grep fuse.default_request_timeout fs.fuse.default_request_timeout = 0 [Luis Henriques: Limit the timeout to the range [FUSE_TIMEOUT_TIMER_FREQ, fuse_max_req_timeout]] Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Luis Henriques <luis@igalia.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: add kernel-enforced timeout option for requestsJoanne Koong6-1/+170
There are situations where fuse servers can become unresponsive or stuck, for example if the server is deadlocked. Currently, there's no good way to detect if a server is stuck and needs to be killed manually. This commit adds an option for enforcing a timeout (in seconds) for requests where if the timeout elapses without the server responding to the request, the connection will be automatically aborted. Please note that these timeouts are not 100% precise. For example, the request may take roughly an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the requested timeout due to internal implementation, in order to mitigate overhead. [SzM: Bump the API version number] Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: optmize missing FUSE_LINK supportMiklos Szeredi2-1/+11
If filesystem doesn't support FUSE_LINK (i.e. returns -ENOSYS), then remember this and next time return immediately, without incurring the overhead of a round trip to the server. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: Return EPERM rather than ENOSYS from link()Matt Johnston1-0/+2
link() is documented to return EPERM when a filesystem doesn't support the operation, return that instead. Link: https://github.com/libfuse/libfuse/issues/925 Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: removed unused function fuse_uring_create() from headerLuis Henriques1-6/+0
Function fuse_uring_create() is used only from dev_uring.c and does not need to be exposed in the header file. Furthermore, it has the wrong signature. While there, also remove the 'struct fuse_ring' forward declaration. Signed-off-by: Luis Henriques <luis@igalia.com> Reviewed-by: Bernd Schubert <bschubert@ddn.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-31fuse: {io-uring} Fix a possible req cancellation raceBernd Schubert5-13/+46
task-A (application) might be in request_wait_answer and try to remove the request when it has FR_PENDING set. task-B (a fuse-server io-uring task) might handle this request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when fetching the next request and accessed the req from the pending list in fuse_uring_ent_assign_req(). That code path was not protected by fiq->lock and so might race with task-A. For scaling reasons we better don't use fiq->lock, but add a handler to remove canceled requests from the queue. This also removes usage of fiq->lock from fuse_uring_add_req_to_ring_ent() altogether, as it was there just to protect against this race and incomplete. Also added is a comment why FR_PENDING is not cleared. Fixes: c090c8abae4b ("fuse: Add io-uring sqe commit and fetch support") Cc: <stable@vger.kernel.org> # v6.14 Reported-by: Joanne Koong <joannelkoong@gmail.com> Closes: https://lore.kernel.org/all/CAJnrk1ZgHNb78dz-yfNTpxmW7wtT88A=m-zF0ZoLXKLUHRjNTw@mail.gmail.com/ Signed-off-by: Bernd Schubert <bschubert@ddn.com> Reviewed-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2025-03-30bcachefs: fix bch2_write_point_to_text() unitsKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: Log original key being moved in data updatesKent Overstreet3-1/+34
There's something going on with the data move path; log the original key being moved for debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: BCH_JSET_ENTRY_log_bkeyKent Overstreet4-1/+34
Add a journal entry type for logging - but logging a bkey, not a string; to be used for data move path debugging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: Reorder error messages that include journal debugKent Overstreet2-7/+7
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: Don't use designated initializers for disk_accounting_posKent Overstreet7-46/+44
Not all compilers fully initialize these - they're not guaranteed to because of the union shenanigans. Fixes: https://github.com/koverstreet/bcachefs/issues/844 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: Silence errors after emergency shutdownKent Overstreet2-3/+7
We don't care about errors from asynchronous ops that were because we did an emergency shutdown; silence them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: fix units in rebalance_statusKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30bcachefs: bch2_ioctl_subvolume_destroy() fixesKent Overstreet1-2/+4
bch2_evict_subvolume_inodes() was getting stuck - due to incorrectly pruning the dcache. Also, fix missing permissions checks. Reported-by: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-30Merge tag 'bpf-next-6.15' of ↵Linus Torvalds1-11/+214
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "For this merge window we're splitting BPF pull request into three for higher visibility: main changes, res_spin_lock, try_alloc_pages. These are the main BPF changes: - Add DFA-based live registers analysis to improve verification of programs with loops (Eduard Zingerman) - Introduce load_acquire and store_release BPF instructions and add x86, arm64 JIT support (Peilin Ye) - Fix loop detection logic in the verifier (Eduard Zingerman) - Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet) - Add kfunc for populating cpumask bits (Emil Tsalapatis) - Convert various shell based tests to selftests/bpf/test_progs format (Bastien Curutchet) - Allow passing referenced kptrs into struct_ops callbacks (Amery Hung) - Add a flag to LSM bpf hook to facilitate bpf program signing (Blaise Boscaccy) - Track arena arguments in kfuncs (Ihor Solodrai) - Add copy_remote_vm_str() helper for reading strings from remote VM and bpf_copy_from_user_task_str() kfunc (Jordan Rome) - Add support for timed may_goto instruction (Kumar Kartikeya Dwivedi) - Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy) - Reduce bpf_cgrp_storage_busy false positives when accessing cgroup local storage (Martin KaFai Lau) - Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko) - Allow retrieving BTF data with BTF token (Mykyta Yatsenko) - Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix (Song Liu) - Reject attaching programs to noreturn functions (Yafang Shao) - Introduce pre-order traversal of cgroup bpf programs (Yonghong Song)" * tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits) selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid bpf: Fix out-of-bounds read in check_atomic_load/store() libbpf: Add namespace for errstr making it libbpf_errstr bpf: Add struct_ops context information to struct bpf_prog_aux selftests/bpf: Sanitize pointer prior fclose() selftests/bpf: Migrate test_xdp_vlan.sh into test_progs selftests/bpf: test_xdp_vlan: Rename BPF sections bpf: clarify a misleading verifier error message selftests/bpf: Add selftest for attaching fexit to __noreturn functions bpf: Reject attaching fexit/fmod_ret to __noreturn functions bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage bpf: Make perf_event_read_output accessible in all program types. bpftool: Using the right format specifiers bpftool: Add -Wformat-signedness flag to detect format errors selftests/bpf: Test freplace from user namespace libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID bpf: Return prog btf_id without capable check bpf: BPF token support for BPF_BTF_GET_FD_BY_ID bpf, x86: Fix objtool warning for timed may_goto bpf: Check map->record at the beginning of check_and_free_fields() ...
2025-03-29bcachefs: Clear fs_path_parent on subvolume unlinkKent Overstreet1-0/+1
This fixes recursive subvolume removal. Subvolume deletion is asynchronous; fs_path_parent, and thus the entry in the subvolume_children btree, need to be cleared when the subvolume is unlinked from the fs heirarchy - else we'll spuriously think a subvolume has children and deletion will fail. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-29Merge tag 'efi-next-for-v6.15' of ↵Linus Torvalds1-6/+4
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Decouple mixed mode startup code from the traditional x86 decompressor - Revert zero-length file hack in efivarfs - Prevent EFI zboot from using the CopyMem/SetMem boot services after ExitBootServices() - Update EFI zboot to use the ZLIB/ZSTD library interfaces directly * tag 'efi-next-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi/libstub: Avoid legacy decompressor zlib/zstd wrappers efi/libstub: Avoid CopyMem/SetMem EFI services after ExitBootServices efi: efibc: change kmalloc(size * count, ...) to kmalloc_array() efivarfs: Revert "allow creation of zero length files" x86/efi/mixed: Move mixed mode startup code into libstub x86/efi/mixed: Simplify and document thunking logic x86/efi/mixed: Remove dependency on legacy startup_32 code x86/efi/mixed: Set up 1:1 mapping of lower 4GiB in the stub x86/efi/mixed: Factor out and clean up long mode entry x86/efi/mixed: Check CPU compatibility without relying on verify_cpu() x86/efistub: Merge PE and handover entrypoints
2025-03-29bcachefs: Change btree_insert_node() assertion to errorKent Overstreet1-1/+16
Debug for https://github.com/koverstreet/bcachefs/issues/843 Print useful debug info and go emergency read-only. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-29bcachefs: Better printing of inconsistency errorsKent Overstreet10-151/+153
Build up and emit the error message for an inconsistency error all at once, instead of spread over multiple printk calls, so they're not jumbled in the dmesg log. Also, add better indenting. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-29bcachefs: bch2_count_fsck_err()Kent Overstreet3-38/+68
Factor out a helper from __bch2_fsck_err(), for counting the error in the superblock and deciding whether to print or ratelimit - will be used to replace some log_fsck_err() calls, where we want to lift out printing the error message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-29Merge tag 'v6.15-p1' of ↵Linus Torvalds4-109/+210
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Remove legacy compression interface - Improve scatterwalk API - Add request chaining to ahash and acomp - Add virtual address support to ahash and acomp - Add folio support to acomp - Remove NULL dst support from acomp Algorithms: - Library options are fuly hidden (selected by kernel users only) - Add Kerberos5 algorithms - Add VAES-based ctr(aes) on x86 - Ensure LZO respects output buffer length on compression - Remove obsolete SIMD fallback code path from arm/ghash-ce Drivers: - Add support for PCI device 0x1134 in ccp - Add support for rk3588's standalone TRNG in rockchip - Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93 - Fix bugs in tegra uncovered by multi-threaded self-test - Fix corner cases in hisilicon/sec2 Others: - Add SG_MITER_LOCAL to sg miter - Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp" * tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits) crypto: testmgr - Add multibuffer acomp testing crypto: acomp - Fix synchronous acomp chaining fallback crypto: testmgr - Add multibuffer hash testing crypto: hash - Fix synchronous ahash chaining fallback crypto: arm/ghash-ce - Remove SIMD fallback code path crypto: essiv - Replace memcpy() + NUL-termination with strscpy() crypto: api - Call crypto_alg_put in crypto_unregister_alg crypto: scompress - Fix incorrect stream freeing crypto: lib/chacha - remove unused arch-specific init support crypto: remove obsolete 'comp' compression API crypto: compress_null - drop obsolete 'comp' implementation crypto: cavium/zip - drop obsolete 'comp' implementation crypto: zstd - drop obsolete 'comp' implementation crypto: lzo - drop obsolete 'comp' implementation crypto: lzo-rle - drop obsolete 'comp' implementation crypto: lz4hc - drop obsolete 'comp' implementation crypto: lz4 - drop obsolete 'comp' implementation crypto: deflate - drop obsolete 'comp' implementation crypto: 842 - drop obsolete 'comp' implementation crypto: nx - Migrate to scomp API ...
2025-03-29exfat: call bh_read in get_block only when necessarySungjong Seo1-82/+77
With commit 11a347fb6cef ("exfat: change to get file size from DataLength"), exfat_get_block() can now handle valid_size. However, most partial unwritten blocks that could be mapped with other blocks are being inefficiently processed separately as individual blocks. Except for partial unwritten blocks that require independent processing, let's handle them simply as before. Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-29exfat: fix potential wrong error return from get_blockSungjong Seo1-0/+2
If there is no error, get_block() should return 0. However, when bh_read() returns 1, get_block() also returns 1 in the same manner. Let's set err to 0, if there is no error from bh_read() Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength") Cc: stable@vger.kernel.org Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2025-03-28bcachefs: Better helpers for inconsistency errorsKent Overstreet2-38/+110
An inconsistency error often happens as part of an event with multiple error messages, and we want to build up one single error message with proper indenting to produce more readable log messages that don't get garbled. Add new helpers that emit messages to a printbuf instead of printing them directly, next patch will convert to use them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Consistent indentation of multiline fsck errorsKent Overstreet20-94/+113
Add the new helper printbuf_indent_add_nextline(), and use it in __bch2_fsck_err() to centralize setting the indentation of multiline fsck errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: Add an "ignore unknown" option to bch2_parse_mount_opts()Kent Overstreet4-27/+26
To be used by the mount helper in userspace, where we still have options to be parsed by other layers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-28bcachefs: bch2_time_stats_init_no_pcpu()Kent Overstreet2-4/+17
Add a mode to disable automatic switching to percpu mode, useful when a time_stats will only be used by one thread and we don't want to have to flush the percpu buffers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>