Age | Commit message (Collapse) | Author | Files | Lines |
|
Access to SACL part of SMB security descriptor is granted by SACL privilege
which by default is accessible only for local administrator. But it can be
granted to any other user by local GPO or AD. SACL access is not granted by
DACL permissions and therefore is it possible that some user would not have
access to DACLs of some file, but would have access to SACLs of all files.
So it means that for accessing SACLs (either getting or setting) in some
cases requires not touching or asking for DACLs.
Currently Linux SMB client does not allow to get or set SACLs without
touching DACLs. Which means that user without DACL access is not able to
get or set SACLs even if it has access to SACLs.
Fix this problem by introducing a new xattr "system.smb3_ntsd_sacl" for
accessing only SACLs part of the security descriptor (therefore without
DACLs and OWNER/GROUP).
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Access psid->sub_auth[psid->num_subauth - 1] without checking
if num_subauth is non-zero leads to an out-of-bounds read.
This patch adds a validation step to ensure num_subauth != 0
before sub_auth is accessed.
Cc: stable@vger.kernel.org
Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The dacloffset field was originally typed as int and used in an
unchecked addition, which could overflow and bypass the existing
bounds check in both smb_check_perm_dacl() and smb_inherit_dacl().
This could result in out-of-bounds memory access and a kernel crash
when dereferencing the DACL pointer.
This patch converts dacloffset to unsigned int and uses
check_add_overflow() to validate access to the DACL.
Cc: stable@vger.kernel.org
Signed-off-by: Norbert Szetei <norbert@doyensec.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
There is a race condition between session setup and
ksmbd_sessions_deregister. The session can be freed before the connection
is added to channel list of session.
This patch check reference count of session before freeing it.
Cc: stable@vger.kernel.org
Reported-by: Sean Heelan <seanheelan@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When a SMB connection is reset and reconnected, the negotiated IO
parameters (rsize/wsize) can become out of sync with the server's
current capabilities. This can lead to suboptimal performance or
even IO failures if the server's limits have changed.
This patch implements automatic IO size renegotiation:
1. Adds cifs_renegotiate_iosize() function to update all superblocks
associated with a tree connection
2. Updates each mount's rsize/wsize based on current server capabilities
3. Calls this function after successful tree connection reconnection
With this change, all mount points will automatically maintain optimal
and reliable IO parameters after network disruptions, using the
bidirectional mapping added in previous patches.
This completes the series improving connection resilience by keeping
mount parameters synchronized with server capabilities.
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
During mount option processing and negotiation with the server, the
original user-specified rsize/wsize values were being modified directly.
This makes it impossible to recover these values after a connection
reset, leading to potential degraded performance after reconnection.
The other problem is that When negotiating read and write sizes, there are
cases where the negotiated values might calculate to zero, especially
during reconnection when server->max_read or server->max_write might be
reset. In general, these values come from the negotiation response.
According to MS-SMB2 specification, these values should be at least 65536
bytes.
This patch improves IO parameter handling:
1. Adds vol_rsize and vol_wsize fields to store the original user-specified
values separately from the negotiated values
2. Uses got_rsize/got_wsize flags to determine if values were
user-specified rather than checking for non-zero values, which is more
reliable
3. Adds a prevent_zero_iosize() helper function to ensure IO sizes are
never negotiated down to zero, which could happen in edge cases like
when server->max_read/write is zero
The changes make the CIFS client more resilient to unusual server
responses and reconnection scenarios, preventing potential failures
when IO sizes are calculated to be zero.
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Currently, when a SMB connection is reset and renegotiated with the
server, there's no way to update all related mount points with new
negotiated sizes. This is because while superblocks (cifs_sb_info)
maintain references to tree connections (tcon) through tcon_link
structures, there is no reverse mapping from a tcon back to all the
superblocks using it.
This patch adds a bidirectional relationship between tcon and
cifs_sb_info structures by:
1. Adding a cifs_sb_list to tcon structure with appropriate locking
2. Adding tcon_sb_link to cifs_sb_info to join the list
3. Managing the list entries during mount and umount operations
The bidirectional relationship enables future functionality to locate and
update all superblocks connected to a specific tree connection, such as:
- Updating negotiated parameters after reconnection
- Efficiently notifying all affected mounts of capability changes
This is the first part of a series to improve connection resilience
by keeping all mount parameters in sync with server capabilities
after reconnection.
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
echo_interval is checked at mount time, the code has become
unreachable.
Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The echo_interval is not limited in any way during mounting,
which makes it possible to write a large number to it. This can
cause an overflow when multiplying ctx->echo_interval by HZ in
match_server().
Add constraints for echo_interval to smb3_fs_context_parse_param().
Found by Linux Verification Center (linuxtesting.org) with Svace.
Fixes: adfeb3e00e8e1 ("cifs: Make echo interval tunable")
Cc: stable@vger.kernel.org
Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull more bcachefs updates from Kent Overstreet:
"All bugfixes and logging improvements"
* tag 'bcachefs-2025-03-31' of git://evilpiepirate.org/bcachefs: (35 commits)
bcachefs: fix bch2_write_point_to_text() units
bcachefs: Log original key being moved in data updates
bcachefs: BCH_JSET_ENTRY_log_bkey
bcachefs: Reorder error messages that include journal debug
bcachefs: Don't use designated initializers for disk_accounting_pos
bcachefs: Silence errors after emergency shutdown
bcachefs: fix units in rebalance_status
bcachefs: bch2_ioctl_subvolume_destroy() fixes
bcachefs: Clear fs_path_parent on subvolume unlink
bcachefs: Change btree_insert_node() assertion to error
bcachefs: Better printing of inconsistency errors
bcachefs: bch2_count_fsck_err()
bcachefs: Better helpers for inconsistency errors
bcachefs: Consistent indentation of multiline fsck errors
bcachefs: Add an "ignore unknown" option to bch2_parse_mount_opts()
bcachefs: bch2_time_stats_init_no_pcpu()
bcachefs: Fix bch2_fs_get_tree() error path
bcachefs: fix logging in journal_entry_err_msg()
bcachefs: add missing newline in bch2_trans_updates_to_text()
bcachefs: print_string_as_lines: fix extra newline
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2, udf, and isofs updates from Jan Kara:
- conversion of ext2 to the new mount API
- small folio conversion work for ext2
- a fix of an unexpected return value in udf in inode_getblk()
- a fix of handling of corrupted directory in isofs
* tag 'fs_for_v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix inode_getblk() return value
ext2: Make ext2_params_spec static
ext2: create ext2_msg_fc for use during parsing
ext2: convert to the new mount API
ext2: Remove reference to bh->b_page
isofs: fix KMSAN uninit-value bug in do_isofs_readdir()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Fix random stack corruption and incorrect error returns in
exfat_get_block()
- Optimize exfat_get_block() by improving checking corner cases
- Fix an endless loop by self-linked chain in exfat_find_last_cluster
- Remove dead EXFAT_CLUSTERS_UNTRACKED codes
- Add missing shutdown check
- Improve the delete performance with discard mount option
* tag 'exfat-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: call bh_read in get_block only when necessary
exfat: fix potential wrong error return from get_block
exfat: fix missing shutdown check
exfat: fix the infinite loop in exfat_find_last_cluster()
exfat: fix random stack corruption after get_block
exfat: remove count used cluster from exfat_statfs()
exfat: support batch discard of clusters when freeing clusters
|
|
Pull smb server updates from Steve French:
- Two fixes for bounds checks of open contexts
- Two multichannel fixes, including one for important UAF
- Oplock/lease break fix for potential ksmbd connection refcount leak
- Security fix to free crypto data more securely
- Fix to enable allowing Kerberos authentication by default
- Two RDMA/smbdirect fixes
- Minor cleanup
* tag 'v6.15rc-part1-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix r_count dec/increment mismatch
ksmbd: fix multichannel connection failure
ksmbd: fix use-after-free in ksmbd_sessions_deregister()
ksmbd: use ib_device_get_netdev() instead of calling ops.get_netdev
ksmbd: use aead_request_free to match aead_request_alloc
Revert "ksmbd: fix missing RDMA-capable flag for IPoIB device in ksmbd_rdma_capable_netdev()"
ksmbd: add bounds check for create lease context
ksmbd: add bounds check for durable handle context
ksmbd: make SMB_SERVER_KERBEROS5 enable by default
ksmbd: Use str_read_write() and str_true_false() helpers
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull smb client updates from Steve French:
- Fix for network namespace refcount leak
- Multichannel fix and minor multichannel debug message cleanup
- Fix potential null ptr reference in SMB3 close
- Fix for special file handling when reparse points not supported by
server
- Two ACL fixes one for stricter ACE validation, one for incorrect
perms requested
- Three RFC1001 fixes: one for SMB3 mounts on port 139, one for better
default hostname, and one for better session response processing
- Minor update to email address for MAINTAINERS file
- Allow disabling Unicode for access to old SMB1 servers
- Three minor cleanups
* tag '6.15-rc-part1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Add new mount option -o nounicode to disable SMB1 UNICODE mode
cifs: Set default Netbios RFC1001 server name to hostname in UNC
smb: client: Fix netns refcount imbalance causing leaks and use-after-free
cifs: add validation check for the fields in smb_aces
CIFS: Propagate min offload along with other parameters from primary to secondary channels.
cifs: Improve establishing SMB connection with NetBIOS session
cifs: Fix establishing NetBIOS session for SMB2+ connection
cifs: Fix getting DACL-only xattr system.cifs_acl and system.smb3_acl
cifs: Check if server supports reparse points before using them
MAINTAINERS: reorder preferred email for Steve French
cifs: avoid NULL pointer dereference in dbg call
smb: client: Remove redundant check in smb2_is_path_accessible()
smb: client: Remove redundant check in cifs_oplock_break()
smb: mark the new channel addition log as informational log with cifs_info
smb: minor cleanup to remove unused function declaration
|
|
Pull nfsd updates from Chuck Lever:
"Neil Brown contributed more scalability improvements to NFSD's open
file cache, and Jeff Layton contributed a menagerie of repairs to
NFSD's NFSv4 callback / backchannel implementation.
Mike Snitzer contributed a change to NFS re-export support that
disables support for file locking on a re-exported NFSv4 mount. This
is because NFSv4 state recovery is currently difficult if not
impossible for re-exported NFS mounts. The change aims to prevent data
integrity exposures after the re-export server crashes.
Work continues on the evolving NFSD netlink administrative API.
Many thanks to the contributors, reviewers, testers, and bug reporters
who participated during the v6.15 development cycle"
* tag 'nfsd-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (45 commits)
NFSD: Add a Kconfig setting to enable delegated timestamps
sysctl: Fixes nsm_local_state bounds
nfsd: use a long for the count in nfsd4_state_shrinker_count()
nfsd: remove obsolete comment from nfs4_alloc_stid
nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
nfsd: reorganize struct nfs4_delegation for better packing
nfsd: handle errors from rpc_call_async()
nfsd: move cb_need_restart flag into cb_flags
nfsd: replace CB_GETATTR_BUSY with NFSD4_CALLBACK_RUNNING
nfsd: eliminate cl_ra_cblist and NFSD4_CLIENT_CB_RECALL_ANY
nfsd: prevent callback tasks running concurrently
nfsd: disallow file locking and delegations for NFSv4 reexport
nfsd: filecache: drop the list_lru lock during lock gc scans
nfsd: filecache: don't repeatedly add/remove files on the lru list
nfsd: filecache: introduce NFSD_FILE_RECENT
nfsd: filecache: use list_lru_walk_node() in nfsd_file_gc()
nfsd: filecache: use nfsd_file_dispose_list() in nfsd_file_close_inode_sync()
NFSD: Re-organize nfsd_file_gc_worker()
nfsd: filecache: remove race handling.
fs: nfs: acl: Avoid -Wflex-array-member-not-at-end warning
...
|
|
This was previously hard to hit since it requires racing with device
removal, but splitting up io_ref uncovered it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Struct with embedded VLA...
memcpy: detected field-spanning write (size 8) of single field "&gc->r.e" at fs/bcachefs/ec.c:465 (size 3)
WARNING: CPU: 1 PID: 936 at fs/bcachefs/ec.c:465 bch2_trigger_stripe+0x706/0x730
Modules linked in:
CPU: 1 UID: 0 PID: 936 Comm: mount.bcachefs Not tainted 6.14.0-rc6-ktest-00236-gefb0b5c62dbc #55
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:bch2_trigger_stripe+0x706/0x730
Code: b4 00 01 b9 03 00 00 00 48 89 fb 48 c7 c7 33 54 da 81 48 89 d6 49 89 d6 48 c7 c2 c3 36 db 81 e8 60 54 c5 ff 48 89 df 4c 89 f2 <0f> 0b e9 5c fd ff ff e8 fe 5e 4e 00 bf 10 00 00 00 48 c7 c6 ff ff
RSP: 0018:ffff88817081f680 EFLAGS: 00010246
RAX: f8fe7dd1c56b5600 RBX: ffff888101265368 RCX: 0000000000000027
RDX: 0000000000000008 RSI: 00000000fffbffff RDI: ffff888101265368
RBP: 0000000000000000 R08: 000000000003ffff R09: ffff88817f1fe000
R10: 00000000000bfffd R11: 0000000000000004 R12: ffff8881012652c0
R13: 0000000000000000 R14: 0000000000000008 R15: ffff88817081f6c9
FS: 00007fc428bc7c80(0000) GS:ffff888179280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd3ee4a038 CR3: 000000010a9bc000 CR4: 0000000000750eb0
PKRU: 55555554
Call Trace:
<TASK>
? __warn+0xce/0x1b0
? bch2_trigger_stripe+0x706/0x730
? report_bug+0x11b/0x1a0
? bch2_trigger_stripe+0x706/0x730
? handle_bug+0x5e/0x90
? exc_invalid_op+0x1a/0x50
? asm_exc_invalid_op+0x1a/0x20
? bch2_trigger_stripe+0x706/0x730
bch2_gc_mark_key+0x2cf/0x430
bch2_check_allocations+0x1a64/0x1ed0
? vsnprintf+0x1ad/0x420
? bch2_check_allocations+0x191f/0x1ed0
bch2_run_recovery_passes+0x13b/0x2b0
bch2_fs_recovery+0x9b7/0x1290
? __bch2_print+0xb2/0xf0
? bch2_printbuf_exit+0x1e/0x30
? print_mount_opts+0x153/0x180
bch2_fs_start+0x274/0x3b0
bch2_fs_get_tree+0x516/0x6e0
vfs_get_tree+0x21/0xa0
do_new_mount+0x153/0x350
__x64_sys_mount+0x16c/0x1f0
do_syscall_64+0x6c/0x140
? arch_exit_to_user_mode_prepare+0x9/0x40
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
For striping across devices, we maintain "clocks", and we advance them
by the inverse of "how much free space this device has left", so that we
round robin biased in favor of devices with more free space.
This code was originally trying to do EWMA-ish stuff when originally
written, ~10 years ago, and was never properly cleaned up when it was
realized that an EWMA is not the right approach here.
That left a bug, when we rescale to keep all the clocks in the correct
range and prevent overflow.
It was assumed that we'd always be allocated from the device with the
smallest clock hand, but that's actually not correct: with the target
options, allocations will be first tried from a subset of devices, and
then the entire filesystem if that fails.
Thus, the rescale from the first allocation - allocating from a subset
of devices - can pick the wrong rescale value and cause the rest of the
clocks to go to 0, losing information.
This resuls in incorrect striping behaviour when the desired number of
replicas doesn't fit on the foreground target.
Link: https://www.reddit.com/r/bcachefs/comments/1jn3t26/replica_allocation_not_evenly_distributed_among/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When the ring is allocated, it is kzalloc-ed. ring->queue_refs will
already be initialized to 0 by default. It does not need to be
atomically set to 0.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
There is a race condition leading to a kernel crash from a null
dereference when attemping to access fc->lock in
fuse_uring_create_queue(). fc may be NULL in the case where another
thread is creating the uring in fuse_uring_create() and has set
fc->ring but has not yet set ring->fc when fuse_uring_create_queue()
reads ring->fc. There is another race condition as well where in
fuse_uring_register(), ring->nr_queues may still be 0 and not yet set
to the new value when we compare qid against it.
This fix sets fc->ring only after ring->fc and ring->nr_queues have been
set, which guarantees now that ring->fc is a proper pointer when any
queues are created and ring->nr_queues reflects the right number of
queues if ring is not NULL. We must use smp_store_release() and
smp_load_acquire() semantics to ensure the ordering will remain correct
where fc->ring is assigned only after ring->fc and ring->nr_queues have
been assigned.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Fixes: 24fe962c86f5 ("fuse: {io-uring} Handle SQEs - register commands")
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Our file system has a translation capability for S3-to-posix.
The current value of 1kiB is enough to cover S3 keys, but
does not allow encoding of %xx escape characters.
The limit is increased to (PATH_MAX - 1), as we need
3 x 1024 and that is close to PATH_MAX (4kB) already.
-1 is used as the terminating null is not included in the
length calculation.
Testing large file names was hard with libfuse/example file systems,
so I created a new memfs that does not have a 255 file name length
limitation.
https://github.com/libfuse/libfuse/pull/1077
The connection is initialized with FUSE_NAME_LOW_MAX, which
is set to the previous value of FUSE_NAME_MAX of 1024. With
FUSE_MIN_READ_BUFFER of 8192 that is enough for two file names
+ fuse headers.
When FUSE_INIT reply sets max_pages to a value > 1 we know
that fuse daemon supports request buffers of at least 2 pages
(+ header) and can therefore hold 2 x PATH_MAX file names - operations
like rename or link that need two file names are no issue then.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
fuse_notify_inval_entry and fuse_notify_delete were using fixed allocations
of FUSE_NAME_MAX to hold the file name. Often that large buffers are not
needed as file names might be smaller, so this uses the actual file name
size to do the allocation.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Introduce two new sysctls, "default_request_timeout" and
"max_request_timeout". These control how long (in seconds) a server can
take to reply to a request. If the server does not reply by the timeout,
then the connection will be aborted. The upper bound on these sysctl
values is 65535.
"default_request_timeout" sets the default timeout if no timeout is
specified by the fuse server on mount. 0 (default) indicates no default
timeout should be enforced. If the server did specify a timeout, then
default_request_timeout will be ignored.
"max_request_timeout" sets the max amount of time the server may take to
reply to a request. 0 (default) indicates no maximum timeout. If
max_request_timeout is set and the fuse server attempts to set a
timeout greater than max_request_timeout, the system will use
max_request_timeout as the timeout. Similarly, if default_request_timeout
is greater than max_request_timeout, the system will use
max_request_timeout as the timeout. If the server does not request a
timeout and default_request_timeout is set to 0 but max_request_timeout
is set, then the timeout will be max_request_timeout.
Please note that these timeouts are not 100% precise. The request may
take roughly an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max
timeout due to how it's internally implemented.
$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 0
$ echo 65536 | sudo tee /proc/sys/fs/fuse/default_request_timeout
tee: /proc/sys/fs/fuse/default_request_timeout: Invalid argument
$ echo 65535 | sudo tee /proc/sys/fs/fuse/default_request_timeout
65535
$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 65535
$ echo 0 | sudo tee /proc/sys/fs/fuse/default_request_timeout
0
$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 0
[Luis Henriques: Limit the timeout to the range [FUSE_TIMEOUT_TIMER_FREQ,
fuse_max_req_timeout]]
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Luis Henriques <luis@igalia.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
There are situations where fuse servers can become unresponsive or
stuck, for example if the server is deadlocked. Currently, there's no
good way to detect if a server is stuck and needs to be killed manually.
This commit adds an option for enforcing a timeout (in seconds) for
requests where if the timeout elapses without the server responding to
the request, the connection will be automatically aborted.
Please note that these timeouts are not 100% precise. For example, the
request may take roughly an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond
the requested timeout due to internal implementation, in order to
mitigate overhead.
[SzM: Bump the API version number]
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
If filesystem doesn't support FUSE_LINK (i.e. returns -ENOSYS), then
remember this and next time return immediately, without incurring the
overhead of a round trip to the server.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
link() is documented to return EPERM when a filesystem doesn't support
the operation, return that instead.
Link: https://github.com/libfuse/libfuse/issues/925
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Function fuse_uring_create() is used only from dev_uring.c and does not
need to be exposed in the header file. Furthermore, it has the wrong
signature.
While there, also remove the 'struct fuse_ring' forward declaration.
Signed-off-by: Luis Henriques <luis@igalia.com>
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
task-A (application) might be in request_wait_answer and
try to remove the request when it has FR_PENDING set.
task-B (a fuse-server io-uring task) might handle this
request with FUSE_IO_URING_CMD_COMMIT_AND_FETCH, when
fetching the next request and accessed the req from
the pending list in fuse_uring_ent_assign_req().
That code path was not protected by fiq->lock and so
might race with task-A.
For scaling reasons we better don't use fiq->lock, but
add a handler to remove canceled requests from the queue.
This also removes usage of fiq->lock from
fuse_uring_add_req_to_ring_ent() altogether, as it was
there just to protect against this race and incomplete.
Also added is a comment why FR_PENDING is not cleared.
Fixes: c090c8abae4b ("fuse: Add io-uring sqe commit and fetch support")
Cc: <stable@vger.kernel.org> # v6.14
Reported-by: Joanne Koong <joannelkoong@gmail.com>
Closes: https://lore.kernel.org/all/CAJnrk1ZgHNb78dz-yfNTpxmW7wtT88A=m-zF0ZoLXKLUHRjNTw@mail.gmail.com/
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
There's something going on with the data move path; log the original key
being moved for debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a journal entry type for logging - but logging a bkey, not a string;
to be used for data move path debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Not all compilers fully initialize these - they're not guaranteed to
because of the union shenanigans.
Fixes: https://github.com/koverstreet/bcachefs/issues/844
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We don't care about errors from asynchronous ops that were because we
did an emergency shutdown; silence them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_evict_subvolume_inodes() was getting stuck - due to incorrectly
pruning the dcache.
Also, fix missing permissions checks.
Reported-by: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
"For this merge window we're splitting BPF pull request into three for
higher visibility: main changes, res_spin_lock, try_alloc_pages.
These are the main BPF changes:
- Add DFA-based live registers analysis to improve verification of
programs with loops (Eduard Zingerman)
- Introduce load_acquire and store_release BPF instructions and add
x86, arm64 JIT support (Peilin Ye)
- Fix loop detection logic in the verifier (Eduard Zingerman)
- Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet)
- Add kfunc for populating cpumask bits (Emil Tsalapatis)
- Convert various shell based tests to selftests/bpf/test_progs
format (Bastien Curutchet)
- Allow passing referenced kptrs into struct_ops callbacks (Amery
Hung)
- Add a flag to LSM bpf hook to facilitate bpf program signing
(Blaise Boscaccy)
- Track arena arguments in kfuncs (Ihor Solodrai)
- Add copy_remote_vm_str() helper for reading strings from remote VM
and bpf_copy_from_user_task_str() kfunc (Jordan Rome)
- Add support for timed may_goto instruction (Kumar Kartikeya
Dwivedi)
- Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy)
- Reduce bpf_cgrp_storage_busy false positives when accessing cgroup
local storage (Martin KaFai Lau)
- Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko)
- Allow retrieving BTF data with BTF token (Mykyta Yatsenko)
- Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix
(Song Liu)
- Reject attaching programs to noreturn functions (Yafang Shao)
- Introduce pre-order traversal of cgroup bpf programs (Yonghong
Song)"
* tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits)
selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid
bpf: Fix out-of-bounds read in check_atomic_load/store()
libbpf: Add namespace for errstr making it libbpf_errstr
bpf: Add struct_ops context information to struct bpf_prog_aux
selftests/bpf: Sanitize pointer prior fclose()
selftests/bpf: Migrate test_xdp_vlan.sh into test_progs
selftests/bpf: test_xdp_vlan: Rename BPF sections
bpf: clarify a misleading verifier error message
selftests/bpf: Add selftest for attaching fexit to __noreturn functions
bpf: Reject attaching fexit/fmod_ret to __noreturn functions
bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage
bpf: Make perf_event_read_output accessible in all program types.
bpftool: Using the right format specifiers
bpftool: Add -Wformat-signedness flag to detect format errors
selftests/bpf: Test freplace from user namespace
libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID
bpf: Return prog btf_id without capable check
bpf: BPF token support for BPF_BTF_GET_FD_BY_ID
bpf, x86: Fix objtool warning for timed may_goto
bpf: Check map->record at the beginning of check_and_free_fields()
...
|
|
This fixes recursive subvolume removal.
Subvolume deletion is asynchronous; fs_path_parent, and thus the entry
in the subvolume_children btree, need to be cleared when the subvolume
is unlinked from the fs heirarchy - else we'll spuriously think a
subvolume has children and deletion will fail.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- Decouple mixed mode startup code from the traditional x86
decompressor
- Revert zero-length file hack in efivarfs
- Prevent EFI zboot from using the CopyMem/SetMem boot services after
ExitBootServices()
- Update EFI zboot to use the ZLIB/ZSTD library interfaces directly
* tag 'efi-next-for-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi/libstub: Avoid legacy decompressor zlib/zstd wrappers
efi/libstub: Avoid CopyMem/SetMem EFI services after ExitBootServices
efi: efibc: change kmalloc(size * count, ...) to kmalloc_array()
efivarfs: Revert "allow creation of zero length files"
x86/efi/mixed: Move mixed mode startup code into libstub
x86/efi/mixed: Simplify and document thunking logic
x86/efi/mixed: Remove dependency on legacy startup_32 code
x86/efi/mixed: Set up 1:1 mapping of lower 4GiB in the stub
x86/efi/mixed: Factor out and clean up long mode entry
x86/efi/mixed: Check CPU compatibility without relying on verify_cpu()
x86/efistub: Merge PE and handover entrypoints
|
|
Debug for https://github.com/koverstreet/bcachefs/issues/843
Print useful debug info and go emergency read-only.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Build up and emit the error message for an inconsistency error all at
once, instead of spread over multiple printk calls, so they're not
jumbled in the dmesg log.
Also, add better indenting.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out a helper from __bch2_fsck_err(), for counting the error in
the superblock and deciding whether to print or ratelimit - will be used
to replace some log_fsck_err() calls, where we want to lift out printing
the error message.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Remove legacy compression interface
- Improve scatterwalk API
- Add request chaining to ahash and acomp
- Add virtual address support to ahash and acomp
- Add folio support to acomp
- Remove NULL dst support from acomp
Algorithms:
- Library options are fuly hidden (selected by kernel users only)
- Add Kerberos5 algorithms
- Add VAES-based ctr(aes) on x86
- Ensure LZO respects output buffer length on compression
- Remove obsolete SIMD fallback code path from arm/ghash-ce
Drivers:
- Add support for PCI device 0x1134 in ccp
- Add support for rk3588's standalone TRNG in rockchip
- Add Inside Secure SafeXcel EIP-93 crypto engine support in eip93
- Fix bugs in tegra uncovered by multi-threaded self-test
- Fix corner cases in hisilicon/sec2
Others:
- Add SG_MITER_LOCAL to sg miter
- Convert ubifs, hibernate and xfrm_ipcomp from legacy API to acomp"
* tag 'v6.15-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (187 commits)
crypto: testmgr - Add multibuffer acomp testing
crypto: acomp - Fix synchronous acomp chaining fallback
crypto: testmgr - Add multibuffer hash testing
crypto: hash - Fix synchronous ahash chaining fallback
crypto: arm/ghash-ce - Remove SIMD fallback code path
crypto: essiv - Replace memcpy() + NUL-termination with strscpy()
crypto: api - Call crypto_alg_put in crypto_unregister_alg
crypto: scompress - Fix incorrect stream freeing
crypto: lib/chacha - remove unused arch-specific init support
crypto: remove obsolete 'comp' compression API
crypto: compress_null - drop obsolete 'comp' implementation
crypto: cavium/zip - drop obsolete 'comp' implementation
crypto: zstd - drop obsolete 'comp' implementation
crypto: lzo - drop obsolete 'comp' implementation
crypto: lzo-rle - drop obsolete 'comp' implementation
crypto: lz4hc - drop obsolete 'comp' implementation
crypto: lz4 - drop obsolete 'comp' implementation
crypto: deflate - drop obsolete 'comp' implementation
crypto: 842 - drop obsolete 'comp' implementation
crypto: nx - Migrate to scomp API
...
|
|
With commit 11a347fb6cef ("exfat: change to get file size from DataLength"),
exfat_get_block() can now handle valid_size. However, most partial
unwritten blocks that could be mapped with other blocks are being
inefficiently processed separately as individual blocks.
Except for partial unwritten blocks that require independent processing,
let's handle them simply as before.
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
If there is no error, get_block() should return 0. However, when bh_read()
returns 1, get_block() also returns 1 in the same manner.
Let's set err to 0, if there is no error from bh_read()
Fixes: 11a347fb6cef ("exfat: change to get file size from DataLength")
Cc: stable@vger.kernel.org
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
An inconsistency error often happens as part of an event with multiple
error messages, and we want to build up one single error message with
proper indenting to produce more readable log messages that don't get
garbled.
Add new helpers that emit messages to a printbuf instead of printing
them directly, next patch will convert to use them.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add the new helper printbuf_indent_add_nextline(), and use it in
__bch2_fsck_err() to centralize setting the indentation of multiline
fsck errors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
To be used by the mount helper in userspace, where we still have options
to be parsed by other layers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a mode to disable automatic switching to percpu mode, useful when a
time_stats will only be used by one thread and we don't want to have to
flush the percpu buffers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|