Age | Commit message (Collapse) | Author | Files | Lines |
|
Don't check root dir dentry in cifs_dentry_needs_reval() as its inode
was created before the cfid, so the time check will fail and trigger an
unnecessary revalidation.
Also account for dir_cache_timeout in time comparison, because if we
have a cached dir, we have a lease for it, and, thus, may assume its
children are (still) valid. To confirm that, let the ac*max checks
go through for granular results.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
A dentry is passed to cifs_statfs(), so pass down d_is_dir() to
smb2_queryfs() so we can cache/reuse this dir.
Other:
- make smb2_compound_op a static function, as it's not used anywhere
else
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
When we have an inode on upper levels, pass is_dir down to
smb2_query_path_info() so we can lookup for a cached dir there.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Replace with ->unique_id and ->dtype -- the only fields used from
cifs_fattr.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
The file struct passed down to cifs_readdir() is a stack variable, which
means it makes no sense to keep/track it across a cached dir lifetime.
Instead, use it to track concurrent accesses to the same cached path,
and wait for the previous one to finish filling/emitting.
Without this patch, virtually every 'ls' will issue a Find request,
even when we have both directory and dirents cached and valid.
With the patch, even on highly concurrent scenarios, the chances of
cache hit are increased a lot => reduced network calls.
(*) The retry/wait path adds a cond_resched() call to allow the
concurrent readdir to finish, but this path is only reached ~1/10000
times (xfstests generic/241 (dbench 4 clients) on my machine).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Currently, even when we have a valid cached dir, cifs_readdir will not
make use of it for the Find request, and will reopen a separate handle
in query_dir_first.
Fix this by setting cifsFile->fid to cfid->fid and resetting search info
parameters. Also add cifs_search_info->reset_scan to indicate
SMB2_query_directory_init to include the SMB2_RESTART_SCANS flag.
With this, we use query_dir_next directly instead of query_dir_first.
This patch also keeps the cfid reference all through cifs_readdir().
To prevent bogus/invalid usage of it, check if cfid is still valid after
each possible network call (i.e. where possible reconnects may have
happened).
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
In SMB2_open_init(), lookup for a cached parent of oparms->path
(child/target) and set ParentLeaseKey in lease context if found.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Add support for find_cached_dir() to lookup for a path's parent.
* will be used to set ParentLeaseKey on opens
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Add cached_fid->state to track a cfid state more accurately and
flexibly.
Remove bools ->has_lease, ->is_open, ->file_all_info_valid.
Add states:
- CFID_IS_OPEN
- CFID_IS_OPENING
- CFID_INVALID
See cached_dir.h for more info on state flags.
Note CFID_IS_OPEN covers both ->has_lease and ->is_open because,
after all, we only have a lease if we opened it to begin with.
Other changes:
- ->file_all_info becomes a pointer and its validity becomes its
allocation (NULL or not NULL)
- move full cfid initialization to the end of open_cached_dir() (instead
of in between calls), except CFID_IS_OPEN
- remove unnecessary list lock when parsing response in
open_cached_dir()
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
s/drop_cached_dir_by_name/drop_cached_dir/
Make it a generic find + invalidate function to replace
drop_cached_dir_by_name() and cached_dir_lease_break().
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Add find_cfid() to allow looking up the 3 modes (path, dentry,
and lease key) used to find a cached dir.
Caller exposed function (find_cached_dir()) checks if cfid is
mid-creation in open_cached_dir() and retries the lookup, avoiding
opening the same path again.
Changes:
- expose find_cached_dir()
- add CFID_LOOKUP_* modes
- remove @lookup_only arg from open_cached_dir(), replace, in calllers,
with find_cached_dir() where it was true
- remove open_cached_dir_by_dentry(), replace with find_cached_dir()
- use find_cached_dir() in cached_dir_lease_break()
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Clearer semantics aside, this makes find_cached_dir() open to
enhancements (and easier to do so).
Other:
- move initialization part into init_cached_dir() and add
find_cached_dir() for lookups
- drop_cached_dir_by_name():
* use find_cached_dir()
* drop no longer used args
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
free_cached_dir() is no longer used anywhere else.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
close_all_cached_dirs(), invalidate_all_cached_dirs() and
free_cached_dirs() have become too similar now, merge their
functionality in a single static invalidate_all_cfids() function.
This also allows removing free_cached_dirs() altogether as it only
requires cancelling the work afterwards (done directly in
tconInfoFree()).
Other changes:
- remove struct cached_dir_dentry
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
cfids are always on some list throughout its lifetime, so we can drop
this field.
Decrement cfids->num_entries now based on valid -> invalid transition.
Other changes:
- add helpers cfid_expired() and cfid_is_valid()
- check cfid_is_valid() even on success when leaving open_cached_dir()
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Since any cleanup is done on laundromat, the dying list can be removed.
- entries stays on the main list until they're schedule for cleanup
(->last_access_time = 1)
- cached_fids->num_entries is decremented only when cfid transitions
from on_list true -> false
cached_fid lifecycle on the list becomes:
- list_add() on find_or_create_cached_dir()
- list_move() to local list on laundromat
- list_del_init() from local list
- list_del() on release callback
Other changes:
- add invalidate_cfid() helper to centralize cfid invalidation criteria
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Move cfid to dying list directly on cached_dir_lease_break(), and
schedule laundromat for cleanup there too.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
Make put_work an 'async dput' and then move cfid to dying list so
laundromat can clean it up (put last ref).
Other changes:
- move SMB2_close() from release callback to laundromat
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
We can do the same cleanup on laundromat.
On invalidate_all_cached_dirs(), run laundromat worker with 0 timeout
and flush it for immediate + sync cleanup.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
|
|
fs/smb/common/cifs_arc4.c has an implementation of ARC4, but a copy of
this same code is also present in lib/crypto/arc4.c to serve the other
users of this legacy algorithm in the kernel. Remove the duplicate
implementation in fs/smb/, which seems to have been added because of a
misunderstanding, and just use the lib/crypto/ one.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Rename of open files in SMB2+ has been broken for a very long time,
resulting in data loss as the CIFS client would fail the rename(2)
call with -ENOENT and then removing the target file.
Fix this by implementing ->rename_pending_delete() for SMB2+, which
will rename busy files to random filenames (e.g. silly rename) during
unlink(2) or rename(2), and then marking them to delete-on-close.
Besides, introduce a FIND_RD_NO_PENDING_DELETE flag for
cifs_get_readable_path() to be used in smb2_query_path_info() and
smb2_query_reparse_point() so we don't end up reusing open handles of
files that were already removed.
Reported-by: Jean-Baptiste Denis <jbdenis@pasteur.fr>
Closes: https://marc.info/?i=16aeb380-30d4-4551-9134-4e7d1dc833c0@pasteur.fr
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: stable@vger.kernel.org
Cc: Frank Sorenson <sorenson@redhat.com>
Cc: Olga Kornievskaia <okorniev@redhat.com>
Cc: Benjamin Coddington <bcodding@redhat.com>
Cc: Scott Mayhew <smayhew@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The encryption layer can't handle the padding iovs, so flatten the
compound request into a single buffer with required padding to prevent
the server from dropping the connection when finding unaligned
compound requests.
Fixes: bc925c1216f0 ("smb: client: improve compound padding in encryption")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: linux-cifs@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull smb client fixes from Steve French:
- Fix two potential NULL pointer references
- Two debugging improvements (to help debug recent issues) a new
tracepoint, and minor improvement to DebugData
- Trivial comment cleanup
* tag '6.17-RC4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: prevent NULL pointer dereference in UTF16 conversion
smb: client: show negotiated cipher in DebugData
smb: client: add new tracepoint to trace lease break notification
smb: client: fix spellings in comments
smb: client: Fix NULL pointer dereference in cifs_debug_dirs_proc_show()
|
|
There can be a NULL pointer dereference bug here. NULL is passed to
__cifs_sfu_make_node without checks, which passes it unchecked to
cifs_strndup_to_utf16, which in turn passes it to
cifs_local_to_utf16_bytes where '*from' is dereferenced, causing a crash.
This patch adds a check for NULL 'src' in cifs_strndup_to_utf16 and
returns NULL early to prevent dereferencing NULL pointer.
Found by Linux Verification Center (linuxtesting.org) with SVACE
Signed-off-by: Makar Semyonov <m.semenov@tssltd.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Pull smb server fix from Steve French:
- fix handling filenames with ":" (colon) in them
* tag 'v6.17-rc4-ksmbd-fix' of git://git.samba.org/ksmbd:
ksmbd: allow a filename to contain colons on SMB3.1.1 posix extensions
|
|
Print the negotiated encryption cipher type in DebugData
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Add smb3_lease_break_enter to trace lease break notifications,
recording lease state, flags, epoch, and lease key. Align
smb3_lease_not_found to use the same payload and print format.
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
correct spellings in comments
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"17 hotfixes. 13 are cc:stable and the remainder address post-6.16
issues or aren't considered necessary for -stable kernels. 11 of these
fixes are for MM.
This includes a three-patch series from Harry Yoo which fixes an
intermittent boot failure which can occur on x86 systems. And a
two-patch series from Alexander Gordeev which fixes a KASAN crash on
S390 systems"
* tag 'mm-hotfixes-stable-2025-09-01-17-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: fix possible deadlock in kmemleak
x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()
mm: introduce and use {pgd,p4d}_populate_kernel()
mm: move page table sync declarations to linux/pgtable.h
proc: fix missing pde_set_flags() for net proc files
mm: fix accounting of memmap pages
mm/damon/core: prevent unnecessary overflow in damos_set_effective_quota()
kexec: add KEXEC_FILE_NO_CMA as a legal flag
kasan: fix GCC mem-intrinsic prefix with sw tags
mm/kasan: avoid lazy MMU mode hazards
mm/kasan: fix vmalloc shadow memory (de-)population races
kunit: kasan_test: disable fortify string checker on kasan_strings() test
selftests/mm: fix FORCE_READ to read input value correctly
mm/userfaultfd: fix kmap_local LIFO ordering for CONFIG_HIGHPTE
ocfs2: prevent release journal inode after journal shutdown
rust: mm: mark VmaNew as transparent
of_numa: fix uninitialized memory nodes causing kernel panic
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix a few races related to inode link count
- fix inode leak on failure to add link to inode
- move transaction aborts closer to where they happen
* tag 'for-6.17-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: avoid load/store tearing races when checking if an inode was logged
btrfs: fix race between setting last_dir_index_offset and inode logging
btrfs: fix race between logging inode and checking if it was logged before
btrfs: simplify error handling logic for btrfs_link()
btrfs: fix inode leak on failure to add link to inode
btrfs: abort transaction on failure to add link to inode
|
|
If the client sends SMB2_CREATE_POSIX_CONTEXT to ksmbd, allow the filename
to contain a colon (':'). This requires disabling the support for Alternate
Data Streams (ADS), which are denoted by a colon-separated suffix to the
filename on Windows. This should not be an issue, since this concept is not
known to POSIX anyway and the client has to explicitly request a POSIX
context to get this behavior.
Link: https://lore.kernel.org/all/f9401718e2be2ab22058b45a6817db912784ef61.camel@rx2.rx-server.de/
Signed-off-by: Philipp Kerling <pkerling@casix.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Reading /proc/fs/cifs/open_dirs may hit a NULL dereference when
tcon->cfids is NULL.
Add NULL check before accessing cfids to prevent the crash.
Reproduction:
- Mount CIFS share
- cat /proc/fs/cifs/open_dirs
Fixes: 844e5c0eb176 ("smb3 client: add way to show directory leases for improved debugging")
Signed-off-by: Wang Zhaolong <wangzhaolong@huaweicloud.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Assorted fixes for the OP-TEE based pseudo-EFI variable store
- Fix for an OOB access when looking up the same non-existing efivarfs
entry multiple times in parallel
* tag 'efi-fixes-for-v6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efivarfs: Fix slab-out-of-bounds in efivarfs_d_compare
efi: stmm: Drop unneeded null pointer check
efi: stmm: Drop unused EFI error from setup_mm_hdr arguments
efi: stmm: Do not return EFI_OUT_OF_RESOURCES on internal errors
efi: stmm: Fix incorrect buffer allocation method
|
|
Pull smb client fixes from Steve French:
- Fix possible refcount leak in compound operations
- Fix remap_file_range() return code mapping, found by generic/157
* tag 'v6.17-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
fs/smb: Fix inconsistent refcnt update
smb3 client: fix return code mapping of remap_file_range
|
|
Pull xfs fixes from Carlos Maiolino:
"The highlight I'd like to point here is related to the XFS_RT
Kconfig, which has been updated to be enabled by default now if
CONFIG_BLK_DEV_ZONED is enabled.
This also contains a few fixes for zoned devices support in XFS,
specially related to swapon requests in inodes belonging to the zoned
FS.
A null-ptr dereference fix in the xattr data, due to a mishandling of
medium errors generated by block devices is also included"
* tag 'xfs-fixes-6.17-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: do not propagate ENODATA disk errors into xattr code
xfs: reject swapon for inodes on a zoned file system earlier
xfs: kick off inodegc when failing to reserve zoned blocks
xfs: remove xfs_last_used_zone
xfs: Default XFS_RT to Y if CONFIG_BLK_DEV_ZONED is enabled
|
|
Observed on kernel 6.6 (present on master as well):
BUG: KASAN: slab-out-of-bounds in memcmp+0x98/0xd0
Call trace:
kasan_check_range+0xe8/0x190
__asan_loadN+0x1c/0x28
memcmp+0x98/0xd0
efivarfs_d_compare+0x68/0xd8
__d_lookup_rcu_op_compare+0x178/0x218
__d_lookup_rcu+0x1f8/0x228
d_alloc_parallel+0x150/0x648
lookup_open.isra.0+0x5f0/0x8d0
open_last_lookups+0x264/0x828
path_openat+0x130/0x3f8
do_filp_open+0x114/0x248
do_sys_openat2+0x340/0x3c0
__arm64_sys_openat+0x120/0x1a0
If dentry->d_name.len < EFI_VARIABLE_GUID_LEN , 'guid' can become
negative, leadings to oob. The issue can be triggered by parallel
lookups using invalid filename:
T1 T2
lookup_open
->lookup
simple_lookup
d_add
// invalid dentry is added to hash list
lookup_open
d_alloc_parallel
__d_lookup_rcu
__d_lookup_rcu_op_compare
hlist_bl_for_each_entry_rcu
// invalid dentry can be retrieved
->d_compare
efivarfs_d_compare
// oob
Fix it by checking 'guid' before cmp.
Fixes: da27a24383b2 ("efivarfs: guid part of filenames are case-insensitive")
Signed-off-by: Li Nan <linan122@huawei.com>
Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
To avoid potential UAF issues during module removal races, we use
pde_set_flags() to save proc_ops flags in PDE itself before
proc_register(), and then use pde_has_proc_*() helpers instead of directly
dereferencing pde->proc_ops->*.
However, the pde_set_flags() call was missing when creating net related
proc files. This omission caused incorrect behavior which FMODE_LSEEK was
being cleared inappropriately in proc_reg_open() for net proc files. Lars
reported it in this link[1].
Fix this by ensuring pde_set_flags() is called when register proc entry,
and add NULL check for proc_ops in pde_set_flags().
[wangzijie1@honor.com: stash pde->proc_ops in a local const variable, per Christian]
Link: https://lkml.kernel.org/r/20250821105806.1453833-1-wangzijie1@honor.com
Link: https://lkml.kernel.org/r/20250818123102.959595-1-wangzijie1@honor.com
Link: https://lore.kernel.org/all/20250815195616.64497967@chagall.paradoxon.rec/ [1]
Fixes: ff7ec8dc1b64 ("proc: use the same treatment to check proc_lseek as ones for proc_read_iter et.al")
Signed-off-by: wangzijie <wangzijie1@honor.com>
Reported-by: Lars Wendler <polynomial-c@gmx.de>
Tested-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Petr Vaněk <pv@excello.cz>
Tested by: Lars Wendler <polynomial-c@gmx.de>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Kirill A. Shutemov <k.shutemov@gmail.com>
Cc: wangzijie <wangzijie1@honor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Before calling ocfs2_delete_osb(), ocfs2_journal_shutdown() has already
been executed in ocfs2_dismount_volume(), so osb->journal must be NULL.
Therefore, the following calltrace will inevitably fail when it reaches
jbd2_journal_release_jbd_inode().
ocfs2_dismount_volume()->
ocfs2_delete_osb()->
ocfs2_free_slot_info()->
__ocfs2_free_slot_info()->
evict()->
ocfs2_evict_inode()->
ocfs2_clear_inode()->
jbd2_journal_release_jbd_inode(osb->journal->j_journal,
Adding osb->journal checks will prevent null-ptr-deref during the above
execution path.
Link: https://lkml.kernel.org/r/tencent_357489BEAEE4AED74CBD67D246DBD2C4C606@qq.com
Fixes: da5e7c87827e ("ocfs2: cleanup journal init and shutdown")
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Reported-by: syzbot+47d8cb2f2cc1517e515a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=47d8cb2f2cc1517e515a
Tested-by: syzbot+47d8cb2f2cc1517e515a@syzkaller.appspotmail.com
Reviewed-by: Mark Tinguely <mark.tinguely@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A possible inconsistent update of refcount was identified in `smb2_compound_op`.
Such inconsistent update could lead to possible resource leaks.
Why it is a possible bug:
1. In the comment section of the function, it clearly states that the
reference to `cfile` should be dropped after calling this function.
2. Every control flow path would check and drop the reference to
`cfile`, except the patched one.
3. Existing callers would not handle refcount update of `cfile` if
-ENOMEM is returned.
To fix the bug, an extra goto label "out" is added, to make sure that the
cleanup logic would always be respected. As the problem is caused by the
allocation failure of `vars`, the cleanup logic between label "finished"
and "out" can be safely ignored. According to the definition of function
`is_replayable_error`, the error code of "-ENOMEM" is not recoverable.
Therefore, the replay logic also gets ignored.
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ENODATA (aka ENOATTR) has a very specific meaning in the xfs xattr code;
namely, that the requested attribute name could not be found.
However, a medium error from disk may also return ENODATA. At best,
this medium error may escape to userspace as "attribute not found"
when in fact it's an IO (disk) error.
At worst, we may oops in xfs_attr_leaf_get() when we do:
error = xfs_attr_leaf_hasname(args, &bp);
if (error == -ENOATTR) {
xfs_trans_brelse(args->trans, bp);
return error;
}
because an ENODATA/ENOATTR error from disk leaves us with a null bp,
and the xfs_trans_brelse will then null-deref it.
As discussed on the list, we really need to modify the lower level
IO functions to trap all disk errors and ensure that we don't let
unique errors like this leak up into higher xfs functions - many
like this should be remapped to EIO.
However, this patch directly addresses a reported bug in the xattr
code, and should be safe to backport to stable kernels. A larger-scope
patch to handle more unique errors at lower levels can follow later.
(Note, prior to 07120f1abdff we did not oops, but we did return the
wrong error code to userspace.)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
Cc: stable@vger.kernel.org # v5.9+
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
|
We were returning -EOPNOTSUPP for various remap_file_range cases
but for some of these the copy_file_range_syscall() requires -EINVAL
to be returned (e.g. where source and target file ranges overlap when
source and target are the same file). This fixes xfstest generic/157
which was expecting EINVAL for that (and also e.g. for when the src
offset is beyond end of file).
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core
Pull driver core fixes from Danilo Krummrich:
- Fix swapped handling of lru_gen and lru_gen_full debugfs files in
vmscan
- Fix debugfs mount options (uid, gid, mode) being silently ignored
- Fix leak of devres action in the unwind path of Devres::new()
- Documentation:
- Expand and fix documentation of (outdated) Device, DeviceContext
and generic driver infrastructure
- Fix C header link of faux device abstractions
- Clarify expected interaction with the security team
- Smooth text flow in the security bug reporting process
documentation
* tag 'driver-core-6.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core:
Documentation: smooth the text flow in the security bug reporting process
Documentation: clarify the expected collaboration with security bugs reporters
debugfs: fix mount options not being applied
rust: devres: fix leaking call to devm_add_action()
rust: faux: fix C header link
driver: rust: expand documentation for driver infrastructure
device: rust: expand documentation for Device
device: rust: expand documentation for DeviceContext
mm/vmscan: fix inverted polarity in lru_gen_seq_show()
|
|
Pull smb client fix from Steve French:
"Fix for netfs smb3 oops"
* tag '6.17-rc2-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix oops due to uninitialised variable
|
|
Pull NFS client fix from Trond Myklebust:
- NFS: Fix a data corrupting race when updating an existing write
* tag 'nfs-for-6.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix a race when updating an existing write
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"20 hotfixes. 10 are cc:stable and the remainder address post-6.16
issues or aren't considered necessary for -stable kernels. 17 of these
fixes are for MM.
As usual, singletons all over the place, apart from a three-patch
series of KHO followup work from Pasha which is actually also a bunch
of singletons"
* tag 'mm-hotfixes-stable-2025-08-21-18-17' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/mremap: fix WARN with uffd that has remap events disabled
mm/damon/sysfs-schemes: put damos dests dir after removing its files
mm/migrate: fix NULL movable_ops if CONFIG_ZSMALLOC=m
mm/damon/core: fix damos_commit_filter not changing allow
mm/memory-failure: fix infinite UCE for VM_PFNMAP pfn
MAINTAINERS: mark MGLRU as maintained
mm: rust: add page.rs to MEMORY MANAGEMENT - RUST
iov_iter: iterate_folioq: fix handling of offset >= folio size
selftests/damon: fix selftests by installing drgn related script
.mailmap: add entry for Easwar Hariharan
selftests/mm: add test for invalid multi VMA operations
mm/mremap: catch invalid multi VMA moves earlier
mm/mremap: allow multi-VMA move when filesystem uses thp_get_unmapped_area
mm/damon/core: fix commit_ops_filters by using correct nth function
tools/testing: add linux/args.h header and fix radix, VMA tests
mm/debug_vm_pgtable: clear page table entries at destroy_args()
squashfs: fix memory leak in squashfs_fill_super
kho: warn if KHO is disabled due to an error
kho: mm: don't allow deferred struct page with KHO
kho: init new_physxa->phys_bits to fix lockdep
|
|
At inode_logged() we do a couple lockless checks for ->logged_trans, and
these are generally safe except the second one in case we get a load or
store tearing due to a concurrent call updating ->logged_trans (either at
btrfs_log_inode() or later at inode_logged()).
In the first case it's safe to compare to the current transaction ID since
once ->logged_trans is set the current transaction, we never set it to a
lower value.
In the second case, where we check if it's greater than zero, we are prone
to load/store tearing races, since we can have a concurrent task updating
to the current transaction ID with store tearing for example, instead of
updating with a single 64 bits write, to update with two 32 bits writes or
four 16 bits writes. In that case the reading side at inode_logged() could
see a positive value that does not match the current transaction and then
return a false negative.
Fix this by doing the second check while holding the inode's spinlock, add
some comments about it too. Also add the data_race() annotation to the
first check to avoid any reports from KCSAN (or similar tools) and comment
about it.
Fixes: 0f8ce49821de ("btrfs: avoid inode logging during rename and link when possible")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
At inode_logged() if we find that the inode was not logged before we
update its ->last_dir_index_offset to (u64)-1 with the goal that the
next directory log operation will see the (u64)-1 and then figure out
it must check what was the index of the last logged dir index key and
update ->last_dir_index_offset to that key's offset (this is done in
update_last_dir_index_offset()).
This however has a possibility for a time window where a race can happen
and lead to directory logging skipping dir index keys that should be
logged. The race happens like this:
1) Task A calls inode_logged(), sees ->logged_trans as 0 and then checks
that the inode item was logged before, but before it sets the inode's
->last_dir_index_offset to (u64)-1...
2) Task B is at btrfs_log_inode() which calls inode_logged() early, and
that has set ->last_dir_index_offset to (u64)-1;
3) Task B then enters log_directory_changes() which calls
update_last_dir_index_offset(). There it sees ->last_dir_index_offset
is (u64)-1 and that the inode was logged before (ctx->logged_before is
true), and so it searches for the last logged dir index key in the log
tree and it finds that it has an offset (index) value of N, so it sets
->last_dir_index_offset to N, so that we can skip index keys that are
less than or equal to N (later at process_dir_items_leaf());
4) Task A now sets ->last_dir_index_offset to (u64)-1, undoing the update
that task B just did;
5) Task B will now skip every index key when it enters
process_dir_items_leaf(), since ->last_dir_index_offset is (u64)-1.
Fix this by making inode_logged() not touch ->last_dir_index_offset and
initializing it to 0 when an inode is loaded (at btrfs_alloc_inode()) and
then having update_last_dir_index_offset() treat a value of 0 as meaning
we must check the log tree and update with the index of the last logged
index key. This is fine since the minimum possible value for
->last_dir_index_offset is 1 (BTRFS_DIR_START_INDEX - 1 = 2 - 1 = 1).
This also simplifies the management of ->last_dir_index_offset and now
all accesses to it are done under the inode's log_mutex.
Fixes: 0f8ce49821de ("btrfs: avoid inode logging during rename and link when possible")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There's a race between checking if an inode was logged before and logging
an inode that can cause us to mark an inode as not logged just after it
was logged by a concurrent task:
1) We have inode X which was not logged before neither in the current
transaction not in past transaction since the inode was loaded into
memory, so it's ->logged_trans value is 0;
2) We are at transaction N;
3) Task A calls inode_logged() against inode X, sees that ->logged_trans
is 0 and there is a log tree and so it proceeds to search in the log
tree for an inode item for inode X. It doesn't see any, but before
it sets ->logged_trans to N - 1...
3) Task B calls btrfs_log_inode() against inode X, logs the inode and
sets ->logged_trans to N;
4) Task A now sets ->logged_trans to N - 1;
5) At this point anyone calling inode_logged() gets 0 (inode not logged)
since ->logged_trans is greater than 0 and less than N, but our inode
was really logged. As a consequence operations like rename, unlink and
link that happen afterwards in the current transaction end up not
updating the log when they should.
Fix this by ensuring inode_logged() only updates ->logged_trans in case
the inode item is not found in the log tree if after tacking the inode's
lock (spinlock struct btrfs_inode::lock) the ->logged_trans value is still
zero, since the inode lock is what protects setting ->logged_trans at
btrfs_log_inode().
Fixes: 0f8ce49821de ("btrfs: avoid inode logging during rename and link when possible")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Instead of incrementing the inode's link count and refcount early before
adding the link, updating the inode and deleting orphan item, do it after
all those steps succeeded right before calling d_instantiate(). This makes
the error handling logic simpler by avoiding the need for the 'drop_inode'
variable to signal if we need to undo the link count increment and the
inode refcount increase under the 'fail' label.
This also reduces the level of indentation by one, making the code easier
to read.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If we fail to update the inode or delete the orphan item we leak the inode
since we update its refcount with the ihold() call to account for the
d_instantiate() call which never happens in case we fail those steps. Fix
this by setting 'drop_inode' to true in case we fail those steps.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|