summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2024-07-15Merge tag 'kcsan.2024.07.12a' of ↵Linus Torvalds1-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull KCSAN updates from Paul McKenney: - improve the documentation for the new __data_racy type qualifier to the data_race() macro's kernel-doc header and to the LKMM's access-marking documentation - add missing MODULE_DESCRIPTION * tag 'kcsan.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: kcsan: Add missing MODULE_DESCRIPTION() macro kcsan: Add example to data_race() kerneldoc header
2024-07-15Merge tag 'rcu.2024.07.12a' of ↵Linus Torvalds3-52/+133
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Update Tasks RCU and Tasks Rude RCU description in Requirements.rst and clarify rcu_assign_pointer() and rcu_dereference() ordering properties - Add lockdep assertions for RCU readers, limit inline wakeups for callback-bypass synchronize_rcu(), add an rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter, add Uladzislau Rezki as RCU maintainer, and fix a subtle callback-migration memory-ordering issue - Remove a number of redundant memory barriers - Remove unnecessary bypass-list lock-contention mitigation, use parking API instead of open-coded ad-hoc equivalent, and upgrade obsolete comments - Revert avoidance of a deadlock that can no longer occur and properly synchronize Tasks Trace RCU checking of runqueues - Add tests for handling of double-call_rcu() bug, add missing MODULE_DESCRIPTION, and add a script that histograms the number of calls to RCU updaters - Fill out SRCU polled-grace-period API * tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits) rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation rcu: Eliminate lockless accesses to rcu_sync->gp_count MAINTAINERS: Add Uladzislau Rezki as RCU maintainer rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter rcu/exp: Remove redundant full memory barrier at the end of GP rcu: Remove full memory barrier on RCU stall printout rcu: Remove full memory barrier on boot time eqs sanity check rcu/exp: Remove superfluous full memory barrier upon first EQS snapshot rcu: Remove superfluous full memory barrier upon first EQS snapshot rcu: Remove full ordering on second EQS snapshot srcu: Fill out polled grace-period APIs srcu: Update cleanup_srcu_struct() comment srcu: Add NUM_ACTIVE_SRCU_POLL_OLDSTATE srcu: Disable interrupts directly in srcu_gp_end() rcu: Disable interrupts directly in rcu_gp_init() rcu/tree: Reduce wake up for synchronize_rcu() common case rcu/tasks: Fix stale task snaphot for Tasks Trace tools/rcu: Add rcu-updaters.sh script rcutorture: Add missing MODULE_DESCRIPTION() macros rcutorture: Fix rcu_torture_fwd_cb_cr() data race ...
2024-07-15Merge tag 'timers-core-2024-07-14' of ↵Linus Torvalds6-1/+39
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Updates for timers, timekeeping and related functionality: Core: - Make the takeover of a hrtimer based broadcast timer reliable during CPU hot-unplug. The current implementation suffers from a race which can lead to broadcast timer starvation in the worst case. - VDSO related cleanups and simplifications - Small cleanups and enhancements all over the place PTP: - Replace the architecture specific base clock to clocksource, e.g. ART to TSC, conversion function with generic functionality to avoid exposing such internals to drivers and convert all existing drivers over. This also allows to provide functionality which converts the other way round in the core code based on the same parameter set. - Provide a function to convert CLOCK_REALTIME to the base clock to support the upcoming PPS output driver on Intel platforms. Drivers: - A set of Device Tree bindings for new hardware - Cleanups and enhancements all over the place" * tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) clocksource/drivers/realtek: Add timer driver for rtl-otto platforms dt-bindings: timer: Add schema for realtek,otto-timer dt-bindings: timer: Add SOPHGO SG2002 clint dt-bindings: timer: renesas,tmu: Add R-Car Gen2 support dt-bindings: timer: renesas,tmu: Add RZ/G1 support dt-bindings: timer: renesas,tmu: Add R-Mobile APE6 support clocksource/drivers/mips-gic-timer: Correct sched_clock width clocksource/drivers/mips-gic-timer: Refine rating computation clocksource/drivers/sh_cmt: Address race condition for clock events clocksource/driver/arm_global_timer: Remove unnecessary ‘0’ values from err clocksource/drivers/arm_arch_timer: Remove unnecessary ‘0’ values from irq tick/broadcast: Make takeover of broadcast hrtimer reliable tick/sched: Combine WARN_ON_ONCE and print_once x86/vdso: Remove unused include x86/vgtod: Remove unused typedef gtod_long_t x86/vdso: Fix function reference in comment vdso: Add comment about reason for vdso struct ordering vdso/gettimeofday: Clarify comment about open coded function timekeeping: Add missing kernel-doc function comments tick: Remove unnused tick_nohz_get_idle_calls() ...
2024-07-15Merge tag 'smp-core-2024-07-14' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "A small set of SMP/CPU hotplug updates: - Reverse the order of iteration when freezing secondary CPUs for hibernation. This avoids that drivers like the Intel uncore performance counter have to transfer the assignement of handling the per package uncore events for every CPU in a package, which is a considerable speedup on larger systems. - Add a missing destroy_work_on_stack() invocation in smp_call_on_cpu() to prevent debug objects to emit a false positive warning when the stack is freed. - Small cleanups in comments and a str_plural() conversion" * tag 'smp-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu() cpu/hotplug: Reverse order of iteration in freeze_secondary_cpus() smp: Use str_plural() to fix Coccinelle warnings cpu/hotplug: Fix typo in comment
2024-07-15Merge tag 'for-6.11/block-20240710' of git://git.kernel.dk/linuxLinus Torvalds15-227/+363
Pull block updates from Jens Axboe: - NVMe updates via Keith: - Device initialization memory leak fixes (Keith) - More constants defined (Weiwen) - Target debugfs support (Hannes) - PCIe subsystem reset enhancements (Keith) - Queue-depth multipath policy (Redhat and PureStorage) - Implement get_unique_id (Christoph) - Authentication error fixes (Gaosheng) - MD updates via Song - sync_action fix and refactoring (Yu Kuai) - Various small fixes (Christoph Hellwig, Li Nan, and Ofir Gal, Yu Kuai, Benjamin Marzinski, Christophe JAILLET, Yang Li) - Fix loop detach/open race (Gulam) - Fix lower control limit for blk-throttle (Yu) - Add module descriptions to various drivers (Jeff) - Add support for atomic writes for block devices, and statx reporting for same. Includes SCSI and NVMe (John, Prasad, Alan) - Add IO priority information to block trace points (Dongliang) - Various zone improvements and tweaks (Damien) - mq-deadline tag reservation improvements (Bart) - Ignore direct reclaim swap writes in writeback throttling (Baokun) - Block integrity improvements and fixes (Anuj) - Add basic support for rust based block drivers. Has a dummy null_blk variant for now (Andreas) - Series converting driver settings to queue limits, and cleanups and fixes related to that (Christoph) - Cleanup for poking too deeply into the bvec internals, in preparation for DMA mapping API changes (Christoph) - Various minor tweaks and fixes (Jiapeng, John, Kanchan, Mikulas, Ming, Zhu, Damien, Christophe, Chaitanya) * tag 'for-6.11/block-20240710' of git://git.kernel.dk/linux: (206 commits) floppy: add missing MODULE_DESCRIPTION() macro loop: add missing MODULE_DESCRIPTION() macro ublk_drv: add missing MODULE_DESCRIPTION() macro xen/blkback: add missing MODULE_DESCRIPTION() macro block/rnbd: Constify struct kobj_type block: take offset into account in blk_bvec_map_sg again block: fix get_max_segment_size() warning loop: Don't bother validating blocksize virtio_blk: Don't bother validating blocksize null_blk: Don't bother validating blocksize block: Validate logical block size in blk_validate_limits() virtio_blk: Fix default logical block size fallback nvmet-auth: fix nvmet_auth hash error handling nvme: implement ->get_unique_id block: pass a phys_addr_t to get_max_segment_size block: add a bvec_phys helper blk-lib: check for kill signal in ioctl BLKZEROOUT block: limit the Write Zeroes to manually writing zeroes fallback block: refacto blkdev_issue_zeroout block: move read-only and supported checks into (__)blkdev_issue_zeroout ...
2024-07-15Merge tag 'for-6.11/io_uring-20240714' of git://git.kernel.dk/linuxLinus Torvalds3-10/+9
Pull io_uring updates from Jens Axboe: "Here are the io_uring updates queued up for 6.11. Nothing major this time around, various minor improvements and cleanups/fixes. This contains: - Add bind/listen opcodes. Main motivation is to support direct descriptors, to avoid needing a regular fd just for doing these two operations (Gabriel) - Probe fixes (Gabriel) - Treat io-wq work flags as atomics. Not fixing a real issue, but may as well and it silences a KCSAN warning (me) - Cleanup of rsrc __set_current_state() usage (me) - Add 64-bit for {m,f}advise operations (me) - Improve performance of data ring messages (me) - Fix for ring message overflow posting (Pavel) - Fix for freezer interaction with TWA_NOTIFY_SIGNAL. Not strictly an io_uring thing, but since TWA_NOTIFY_SIGNAL was originally added for faster task_work signaling for io_uring, bundling it with this pull (Pavel) - Add Pavel as a co-maintainer - Various cleanups (me, Thorsten)" * tag 'for-6.11/io_uring-20240714' of git://git.kernel.dk/linux: (28 commits) io_uring/net: check socket is valid in io_bind()/io_listen() kernel: rerun task_work while freezing in get_signal() io_uring/io-wq: limit retrying worker initialisation io_uring/napi: Remove unnecessary s64 cast io_uring/net: cleanup io_recv_finish() bundle handling io_uring/msg_ring: fix overflow posting MAINTAINERS: change Pavel Begunkov from io_uring reviewer to maintainer io_uring/msg_ring: use kmem_cache_free() to free request io_uring/msg_ring: check for dead submitter task io_uring/msg_ring: add an alloc cache for io_kiocb entries io_uring/msg_ring: improve handling of target CQE posting io_uring: add io_add_aux_cqe() helper io_uring: add remote task_work execution helper io_uring/msg_ring: tighten requirement for remote posting io_uring: Allocate only necessary memory in io_probe io_uring: Fix probe of disabled operations io_uring: Introduce IORING_OP_LISTEN io_uring: Introduce IORING_OP_BIND net: Split a __sys_listen helper for io_uring net: Split a __sys_bind helper for io_uring ...
2024-07-15Merge tag 'vfs-6.11.pidfs' of ↵Linus Torvalds4-5/+55
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfs updates from Christian Brauner: "This contains work to make it possible to derive namespace file descriptors from pidfd file descriptors. Right now it is already possible to use a pidfd with setns() to atomically change multiple namespaces at the same time. In other words, it is possible to switch to the namespace context of a process using a pidfd. There is no need to first open namespace file descriptors via procfs. The work included here is an extension of these abilities by allowing to open namespace file descriptors using a pidfd. This means it is now possible to interact with namespaces without ever touching procfs. To this end a new set of ioctls() on pidfds is introduced covering all supported namespace types" * tag 'vfs-6.11.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: pidfs: allow retrieval of namespace file descriptors nsfs: add open_namespace() nsproxy: add helper to go from arbitrary namespace to ns_common nsproxy: add a cleanup helper for nsproxy file: add take_fd() cleanup helper
2024-07-15Merge tag 'vfs-6.11.nsfs' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace-fs updates from Christian Brauner: "This adds ioctls allowing to translate PIDs between PID namespaces. The motivating use-case comes from LXCFS which is a tiny fuse filesystem used to virtualize various aspects of procfs. LXCFS is run on the host. The files and directories it creates can be bind-mounted by e.g. a container at startup and mounted over the various procfs files the container wishes to have virtualized. When e.g. a read request for uptime is received, LXCFS will receive the pid of the reader. In order to virtualize the corresponding read, LXCFS needs to know the pid of the init process of the reader's pid namespace. In order to do this, LXCFS first needs to fork() two helper processes. The first helper process setns() to the readers pid namespace. The second helper process is needed to create a process that is a proper member of the pid namespace. The second helper process then creates a ucred message with ucred.pid set to 1 and sends it back to LXCFS. The kernel will translate the ucred.pid field to the corresponding pid number in LXCFS's pid namespace. This way LXCFS can learn the init pid number of the reader's pid namespace and can go on to virtualize. Since these two forks() are costly LXCFS maintains an init pid cache that caches a given pid for a fixed amount of time. The cache is pruned during new read requests. However, even with the cache the hit of the two forks() is singificant when a very large number of containers are running. So this adds a simple set of ioctls that let's a caller translate PIDs from and into a given PID namespace. This significantly improves performance with a very simple change. To protect against races pidfds can be used to check whether the process is still valid" * tag 'vfs-6.11.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nsfs: add pid translation ioctls
2024-07-15Merge tag 'vfs-6.11.mount' of ↵Linus Torvalds3-2/+19
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount query updates from Christian Brauner: "This contains work to extend the abilities of listmount() and statmount() and various fixes and cleanups. Features: - Allow iterating through mounts via listmount() from newest to oldest. This makes it possible for mount(8) to keep iterating the mount table in reverse order so it gets newest mounts first. - Relax permissions on listmount() and statmount(). It's not necessary to have capabilities in the initial namespace: it is sufficient to have capabilities in the owning namespace of the mount namespace we're located in to list unreachable mounts in that namespace. - Extend both listmount() and statmount() to list and stat mounts in foreign mount namespaces. Currently the only way to iterate over mount entries in mount namespaces that aren't in the caller's mount namespace is by crawling through /proc in order to find /proc/<pid>/mountinfo for the relevant mount namespace. This is both very clumsy and hugely inefficient. So extend struct mnt_id_req with a new member that allows to specify the mount namespace id of the mount namespace we want to look at. Luckily internally we already have most of the infrastructure for this so we just need to expose it to userspace. Give userspace a way to retrieve the id of a mount namespace via statmount() and through a new nsfs ioctl() on mount namespace file descriptor. This comes with appropriate selftests. - Expose mount options through statmount(). Currently if userspace wants to get mount options for a mount and with statmount(), they still have to open /proc/<pid>/mountinfo to parse mount options. Simply the information through statmount() directly. Afterwards it's possible to only rely on statmount() and listmount() to retrieve all and more information than /proc/<pid>/mountinfo provides. This comes with appropriate selftests. Fixes: - Avoid copying to userspace under the namespace semaphore in listmount. Cleanups: - Simplify the error handling in listmount by relying on our newly added cleanup infrastructure. - Refuse invalid mount ids early for both listmount and statmount" * tag 'vfs-6.11.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: reject invalid last mount id early fs: refuse mnt id requests with invalid ids early fs: find rootfs mount of the mount namespace fs: only copy to userspace on success in listmount() sefltests: extend the statmount test for mount options fs: use guard for namespace_sem in statmount() fs: export mount options via statmount() fs: rename show_mnt_opts -> show_vfsmnt_opts selftests: add a test for the foreign mnt ns extensions fs: add an ioctl to get the mnt ns id from nsfs fs: Allow statmount() in foreign mount namespace fs: Allow listmount() in foreign mount namespace fs: export the mount ns id via statmount fs: keep an index of current mount namespaces fs: relax permissions for statmount() listmount: allow listing in reverse order fs: relax permissions for listmount() fs: simplify error handling fs: don't copy to userspace under namespace semaphore path: add cleanup helper
2024-07-15Merge tag 'vfs-6.11.inode' of ↵Linus Torvalds2-3/+13
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs inode / dentry updates from Christian Brauner: "This contains smaller performance improvements to inodes and dentries: inode: - Add rcu based inode lookup variants. They avoid one inode hash lock acquire in the common case thereby significantly reducing contention. We already support RCU-based operations but didn't take advantage of them during inode insertion. Callers of iget_locked() get the improvement without any code changes. Callers that need a custom callback can switch to iget5_locked_rcu() as e.g., did btrfs. With 20 threads each walking a dedicated 1000 dirs * 1000 files directory tree to stat(2) on a 32 core + 24GB ram vm: before: 3.54s user 892.30s system 1966% cpu 45.549 total after: 3.28s user 738.66s system 1955% cpu 37.932 total (-16.7%) Long-term we should pick up the effort to introduce more fine-grained locking and possibly improve on the currently used hash implementation. - Start zeroing i_state in inode_init_always() instead of doing it in individual filesystems. This allows us to remove an unneeded lock acquire in new_inode() and not burden individual filesystems with this. dcache: - Move d_lockref out of the area used by RCU lookup to avoid cacheline ping poing because the embedded name is sharing a cacheline with d_lockref. - Fix dentry size on 32bit with CONFIG_SMP=y so it does actually end up with 128 bytes in total" * tag 'vfs-6.11.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: fix dentry size vfs: move d_lockref out of the area used by RCU lookup bcachefs: remove now spurious i_state initialization xfs: remove now spurious i_state initialization in xfs_inode_alloc vfs: partially sanitize i_state zeroing on inode creation xfs: preserve i_state around inode_init_always in xfs_reinit_inode btrfs: use iget5_locked_rcu vfs: add rcu-based find_inode variants for iget ops
2024-07-15Merge tag 'vfs-6.11.mount.api' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount API updates from Christian Brauner: - Add a generic helper to parse uid and gid mount options. Currently we open-code the same logic in various filesystems which is error prone, especially since the verification of uid and gid mount options is a sensitive operation in the face of idmappings. Add a generic helper and convert all filesystems over to it. Make sure that filesystems that are mountable in unprivileged containers verify that the specified uid and gid can be represented in the owning namespace of the filesystem. - Convert hostfs to the new mount api. * tag 'vfs-6.11.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fuse: Convert to new uid/gid option parsing helpers fuse: verify {g,u}id mount options correctly fat: Convert to new uid/gid option parsing helpers fat: Convert to new mount api fat: move debug into fat_mount_options vboxsf: Convert to new uid/gid option parsing helpers tracefs: Convert to new uid/gid option parsing helpers smb: client: Convert to new uid/gid option parsing helpers tmpfs: Convert to new uid/gid option parsing helpers ntfs3: Convert to new uid/gid option parsing helpers isofs: Convert to new uid/gid option parsing helpers hugetlbfs: Convert to new uid/gid option parsing helpers ext4: Convert to new uid/gid option parsing helpers exfat: Convert to new uid/gid option parsing helpers efivarfs: Convert to new uid/gid option parsing helpers debugfs: Convert to new uid/gid option parsing helpers autofs: Convert to new uid/gid option parsing helpers fs_parse: add uid & gid option option parsing helpers hostfs: Add const qualifier to host_root in hostfs_fill_super() hostfs: convert hostfs to use the new mount API
2024-07-15Merge tag 'vfs-6.11.casefold' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs casefolding updates from Christian Brauner: "This contains some work to simplify the handling of casefolded names: - Simplify the handling of casefolded names in f2fs and ext4 by keeping the names as a qstr to avoiding unnecessary conversions - Introduce a new generic_ci_match() libfs case-insensitive lookup helper and use it in both f2fs and ext4 allowing to remove the filesystem specific implementations - Remove a bunch of ifdefs by making the unicode build checks part of the code flow" * tag 'vfs-6.11.casefold' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: f2fs: Move CONFIG_UNICODE defguards into the code flow ext4: Move CONFIG_UNICODE defguards into the code flow f2fs: Reuse generic_ci_match for ci comparisons ext4: Reuse generic_ci_match for ci comparisons libfs: Introduce case-insensitive string comparison helper f2fs: Simplify the handling of cached casefolded names ext4: Simplify the handling of cached casefolded names
2024-07-15Merge tag 'vfs-6.11.misc' of ↵Linus Torvalds4-39/+58
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Features: - Support passing NULL along AT_EMPTY_PATH for statx(). NULL paths with any flag value other than AT_EMPTY_PATH go the usual route and end up with -EFAULT to retain compatibility (Rust is abusing calls of the sort to detect availability of statx) This avoids path lookup code, lockref management, memory allocation and in case of NULL path userspace memory access (which can be quite expensive with SMAP on x86_64) - Don't block i_writecount during exec. Remove the deny_write_access() mechanism for executables - Relax open_by_handle_at() permissions in specific cases where we can prove that the caller had sufficient privileges to open a file - Switch timespec64 fields in struct inode to discrete integers freeing up 4 bytes Fixes: - Fix false positive circular locking warning in hfsplus - Initialize hfs_inode_info after hfs_alloc_inode() in hfs - Avoid accidental overflows in vfs_fallocate() - Don't interrupt fallocate with EINTR in tmpfs to avoid constantly restarting shmem_fallocate() - Add missing quote in comment in fs/readdir Cleanups: - Don't assign and test in an if statement in mqueue. Move the assignment out of the if statement - Reflow the logic in may_create_in_sticky() - Remove the usage of the deprecated ida_simple_xx() API from procfs - Reject FSCONFIG_CMD_CREATE_EXCL requets that depend on the new mount api early - Rename variables in copy_tree() to make it easier to understand - Replace WARN(down_read_trylock, ...) abuse with proper asserts in various places in the VFS - Get rid of user_path_at_empty() and drop the empty argument from getname_flags() - Check for error while copying and no path in one branch in getname_flags() - Avoid redundant smp_mb() for THP handling in do_dentry_open() - Rename parent_ino to d_parent_ino and make it use RCU - Remove unused header include in fs/readdir - Export in_group_capable() helper and switch f2fs and fuse over to it instead of open-coding the logic in both places" * tag 'vfs-6.11.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (27 commits) ipc: mqueue: remove assignment from IS_ERR argument vfs: rename parent_ino to d_parent_ino and make it use RCU vfs: support statx(..., NULL, AT_EMPTY_PATH, ...) stat: use vfs_empty_path() helper fs: new helper vfs_empty_path() fs: reflow may_create_in_sticky() vfs: remove redundant smp_mb for thp handling in do_dentry_open fuse: Use in_group_or_capable() helper f2fs: Use in_group_or_capable() helper fs: Export in_group_or_capable() vfs: reorder checks in may_create_in_sticky hfs: fix to initialize fields of hfs_inode_info after hfs_alloc_inode() proc: Remove usage of the deprecated ida_simple_xx() API hfsplus: fix to avoid false alarm of circular locking Improve readability of copy_tree vfs: shave a branch in getname_flags vfs: retire user_path_at_empty and drop empty arg from getname_flags vfs: stop using user_path_at_empty in do_readlinkat tmpfs: don't interrupt fallocate with EINTR fs: don't block i_writecount during exec ...
2024-07-15Merge tag 'drm-fixes-2024-07-12' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds1-1/+7
Pull drm fixes from Dave Airlie: "Oh I screwed up last week's fixes pull, and forgot to send.. Back to work, thanks to Sima for last week, not too many fixes as expected getting close to release [ sic - Linus ], amdgpu and xe have a couple each, and then some other misc ones. amdgpu: - PSR-SU fix - Reseved VMID fix xe: - Use write-back caching mode for system memory on DGFX - Do not leak object when finalizing hdcp gsc bridge: - adv7511 EDID irq fix gma500: - NULL mode fixes. meson: - fix resource leak" * tag 'drm-fixes-2024-07-12' of https://gitlab.freedesktop.org/drm/kernel: Revert "drm/amd/display: Reset freesync config before update new state" drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal drm/xe: Use write-back caching mode for system memory on DGFX drm/amdgpu: reject gang submit on reserved VMIDs drm/gma500: fix null pointer dereference in cdv_intel_lvds_get_modes drm/gma500: fix null pointer dereference in psb_intel_lvds_get_modes drm/meson: fix canvas release in bind function drm/bridge: adv7511: Fix Intermittent EDID failures
2024-07-15Merge branch 'runtime-constants'Linus Torvalds3-0/+24
Merge runtime constants infrastructure with implementations for x86 and arm64. This is one of four branches that came out of me looking at profiles of my kernel build filesystem load on my 128-core Altra arm64 system, where pathname walking and the user copies (particularly strncpy_from_user() for fetching the pathname from user space) is very hot. This is a very specialized "instruction alternatives" model where the dentry hash pointer and hash count will be constants for the lifetime of the kernel, but the allocation are not static but done early during the kernel boot. In order to avoid the pointer load and dynamic shift, we just rewrite the constants in the instructions in place. We can't use the "generic" alternative instructions infrastructure, because different architectures do it very differently, and it's actually simpler to just have very specific helpers, with a fallback to the generic ("old") model of just using variables for architectures that do not implement the runtime constant patching infrastructure. Link: https://lore.kernel.org/all/CAHk-=widPe38fUNjUOmX11ByDckaeEo9tN4Eiyke9u1SAtu9sA@mail.gmail.com/ * runtime-constants: arm64: add 'runtime constant' support runtime constants: add x86 architecture support runtime constants: add default dummy infrastructure vfs: dcache: move hashlen_hash() from callers into d_hash()
2024-07-13Merge tag 'timers-v6.11-rc1' of ↵Thomas Gleixner76-376/+479
https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clocksource/event driver updates from Daniel Lezcano: - Remove unnecessary local variables initialization as they will be initialized in the code path anyway right after on the ARM arch timer and the ARM global timer (Li kunyu) - Fix a race condition in the interrupt leading to a deadlock on the SH CMT driver. Note that this fix was not tested on the platform using this timer but the fix seems reasonable enough to be picked confidently (Niklas Söderlund) - Increase the rating of the gic-timer and use the configured width clocksource register on the MIPS architecture (Jiaxun Yang) - Add the DT bindings for the TMU on the Renesas platforms (Geert Uytterhoeven) - Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas Bonnefille) - Add the rtl-otto timer driver along with the DT bindings for the Realtek platform (Chris Packham) Link: https://lore.kernel.org/all/91cd05de-4c5d-4242-a381-3b8a4fe6a2a2@linaro.org
2024-07-12Merge tag 'for-6.10-rc7-tag' of ↵Linus Torvalds1-8/+10
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Fix a regression in extent map shrinker behaviour. In the past weeks we got reports from users that there are huge latency spikes or freezes. This was bisected to newly added shrinker of extent maps (it was added to fix a build up of the structures in memory). I'm assuming that the freezes would happen to many users after release so I'd like to get it merged now so it's in 6.10. Although the diff size is not small the changes are relatively straightforward, the reporters verified the fixes and we did testing on our side. The fixes: - adjust behaviour under memory pressure and check lock or scheduling conditions, bail out if needed - synchronize tracking of the scanning progress so inode ranges are not skipped or work duplicated - do a delayed iput when scanning a root so evicting an inode does not slow things down in case of lots of dirty data, also fix lockdep warning, a deadlock could happen when writing the dirty data would need to start a transaction" * tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: avoid races when tracking progress for extent map shrinking btrfs: stop extent map shrinker if reschedule is needed btrfs: use delayed iput during extent map shrinking
2024-07-12Merge tag 'char-misc-6.10-final' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver fixes from Greg KH: "Here are some small remaining driver fixes for 6.10-final that have all been in linux-next for a while and resolve reported issues. Included in here are: - mei driver fixes (and a spelling fix at the end just to be clean) - iio driver fixes for reported problems - fastrpc bugfixes - nvmem small fixes" * tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mei: vsc: Fix spelling error mei: vsc: Enhance SPI transfer of IVSC ROM mei: vsc: Utilize the appropriate byte order swap function mei: vsc: Prevent timeout error with added delay post-firmware download mei: vsc: Enhance IVSC chipset stability during warm reboot nvmem: core: limit cell sysfs permissions to main attribute ones nvmem: core: only change name to fram for current attribute nvmem: meson-efuse: Fix return value of nvmem callbacks nvmem: rmem: Fix return value of rmem_read() misc: microchip: pci1xxxx: Fix return value of nvmem callbacks hpet: Support 32-bit userspace misc: fastrpc: Restrict untrusted app to attach to privileged PD misc: fastrpc: Fix ownership reassignment of remote heap misc: fastrpc: Fix memory leak in audio daemon attach operation misc: fastrpc: Avoid updating PD type for capability request misc: fastrpc: Copy the complete capability structure to user misc: fastrpc: Fix DSP capabilities request iio: light: apds9306: Fix error handing iio: trigger: Fix condition for own trigger
2024-07-12clocksource/drivers/realtek: Add timer driver for rtl-otto platformsChris Packham1-0/+1
The timer/counter block on the Realtek SoCs provides up to 5 timers. It also includes a watchdog timer which is handled by the realtek_otto_wdt.c driver. One timer will be used per CPU as a local clock event generator. An additional timer will be used as an overal stable clocksource. Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de> Signed-off-by: Sander Vanheule <sander@svanheule.net> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20240710043524.1535151-8-chris.packham@alliedtelesis.co.nz Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-11Merge tag 'spi-fix-v6.10-rc7' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "This fixes two regressions that have been bubbling along for a large part of this release. One is a revert of the multi mode support for the OMAP SPI controller, this introduced regressions on a number of systems and while there has been progress on fixing those we've not got something that works for everyone yet so let's just drop the change for now. The other is a series of fixes from David Lechner for his recent message optimisation work, this interacted badly with spi-mux which is altogether too clever with recursive use of the bus and creates situations that hadn't been considered. There are also a couple of small driver specific fixes, including one more patch from David for sleep duration calculations in the AXI driver" * tag 'spi-fix-v6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: mux: set ctlr->bits_per_word_mask spi: add defer_optimize_message controller flag spi: don't unoptimize message in spi_async() spi: omap2-mcspi: Revert multi mode support spi: davinci: Unset POWERDOWN bit when releasing resources spi: axi-spi-engine: fix sleep calculation spi: imx: Don't expect DMA for i.MX{25,35,50,51,53} cspi devices
2024-07-11Merge tag 'net-6.10-rc8' of ↵Linus Torvalds1-4/+9
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - core: fix rc7's __skb_datagram_iter() regression Current release - new code bugs: - eth: bnxt: fix crashes when reducing ring count with active RSS contexts Previous releases - regressions: - sched: fix UAF when resolving a clash - skmsg: skip zero length skb in sk_msg_recvmsg2 - sunrpc: fix kernel free on connection failure in xs_tcp_setup_socket - tcp: avoid too many retransmit packets - tcp: fix incorrect undo caused by DSACK of TLP retransmit - udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). - eth: ks8851: fix deadlock with the SPI chip variant - eth: i40e: fix XDP program unloading while removing the driver Previous releases - always broken: - bpf: - fix too early release of tcx_entry - fail bpf_timer_cancel when callback is being cancelled - bpf: fix order of args in call to bpf_map_kvcalloc - netfilter: nf_tables: prefer nft_chain_validate - ppp: reject claimed-as-LCP but actually malformed packets - wireguard: avoid unaligned 64-bit memory accesses" * tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits) net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket net/sched: Fix UAF when resolving a clash net: ks8851: Fix potential TX stall after interface reopen udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). netfilter: nf_tables: prefer nft_chain_validate netfilter: nfnetlink_queue: drop bogus WARN_ON ethtool: netlink: do not return SQI value if link is down ppp: reject claimed-as-LCP but actually malformed packets selftests/bpf: Add timer lockup selftest net: ethernet: mtk-star-emac: set mac_managed_pm when probing e1000e: fix force smbus during suspend flow tcp: avoid too many retransmit packets bpf: Defer work in bpf_timer_cancel_and_free bpf: Fail bpf_timer_cancel when callback is being cancelled bpf: fix order of args in call to bpf_map_kvcalloc net: ethernet: lantiq_etop: fix double free in detach i40e: Fix XDP program unloading while removing the driver net: fix rc7's __skb_datagram_iter() net: ks8851: Fix deadlock with the SPI chip variant octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability() ...
2024-07-11Merge tag 'vfs-6.10-rc8.fixes' of ↵Linus Torvalds2-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "cachefiles: - Export an existing and add a new cachefile helper to be used in filesystems to fix reference count bugs - Use the newly added fscache_ty_get_volume() helper to get a reference count on an fscache_volume to handle volumes that are about to be removed cleanly - After withdrawing a fscache_cache via FSCACHE_CACHE_IS_WITHDRAWN wait for all ongoing cookie lookups to complete and for the object count to reach zero - Propagate errors from vfs_getxattr() to avoid an infinite loop in cachefiles_check_volume_xattr() because it keeps seeing ESTALE - Don't send new requests when an object is dropped by raising CACHEFILES_ONDEMAND_OJBSTATE_DROPPING - Cancel all requests for an object that is about to be dropped - Wait for the ondemand_boject_worker to finish before dropping a cachefiles object to prevent use-after-free - Use cyclic allocation for message ids to better handle id recycling - Add missing lock protection when iterating through the xarray when polling netfs: - Use standard logging helpers for debug logging VFS: - Fix potential use-after-free in file locks during trace_posix_lock_inode(). The tracepoint could fire while another task raced it and freed the lock that was requested to be traced - Only increment the nr_dentry_negative counter for dentries that are present on the superblock LRU. Currently, DCACHE_LRU_LIST list is used to detect this case. However, the flag is also raised in combination with DCACHE_SHRINK_LIST to indicate that dentry->d_lru is used. So checking only DCACHE_LRU_LIST will lead to wrong nr_dentry_negative count. Fix the check to not count dentries that are on a shrink related list Misc: - hfsplus: fix an uninitialized value issue in copy_name - minix: fix minixfs_rename with HIGHMEM. It still uses kunmap() even though we switched it to kmap_local_page() a while ago" * tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: minixfs: Fix minixfs_rename with HIGHMEM hfsplus: fix uninit-value in copy_name vfs: don't mod negative dentry count when on shrinker list filelock: fix potential use-after-free in posix_lock_inode cachefiles: add missing lock protection when polling cachefiles: cyclic allocation of msg_id to avoid reuse cachefiles: wait for ondemand_object_worker to finish when dropping object cachefiles: cancel all requests for the object that is being dropped cachefiles: stop sending new request when dropping object cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie() cachefiles: fix slab-use-after-free in fscache_withdraw_volume() netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume() netfs: Switch debug logging to pr_debug()
2024-07-11drm/xe: Use write-back caching mode for system memory on DGFXThomas Hellström1-1/+7
The caching mode for buffer objects with VRAM as a possible placement was forced to write-combined, regardless of placement. However, write-combined system memory is expensive to allocate and even though it is pooled, the pool is expensive to shrink, since it involves global CPU TLB flushes. Moreover write-combined system memory from TTM is only reliably available on x86 and DGFX doesn't have an x86 restriction. So regardless of the cpu caching mode selected for a bo, internally use write-back caching mode for system memory on DGFX. Coherency is maintained, but user-space clients may perceive a difference in cpu access speeds. v2: - Update RB- and Ack tags. - Rephrase wording in xe_drm.h (Matt Roper) v3: - Really rephrase wording. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Cc: Pallavi Mishra <pallavi.mishra@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: dri-devel@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Effie Yu <effie.yu@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jose Souza <jose.souza@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Acked-by: Matthew Auld <matthew.auld@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Acked-by: Michal Mrozek <michal.mrozek@intel.com> Acked-by: Effie Yu <effie.yu@intel.com> #On chat Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 01e0cfc994be484ddcb9e121e353e51d8bb837c0) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-11btrfs: avoid races when tracking progress for extent map shrinkingFilipe Manana1-8/+10
We store the progress (root and inode numbers) of the extent map shrinker in fs_info without any synchronization but we can have multiple tasks calling into the shrinker during memory allocations when there's enough memory pressure for example. This can result in a task A reading fs_info->extent_map_shrinker_last_ino after another task B updates it, and task A reading fs_info->extent_map_shrinker_last_root before task B updates it, making task A see an odd state that isn't necessarily harmful but may make it skip certain inode ranges or do more work than necessary by going over the same inodes again. These unprotected accesses would also trigger warnings from tools like KCSAN. So add a lock to protect access to these progress fields. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-10Merge tag 'mm-hotfixes-stable-2024-07-10-13-19' of ↵Linus Torvalds5-54/+24
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "21 hotfixes, 15 of which are cc:stable. No identifiable theme here - all are singleton patches, 19 are for MM" * tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits) mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio() filemap: replace pte_offset_map() with pte_offset_map_nolock() arch/xtensa: always_inline get_current() and current_thread_info() sched.h: always_inline alloc_tag_{save|restore} to fix modpost warnings MAINTAINERS: mailmap: update Lorenzo Stoakes's email address mm: fix crashes from deferred split racing folio migration lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat mm: gup: stop abusing try_grab_folio nilfs2: fix kernel bug on rename operation of broken directory mm/hugetlb_vmemmap: fix race with speculative PFN walkers cachestat: do not flush stats in recency check mm/shmem: disable PMD-sized page cache if needed mm/filemap: skip to create PMD-sized page cache if needed mm/readahead: limit page cache size in page_cache_ra_order() mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray mm/damon/core: merge regions aggressively when max_nr_regions is unmet Fix userfaultfd_api to return EINVAL as expected mm: vmalloc: check if a hash-index is in cpu_possible_mask mm: prevent derefencing NULL ptr in pfn_section_valid() ...
2024-07-10Merge tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefsLinus Torvalds1-0/+7
Pull bcachefs fixes from Kent Overstreet: - Switch some asserts to WARN() - Fix a few "transaction not locked" asserts in the data read retry paths and backpointers gc - Fix a race that would cause the journal to get stuck on a flush commit - Add missing fsck checks for the fragmentation LRU - The usual assorted ssorted syzbot fixes * tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs: (22 commits) bcachefs: Add missing bch2_trans_begin() bcachefs: Fix missing error check in journal_entry_btree_keys_validate() bcachefs: Warn on attempting a move with no replicas bcachefs: bch2_data_update_to_text() bcachefs: Log mount failure error code bcachefs: Fix undefined behaviour in eytzinger1_first() bcachefs: Mark bch_inode_info as SLAB_ACCOUNT bcachefs: Fix bch2_inode_insert() race path for tmpfiles closures: fix closure_sync + closure debugging bcachefs: Fix journal getting stuck on a flush commit bcachefs: io clock: run timer fns under clock lock bcachefs: Repair fragmentation_lru in alloc_write_key() bcachefs: add check for missing fragmentation in check_alloc_to_lru_ref() bcachefs: bch2_btree_write_buffer_maybe_flush() bcachefs: Add missing printbuf_tabstops_reset() calls bcachefs: Fix loop restart in bch2_btree_transactions_read() bcachefs: Fix bch2_read_retry_nodecode() bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs bcachefs: Fix shift greater than integer size bcachefs: Change bch2_fs_journal_stop() BUG_ON() to warning ...
2024-07-10closures: fix closure_sync + closure debuggingKent Overstreet1-0/+7
originally, stack closures were only used synchronously, and with the original implementation of closure_sync() the ref never hit 0; thus, closure_put_after_sub() assumes that if the ref hits 0 it's on the debug list, in debug mode. that's no longer true with the current implementation of closure_sync, so we need a new magic so closure_debug_destroy() doesn't pop an assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-09sched.h: always_inline alloc_tag_{save|restore} to fix modpost warningsSuren Baghdasaryan1-2/+2
Mark alloc_tag_{save|restore} as always_inline to fix the following modpost warnings: WARNING: modpost: vmlinux: section mismatch in reference: alloc_tag_save+0x1c (section: .text.unlikely) -> initcall_level_names (section: .init.data) WARNING: modpost: vmlinux: section mismatch in reference: alloc_tag_restore+0x3c (section: .text.unlikely) -> initcall_level_names (section: .init.data) The warnings happen when these functions are called from an __init function and they don't get inlined (remain in the .text section) while the value returned by get_current() points into .init.data section. Assuming get_current() always returns a valid address, this situation can happen only during init stage and accessing .init.data from .text section during that stage should pose no issues. Link: https://lkml.kernel.org/r/20240704132506.1011978-1-surenb@google.com Fixes: 22d407b164ff ("lib: add allocation tagging support for memory allocation profiling") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202407032306.gi9nZsBi-lkp@intel.com/ Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Chris Zankel <chris@zankel.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-09spi: add defer_optimize_message controller flag</