diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-12-05 13:52:43 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-12-05 13:52:43 -0800 |
| commit | 7203ca412fc8e8a0588e9adc0f777d3163f8dff3 (patch) | |
| tree | 7cbdcdb0bc0533f0133d472f95629099c123c3f9 /mm | |
| parent | ac20755937e037e586b1ca18a6717d31b1cbce93 (diff) | |
| parent | faf3c923523e5c8fc3baaa413d62e913774ae52f (diff) | |
| download | linux-7203ca412fc8e8a0588e9adc0f777d3163f8dff3.tar.gz linux-7203ca412fc8e8a0588e9adc0f777d3163f8dff3.tar.bz2 linux-7203ca412fc8e8a0588e9adc0f777d3163f8dff3.zip | |
Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
Rework the vmalloc() code to support non-blocking allocations
(GFP_ATOIC, GFP_NOWAIT)
"ksm: fix exec/fork inheritance" (xu xin)
Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
inherited across fork/exec
"mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
Some light maintenance work on the zswap code
"mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
Enhance the /sys/kernel/debug/page_owner debug feature by adding
unique identifiers to differentiate the various stack traces so
that userspace monitoring tools can better match stack traces over
time
"mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
Minor alterations to the page allocator's per-cpu-pages feature
"Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
Address a scalability issue in userfaultfd's UFFDIO_MOVE operation
"kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)
"drivers/base/node: fold node register and unregister functions" (Donet Tom)
Clean up the NUMA node handling code a little
"mm: some optimizations for prot numa" (Kefeng Wang)
Cleanups and small optimizations to the NUMA allocation hinting
code
"mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
Address long lock hold times at boot on large machines. These were
causing (harmless) softlockup warnings
"optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
Remove some now-unnecessary work from page reclaim
"mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
Enhance the DAMOS auto-tuning feature
"mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
configuration
"expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
Enhance the new(ish) file_operations.mmap_prepare() method and port
additional callsites from the old ->mmap() over to ->mmap_prepare()
"Fix stale IOTLB entries for kernel address space" (Lu Baolu)
Fix a bug (and possible security issue on non-x86) in the IOMMU
code. In some situations the IOMMU could be left hanging onto a
stale kernel pagetable entry
"mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
Clean up and optimize the folio splitting code
"mm, swap: misc cleanup and bugfix" (Kairui Song)
Some cleanups and a minor fix in the swap discard code
"mm/damon: misc documentation fixups" (SeongJae Park)
"mm/damon: support pin-point targets removal" (SeongJae Park)
Permit userspace to remove a specific monitoring target in the
middle of the current targets list
"mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
A couple of cleanups related to mm header file inclusion
"mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
improve the selection of swap devices for NUMA machines
"mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
Change the memory block labels from macros to enums so they will
appear in kernel debug info
"ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
Address an inefficiency when KSM unmerges an address range
"mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
Fix leaks and unhandled malloc() failures in DAMON userspace unit
tests
"some cleanups for pageout()" (Baolin Wang)
Clean up a couple of minor things in the page scanner's
writeback-for-eviction code
"mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
Move hugetlb's sysfs/sysctl handling code into a new file
"introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
Make the VMA guard regions available in /proc/pid/smaps and
improves the mergeability of guarded VMAs
"mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
Reduce mmap lock contention for callers performing VMA guard region
operations
"vma_start_write_killable" (Matthew Wilcox)
Start work on permitting applications to be killed when they are
waiting on a read_lock on the VMA lock
"mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
Add additional userspace testing of DAMON's "commit" feature
"mm/damon: misc cleanups" (SeongJae Park)
"make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
VMA is merged with another
"mm: support device-private THP" (Balbir Singh)
Introduce support for Transparent Huge Page (THP) migration in zone
device-private memory
"Optimize folio split in memory failure" (Zi Yan)
"mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
Some more cleanups in the folio splitting code
"mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
Clean up our handling of pagetable leaf entries by introducing the
concept of 'software leaf entries', of type softleaf_t
"reparent the THP split queue" (Muchun Song)
Reparent the THP split queue to its parent memcg. This is in
preparation for addressing the long-standing "dying memcg" problem,
wherein dead memcg's linger for too long, consuming memory
resources
"unify PMD scan results and remove redundant cleanup" (Wei Yang)
A little cleanup in the hugepage collapse code
"zram: introduce writeback bio batching" (Sergey Senozhatsky)
Improve zram writeback efficiency by introducing batched bio
writeback support
"memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
Clean up our handling of the interrupt safety of some memcg stats
"make vmalloc gfp flags usage more apparent" (Vishal Moola)
Clean up vmalloc's handling of incoming GFP flags
"mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
Teach soft dirty and userfaultfd write protect tracking to use
RISC-V's Svrsw60t59b extension
"mm: swap: small fixes and comment cleanups" (Youngjun Park)
Fix a small bug and clean up some of the swap code
"initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
Start work on converting the vma struct's flags to a bitmap, so we
stop running out of them, especially on 32-bit
"mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
Address a possible bug in the swap discard code and clean things
up a little
[ This merge also reverts commit ebb9aeb980e5 ("vfio/nvgrace-gpu:
register device memory for poison handling") because it looks
broken to me, I've asked for clarification - Linus ]
* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm: fix vma_start_write_killable() signal handling
mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
mm/swapfile: fix list iteration when next node is removed during discard
fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
mm/kfence: add reboot notifier to disable KFENCE on shutdown
memcg: remove inc/dec_lruvec_kmem_state helpers
selftests/mm/uffd: initialize char variable to Null
mm: fix DEBUG_RODATA_TEST indentation in Kconfig
mm: introduce VMA flags bitmap type
tools/testing/vma: eliminate dependency on vma->__vm_flags
mm: simplify and rename mm flags function for clarity
mm: declare VMA flags by bit
zram: fix a spelling mistake
mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
pagemap: update BUDDY flag documentation
mm: swap: remove scan_swap_map_slots() references from comments
mm: swap: change swap_alloc_slow() to void
mm, swap: remove redundant comment for read_swap_cache_async
mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
...
Diffstat (limited to 'mm')
87 files changed, 5852 insertions, 3230 deletions
diff --git a/mm/Kconfig b/mm/Kconfig index ca3f146bc705..bd0ea5454af8 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -695,15 +695,6 @@ config PCP_BATCH_SCALE_MAX config PHYS_ADDR_T_64BIT def_bool 64BIT -config BOUNCE - bool "Enable bounce buffers" - default y - depends on BLOCK && MMU && HIGHMEM - help - Enable bounce buffers for devices that cannot access the full range of - memory available to the CPU. Enabled by default when HIGHMEM is - selected, but you may say n to override this. - config MMU_NOTIFIER bool select INTERVAL_TREE @@ -749,7 +740,7 @@ config MEMORY_FAILURE depends on MMU depends on ARCH_SUPPORTS_MEMORY_FAILURE bool "Enable recovery from hardware memory errors" - select RAS + select INTERVAL_TREE help Enables code to recover from some memory failures on systems with MCA recovery. This allows a system to continue running @@ -862,6 +853,97 @@ choice enabled at runtime via sysfs. endchoice +choice + prompt "Shmem hugepage allocation defaults" + depends on TRANSPARENT_HUGEPAGE + default TRANSPARENT_HUGEPAGE_SHMEM_HUGE_NEVER + help + Selects the hugepage allocation policy defaults for + the internal shmem mount. + + The selection made here can be overridden by using the kernel + command line 'transparent_hugepage_shmem=' option. + + config TRANSPARENT_HUGEPAGE_SHMEM_HUGE_NEVER + bool "never" + help + Disable hugepage allocation for shmem mount by default. It can + still be enabled with the kernel command line + 'transparent_hugepage_shmem=' option or at runtime via sysfs + knob. Note that madvise(MADV_COLLAPSE) can still cause + transparent huge pages to be obtained even if this mode is + specified. + + config TRANSPARENT_HUGEPAGE_SHMEM_HUGE_ALWAYS + bool "always" + help + Always attempt to allocate hugepage for shmem mount, can + increase the memory footprint of applications without a + guaranteed benefit but it will work automatically for all + applications. + + config TRANSPARENT_HUGEPAGE_SHMEM_HUGE_WITHIN_SIZE + bool "within_size" + help + Enable hugepage allocation for shmem mount if the allocation + will be fully within the i_size. This configuration also takes + into account any madvise(MADV_HUGEPAGE) hints that may be + provided by the applications. + + config TRANSPARENT_HUGEPAGE_SHMEM_HUGE_ADVISE + bool "advise" + help + Enable hugepage allocation for the shmem mount exclusively when + applications supply the madvise(MADV_HUGEPAGE) hint. + This ensures that hugepages are used only in response to explicit + requests from applications. +endchoice + +choice + prompt "Tmpfs hugepage allocation defaults" + depends on TRANSPARENT_HUGEPAGE + default TRANSPARENT_HUGEPAGE_TMPFS_HUGE_NEVER + help + Selects the hugepage allocation policy defaults for + the tmpfs mount. + + The selection made here can be overridden by using the kernel + command line 'transparent_hugepage_tmpfs=' option. + + config TRANSPARENT_HUGEPAGE_TMPFS_HUGE_NEVER + bool "never" + help + Disable hugepage allocation for tmpfs mount by default. It can + still be enabled with the kernel command line + 'transparent_hugepage_tmpfs=' option. Note that + madvise(MADV_COLLAPSE) can still cause transparent huge pages + to be obtained even if this mode is specified. + + config TRANSPARENT_HUGEPAGE_TMPFS_HUGE_ALWAYS + bool "always" + help + Always attempt to allocate hugepage for tmpfs mount, can + increase the memory footprint of applications without a + guaranteed benefit but it will work automatically for all + applications. + + config TRANSPARENT_HUGEPAGE_TMPFS_HUGE_WITHIN_SIZE + bool "within_size" + help + Enable hugepage allocation for tmpfs mount if the allocation + will be fully within the i_size. This configuration also takes + into account any madvise(MADV_HUGEPAGE) hints that may be + provided by the applications. + + config TRANSPARENT_HUGEPAGE_TMPFS_HUGE_ADVISE + bool "advise" + help + Enable hugepage allocation for the tmpfs mount exclusively when + applications supply the madvise(MADV_HUGEPAGE) hint. + This ensures that hugepages are used only in response to explicit + requests from applications. +endchoice + config THP_SWAP def_bool y depends on TRANSPARENT_HUGEPAGE && ARCH_WANTS_THP_SWAP && SWAP && 64BIT @@ -915,6 +997,9 @@ config HAVE_GIGANTIC_FOLIOS def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \ (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) +config ASYNC_KERNEL_PGTABLE_FREE + def_bool n + # TODO: Allow to be enabled without THP config ARCH_SUPPORTS_HUGE_PFNMAP def_bool n diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug index 32b65073d0cc..7638d75b27db 100644 --- a/mm/Kconfig.debug +++ b/mm/Kconfig.debug @@ -175,10 +175,10 @@ config DEBUG_PAGE_REF nil until the tracepoints are actually enabled. config DEBUG_RODATA_TEST - bool "Testcase for the marking rodata read-only" - depends on STRICT_KERNEL_RWX + bool "Testcase for the marking rodata read-only" + depends on STRICT_KERNEL_RWX help - This option enables a testcase for the setting rodata read-only. + This option enables a testcase for the setting rodata read-only. config ARCH_HAS_DEBUG_WX bool diff --git a/mm/Makefile b/mm/Makefile index 21abb3353550..00ceb2418b64 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -78,7 +78,7 @@ endif obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o obj-$(CONFIG_ZSWAP) += zswap.o obj-$(CONFIG_HAS_DMA) += dmapool.o -obj-$(CONFIG_HUGETLBFS) += hugetlb.o +obj-$(CONFIG_HUGETLBFS) += hugetlb.o hugetlb_sysfs.o hugetlb_sysctl.o ifdef CONFIG_CMA obj-$(CONFIG_HUGETLBFS) += hugetlb_cma.o endif diff --git a/mm/damon/core.c b/mm/damon/core.c index 109b050c795a..f9fc0375890a 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -10,6 +10,7 @@ #include <linux/damon.h> #include <linux/delay.h> #include <linux/kthread.h> +#include <linux/memcontrol.h> #include <linux/mm.h> #include <linux/psi.h> #include <linux/slab.h> @@ -19,11 +20,6 @@ #define CREATE_TRACE_POINTS #include <trace/events/damon.h> -#ifdef CONFIG_DAMON_KUNIT_TEST -#undef DAMON_MIN_REGION -#define DAMON_MIN_REGION 1 -#endif - static DEFINE_MUTEX(damon_lock); static int nr_running_ctxs; static bool running_exclusive_ctxs; @@ -305,7 +301,7 @@ void damos_add_filter(struct damos *s, struct damos_filter *f) if (damos_filter_for_ops(f->type)) list_add_tail(&f->list, &s->ops_filters); else - list_add_tail(&f->list, &s->filters); + list_add_tail(&f->list, &s->core_filters); } static void damos_del_filter(struct damos_filter *f) @@ -396,7 +392,7 @@ struct damos *damon_new_scheme(struct damos_access_pattern *pattern, */ scheme->next_apply_sis = 0; scheme->walk_completed = false; - INIT_LIST_HEAD(&scheme->filters); + INIT_LIST_HEAD(&scheme->core_filters); INIT_LIST_HEAD(&scheme->ops_filters); scheme->stat = (struct damos_stat){}; INIT_LIST_HEAD(&scheme->list); @@ -449,7 +445,7 @@ void damon_destroy_scheme(struct damos *s) damos_for_each_quota_goal_safe(g, g_next, &s->quota) damos_destroy_quota_goal(g); - damos_for_each_filter_safe(f, next, s) + damos_for_each_core_filter_safe(f, next, s) damos_destroy_filter(f); damos_for_each_ops_filter_safe(f, next, s) @@ -478,6 +474,7 @@ struct damon_target *damon_new_target(void) t->nr_regions = 0; INIT_LIST_HEAD(&t->regions_list); INIT_LIST_HEAD(&t->list); + t->obsolete = false; return t; } @@ -788,6 +785,11 @@ static void damos_commit_quota_goal_union( case DAMOS_QUOTA_NODE_MEM_FREE_BP: |
