From 80367ad01d93ac781b0e1df246edaf006928002f Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior Date: Wed, 16 Apr 2025 18:29:12 +0200 Subject: futex: Add basic infrastructure for local task local hash The futex hash is system wide and shared by all tasks. Each slot is hashed based on futex address and the VMA of the thread. Due to randomized VMAs (and memory allocations) the same logical lock (pointer) can end up in a different hash bucket on each invocation of the application. This in turn means that different applications may share a hash bucket on the first invocation but not on the second and it is not always clear which applications will be involved. This can result in high latency's to acquire the futex_hash_bucket::lock especially if the lock owner is limited to a CPU and can not be effectively PI boosted. Introduce basic infrastructure for process local hash which is shared by all threads of process. This hash will only be used for a PROCESS_PRIVATE FUTEX operation. The hashmap can be allocated via: prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, num); A `num' of 0 means that the global hash is used instead of a private hash. Other values for `num' specify the number of slots for the hash and the number must be power of two, starting with two. The prctl() returns zero on success. This function can only be used before a thread is created. The current status for the private hash can be queried via: num = prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS); which return the current number of slots. The value 0 means that the global hash is used. Values greater than 0 indicate the number of slots that are used. A negative number indicates an error. For optimisation, for the private hash jhash2() uses only two arguments the address and the offset. This omits the VMA which is always the same. [peterz: Use 0 for global hash. A bit shuffling and renaming. ] Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20250416162921.513656-13-bigeasy@linutronix.de --- init/Kconfig | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 63f5974b9fa6..4b84da2b2ec4 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1699,6 +1699,11 @@ config FUTEX_PI depends on FUTEX && RT_MUTEXES default y +config FUTEX_PRIVATE_HASH + bool + depends on FUTEX && !BASE_SMALL && MMU + default y + config EPOLL bool "Enable eventpoll support" if EXPERT default y -- cgit v1.2.3 From c042c505210dc3453f378df432c10fff3d471bc5 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Wed, 16 Apr 2025 18:29:17 +0200 Subject: futex: Implement FUTEX2_MPOL Extend the futex2 interface to be aware of mempolicy. When FUTEX2_MPOL is specified and there is a MPOL_PREFERRED or home_node specified covering the futex address, use that hash-map. Notably, in this case the futex will go to the global node hashtable, even if it is a PRIVATE futex. When FUTEX2_NUMA|FUTEX2_MPOL is specified and the user specified node value is FUTEX_NO_NODE, the MPOL lookup (as described above) will be tried first before reverting to setting node to the local node. [bigeasy: add CONFIG_FUTEX_MPOL, add MPOL to FUTEX2_VALID_MASK, write the node only to user if FUTEX_NO_NODE was supplied] Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20250416162921.513656-18-bigeasy@linutronix.de --- init/Kconfig | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 4b84da2b2ec4..b373267ba2e2 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1704,6 +1704,11 @@ config FUTEX_PRIVATE_HASH depends on FUTEX && !BASE_SMALL && MMU default y +config FUTEX_MPOL + bool + depends on FUTEX && NUMA + default y + config EPOLL bool "Enable eventpoll support" if EXPERT default y -- cgit v1.2.3