linux.git - Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Age	Commit message (Collapse)	Author	Files	Lines
2026-03-04	tools: Fix bitfield dependency failure	Leo Yan	1	-0/+1
	[ Upstream commit a537c0da168a08b0b6a7f7bd9e75f4cc8d45ff57 ] A perf build failure was reported by Thomas Voegtle on stable kernel v6.6.120: CC tests/sample-parsing.o CC util/intel-pt-decoder/intel-pt-pkt-decoder.o CC util/perf-regs-arch/perf_regs_csky.o CC util/arm-spe-decoder/arm-spe-pkt-decoder.o CC util/perf-regs-arch/perf_regs_loongarch.o In file included from util/arm-spe-decoder/arm-spe-pkt-decoder.h:10, from util/arm-spe-decoder/arm-spe-pkt-decoder.c:14: /local/git/linux-stable-rc/tools/include/linux/bitfield.h: In function ‘le16_encode_bits’: /local/git/linux-stable-rc/tools/include/linux/bitfield.h:166:31: error: implicit declaration of function ‘cpu_to_le16’; did you mean ‘htole16’? [-Werror=implicit-function-declaration] ____MAKE_OP(le##size,u##size,cpu_to_le##size,le##size##_to_cpu) \ ^~~~~~~~~ /local/git/linux-stable-rc/tools/include/linux/bitfield.h:149:9: note: in definition of macro ‘____MAKE_OP’ return to((v & field_mask(field)) * field_multiplier(field)); \ ^~ /local/git/linux-stable-rc/tools/include/linux/bitfield.h:170:1: note: in expansion of macro ‘__MAKE_OP’ __MAKE_OP(16) Fix this by including linux/kernel.h, which provides the required definitions. The issue was not found on the mainline due to the relevant C files have included kernel.h. It'd be good to merge this change on mainline as well for robustness. Closes: https://lore.kernel.org/stable/3a44500b-d7c8-179f-61f6-e51cb50d3512@lio96.de/ Fixes: 64d86c03e1441742 ("perf arm-spe: Extend branch operations") Reported-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Reported-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> To: Sasha Levin <sashal@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-11	tools/nolibc/stdio: let perror work when NOLIBC_IGNORE_ERRNO is set	Benjamin Berg	1	-0/+4
	[ Upstream commit c485ca3aff2442adea4c08ceb5183e671ebed22a ] There is no errno variable when NOLIBC_IGNORE_ERRNO is defined. As such, simply print the message with "unknown error" rather than the integer value of errno. Fixes: acab7bcdb1bc ("tools/nolibc/stdio: add perror() to report the errno value") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-24	tools bitmap: Add missing asm-generic/bitsperlong.h include	Ian Rogers	1	-0/+1
	[ Upstream commit f38ce0209ab4553906b44bd1159e35c740a84161 ] small_const_nbits is defined in asm-generic/bitsperlong.h which bitmap.h uses but doesn't include causing build failures in some build systems. Add the missing #include. Note the bitmap.h in tools has diverged from that of the kernel, so no changes are made there. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Yury Norov <yury.norov@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: André Almeida <andrealmeid@igalia.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Darren Hart <dvhart@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jason Xing <kerneljasonxing@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonas Gottlieb <jonas.gottlieb@stackit.cloud> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maurice Lambert <mauricelambert434@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Machata <petrm@nvidia.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yuyang Huang <yuyanghuang@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-15	tools/nolibc: make time_t robust if __kernel_old_time_t is missing in host ↵	Zhouyi Zhou	1	-1/+1
	headers [ Upstream commit 0ff52df6b32a6b04a7c9dfe3d7a387aff215b482 ] Commit d5094bcb5bfd ("tools/nolibc: define time_t in terms of __kernel_old_time_t") made nolibc use the kernel's time type so that `time_t` matches `timespec::tv_sec` on all ABIs (notably x32). But since __kernel_old_time_t is fairly new, notably from 2020 in commit 94c467ddb273 ("y2038: add __kernel_old_timespec and __kernel_old_time_t"), nolibc builds that rely on host headers may fail. Switch to __kernel_time_t, which is the same as __kernel_old_time_t and has existed for longer. Tested in PPC VM of Open Source Lab of Oregon State University (./tools/testing/selftests/rcutorture/bin/mkinitrd.sh) Fixes: d5094bcb5bfd ("tools/nolibc: define time_t in terms of __kernel_old_time_t") Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com> [Thomas: Reformat commit and its message a bit] Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28	bonding: Add independent control state machine	Aahil Awatramani	1	-0/+1
	[ Upstream commit 240fd405528bbf7fafa0559202ca7aa524c9cd96 ] Add support for the independent control state machine per IEEE 802.1AX-2008 5.4.15 in addition to the existing implementation of the coupled control state machine. Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in the LACP MUX state machine for separated handling of an initial Collecting state before the Collecting and Distributing state. This enables a port to be in a state where it can receive incoming packets while not still distributing. This is useful for reducing packet loss when a port begins distributing before its partner is able to collect. Added new functions such as bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to precisely manage the port's collecting and distributing states. Previously, there was no dedicated method to disable TX while keeping RX enabled, which this patch addresses. Note that the regular flow process in the kernel's bonding driver remains unaffected by this patch. The extension requires explicit opt-in by the user (in order to ensure no disruptions for existing setups) via netlink support using the new bonding parameter coupled_control. The default value for coupled_control is set to 1 so as to preserve existing behaviour. Signed-off-by: Aahil Awatramani <aahila@google.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240202175858.1573852-1-aahila@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Stable-dep-of: 0599640a21e9 ("bonding: send LACPDUs periodically in passive mode after receiving partner's LACPDU") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28	tools/nolibc: fix spelling of FD_SETBITMASK in FD_* macros	Willy Tarreau	1	-2/+2
	commit a477629baa2a0e9991f640af418e8c973a1c08e3 upstream. While nolibc-test does test syscalls, it doesn't test as much the rest of the macros, and a wrong spelling of FD_SETBITMASK in commit feaf75658783a broke programs using either FD_SET() or FD_CLR() without being noticed. Let's fix these macros. Fixes: feaf75658783a ("nolibc: fix fd_set type") Cc: stable@vger.kernel.org # v6.2+ Acked-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	tools/nolibc: define time_t in terms of __kernel_old_time_t	Thomas Weißschuh	1	-1/+3
	[ Upstream commit d5094bcb5bfdfea2cf0de8aaf77cc65db56cbdb5 ] Nolibc assumes that the kernel ABI is using a time values that are as large as a long integer. For most ABIs this holds true. But for x32 this is not correct, as it uses 32bit longs but 64bit times. Also the 'struct stat' implementation of nolibc relies on timespec::tv_sec and time_t being the same type. While timespec::tv_sec comes from the kernel and is of type __kernel_old_time_t, time_t is defined within nolibc. Switch to the __kernel_old_time_t to always get the correct type. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250712-nolibc-x32-v1-1-6d81cb798710@weissschuh.net Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-17	kallsyms: fix build without execinfo	Achill Gilgenast	1	-0/+4
	commit a95743b53031b015e8949e845a9f6fdfb2656347 upstream. Some libc's like musl libc don't provide execinfo.h since it's not part of POSIX. In order to fix compilation on musl, only include execinfo.h if available (HAVE_BACKTRACE_SUPPORT) This was discovered with c104c16073b7 ("Kunit to check the longest symbol length") which starts to include linux/kallsyms.h with Alpine Linux' configs. Link: https://lkml.kernel.org/r/20250622014608.448718-1-fossdd@pwned.life Fixes: c104c16073b7 ("Kunit to check the longest symbol length") Signed-off-by: Achill Gilgenast <fossdd@pwned.life> Cc: Luis Henriques <luis@igalia.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27	bpf: Fix L4 csum update on IPv6 in CHECKSUM_COMPLETE	Paul Chaignon	1	-0/+2
	commit ead7f9b8de65632ef8060b84b0c55049a33cfea1 upstream. In Cilium, we use bpf_csum_diff + bpf_l4_csum_replace to, among other things, update the L4 checksum after reverse SNATing IPv6 packets. That use case is however not currently supported and leads to invalid skb->csum values in some cases. This patch adds support for IPv6 address changes in bpf_l4_csum_update via a new flag. When calling bpf_l4_csum_replace in Cilium, it ends up calling inet_proto_csum_replace_by_diff: 1: void inet_proto_csum_replace_by_diff(__sum16 sum, struct sk_buff skb, 2: __wsum diff, bool pseudohdr) 3: { 4: if (skb->ip_summed != CHECKSUM_PARTIAL) { 5: csum_replace_by_diff(sum, diff); 6: if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr) 7: skb->csum = ~csum_sub(diff, skb->csum); 8: } else if (pseudohdr) { 9: sum = ~csum_fold(csum_add(diff, csum_unfold(sum))); 10: } 11: } The bug happens when we're in the CHECKSUM_COMPLETE state. We've just updated one of the IPv6 addresses. The helper now updates the L4 header checksum on line 5. Next, it updates skb->csum on line 7. It shouldn't. For an IPv6 packet, the updates of the IPv6 address and of the L4 checksum will cancel each other. The checksums are set such that computing a checksum over the packet including its checksum will result in a sum of 0. So the same is true here when we update the L4 checksum on line 5. We'll update it as to cancel the previous IPv6 address update. Hence skb->csum should remain untouched in this case. The same bug doesn't affect IPv4 packets because, in that case, three fields are updated: the IPv4 address, the IP checksum, and the L4 checksum. The change to the IPv4 address and one of the checksums still cancel each other in skb->csum, but we're left with one checksum update and should therefore update skb->csum accordingly. That's exactly what inet_proto_csum_replace_by_diff does. This special case for IPv6 L4 checksums is also described atop inet_proto_csum_replace16, the function we should be using in this case. This patch introduces a new bpf_l4_csum_replace flag, BPF_F_IPV6, to indicate that we're updating the L4 checksum of an IPv6 packet. When the flag is set, inet_proto_csum_replace_by_diff will skip the skb->csum update. Fixes: 7d672345ed295 ("bpf: add generic bpf_csum_diff helper") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/96a6bc3a443e6f0b21ff7b7834000e17fb549e05.1748509484.git.paul.chaignon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04	bpf: Allow pre-ordering for bpf cgroup progs	Yonghong Song	1	-0/+1
	[ Upstream commit 4b82b181a26cff8bf7adc3a85a88d121d92edeaf ] Currently for bpf progs in a cgroup hierarchy, the effective prog array is computed from bottom cgroup to upper cgroups (post-ordering). For example, the following cgroup hierarchy root cgroup: p1, p2 subcgroup: p3, p4 have BPF_F_ALLOW_MULTI for both cgroup levels. The effective cgroup array ordering looks like p3 p4 p1 p2 and at run time, progs will execute based on that order. But in some cases, it is desirable to have root prog executes earlier than children progs (pre-ordering). For example, - prog p1 intends to collect original pkt dest addresses. - prog p3 will modify original pkt dest addresses to a proxy address for security reason. The end result is that prog p1 gets proxy address which is not what it wants. Putting p1 to every child cgroup is not desirable either as it will duplicate itself in many child cgroups. And this is exactly a use case we are encountering in Meta. To fix this issue, let us introduce a flag BPF_F_PREORDER. If the flag is specified at attachment time, the prog has higher priority and the ordering with that flag will be from top to bottom (pre-ordering). For example, in the above example, root cgroup: p1, p2 subcgroup: p3, p4 Let us say p2 and p4 are marked with BPF_F_PREORDER. The final effective array ordering will be p2 p4 p3 p1 Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250224230116.283071-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-17	memblock tests: fix implicit declaration of function 'numa_valid_node'	Wei Yang	1	-0/+5
	commit 9364a7e40d54e6858479f0a96e1a04aa1204be16 upstream. commit 8043832e2a12 ("memblock: use numa_valid_node() helper to check for invalid node ID") introduce a new helper numa_valid_node(), which is not defined in memblock tests. Let's add it in the corresponding header file. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Mike Rapoport (IBM) <rppt@kernel.org> Link: https://lore.kernel.org/r/20240624015432.31134-1-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-02	stddef: make __struct_group() UAPI C++-friendly	Alexander Lobakin	1	-4/+11
	[ Upstream commit 724c6ce38bbaeb4b3f109b0e066d6c0ecd15446c ] For the most part of the C++ history, it couldn't have type declarations inside anonymous unions for different reasons. At the same time, __struct_group() relies on the latters, so when the @TAG argument is not empty, C++ code doesn't want to build (even under `extern "C"`): ../linux/include/uapi/linux/pkt_cls.h:25:24: error: 'struct tc_u32_sel::<unnamed union>::tc_u32_sel_hdr,' invalid; an anonymous union may only have public non-static data members [-fpermissive] The safest way to fix this without trying to switch standards (which is impossible in UAPI anyway) etc., is to disable tag declaration for that language. This won't break anything since for now it's not buildable at all. Use a separate definition for __struct_group() when __cplusplus is defined to mitigate the error, including the version from tools/. Fixes: 50d7bd38c3aa ("stddef: Introduce struct_group() helper macro") Reported-by: Christopher Ferris <cferris@google.com> Closes: https://lore.kernel.org/linux-hardening/Z1HZpe3WE5As8UAz@google.com Suggested-by: Kees Cook <kees@kernel.org> # __struct_group_tag() Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20241219135734.2130002-1-aleksander.lobakin@intel.com Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09	tools/nolibc: s390: include std.h	Thomas Weißschuh	1	-0/+1
	commit 711b5875814b2a0e9a5aaf7a85ba7c80f5a389b1 upstream. arch-s390.h uses types from std.h, but does not include it. Depending on the inclusion order the compilation can fail. Include std.h explicitly to avoid these errors. Fixes: 404fa87c0eaf ("tools/nolibc: s390: provide custom implementation for sys_fork") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20240927-nolibc-s390-std-h-v1-1-30442339a6b9@linutronix.de Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-01	bpf: Add cookie to perf_event bpf_link_info records	Jiri Olsa	1	-0/+6
	[ Upstream commit d5c16492c66fbfca85f36e42363d32212df5927b ] At the moment we don't store cookie for perf_event probes, while we do that for the rest of the probes. Adding cookie fields to struct bpf_link_info perf event probe records: perf_event.uprobe perf_event.kprobe perf_event.tracepoint perf_event.perf_event And the code to store that in bpf_link_info struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20240119110505.400573-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Stable-dep-of: 4deecdd29cf2 ("bpf: fix unpopulated name_len field in perf_event link info") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-01	bpf: Add missed value to kprobe perf link info	Jiri Olsa	1	-0/+1
	[ Upstream commit 3acf8ace68230e9558cf916847f1cc9f208abdf1 ] Add missed value to kprobe attached through perf link info to hold the stats of missed kprobe handler execution. The kprobe's missed counter gets incremented when kprobe handler is not executed due to another kprobe running on the same cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-4-jolsa@kernel.org Stable-dep-of: 4deecdd29cf2 ("bpf: fix unpopulated name_len field in perf_event link info") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-10	tools/nolibc: powerpc: limit stack-protector workaround to GCC	Thomas Weißschuh	1	-1/+1
	[ Upstream commit 1daea158d0aae0770371f3079305a29fdb66829e ] As mentioned in the comment, the workaround for __attribute__((no_stack_protector)) is only necessary on GCC. Avoid applying the workaround on clang, as clang does not recognize __attribute__((__optimize__)) and would fail. Acked-by: Willy Tarreau <w@1wt.eu> Link: https://lore.kernel.org/r/20240807-nolibc-llvm-v2-3-c20f2f5fc7c2@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29	tools: move alignment-related macros to new <linux/align.h>	Alexander Lobakin	3	-5/+14
	commit 10a04ff09bcc39e0044190ffe9f00f998f13647c upstream. Currently, tools have ALIGN() macros scattered across the unrelated headers, as there are only 3 of them and they were added separately each time on an as-needed basis. Anyway, let's make it more consistent with the kernel headers and allow using those macros outside of the mentioned headers. Create <linux/align.h> inside the tools/ folder and include it where needed. Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29	bitmap: introduce generic optimized bitmap_size()	Alexander Lobakin	1	-3/+4
	commit a37fbe666c016fd89e4460d0ebfcea05baba46dc upstream. The number of times yet another open coded `BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge. Some generic helper is long overdue. Add one, bitmap_size(), but with one detail. BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both divident and divisor are compile-time constants or when the divisor is not a pow-of-2. When it is however, the compilers sometimes tend to generate suboptimal code (GCC 13): 48 83 c0 3f add $0x3f,%rax 48 c1 e8 06 shr $0x6,%rax 48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx %BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does full division of `nbits + 63` by it and then multiplication by 8. Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC: 8d 50 3f lea 0x3f(%rax),%edx c1 ea 03 shr $0x3,%edx 81 e2 f8 ff ff 1f and $0x1ffffff8,%edx Now it shifts `nbits + 63` by 3 positions (IOW performs fast division by 8) and then masks bits[2:0]. bloat-o-meter: add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617) Clang does it better and generates the same code before/after starting from -O1, except that with the ALIGN() approach it uses %edx and thus still saves some bytes: add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520) Note that we can't expand DIV_ROUND_UP() by adding a check and using this approach there, as it's used in array declarations where expressions are not allowed. Add this helper to tools/ as well. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Acked-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19	bpf: Replace bpf_lpm_trie_key 0-length array with flexible array	Kees Cook	1	-1/+18
	[ Upstream commit 896880ff30866f386ebed14ab81ce1ad3710cfc4 ] Replace deprecated 0-length array in struct bpf_lpm_trie_key with flexible array. Found with GCC 13: ../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=] 207 \| (__be16 )&key->data[i]); \| ^~~~~~~~~~~~~ ../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16' 102 \| #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) \| ^ ../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu' 97 \| #define be16_to_cpu __be16_to_cpu \| ^~~~~~~~~~~~~ ../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu' 206 \| u16 diff = be16_to_cpu((__be16 )&node->data[i] ^ \| ^~~~~~~~~~~ In file included from ../include/linux/bpf.h:7: ../include/uapi/linux/bpf.h:82:17: note: while referencing 'data' 82 \| __u8 data[0]; /* Arbitrary size / \| ^~~~ And found at run-time under CONFIG_FORTIFY_SOURCE: UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49 index 0 is out of range for type '__u8 []' Changing struct bpf_lpm_trie_key is difficult since has been used by userspace. For example, in Cilium: struct egress_gw_policy_key { struct bpf_lpm_trie_key lpm_key; __u32 saddr; __u32 daddr; }; While direct references to the "data" member haven't been found, there are static initializers what include the final member. For example, the "{}" here: struct egress_gw_policy_key in_key = { .lpm_key = { 32 + 24, {} }, .saddr = CLIENT_IP, .daddr = EXTERNAL_SVC_IP & 0Xffffff, }; To avoid the build time and run time warnings seen with a 0-sized trailing array for struct bpf_lpm_trie_key, introduce a new struct that correctly uses a flexible array for the trailing bytes, struct bpf_lpm_trie_key_u8. As part of this, include the "header" portion (which is just the "prefixlen" member), so it can be used by anything building a bpf_lpr_trie_key that has trailing members that aren't a u8 flexible array (like the self-test[1]), which is named struct bpf_lpm_trie_key_hdr. Unfortunately, C++ refuses to parse the __struct_group() helper, so it is not possible to define struct bpf_lpm_trie_key_hdr directly in struct bpf_lpm_trie_key_u8, so we must open-code the union directly. Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out, and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment to the UAPI header directing folks to the two new options. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Closes: https://paste.debian.net/hidden/ca500597/ Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1] Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org Stable-dep-of: 59f2f841179a ("bpf: Avoid kfree_rcu() under lock in bpf_lpm_trie.") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12	bpf: Pack struct bpf_fib_lookup	Anton Protopopov	1	-1/+1
	[ Upstream commit f91717007217d975aa975ddabd91ae1a107b9bff ] The struct bpf_fib_lookup is supposed to be of size 64. A recent commit 59b418c7063d ("bpf: Add a check for struct bpf_fib_lookup size") added a static assertion to check this property so that future changes to the structure will not accidentally break this assumption. As it immediately turned out, on some 32-bit arm systems, when AEABI=n, the total size of the structure was equal to 68, see [1]. This happened because the bpf_fib_lookup structure contains a union of two 16-bit fields: union { __u16 tot_len; __u16 mtu_result; }; which was supposed to compile to a 16-bit-aligned 16-bit field. On the aforementioned setups it was instead both aligned and padded to 32-bits. Declare this inner union as __attribute__((packed, aligned(2))) such that it always is of size 2 and is aligned to 16 bits. [1] https://lore.kernel.org/all/CA+G9fYtsoP51f-oP_Sp5MOq-Ffv8La2RztNpwvE6+R1VtFiLrw@mail.gmail.com/#t Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: e1850ea9bd9e ("bpf: bpf_fib_lookup return MTU value as output when looked up") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240403123303.1452184-1-aspsk@isovalent.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12	tools/nolibc/stdlib: fix memory error in realloc()	Brennan Xavier McManus	1	-1/+1
	commit 791f4641142e2aced85de082e5783b4fb0b977c2 upstream. Pass user_p_len to memcpy() instead of heap->len to prevent realloc() from copying an extra sizeof(heap) bytes from beyond the allocated region. Signed-off-by: Brennan Xavier McManus <bxmcmanus@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Fixes: 0e0ff638400be8f497a35b51a4751fd823f6bd6a ("tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`") Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17	memblock tests: fix undefined reference to `panic'	Wei Yang	2	-0/+20
	[ Upstream commit e0f5a8e74be88f2476e58b25d3b49a9521bdc4ec ] commit e96c6b8f212a ("memblock: report failures when memblock_can_resize is not set") introduced the usage of panic, which is not defined in memblock test. Let's define it directly in panic.h to fix it. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Song Shuai <songshuaishuai@tinylab.org> CC: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/r/20240402132701.29744-3-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-17	memblock tests: fix undefined reference to `early_pfn_to_nid'	Wei Yang	1	-0/+5
	[ Upstream commit 7d8ed162e6a92268d4b2b84d364a931216102c8e ] commit 6a9531c3a880 ("memblock: fix crash when reserved memory is not added to memory") introduce the usage of early_pfn_to_nid, which is not defined in memblock tests. The original definition of early_pfn_to_nid is defined in mm.h, so let add this in the corresponding mm.h. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Yajun Deng <yajun.deng@linux.dev> CC: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/r/20240402132701.29744-2-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-03	tools/resolve_btfids: fix build with musl libc	Natanael Copa	1	-0/+2
	commit 62248b22d01e96a4d669cde0d7005bd51ebf9e76 upstream. Include the header that defines u32. This fixes build of 6.6.23 and 6.1.83 kernels for Alpine Linux, which uses musl libc. I assume that GNU libc indirecly pulls in linux/types.h. Fixes: 9707ac4fe2f5 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218647 Cc: stable@vger.kernel.org Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> Tested-by: Greg Thelen <gthelen@google.com> Link: https://lore.kernel.org/r/20240328110103.28734-1-ncopa@alpinelinux.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-26	tools/resolve_btfids: Refactor set sorting with types from btf_ids.h	Viktor Malik	1	-0/+9
	[ Upstream commit 9707ac4fe2f5bac6406d2403f8b8a64d7b3d8e43 ] Instead of using magic offsets to access BTF ID set data, leverage types from btf_ids.h (btf_id_set and btf_id_set8) which define the actual layout of the data. Thanks to this change, set sorting should also continue working if the layout changes. This requires to sync the definition of 'struct btf_id_set8' from include/linux/btf_ids.h to tools/include/linux/btf_ids.h. We don't sync the rest of the file at the moment, b/c that would require to also sync multiple dependent headers and we don't need any other defs from btf_ids.h. Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/ff7f062ddf6a00815fda3087957c4ce667f50532.1707223196.git.vmalik@redhat.com Stable-dep-of: 903fad439466 ("tools/resolve_btfids: Fix cross-compilation to non-host endianness") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01	bpf: Derive source IP addr via bpf_*_fib_lookup()	Martynas Pumputis	1	-0/+10
	commit dab4e1f06cabb6834de14264394ccab197007302 upstream. Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC \| BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; /* the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-23	work around gcc bugs with 'asm goto' with outputs	Linus Torvalds	1	-2/+2
	commit 4356e9f841f7fbb945521cef3577ba394c65f3fc upstream. We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-25	bpf: Add crosstask check to __bpf_get_stack	Jordan Rome	1	-0/+3
	[ Upstream commit b8e3a87a627b575896e448021e5c2f8a3bc19931 ] Currently get_perf_callchain only supports user stack walking for the current task. Passing the correct crosstask param will return 0 frames if the task passed to __bpf_get_stack isn't the current one instead of a single incorrect frame/address. This change passes the correct crosstask param but also does a preemptive check in __bpf_get_stack if the task is current and returns -EOPNOTSUPP if it is not. This issue was found using bpf_get_task_stack inside a BPF iterator ("iter/task"), which iterates over all tasks. bpf_get_task_stack works fine for fetching kernel stacks but because get_perf_callchain relies on the caller to know if the requested task is the current one (via crosstask) it was failing in a confusing way. It might be possible to get user stacks for all tasks utilizing something like access_process_vm but that requires the bpf program calling bpf_get_task_stack to be sleepable and would therefore be a breaking change. Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()") Signed-off-by: Jordan Rome <jordalgo@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231108112334.3433136-1-jordalgo@meta.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-03	mm: add a NO_INHERIT flag to the PR_SET_MDWE prctl	Florent Revest	1	-0/+1
	[ Upstream commit 24e41bf8a6b424c76c5902fb999e9eca61bdf83d ] This extends the current PR_SET_MDWE prctl arg with a bit to indicate that the process doesn't want MDWE protection to propagate to children. To implement this no-inherit mode, the tag in current->mm->flags must be absent from MMF_INIT_MASK. This means that the encoding for "MDWE but without inherit" is different in the prctl than in the mm flags. This leads to a bit of bit-mangling in the prctl implementation. Link: https://lkml.kernel.org/r/20230828150858.393570-6-revest@chromium.org Signed-off-by: Florent Revest <revest@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Izbyshev <izbyshev@ispras.ru> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: 793838138c15 ("prctl: Disable prctl(PR_SET_MDWE) on parisc") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28	mm: make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long	Florent Revest	1	-1/+1
	commit 0da668333fb07805c2836d5d50e26eda915b24a1 upstream. Defining a prctl flag as an int is a footgun because on a 64 bit machine and with a variadic implementation of prctl (like in musl and glibc), when used directly as a prctl argument, it can get casted to long with garbage upper bits which would result in unexpected behaviors. This patch changes the constant to an unsigned long to eliminate that possibilities. This does not break UAPI. I think that a stable backport would be "nice to have": to reduce the chances that users build binaries that could end up with garbage bits in their MDWE prctl arguments. We are not aware of anyone having yet encountered this corner case with MDWE prctls but a backport would reduce the likelihood it happens, since this sort of issues has happened with other prctls. But If this is perceived as a backporting burden, I suppose we could also live without a stable backport. Link: https://lkml.kernel.org/r/20230828150858.393570-5-revest@chromium.org Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl") Signed-off-by: Florent Revest <revest@chromium.org> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ayush Jain <ayush.jain3@amd.com> Cc: Greg Thelen <gthelen@google.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-24	Merge tag 'mm-hotfixes-stable-2023-10-24-09-40' of ↵	Linus Torvalds	1	-0/+40
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues or aren't considered necessary for earlier kernel versions" * tag 'mm-hotfixes-stable-2023-10-24-09-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: maple_tree: add GFP_KERNEL to allocations in mas_expected_entries() selftests/mm: include mman header to access MREMAP_DONTUNMAP identifier mailmap: correct email aliasing for Oleksij Rempel mailmap: map Bartosz's old address to the current one mm/damon/sysfs: check DAMOS regions update progress from before_terminate() MAINTAINERS: Ondrej has moved kasan: disable kasan_non_canonical_hook() for HW tags kasan: print the original fault addr when access invalid shadow hugetlbfs: close race between MADV_DONTNEED and page fault hugetlbfs: extend hugetlb_vma_lock to private VMAs hugetlbfs: clear resv_map pointer if mmap fails mm: zswap: fix pool refcount bug around shrink_worker() mm/migrate: fix do_pages_move for compat pointers riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set riscv: handle VM_FAULT_[HWPOISON\|HWPOISON_LARGE] faults instead of panicking mmap: fix error paths with dup_anon_vma() mmap: fix vma_iterator in error path of vma_merge() mm: fix vm_brk_flags() to not bail out while holding lock mm/mempolicy: fix set_mempolicy_home_node() previous VMA pointer mm/page_alloc: correct start page when guard page debug is enabled
2023-10-23	Merge tag 'urgent/nolibc.2023.10.16a' of ↵	Linus Torvalds	2	-1/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull nolibc fixes from Paul McKenney: - tools/nolibc: i386: Fix a stack misalign bug on _start - MAINTAINERS: nolibc: update tree location - tools/nolibc: mark start_c as weak to avoid linker errors * tag 'urgent/nolibc.2023.10.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: tools/nolibc: mark start_c as weak MAINTAINERS: nolibc: update tree location tools/nolibc: i386: Fix a stack misalign bug on _start
2023-10-18	maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()	Liam R. Howlett	1	-0/+40
	Users complained about OOM errors during fork without triggering compaction. This can be fixed by modifying the flags used in mas_expected_entries() so that the compaction will be triggered in low memory situations. Since mas_expected_entries() is only used during fork, the extra argument does not need to be passed through. Additionally, the two test_maple_tree test cases and one benchmark test were altered to use the correct locking type so that allocations would not trigger sleeping and thus fail. Testing was completed with lockdep atomic sleep detection. The additional locking change requires rwsem support additions to the tools/ directory through the use of pthreads pthread_rwlock_t. With this change test_maple_tree works in userspace, as a module, and in-kernel. Users may notice that the system gave up early on attempting to start new processes instead of attempting to reclaim memory. Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4 Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <jason.sim@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-12	tools/nolibc: mark start_c as weak	Thomas Weißschuh	1	-0/+1
	Otherwise the different instances of _start_c from each compilation unit will lead to linker errors: /usr/bin/ld: /tmp/ccSNvRqs.o: in function `_start_c': nolibc-test-foo.c:(.text.nolibc_memset+0x9): multiple definition of `_start_c'; /tmp/ccG25101.o:nolibc-test.c:(.text+0x1ea3): first defined here Fixes: 17336755150b ("tools/nolibc: add new crt.h with _start_c") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/lkml/20231012-nolibc-start_c-multiple-v1-1-fbfc73e0283f@weissschuh.net/ Link: https://lore.kernel.org/lkml/20231012-nolibc-linkage-test-v1-1-315e682768b4@weissschuh.net/ Acked-by: Willy Tarreau <w@1wt.eu>
2023-10-12	tools/nolibc: i386: Fix a stack misalign bug on _start	Ammar Faizi	1	-1/+3
	The ABI mandates that the %esp register must be a multiple of 16 when executing a 'call' instruction. Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c") simplified the _start function, but it didn't take care of the %esp alignment, causing SIGSEGV on SSE and AVX programs that use aligned move instruction (e.g., movdqa, movaps, and vmovdqa). The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push %eax' will subtract the %esp by 4; thus, it breaks the 16-byte alignment. Make sure the %esp is correctly aligned after the push by subtracting 12 before the push. Extra: Add 'add $12, %esp' before the 'and $-16, %esp' to avoid over-estimating for particular cases as suggested by Willy. A test program to validate the %esp alignment on _start can be found at: https://lore.kernel.org/lkml/ZOoindMFj1UKqo+s@biznet-home.integral.gnuweeb.org [ Thomas: trim Fixes tag commit id ] Cc: Zhangjin Wu <falcon@tinylab.org> Fixes: 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c") Reported-by: Nicholas Rosenberg <inori@vnlx.org> Acked-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Reviewed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
2023-09-26	Merge tag 'perf-tools-fixes-for-v6.6-1-2023-09-25' of ↵	Linus Torvalds	3	-16/+230
	git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "Build: - Update header files in the tools/**/include directory t