linux.git - Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git

Age	Commit message (Collapse)	Author	Files	Lines
2024-09-15	Merge tag 'for-linus-6.11' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	2	-11/+7
	Pull kvm fix from Paolo Bonzini: "Do not always honor guest PAT on CPUs that support self-snoop. This triggers an issue in the bochsdrm driver, which used ioremap() instead of ioremap_wc() to map the video RAM. The revert lets video RAM use the WB memory type instead of the slower UC memory type" * tag 'for-linus-6.11' of git://git.kernel.org/pub/scm/virt/kvm/kvm: Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop"
2024-09-15	riscv: Stop emitting preventive sfence.vma for new userspace mappings with ↵	Alexandre Ghiti	2	-1/+28
	Svvptc The preventive sfence.vma were emitted because new mappings must be made visible to the page table walker but Svvptc guarantees that it will happen within a bounded timeframe, so no need to sfence.vma for the uarchs that implement this extension, we will then take gratuitous (but very unlikely) page faults, similarly to x86 and arm64. This allows to drastically reduce the number of sfence.vma emitted: * Ubuntu boot to login: Before: ~630k sfence.vma After: ~200k sfence.vma * ltp - mmapstress01 Before: ~45k After: ~6.3k * lmbench - lat_pagefault Before: ~665k After: 832 (!) * lmbench - lat_mmap Before: ~546k After: 718 (!) Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240717060125.139416-5-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15	riscv: Stop emitting preventive sfence.vma for new vmalloc mappings	Alexandre Ghiti	5	-1/+120
	In 6.5, we removed the vmalloc fault path because that can't work (see [1] [2]). Then in order to make sure that new page table entries were seen by the page table walker, we had to preventively emit a sfence.vma on all harts [3] but this solution is very costly since it relies on IPI. And even there, we could end up in a loop of vmalloc faults if a vmalloc allocation is done in the IPI path (for example if it is traced, see [4]), which could result in a kernel stack overflow. Those preventive sfence.vma needed to be emitted because: - if the uarch caches invalid entries, the new mapping may not be observed by the page table walker and an invalidation may be needed. - if the uarch does not cache invalid entries, a reordered access could "miss" the new mapping and traps: in that case, we would actually only need to retry the access, no sfence.vma is required. So this patch removes those preventive sfence.vma and actually handles the possible (and unlikely) exceptions. And since the kernel stacks mappings lie in the vmalloc area, this handling must be done very early when the trap is taken, at the very beginning of handle_exception: this also rules out the vmalloc allocations in the fault path. Link: https://lore.kernel.org/linux-riscv/20230531093817.665799-1-bjorn@kernel.org/ [1] Link: https://lore.kernel.org/linux-riscv/20230801090927.2018653-1-dylan@andestech.com [2] Link: https://lore.kernel.org/linux-riscv/20230725132246.817726-1-alexghiti@rivosinc.com/ [3] Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ [4] Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://lore.kernel.org/r/20240717060125.139416-4-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15	riscv: Add ISA extension parsing for Svvptc	Alexandre Ghiti	2	-0/+2
	Add support to parse the Svvptc string in the riscv,isa string. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240717060125.139416-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15	riscv: select ARCH_USE_SYM_ANNOTATIONS	Jisheng Zhang	1	-0/+1
	Now, riscv has been converted to the new style SYM_ assembler annotations. So select ARCH_USE_SYM_ANNOTATIONS to ensure the deprecated macros such as ENTRY(), END(), WEAK() and so on are not available and we don't regress. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-By: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/20240709160536.3690-3-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15	riscv: errata: sifive: Use SYM_*() assembly macros	Jisheng Zhang	1	-4/+4
	ENTRY()/END() macros are deprecated and we should make use of the new SYM_*() macros [1] for better annotation of symbols. Replace the deprecated ones with the new ones. [1] https://docs.kernel.org/core-api/asm-annotations.html Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-By: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/20240709160536.3690-2-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14	riscv: stacktrace: Add USER_STACKTRACE support	Jinjie Ruan	3	-43/+47
	Currently, userstacktrace is unsupported for riscv. So use the perf_callchain_user() code as blueprint to implement the arch_stack_walk_user() which add userstacktrace support on riscv. Meanwhile, we can use arch_stack_walk_user() to simplify the implementation of perf_callchain_user(). A ftrace test case is shown as below: # cd /sys/kernel/debug/tracing # echo 1 > options/userstacktrace # echo 1 > options/sym-userobj # echo 1 > events/sched/sched_process_fork/enable # cat trace ...... bash-178 [000] ...1. 97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231 bash-178 [000] ...1. 97.970075: <user stack trace> => /lib/libc.so.6[+0xb5090] Also a simple perf test is ok as below: # perf record -e cpu-clock --call-graph fp top # perf report --call-graph ..... [[31m 66.54%[[m 0.00% top [kernel.kallsyms] [k] ret_from_exception \| ---ret_from_exception \| \|--[[31m58.97%[[m--do_trap_ecall_u \| \| \| \|--[[31m17.34%[[m--__riscv_sys_read \| \| ksys_read \| \| \| \| \| --[[31m16.88%[[m--vfs_read \| \| \| \| \| \|--[[31m10.90%[[m--seq_read Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Tested-by: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14	riscv: Fix fp alignment bug in perf_callchain_user()	Jinjie Ruan	1	-1/+1
	The standard RISC-V calling convention said: "The stack grows downward and the stack pointer is always kept 16-byte aligned". So perf_callchain_user() should check whether 16-byte aligned for fp. Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf Fixes: dbeb90b0c1eb ("riscv: Add perf callchain support") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Cc: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-15	Revert "KVM: VMX: Always honor guest PAT on CPUs that support self-snoop"	Paolo Bonzini	2	-11/+7
	This reverts commit 377b2f359d1f71c75f8cc352b5c81f2210312d83. This caused a regression with the bochsdrm driver, which used ioremap() instead of ioremap_wc() to map the video RAM. After the commit, the WB memory type is used without the IGNORE_PAT, resulting in the slower UC memory type. In fact, UC is slow enough to basically cause guests to not boot... but only on new processors such as Sapphire Rapids and Cascade Lake. Coffee Lake for example works properly, though that might also be an effect of being on a larger, more NUMA system. The driver has been fixed but that does not help older guests. Until we figure out whether Cascade Lake and newer processors are working as intended, revert the commit. Long term we might add a quirk, but the details depend on whether the processors are working as intended: for example if they are, the quirk might reference bochs-compatible devices, e.g. in the name and documentation, so that userspace can disable the quirk by default and only leave it enabled if such a device is being exposed to the guest. If instead this is actually a bug in CLX+, then the actions we need to take are different and depend on the actual cause of the bug. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-15	Merge tag 'kvm-riscv-6.12-1' of https://github.com/kvm-riscv/linux into HEAD	Paolo Bonzini	3	-18/+21
	KVM/riscv changes for 6.12 - Fix sbiret init before forwarding to userspace - Don't zero-out PMU snapshot area before freeing data - Allow legacy PMU access from guest - Fix to allow hpmcounter31 from the guest
2024-09-15	Merge tag 'loongarch-kvm-6.12' of ↵	Paolo Bonzini	84	-311/+1118
	git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.12 1. Revert qspinlock to test-and-set simple lock on VM. 2. Add Loongson Binary Translation extension support. 3. Add PMU support for guest. 4. Enable paravirt feature control from VMM. 5. Implement function kvm_para_has_feature().
2024-09-14	riscv: Remove redundant restriction on memory size	Stuart Menefy	1	-7/+1
	The original reason for reserving the top 4GiB of the direct map (space for modules/BPF/kernel) hasn't applied since the address map was reworked for KASAN. Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14	riscv: vdso: do not strip debugging info for vdso.so.dbg	Changbin Du	1	-1/+1
	The vdso.so.dbg is a debug version of vdso and could be used for debugging purpose. For example, perf-annotate requires debugging info to show source lines. So let's keep its debugging info. Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-13	MIPS: Remove the obsoleted code for include/linux/mv643xx.h	Gaosheng Cui	1	-2/+5
	Most of the drivers which used this header have been deleted, most of these code is obsoleted, move the only defines that are actually used into arch/powerpc/platforms/chrp/pegasos_eth.c and delete the file completely. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Link: https://patch.msgid.link/20240912011949.2726928-1-cuigaosheng1@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	s390/vdso: Wire up getrandom() vdso implementation	Heiko Carstens	10	-8/+289
	Provide the s390 specific vdso getrandom() architecture backend. _vdso_rng_data required data is placed within the _vdso_data vvar page, by using a hardcoded offset larger than vdso_data. As required the chacha20 implementation does not write to the stack. The implementation follows more or less the arm64 implementations and makes use of vector instructions. It has a fallback to the getrandom() system call for machines where the vector facility is not installed. The check if the vector facility is installed, as well as an optimization for machines with the vector-enhancements facility 2, is implemented with alternatives, avoiding runtime checks. Note that __kernel_getrandom() is implemented without the vdso user wrapper which would setup a stack frame for odd cases (aka very old glibc variants) where the caller has not done that. All callers of __kernel_getrandom() are required to setup a stack frame, like the C ABI requires it. The vdso testcases vdso_test_getrandom and vdso_test_chacha pass. Benchmark on a z16: $ ./vdso_test_getrandom bench-single vdso: 25000000 times in 0.493703559 seconds syscall: 25000000 times in 6.584025337 seconds Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	arch/sparc: remove unused varible paddrbase in function leon_swprobe()	Alex Shi	1	-7/+1
	commit f22ed71cd602 ("sparc32,leon: SRMMU MMU Table probe fix") change return value from paddrbase to 'pte', but left the varible here. That causes a build warning for this varible, so we may remove it. make --keep-going CROSS_COMPILE=/home/alexs/0day/gcc-14.1.0-nolibc/sparc-linux/bin/sparc-linux- --jobs=16 KCFLAGS= -Wtautological-compare -Wno-error=return-type -Wreturn-type -Wcast-function-type -funsigned-char -Wundef -fstrict-flex-arrays=3 -Wformat-overflow -Wformat-truncation -Wrestrict -Wenum-conversion W=1 O=sparc ARCH=sparc defconfig SHELL=/bin/bash arch/sparc/mm/ mm/ -s <stdin>:1519:2: warning: #warning syscall clone3 not implemented [-Wcpp] ../arch/sparc/mm/leon_mm.c: In function 'leon_swprobe': ../arch/sparc/mm/leon_mm.c:42:32: warning: variable 'paddrbase' set but not used [-Wunused-but-set-variable] 42 \| unsigned int lvl, pte, paddrbase; \| ^~~~~~~~~ Signed-off-by: Alex Shi <alexs@kernel.org> To: linux-kernel@vger.kernel.org To: sparclinux@vger.kernel.org To: Christian Brauner <brauner@kernel.org> To: Andreas Larsson <andreas@gaisler.com> To: David S. Miller <davem@davemloft.net> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Tested-by: Andreas Larsson <andreas@gaisler.com> Link: https://lore.kernel.org/r/20240729064926.3126528-1-alexs@kernel.org Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2024-09-13	s390/vdso: Move vdso symbol handling to separate header file	Heiko Carstens	4	-14/+19
	The vdso.h header file, which is included at many places, includes generated header files. This can easily lead to recursive header file inclusions if the vdso code is changed. Therefore move the vdso symbol code, which requires the generated header files, to a separate header file, and include it at the two locations which require it. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	s390/vdso: Allow alternatives in vdso code	Heiko Carstens	2	-0/+23
	Implement the infrastructure required to allow alternatives in vdso code. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	s390/module: Provide find_section() helper	Heiko Carstens	1	-0/+14
	Provide find_section() helper function which can be used to find a section by name, similar to other architectures. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	s390/facility: Let test_facility() generate static branch if possible	Heiko Carstens	1	-8/+29
	Let test_facility() generate a branch instruction if the tested facility is a constant, and where the result cannot be evaluated during compile time. The branch instruction defaults to "false" and is patched to nop (branch not taken) if the tested facility is available. This avoids runtime checks and is similar to x86's static_cpu_has() and arm64's alternative_has_cap_likely(). Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	s390/alternatives: Remove ALT_FACILITY_EARLY	Heiko Carstens	2	-6/+2
	Patch all alternatives which depend on facilities from the decompressor. There is no technical reason which enforces to split patching of such alternatives to the decompressor and the kernel. This simplifies alternative handling a bit, since one alternative type is removed. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	s390/facility: Disable compile time optimization for decompressor code	Heiko Carstens	1	-2/+4
	Disable compile time optimizations of test_facility() for the decompressor. The decompressor should not contain any optimized code depending on the architecture level set the kernel image is compiled for to avoid unexpected operation exceptions. Add a __DECOMPRESSOR check to test_facility() to enforce that facilities are always checked during runtime for the decompressor. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64	Christophe Leroy	5	-3/+69
	Extend getrandom() vDSO implementation to VDSO64. Tested on QEMU on both ppc64_defconfig and ppc64le_defconfig. Results from a Power9 (PowerNV): ~ # ./vdso_test_getrandom bench-single vdso: 25000000 times in 0.787943615 seconds libc: 25000000 times in 14.101887252 seconds syscall: 25000000 times in 14.047475082 seconds Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32	Christophe Leroy	11	-4/+453
	To be consistent with other VDSO functions, the function is called __kernel_getrandom() __arch_chacha20_blocks_nostack() fonction is implemented basically with 32 bits operations. It performs 4 QUARTERROUND operations in parallele. There are enough registers to avoid using the stack: On input: r3: output bytes r4: 32-byte key input r5: 8-byte counter input/output r6: number of 64-byte blocks to write to output During operation: stack: pointer to counter (r5) and non-volatile registers (r14-131) r0: counter of blocks (initialised with r6) r4: Value '4' after key has been read, used for indexing r5-r12: key r14-r15: block counter r16-r31: chacha state At the end: r0, r6-r12: Zeroised r5, r14-r31: Restored Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 25000000 times in 62.938002291 seconds libc: 25000000 times in 535.581916866 seconds syscall: 25000000 times in 531.525042806 seconds Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 25000000 times in 16.899318858 seconds libc: 25000000 times in 131.050596522 seconds syscall: 25000000 times in 129.794790389 seconds This first patch adds support for VDSO32. As selftests cannot easily be generated only for VDSO32, and because the following patch brings support for VDSO64 anyway, this patch opts out all code in __arch_chacha20_blocks_nostack() so that vdso_test_chacha will not fail to compile and will not crash on PPC64/PPC64LE, allthough the selftest itself will fail. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	powerpc/vdso: Refactor CFLAGS for CVDSO build	Christophe Leroy	1	-19/+13
	In order to avoid two much duplication when we add new VDSO functionnalities in C like getrandom, refactor common CFLAGS. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	powerpc/vdso32: Add crtsavres	Christophe Leroy	2	-14/+4
	Commit 08c18b63d965 ("powerpc/vdso32: Add missing _restgpr_31_x to fix build failure") added _restgpr_31_x to the vdso for gettimeofday, but the work on getrandom shows that we will need more of those functions. Remove _restgpr_31_x and link in crtsavres.o so that we get all save/restore functions when optimising the kernel for size. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	powerpc/vdso: Fix VDSO data access when running in a non-root time namespace	Christophe Leroy	4	-3/+20
	When running in a non-root time namespace, the global VDSO data page is replaced by a dedicated namespace data page and the global data page is mapped next to it. Detailed explanations can be found at commit 660fd04f9317 ("lib/vdso: Prepare for time namespace support"). When it happens, __kernel_get_syscall_map and __kernel_get_tbfreq and __kernel_sync_dicache don't work anymore because they read 0 instead of the data they need. To address that, clock_mode has to be read. When it is set to VDSO_CLOCKMODE_TIMENS, it means it is a dedicated namespace data page and the global data is located on the following page. Add a macro called get_realdatapage which reads clock_mode and add PAGE_SIZE to the pointer provided by get_datapage macro when clock_mode is equal to VDSO_CLOCKMODE_TIMENS. Use this new macro instead of get_datapage macro except for time functions as they handle it internally. Fixes: 74205b3fc2ef ("powerpc/vdso: Add support for time namespaces") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Closes: https://lore.kernel.org/all/ZtnYqZI-nrsNslwy@zx2c4.com/ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	arm64: vDSO: Wire up getrandom() vDSO implementation	Adhemerval Zanella	9	-15/+279
	Hook up the generic vDSO implementation to the aarch64 vDSO data page. The _vdso_rng_data required data is placed within the _vdso_data vvar page, by using a offset larger than the vdso_data. The vDSO function requires a ChaCha20 implementation that does not write to the stack, and that can do an entire ChaCha20 permutation. The one provided uses NEON on the permute operation, with a fallback to the syscall for chips that do not support AdvSIMD. This also passes the vdso_test_chacha test along with vdso_test_getrandom. The vdso_test_getrandom bench-single result on Neoverse-N1 shows: vdso: 25000000 times in 0.783884250 seconds libc: 25000000 times in 8.780275399 seconds syscall: 25000000 times in 8.786581518 seconds A small fixup to arch/arm64/include/asm/mman.h was required to avoid pulling kernel code into the vDSO, similar to what's already done in arch/arm64/include/asm/rwonce.h. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	arm64: alternative: make alternative_has_cap_likely() VDSO compatible	Mark Rutland	1	-0/+4
	Currently alternative_has_cap_unlikely() can be used in VDSO code, but alternative_has_cap_likely() cannot as it references alt_cb_patch_nops, which is not available when linking the VDSO. This is unfortunate as it would be useful to have alternative_has_cap_likely() available in VDSO code. The use of alt_cb_patch_nops was added in commit: d926079f17bf8aa4 ("arm64: alternatives: add shared NOP callback") ... as removing duplicate NOPs within the kernel Image saved areasonable amount of space. Given the VDSO code will have nowhere near as many alternative branches as the main kernel image, this isn't much of a concern, and a few extra nops isn't a massive problem. Change alternative_has_cap_likely() to only use alt_cb_patch_nops for the main kernel image, and allow duplicate NOPs in VDSO code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	LoongArch: vDSO: Wire up getrandom() vDSO implementation	Xi Ruoyao	9	-1/+314
	Hook up the generic vDSO implementation to the LoongArch vDSO data page by providing the required __arch_chacha20_blocks_nostack, __arch_get_k_vdso_rng_data, and getrandom_syscall implementations. Also wire up the selftests. Signed-off-by: Xi Ruoyao <xry111@xry111.site> Acked-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: add a __vdso_getrandom prototype for all architectures	Xi Ruoyao	1	-2/+0
	Without a prototype, we'll have to add a prototype for each architecture implementing vDSO getrandom. As most architectures will likely have the vDSO getrandom implemented in a near future, and we'd like to keep the declarations compatible everywhere (to ease the libc implementor work), we should really just have one copy of the prototype. This also is what's already done inside of include/vdso/gettime.h for those vDSO functions, so this continues that convention. Suggested-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Xi Ruoyao <xry111@xry111.site> Acked-by: Huacai Chen <chenhuacai@kernel.org> [Jason: rewrite docbook comment for prototype.] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: minimize and simplify header includes	Christophe Leroy	1	-0/+1
	Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel is problematic when some system headers are included. Minimise the amount of headers by moving needed items, such as __{get,put}_unaligned_t, into dedicated common headers and in general use more specific headers, similar to what was done in commit 8165b57bca21 ("linux/const.h: Extract common header for vDSO") and commit 8c59ab839f52 ("lib/vdso: Enable common headers"). On some architectures this results in missing PAGE_SIZE, as was described by commit 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for ARM64"), so define this if necessary, in the same way as done prior by commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h"). Removing linux/time64.h leads to missing 'struct timespec64' in x86's asm/pvclock.h. Add a forward declaration of that struct in that file. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: add __arch_get_k_vdso_rng_data() helper for data page access	Christophe Leroy	2	-3/+10
	_vdso_data is specific to x86 and __arch_get_k_vdso_data() is provided so that all architectures can provide the requested pointer. Do the same with _vdso_rng_data, provide __arch_get_k_vdso_rng_data() and don't use x86 _vdso_rng_data directly. Until now vdso/vsyscall.h was only included by time/vsyscall.c but now it will also be included in char/random.c, leading to a duplicate declaration of _vdso_data and _vdso_rng_data. To fix this issue, move the declaration in a C file. vma.c looks like the most appropriate candidate. We don't need to replace the definitions in vsyscall.h by declarations as declarations are already in asm/vvar.h. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: move prototype of arch chacha function to vdso/getrandom.h	Jason A. Donenfeld	1	-13/+0
	Having the prototype for __arch_chacha20_blocks_nostack in arch/x86/include/asm/vdso/getrandom.h meant that the prototype and large doc comment were cloned by every architecture, which has been causing unnecessary churn. Instead move it into include/vdso/getrandom.h, where it can be shared by all archs implementing it. As a side bonus, this then lets us use that prototype in the vdso_test_chacha self test, to ensure that it matches the source, and indeed doing so turned up some inconsistencies, which are rectified here. Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	xtensa: Emulate one-byte cmpxchg	Paul E. McKenney	2	-0/+3
	Use the new cmpxchg_emu_u8() to emulate one-byte cmpxchg() on xtensa. [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Drop two-byte support per Arnd Bergmann feedback. ] [ Apply Geert Uytterhoeven feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yujie Liu <yujie.liu@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
2024-09-13	sh: Emulate one-byte cmpxchg	Paul E. McKenney	2	-0/+4
	Use the new cmpxchg_emu_u8() to emulate one-byte cmpxchg() on sh. [ paulmck: Drop two-byte support per Arnd Bergmann feedback. ] [ paulmck: Apply feedback from Naresh Kamboju. ] [ Apply Geert Uytterhoeven feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <linux-sh@vger.kernel.org> Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-09-13	ARC: Emulate one-byte cmpxchg	Paul E. McKenney	2	-2/+5
	Use the new cmpxchg_emu_u8() to emulate one-byte cmpxchg() on arc. [ paulmck: Drop two-byte support per Arnd Bergmann feedback. ] [ paulmck: Apply feedback from Naresh Kamboju. ] [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Apply feedback from Vineet Gupta. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: <linux-snps-arc@lists.infradead.org> Acked-by: Vineet Gupta <vgupta@kernel.org>
2024-09-13	crypto: mips/crc32 - Clean up useless assignment operations	WangYuli	1	-33/+37
	When entering the "len & sizeof(u32)" branch, len must be less than 8. So after one operation, len must be less than 4. At this time, "len -= sizeof(u32)" is not necessary for 64-bit CPUs. After that, replace `while' loops with equivalent `for' to make the code structure a little bit better by the way. Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/all/alpine.DEB.2.21.2406281713040.43454@angie.orcam.me.uk/ Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/all/ZtqZpzMH_qMQqzyc@gondor.apana.org.au/ Signed-off-by: Guan Wentao <guanwentao@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-09-12	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	30	-137/+208
	Cross-merge networking fixes after downstream PR. No conflicts (sort of) and no adjacent changes. This merge reverts commit b3c9e65eb227 ("net: hsr: remove seqnr_lock") from net, as it was superseded by commit 430d67bdcb04 ("net: hsr: Use the seqnr lock for frames received via interlink port.") in net-next. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13	cfi: add CONFIG_CFI_ICALL_NORMALIZE_INTEGERS	Alice Ryhl	1	-0/+16
	Introduce a Kconfig option for enabling the experimental option to normalize integer types. This ensures that integer types of the same size and signedness are considered compatible by the Control Flow Integrity sanitizer. The security impact of this flag is minimal. When Sami Tolvanen looked into it, he found that integer normalization reduced the number of unique type hashes in the kernel by ~1%, which is acceptable. This option exists for compatibility with Rust, as C and Rust do not have the same set of integer types. There are cases where C has two different integer types of the same size and signedness, but Rust only has one integer type of that size and signedness. When Rust calls into C functions using such types in their signature, this results in CFI failures. One example is 'unsigned long long' and 'unsigned long' which are both 64-bit on LP64 targets, so on those targets this flag will give both types the same CFI tag. This flag changes the ABI heavily. It is not applied automatically when CONFIG_RUST is turned on to make sure that the CONFIG_RUST option does not change the ABI of C code. For example, some build may need to make other changes atomically with toggling this flag. Having it be a separate option makes it possible to first turn on normalized integer tags, and then later turn on CONFIG_RUST. Similarly, when turning on CONFIG_RUST in a build, you may need a few attempts where the RUST=y commit gets reverted a few times. It is inconvenient if reverting RUST=y also requires reverting the changes you made to support normalized integer tags. To avoid having this flag impact builds that don't care about this, the next patch in this series will make CONFIG_RUST turn on this option using `select` rather than `depends on`. Signed-off-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Gatlin Newhouse <gatlin.newhouse@gmail.com> Acked-by: Kees Cook <kees@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240801-kcfi-v2-1-c93caed3d121@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-13	rust: support for shadow call stack sanitizer	Alice Ryhl	3	-2/+24
	Add all of the flags that are needed to support the shadow call stack (SCS) sanitizer with Rust, and updates Kconfig to allow only configurations that work. The -Zfixed-x18 flag is required to use SCS on arm64, and requires rustc version 1.80.0 or greater. This restriction is reflected in Kconfig. When CONFIG_DYNAMIC_SCS is enabled, the build will be configured to include unwind tables in the build artifacts. Dynamic SCS uses the unwind tables at boot to find all places that need to be patched. The -Cforce-unwind-tables=y flag ensures that unwind tables are available for Rust code. In non-dynamic mode, the -Zsanitizer=shadow-call-stack flag is what enables the SCS sanitizer. Using this flag requires rustc version 1.82.0 or greater on the targets used by Rust in the kernel. This restriction is reflected in Kconfig. It is possible to avoid the requirement of rustc 1.80.0 by using -Ctarget-feature=+reserve-x18 instead of -Zfixed-x18. However, this flag emits a warning during the build, so this patch does not add support for using it and instead requires 1.80.0 or greater. The dependency is placed on `select HAVE_RUST` to avoid a situation where enabling Rust silently turns off the sanitizer. Instead, turning on the sanitizer results in Rust being disabled. We generally do not want changes to CONFIG_RUST to result in any mitigations being changed or turned off. At the time of writing, rustc 1.82.0 only exists via the nightly release channel. There is a chance that the -Zsanitizer=shadow-call-stack flag will end up needing 1.83.0 instead, but I think it is small. Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240829-shadow-call-stack-v7-1-2f62a4432abf@google.com [ Fixed indentation using spaces. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-12	Merge tag 'riscv-for-linus-6.11-rc8' of ↵	Linus Torvalds	1	-6/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - Two fixes for smp_processor_id() calls in preemptible sections: one if the perf driver, and one in the fence.i prctl. * tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF drivers: perf: Fix smp_processor_id() use in preemptible code
2024-09-12	um: fix time-travel syscall scheduling hack	Johannes Berg	1	-14/+20
	The schedule() call there really never did anything at least since the introduction of the EEVDF scheduler, but now I found a case where we permanently hang in a loop of -ERESTARTNOINTR (due to locking.) Work around it by making any syscalls with error return take time (and then schedule after) so we cannot hang in such a loop forever. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove outdated asm/sysrq.h header	Tiwei Bie	4	-11/+0
	This header no longer serves a purpose after show_trace was removed by commit 9d1ee8ce92e1 ("um: Rewrite show_stack()"). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove the declaration of user_thread function	Tiwei Bie	1	-1/+0
	This function has never been defined since its declaration was introduced by commit 1da177e4c3f4 ("Linux-2.6.12-rc2"). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove the call to SUBARCH_EXECVE1 macro	Tiwei Bie	1	-3/+0
	This macro has never been defined by any supported sub-architectures in tree since it was introduced by commit 1d3468a6643a ("[PATCH uml: move _kern.c files"). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove unused mm_fd field from mm_id	Tiwei Bie	6	-14/+11
	It's no longer used since the removal of the SKAS3/4 support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove unused fields from thread_struct	Tiwei Bie	3	-21/+11
	These fields are no longer used since the removal of tt mode. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove the redundant newpage check in update_pte_range	Tiwei Bie	1	-10/+6
	The two checks have been identical since commit ef714f15027c ("um: remove force_flush_all from fork_handler"). And the inner one isn't necessary anymore. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-09-12	um: Remove unused kpte_clear_flush macro	Tiwei Bie	1	-7/+0
	This macro has no users, and __flush_tlb_one doesn't exist either. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>