summaryrefslogtreecommitdiff
path: root/arch/x86/entry/vdso
AgeCommit message (Collapse)AuthorFilesLines
2024-07-19x86: vdso: Wire up getrandom() vDSO implementationJason A. Donenfeld4-1/+199
Hook up the generic vDSO implementation to the x86 vDSO data page. Since the existing vDSO infrastructure is heavily based on the timekeeping functionality, which works over arrays of bases, a new macro is introduced for vvars that are not arrays. The vDSO function requires a ChaCha20 implementation that does not write to the stack, yet can still do an entire ChaCha20 permutation, so provide this using SSE2, since this is userland code that must work on all x86-64 processors. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Samuel Neves <sneves@dei.uc.pt> # for vgetrandom-chacha.S Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-05-14Makefile: remove redundant tool coverage variablesMasahiro Yamada1-26/+0
Now Kbuild provides reasonable defaults for objtool, sanitizers, and profilers. Remove redundant variables. Note: This commit changes the coverage for some objects: - include arch/mips/vdso/vdso-image.o into UBSAN, GCOV, KCOV - include arch/sparc/vdso/vdso-image-*.o into UBSAN - include arch/sparc/vdso/vma.o into UBSAN - include arch/x86/entry/vdso/extable.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vdso-image-*.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vdso32-setup.o into KASAN, KCSAN, UBSAN, GCOV, KCOV - include arch/x86/entry/vdso/vma.o into GCOV, KCOV - include arch/x86/um/vdso/vma.o into KASAN, GCOV, KCOV I believe these are positive effects because all of them are kernel space objects. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Roberto Sassu <roberto.sassu@huawei.com>
2024-05-10kbuild: use $(src) instead of $(srctree)/$(src) for source directoryMasahiro Yamada1-1/+1
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-03-26x86/vdso: Fix rethunk patching for vdso-image-x32.o tooBorislav Petkov (AMD)1-0/+1
In a similar fashion to b388e57d4628 ("x86/vdso: Fix rethunk patching for vdso-image-{32,64}.o") annotate vdso-image-x32.o too for objtool so that it gets annotated properly and the unused return thunk warning doesn't fire. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202403251454.23df6278-lkp@intel.com Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/202403251454.23df6278-lkp@intel.com
2024-03-21Merge tag 'kbuild-v6.9' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Generate a list of built DTB files (arch/*/boot/dts/dtbs-list) - Use more threads when building Debian packages in parallel - Fix warnings shown during the RPM kernel package uninstallation - Change OBJECT_FILES_NON_STANDARD_*.o etc. to take a relative path to Makefile - Support GCC's -fmin-function-alignment flag - Fix a null pointer dereference bug in modpost - Add the DTB support to the RPM package - Various fixes and cleanups in Kconfig * tag 'kbuild-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (67 commits) kconfig: tests: test dependency after shuffling choices kconfig: tests: add a test for randconfig with dependent choices kconfig: tests: support KCONFIG_SEED for the randconfig runner kbuild: rpm-pkg: add dtb files in kernel rpm kconfig: remove unneeded menu_is_visible() call in conf_write_defconfig() kconfig: check prompt for choice while parsing kconfig: lxdialog: remove unused dialog colors kconfig: lxdialog: fix button color for blackbg theme modpost: fix null pointer dereference kbuild: remove GCC's default -Wpacked-bitfield-compat flag kbuild: unexport abs_srctree and abs_objtree kbuild: Move -Wenum-{compare-conditional,enum-conversion} into W=1 kconfig: remove named choice support kconfig: use linked list in get_symbol_str() to iterate over menus kconfig: link menus to a symbol kbuild: fix inconsistent indentation in top Makefile kbuild: Use -fmin-function-alignment when available alpha: merge two entries for CONFIG_ALPHA_GAMMA alpha: merge two entries for CONFIG_ALPHA_EV4 kbuild: change DTC_FLAGS_<basetarget>.o to take the path relative to $(obj) ...
2024-02-27x86/vdso: Move vDSO to mmap regionDaniel Micay1-55/+2
The vDSO (and its initial randomization) was introduced in commit 2aae950b21e4 ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu"), but had very low entropy. The entropy was improved in commit 394f56fe4801 ("x86_64, vdso: Fix the vdso address randomization algorithm"), but there is still improvement to be made. In principle there should not be executable code at a low entropy offset from the stack, since the stack and executable code having separate randomization is part of what makes ASLR stronger. Remove the only executable code near the stack region and give the vDSO the same randomized base as other mmap mappings including the linker and other shared objects. This results in higher entropy being provided and there's little to no advantage in separating this from the existing executable code there. This is already how other architectures like arm64 handle the vDSO. As an side, while it's sensible for userspace to reserve the initial mmap base as a region for executable code with a random gap for other mmap allocations, along with providing randomization within that region, there isn't much the kernel can do to help due to how dynamic linkers load the shared objects. This was extracted from the PaX RANDMMAP feature. [kees: updated commit log with historical details and other tweaks] Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Closes: https://github.com/KSPP/linux/issues/280 Link: https://lore.kernel.org/r/20240210091827.work.233-kees@kernel.org
2024-02-23kbuild: change tool coverage variables to take the path relative to $(obj)Masahiro Yamada1-0/+2
Commit 54b8ae66ae1a ("kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj)") changed the syntax of per-file compiler flags. The situation is the same for the following variables: OBJECT_FILES_NON_STANDARD_<basetarget>.o GCOV_PROFILE_<basetarget>.o KASAN_SANITIZE_<basetarget>.o KMSAN_SANITIZE_<basetarget>.o KMSAN_ENABLE_CHECKS_<basetarget>.o UBSAN_SANITIZE_<basetarget>.o KCOV_INSTRUMENT_<basetarget>.o KCSAN_SANITIZE_<basetarget>.o KCSAN_INSTRUMENT_BARRIERS_<basetarget>.o The <basetarget> is the filename of the target with its directory and suffix stripped. This syntax comes into a trouble when two files with the same basename appear in one Makefile, for example: obj-y += dir1/foo.o obj-y += dir2/foo.o OBJECT_FILES_NON_STANDARD_foo.o := y OBJECT_FILES_NON_STANDARD_foo.o is applied to both dir1/foo.o and dir2/foo.o. This syntax is not flexbile enough to handle cases where one of them is a standard object, but the other is not. It is more sensible to use the relative path to the Makefile, like this: obj-y += dir1/foo.o OBJECT_FILES_NON_STANDARD_dir1/foo.o := y obj-y += dir2/foo.o OBJECT_FILES_NON_STANDARD_dir2/foo.o := y To maintain the current behavior, I made adjustments to the following two Makefiles: - arch/x86/entry/vdso/Makefile, which compiles vclock_gettime.o, vgetcpu.o, and their vdso32 variants. - arch/x86/kvm/Makefile, which compiles vmx/vmenter.o and svm/vmenter.o Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Acked-by: Sean Christopherson <seanjc@google.com>
2024-02-22x86/vdso/kbuild: Group non-standard build attributes and primary object file ↵Ingo Molnar1-15/+15
rules together The fresh changes to the vDSO Makefile in: 289d0a475c3e ("x86/vdso: Use CONFIG_COMPAT_32 to specify vdso32") 329b77b59f83 ("x86/vdso: Simplify obj-y addition") Conflicted with a pending change in: b388e57d4628e ("x86/vdso: Fix rethunk patching for vdso-image-{32,64}.o") Which was resolved in a simple fasion in this merge commit: f14df823a61e ("Merge branch 'x86/vdso' into x86/core, to resolve conflict and to prepare for dependent changes") ... but all these changes make me look and notice a bit of historic baggage left in the Makefile: - Disordered build rules where non-standard build attributes relating to were placed sometimes several lines after - and sometimes *before* the .o build rules of the object files... Functional but inconsistent. - Inconsistent vertical spacing, stray whitespaces, inconsistent spelling of 'vDSO' over the years, a few spelling mistakes and inconsistent capitalization of comment blocks. Tidy it all up. No functional changes intended. Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-02-22Merge branch 'x86/vdso' into x86/core, to resolve conflict and to prepare ↵Ingo Molnar1-22/+8
for dependent changes Conflicts: arch/x86/entry/vdso/Makefile We also want to change arch/x86/entry/vdso/Makefile in a followup commit, so merge the trees for this. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-02-20x86/vdso: Fix rethunk patching for vdso-image-{32,64}.oJosh Poimboeuf1-3/+6
For CONFIG_RETHUNK kernels, objtool annotates all the function return sites so they can be patched during boot. By design, after apply_returns() is called, all tail-calls to the compiler-generated default return thunk (__x86_return_thunk) should be patched out and replaced with whatever's needed for any mitigations (or lack thereof). The commit 4461438a8405 ("x86/retpoline: Ensure default return thunk isn't used at runtime") adds a runtime check and a WARN_ONCE() if the default return thunk ever gets executed after alternatives have been applied. This warning is a sanity check to make sure objtool and apply_returns() are doing their job. As Nathan reported, that check found something: Unpatched return thunk in use. This should not happen! WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/bugs.c:2856 __warn_thunk+0x27/0x40 RIP: 0010:__warn_thunk+0x27/0x40 Call Trace: <TASK> ? show_regs ? __warn ? __warn_thunk ? report_bug ? console_unlock ? handle_bug ? exc_invalid_op ? asm_exc_invalid_op ? ia32_binfmt_init ? __warn_thunk warn_thunk_thunk do_one_initcall kernel_init_freeable ? __pfx_kernel_init kernel_init ret_from_fork ? __pfx_kernel_init ret_from_fork_asm </TASK> Boris debugged to find that the unpatched return site was in init_vdso_image_64(), and its translation unit wasn't being analyzed by objtool, so it never got annotated. So it got ignored by apply_returns(). This is only a minor issue, as this function is only called during boot. Still, objtool needs full visibility to the kernel. Fix it by enabling objtool on vdso-image-{32,64}.o. Note this problem can only be seen with !CONFIG_X86_KERNEL_IBT, as that requires objtool to run individually on all translation units rather on vmlinux.o. [ bp: Massage commit message. ] Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240215032049.GA3944823@dev-arch.thelio-3990X
2024-02-14Merge branch 'x86/bugs' into x86/core, to pick up pending changes before ↵Ingo Molnar1-2/+2
dependent patches Merge in pending alternatives patching infrastructure changes, before applying more patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-02-08x86/vdso: Use CONFIG_COMPAT_32 to specify vdso32Masahiro Yamada1-2/+1
In arch/x86/Kconfig, COMPAT_32 is defined as (IA32_EMULATION || X86_32). Use it to eliminate redundancy in Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231121235701.239606-5-masahiroy@kernel.org
2024-02-08x86/vdso: Use $(addprefix ) instead of $(foreach )Masahiro Yamada1-3/+3
$(addprefix ) is slightly shorter and more intuitive. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231121235701.239606-4-masahiroy@kernel.org
2024-02-08x86/vdso: Simplify obj-y additionMasahiro Yamada1-12/+4
Add objects to obj-y in a more straightforward way. CONFIG_X86_32 and CONFIG_IA32_EMULATION are not enabled simultaneously, but even if they are, Kbuild graciously deduplicates obj-y entries. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231121235701.239606-3-masahiroy@kernel.org
2024-02-08x86/vdso: Consolidate targets and clean-filesMasahiro Yamada1-6/+1
'targets' and 'clean-files' do not need to list the same files because the files listed in 'targets' are cleaned up. Refactor the code. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231121235701.239606-2-masahiroy@kernel.org
2024-01-10x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINEBreno Leitao1-2/+2
Step 5/10 of the namespace unification of CPU mitigations related Kconfig options. [ mingo: Converted a few more uses in comments/messages as well. ] Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ariel Miculas <amiculas@cisco.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20231121160740.1249350-6-leitao@debian.org
2023-11-23arch: vdso: consolidate gettime prototypesArnd Bergmann1-9/+1
The VDSO functions are defined as globals in the kernel sources but intended to be called from userspace, so there is no need to declare them in a kernel side header. Without a prototype, this now causes warnings such as arch/mips/vdso/vgettimeofday.c:14:5: error: no previous prototype for '__vdso_clock_gettime' [-Werror=missing-prototypes] arch/mips/vdso/vgettimeofday.c:28:5: error: no previous prototype for '__vdso_gettimeofday' [-Werror=missing-prototypes] arch/mips/vdso/vgettimeofday.c:36:5: error: no previous prototype for '__vdso_clock_getres' [-Werror=missing-prototypes] arch/mips/vdso/vgettimeofday.c:42:5: error: no previous prototype for '__vdso_clock_gettime64' [-Werror=missing-prototypes] arch/sparc/vdso/vclock_gettime.c:254:1: error: no previous prototype for '__vdso_clock_gettime' [-Werror=missing-prototypes] arch/sparc/vdso/vclock_gettime.c:282:1: error: no previous prototype for '__vdso_clock_gettime_stick' [-Werror=missing-prototypes] arch/sparc/vdso/vclock_gettime.c:307:1: error: no previous prototype for '__vdso_gettimeofday' [-Werror=missing-prototypes] arch/sparc/vdso/vclock_gettime.c:343:1: error: no previous prototype for '__vdso_gettimeofday_stick' [-Werror=missing-prototypes] Most architectures have already added workarounds for these by adding declarations somewhere, but since these are all compatible, we should really just have one copy, with an #ifdef check for the 32-bit vs 64-bit variant and use that everywhere. Unfortunately, the sparc an um versions are currently incompatible since they never added support for __vdso_clock_gettime64() in 32-bit userland. For the moment, I'm leaving this one out, as I can't easily test it and it requires a larger rework. Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-11-04Merge tag 'kbuild-v6.7' of ↵Linus Torvalds1-27/+0
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Implement the binary search in modpost for faster symbol lookup - Respect HOSTCC when linking host programs written in Rust - Change the binrpm-pkg target to generate kernel-devel RPM package - Fix endianness issues for tee and ishtp MODULE_DEVICE_TABLE - Unify vdso_install rules - Remove unused __memexit* annotations - Eliminate stale whitelisting for __devinit/__devexit from modpost - Enable dummy-tools to handle the -fpatchable-function-entry flag - Add 'userldlibs' syntax * tag 'kbuild-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits) kbuild: support 'userldlibs' syntax kbuild: dummy-tools: pretend we understand -fpatchable-function-entry kbuild: Correct missing architecture-specific hyphens modpost: squash ALL_{INIT,EXIT}_TEXT_SECTIONS to ALL_TEXT_SECTIONS modpost: merge sectioncheck table entries regarding init/exit sections modpost: use ALL_INIT_SECTIONS for the section check from DATA_SECTIONS modpost: disallow the combination of EXPORT_SYMBOL and __meminit* modpost: remove EXIT_SECTIONS macro modpost: remove MEM_INIT_SECTIONS macro modpost: remove more symbol patterns from the section check whitelist modpost: disallow *driver to reference .meminit* sections linux/init: remove __memexit* annotations modpost: remove ALL_EXIT_DATA_SECTIONS macro kbuild: simplify cmd_ld_multi_m kbuild: avoid too many execution of scripts/pahole-flags.sh kbuild: remove ARCH_POSTLINK from module builds kbuild: unify no-compiler-targets and no-sync-config-targets kbuild: unify vdso_install rules docs: kbuild: add INSTALL_DTBS_PATH UML: remove unused cmd_vdso_install ...
2023-11-01Merge tag 'sysctl-6.7-rc1' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "To help make the move of sysctls out of kernel/sysctl.c not incur a size penalty sysctl has been changed to allow us to not require the sentinel, the final empty element on the sysctl array. Joel Granados has been doing all this work. On the v6.6 kernel we got the major infrastructure changes required to support this. For v6.7-rc1 we have all arch/ and drivers/ modified to remove the sentinel. Both arch and driver changes have been on linux-next for a bit less than a month. It is worth re-iterating the value: - this helps reduce the overall build time size of the kernel and run time memory consumed by the kernel by about ~64 bytes per array - the extra 64-byte penalty is no longer inncurred now when we move sysctls out from kernel/sysctl.c to their own files For v6.8-rc1 expect removal of all the sentinels and also then the unneeded check for procname == NULL. The last two patches are fixes recently merged by Krister Johansen which allow us again to use softlockup_panic early on boot. This used to work but the alias work broke it. This is useful for folks who want to detect softlockups super early rather than wait and spend money on cloud solutions with nothing but an eventual hung kernel. Although this hadn't gone through linux-next it's also a stable fix, so we might as well roll through the fixes now" * tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (23 commits) watchdog: move softlockup_panic back to early_param proc: sysctl: prevent aliased sysctls from getting passed to init intel drm: Remove now superfluous sentinel element from ctl_table array Drivers: hv: Remove now superfluous sentinel element from ctl_table array raid: Remove now superfluous sentinel element from ctl_table array fw loader: Remove the now superfluous sentinel element from ctl_table array sgi-xp: Remove the now superfluous sentinel element from ctl_table array vrf: Remove the now superfluous sentinel element from ctl_table array char-misc: Remove the now superfluous sentinel element from ctl_table array infiniband: Remove the now superfluous sentinel element from ctl_table array macintosh: Remove the now superfluous sentinel element from ctl_table array parport: Remove the now superfluous sentinel element from ctl_table array scsi: Remove now superfluous sentinel element from ctl_table array tty: Remove now superfluous sentinel element from ctl_table array xen: Remove now superfluous sentinel element from ctl_table array hpet: Remove now superfluous sentinel element from ctl_table array c-sky: Remove now superfluous sentinel element from ctl_talbe array powerpc: Remove now superfluous sentinel element from ctl_table arrays riscv: Remove now superfluous sentinel element from ctl_table array x86/vdso: Remove now superfluous sentinel element from ctl_table array ...
2023-10-30Merge tag 'x86-headers-2023-10-28' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header file cleanup from Ingo Molnar: "Replace <asm/export.h> uses with <linux/export.h> and then remove <asm/export.h>" * tag 'x86-headers-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/headers: Remove <asm/export.h> x86/headers: Replace #include <asm/export.h> with #include <linux/export.h> x86/headers: Remove unnecessary #include <asm/export.h>
2023-10-28kbuild: unify vdso_install rulesMasahiro Yamada1-27/+0
Currently, there is no standard implementation for vdso_install, leading to various issues: 1. Code duplication Many architectures duplicate similar code just for copying files to the install destination. Some architectures (arm, sparc, x86) create build-id symlinks, introducing more code duplication. 2. Unintended updates of in-tree build artifacts The vdso_install rule depends on the vdso files to install. It may update in-tree build artifacts. This can be problematic, as explained in commit 19514fc665ff ("arm, kbuild: make "make install" not depend on vmlinux"). 3. Broken code in some architectures Makefile code is often copied from one architecture to another without proper adaptation. 'make vdso_install' for parisc does not work. 'make vdso_install' for s390 installs vdso64, but not vdso32. To address these problems, this commit introduces a generic vdso_install rule. Architectures that support vdso_install need to define vdso-install-y in arch/*/Makefile. vdso-install-y lists the files to install. For example, arch/x86/Makefile looks like this: vdso-install-$(CONFIG_X86_64) += arch/x86/entry/vdso/vdso64.so.dbg vdso-install-$(CONFIG_X86_X32_ABI) += arch/x86/entry/vdso/vdsox32.so.dbg vdso-install-$(CONFIG_X86_32) += arch/x86/entry/vdso/vdso32.so.dbg vdso-install-$(CONFIG_IA32_EMULATION) += arch/x86/entry/vdso/vdso32.so.dbg These files will be installed to $(MODLIB)/vdso/ with the .dbg suffix, if exists, stripped away. vdso-install-y can optionally take the second field after the colon separator. This is needed because some architectures install a vdso file as a different base name. The following is a snippet from arch/arm64/Makefile. vdso-install-$(CONFIG_COMPAT_VDSO) += arch/arm64/kernel/vdso32/vdso.so.dbg:vdso32.so This will rename vdso.so.dbg to vdso32.so during installation. If such architectures change their implementation so that the base names match, this workaround will go away. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Sven Schnelle <svens@linux.ibm.com> # s390 Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Reviewed-by: Guo Ren <guoren@kernel.org> Acked-by: Helge Deller <deller@gmx.de> # parisc Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-10-20x86/vdso: Run objtool on vdso32-setup.oDavid Kaplan1-1/+2
vdso32-setup.c is part of the main kernel image and should not be excluded from objtool. Objtool is necessary in part for ensuring that returns in this file are correctly patched to the appropriate return thunk at runtime. Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20231010171020.462211-3-david.kaplan@amd.com
2023-10-10x86/vdso: Remove now superfluous sentinel element from ctl_table arrayJoel Granados1-1/+0
This commit comes at the tail end of a greater effort to remove the empty elements at the end of the ctl_table arrays (sentinels) which will reduce the overall build time size of the kernel and run time memory bloat by ~64 bytes per sentinel (further information Link : https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/) Remove sentinel element from abi_table2. This removal is safe because register_sysctl implicitly uses ARRAY_SIZE() in addition to checking for the sentinel. Signed-off-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-03x86/headers: Remove unnecessary #include <asm/export.h>Masahiro Yamada1-1/+0
There is no EXPORT_SYMBOL() line there, hence #include <asm/export.h> is unnecessary. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230806145958.380314-1-masahiroy@kernel.org
2023-08-09x86/mm: Fix VDSO and VVAR placement on 5-level paging machinesKirill A. Shutemov1-2/+2
Yingcong has noticed that on the 5-level paging machine, VDSO and VVAR VMAs are placed above the 47-bit border: 8000001a9000-8000001ad000 r--p 00000000 00:00 0 [vvar] 8000001ad000-8000001af000 r-xp 00000000 00:00 0 [vdso] This might confuse users who are not aware of 5-level paging and expect all userspace addresses to be under the 47-bit border. So far problem has only been triggered with ASLR disabled, although it may also occur with ASLR enabled if the layout is randomized in a just right way. The problem happens due to custom placement for the VMAs in the VDSO code: vdso_addr() tries to place them above the stack and checks the result against TASK_SIZE_MAX, which is wrong. TASK_SIZE_MAX is set to the 56-bit border on 5-level paging machines. Use DEFAULT_MAP_WINDOW instead. Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace") Reported-by: Yingcong Wu <yingcong.wu@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230803151609.22141-1-kirill.shutemov%40linux.intel.com
2023-05-18x86/vdso: Include vdso/processor.hArnd Bergmann1-0/+1
__vdso_getcpu is declared in a header but this is not included before the definition, causing a W=1 warning: arch/x86/entry/vdso/vgetcpu.c:13:1: error: no previous prototype for '__vdso_getcpu' [-Werror=missing-prototypes] arch/x86/entry/vdso/vdso32/../vgetcpu.c:13:1: error: no previous prototype for '__vdso_getcpu' [-Werror=missing-prototypes] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/all/20230516193549.544673-17-arnd%40kernel.org
2023-04-28Merge tag 'x86_cleanups_for_v6.4_rc1' of ↵Linus Torvalds1-10/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - Unify duplicated __pa() and __va() definitions - Simplify sysctl tables registration - Remove unused symbols - Correct function name in comment * tag 'x86_cleanups_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Centralize __pa()/__va() definitions x86: Simplify one-level sysctl registration for itmt_kern_table x86: Simplify one-level sysctl registration for abi_table2 x86/platform/intel-mid: Remove unused definitions from intel-mid.h x86/uaccess: Remove memcpy_page_flushcache() x86/entry: Change stale function name in comment to error_return()
2023-03-22x86: Simplify one-level sysctl registration for abi_table2Luis Chamberlain1-10/+1
There is no need to declare an extra tables to just create directory, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20230310233248.3965389-2-mcgrof%40kernel.org
2023-03-21vdso: Improve cmd_vdso_check to check all dynamic relocationsFangrui Song1-4/+1
The actual intention is that no dynamic relocation exists in the VDSO. For this the VDSO build validates that the resulting .so file does not have any relocations which are specified via $(ARCH_REL_TYPE_ABS) per architecture, which is fragile as e.g. ARM64 lacks an entry for R_AARCH64_RELATIVE. Aside of that ARCH_REL_TYPE_ABS is a misnomer as it checks for relative relocations too. However, some GNU ld ports produce unneeded R_*_NONE relocation entries. If a port fails to determine the exact .rel[a].dyn size, the trailing zeros become R_*_NONE relocations. E.g. ld's powerpc port recently fixed https://sourceware.org/bugzilla/show_bug.cgi?id=29540). R_*_NONE are generally a no-op in the dynamic loaders. So just ignore them. Remove the ARCH_REL_TYPE_ABS defines and just validate that the resulting .so file does not contain any R_* relocation entries except R_*_NONE. Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for aarch64 Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for vDSO, aarch64 Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Link: https://lore.kernel.org/r/20230310190750.3323802-1-maskray@google.com
2023-02-23Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
2023-02-07x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpuSebastian Andrzej Siewior3-26/+27
The 64bit register constrains in __arch_hweight64() cannot be fulfilled in a 32-bit build. The function is only declared but not used within vclock_gettime.c and gcc does not care. LLVM complains and aborts. Reportedly because it validates extended asm even if latter would get compiled out, see https://lore.kernel.org/r/Y%2BJ%2BUQ1vAKr6RHuH@dev-arch.thelio-3990X i.e., a long standing design difference between gcc and LLVM. Move the "fake a 32 bit kernel configuration" bits from vclock_gettime.c into a common header file. Use this from vclock_gettime.c and vgetcpu.c. [ bp: Add background info from Nathan. ] Fixes: 92d33063c081a ("x86/vdso: Provide getcpu for x86-32.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/Y+IsCWQdXEr8d9Vy@linutronix.de
2023-02-06x86/vdso: Provide getcpu for x86-32.Sebastian Andrzej Siewior4-3/+6
Wire up __vdso_getcpu() for x86-32. The 64bit version is reused with trivial modifications. Contrary to vclock_gettime.c there is no requirement to fake any defines in the case of 32bit VDSO on a 64bit kernel because the GDT entry from which the CPU and node information is read is always the native one. Adopt vdso_getcpu.c by: - removing the unneeded time* header files which lead to compile errors for 32bit. - adding segment.h which provides vdso_read_cpunode() and the defines required by it. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20221125094216.3663444-3-bigeasy@linutronix.de
2023-01-25x86/vdso: Move VDSO image init to vdso2c generated codeBrian Gerst3-24/+10
Generate an init function for each VDSO image, replacing init_vdso() and sysenter_setup(). Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230124184019.26850-1-brgerst@gmail.com
2023-01-18mm: remove zap_page_range and create zap_vma_pagesMike Kravetz1-3/+1
zap_page_range was originally designed to unmap pages within an address range that could span multiple vmas. While working on [1], it was discovered that all callers of zap_page_range pass a range entirely within a single vma. In addition, the mmu notification call within zap_page range does not correctly handle ranges that span multiple vmas. When crossing a vma boundary, a new mmu_notifier_range_init/end call pair with the new vma should be made. Instead of fixing zap_page_range, do the following: - Create a new routine zap_vma_pages() that will remove all pages within the passed vma. Most users of zap_page_range pass the entire vma and can use this new routine. - For callers of zap_page_range not passing the entire vma, instead call zap_page_range_single(). - Remove zap_page_range. [1] https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.kravetz@oracle.com/ Link: https://lkml.kernel.org/r/20230104002732.232573-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Suggested-by: Peter Xu <peterx@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390] Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-14Merge tag 'x86_core_for_v6.2' of ↵Linus Torvalds1-6/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/20220915111039.092790446@infradead.org/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups * tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) x86/paravirt: Use common macro for creating simple asm paravirt functions x86/paravirt: Remove clobber bitmask from .parainstructions x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit x86/Kconfig: Enable kernel IBT by default x86,pm: Force out-of-line memcpy() objtool: Fix weak hole vs prefix symbol objtool: Optimize elf_dirty_reloc_sym() x86/cfi: Add boot time hash randomization x86/cfi: Boot time selection of CFI scheme x86/ibt: Implement FineIBT objtool: Add --cfi to generate the .cfi_sites section x86: Add prefix symbols for function padding objtool: Add option to generate prefix symbols objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf objtool: Slice up elf_create_section_symbol() kallsyms: Revert "Take callthunks into account" x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces x86/retpoline: Fix crash printing warning x86/paravirt: Fix a !PARAVIRT build warning ...
2022-12-12Merge tag 'random-6.2-rc1-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random*_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle