summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)AuthorFilesLines
2024-11-01s390: Initialize psw mask in perf_arch_fetch_caller_regs()Heiko Carstens1-0/+1
[ Upstream commit 223e7fb979fa06934f1595b6ad0ae1d4ead1147f ] Also initialize regs->psw.mask in perf_arch_fetch_caller_regs(). This way user_mode(regs) will return false, like it should. It looks like all current users initialize regs to zero, so that this doesn't fix a bug currently. However it is better to not rely on callers to do this. Fixes: 914d52e46490 ("s390: implement perf_arch_fetch_caller_regs") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-01s390/pci: Handle PCI error codes other than 0x3aNiklas Schnelle1-8/+9
[ Upstream commit 3cd03ea57e8e16cc78cc357d5e9f26078426f236 ] The Linux implementation of PCI error recovery for s390 was based on the understanding that firmware error recovery is a two step process with an optional initial error event to indicate the cause of the error if known followed by either error event 0x3A (Success) or 0x3B (Failure) to indicate whether firmware was able to recover. While this has been the case in testing and the error cases seen in the wild it turns out this is not correct. Instead firmware only generates 0x3A for some error and service scenarios and expects the OS to perform recovery for all PCI events codes except for those indicating permanent error (0x3B, 0x40) and those indicating errors on the function measurement block (0x2A, 0x2B, 0x2C). Align Linux behavior with these expectations. Fixes: 4cdf2f4e24ff ("s390/pci: implement minimal PCI error recovery") Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-22KVM: s390: Change virtual to physical address access in diag 0x258 handlerMichael Mueller1-1/+1
commit cad4b3d4ab1f062708fff33f44d246853f51e966 upstream. The parameters for the diag 0x258 are real addresses, not virtual, but KVM was using them as virtual addresses. This only happened to work, since the Linux kernel as a guest used to have a 1:1 mapping for physical vs virtual addresses. Fix KVM so that it correctly uses the addresses as real addresses. Cc: stable@vger.kernel.org Fixes: 8ae04b8f500b ("KVM: s390: Guest's memory access functions get access registers") Suggested-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240917151904.74314-3-nrb@linux.ibm.com Acked-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-22KVM: s390: gaccess: Check if guest address is in memslotNico Boehr2-6/+12
commit e8061f06185be0a06a73760d6526b8b0feadfe52 upstream. Previously, access_guest_page() did not check whether the given guest address is inside of a memslot. This is not a problem, since kvm_write_guest_page/kvm_read_guest_page return -EFAULT in this case. However, -EFAULT is also returned when copy_to/from_user fails. When emulating a guest instruction, the address being outside a memslot usually means that an addressing exception should be injected into the guest. Failure in copy_to/from_user however indicates that something is wrong in userspace and hence should be handled there. To be able to distinguish these two cases, return PGM_ADDRESSING in access_guest_page() when the guest address is outside guest memory. In access_guest_real(), populate vcpu->arch.pgm.code such that kvm_s390_inject_prog_cond() can be used in the caller for injecting into the guest (if applicable). Since this adds a new return value to access_guest_page(), we need to make sure that other callers are not confused by the new positive return value. There are the following users of access_guest_page(): - access_guest_with_key() does the checking itself (in guest_range_to_gpas()), so this case should never happen. Even if, the handling is set up properly. - access_guest_real() just passes the return code to its callers, which are: - read_guest_real() - see below - write_guest_real() - see below There are the following users of read_guest_real(): - ar_translation() in gaccess.c which already returns PGM_* - setup_apcb10(), setup_apcb00(), setup_apcb11() in vsie.c which always return -EFAULT on read_guest_read() nonzero return - no change - shadow_crycb(), handle_stfle() always present this as validity, this could be handled better but doesn't change current behaviour - no change There are the following users of write_guest_real(): - kvm_s390_store_status_unloaded() always returns -EFAULT on write_guest_real() failure. Fixes: 2293897805c2 ("KVM: s390: add architecture compliant guest access functions") Cc: stable@vger.kernel.org Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20240917151904.74314-2-nrb@linux.ibm.com Acked-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17fs/proc/kcore.c: allow translation of physical memory addressesAlexander Gordeev1-0/+2
commit 3d5854d75e3187147613130561b58f0b06166172 upstream. When /proc/kcore is read an attempt to read the first two pages results in HW-specific page swap on s390 and another (so called prefix) pages are accessed instead. That leads to a wrong read. Allow architecture-specific translation of memory addresses using kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks. That way an architecture can deal with specific physical memory ranges. Re-use the existing /dev/mem callback implementation on s390, which handles the described prefix pages swapping correctly. For other architectures the default callback is basically NOP. It is expected the condition (vaddr == __va(__pa(vaddr))) always holds true for KCORE_RAM memory type. Link: https://lkml.kernel.org/r/20240930122119.1651546-1-agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17s390/traps: Handle early warnings gracefullyHeiko Carstens1-2/+15
[ Upstream commit 3c4d0ae0671827f4b536cc2d26f8b9c54584ccc5 ] Add missing warning handling to the early program check handler. This way a warning is printed to the console as soon as the early console is setup, and the kernel continues to boot. Before this change a disabled wait psw was loaded instead and the machine was silently stopped without giving an idea about what happened. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17s390/cpum_sf: Remove WARN_ON_ONCE statementsThomas Richter1-8/+4
[ Upstream commit b495e710157606889f2d8bdc62aebf2aa02f67a7 ] Remove WARN_ON_ONCE statements. These have not triggered in the past. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17s390/mm: Add cond_resched() to cmm_alloc/free_pages()Gerald Schaefer1-1/+17
[ Upstream commit 131b8db78558120f58c5dc745ea9655f6b854162 ] Adding/removing large amount of pages at once to/from the CMM balloon can result in rcu_sched stalls or workqueue lockups, because of busy looping w/o cond_resched(). Prevent this by adding a cond_resched(). cmm_free_pages() holds a spin_lock while looping, so it cannot be added directly to the existing loop. Instead, introduce a wrapper function that operates on maximum 256 pages at once, and add it there. Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17s390/facility: Disable compile time optimization for decompressor codeHeiko Carstens1-2/+4
[ Upstream commit 0147addc4fb72a39448b8873d8acdf3a0f29aa65 ] Disable compile time optimizations of test_facility() for the decompressor. The decompressor should not contain any optimized code depending on the architecture level set the kernel image is compiled for to avoid unexpected operation exceptions. Add a __DECOMPRESSOR check to test_facility() to enforce that facilities are always checked during runtime for the decompressor. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04s390/ftrace: Avoid calling unwinder in ftrace_return_address()Vasily Gorbik2-20/+16
commit a84dd0d8ae24bdc6da341187fc4c1a0adfce2ccc upstream. ftrace_return_address() is called extremely often from performance-critical code paths when debugging features like CONFIG_TRACE_IRQFLAGS are enabled. For example, with debug_defconfig, ftrace selftests on my LPAR currently execute ftrace_return_address() as follows: ftrace_return_address(0) - 0 times (common code uses __builtin_return_address(0) instead) ftrace_return_address(1) - 2,986,805,401 times (with this patch applied) ftrace_return_address(2) - 140 times ftrace_return_address(>2) - 0 times The use of __builtin_return_address(n) was replaced by return_address() with an unwinder call by commit cae74ba8c295 ("s390/ftrace: Use unwinder instead of __builtin_return_address()") because __builtin_return_address(n) simply walks the stack backchain and doesn't check for reaching the stack top. For shallow stacks with fewer than "n" frames, this results in reads at low addresses and random memory accesses. While calling the fully functional unwinder "works", it is very slow for this purpose. Moreover, potentially following stack switches and walking past IRQ context is simply wrong thing to do for ftrace_return_address(). Reimplement return_address() to essentially be __builtin_return_address(n) with checks for reaching the stack top. Since the ftrace_return_address(n) argument is always a constant, keep the implementation in the header, allowing both GCC and Clang to unroll the loop and optimize it to the bare minimum. Fixes: cae74ba8c295 ("s390/ftrace: Use unwinder instead of __builtin_return_address()") Cc: stable@vger.kernel.org Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-04s390/entry: Make early program check handler relocated lowcore awareHeiko Carstens2-7/+11
[ Upstream commit f101b305a7b9513a8042a2cf09018de4ff371af2 ] Add the missing pieces so the early program check handler also works with a relocated lowcore. Right now the result of an early program check in case of a relocated lowcore would be a program check loop. Fixes: 8f1e70adb1a3 ("s390/boot: Add cmdline option to relocate lowcore") Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04s390/entry: Move early program check handler to entry.SHeiko Carstens3-24/+16
[ Upstream commit f2bb5b97b51ce5425e8f59f899643ce4eadba667 ] Have all program check handlers in one file to make future changes easy. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Stable-dep-of: f101b305a7b9 ("s390/entry: Make early program check handler relocated lowcore aware") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-25Merge tag 's390-6.11-4' of ↵Linus Torvalds8-33/+85
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix KASLR base offset to account for symbol offsets in the vmlinux ELF file, preventing tool breakages like the drgn debugger - Fix potential memory corruption of physmem_info during kernel physical address randomization - Fix potential memory corruption due to overlap between the relocated lowcore and identity mapping by correctly reserving lowcore memory - Fix performance regression and avoid randomizing identity mapping base by default - Fix unnecessary delay of AP bus binding complete uevent to prevent startup lag in KVM guests using AP * tag 's390-6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/boot: Fix KASLR base offset off by __START_KERNEL bytes s390/boot: Avoid possible physmem_info segment corruption s390/ap: Refine AP bus bindings complete processing s390/mm: Pin identity mapping base to zero s390/mm: Prevent lowcore vs identity mapping overlap
2024-08-22s390/boot: Fix KASLR base offset off by __START_KERNEL bytesAlexander Gordeev6-31/+52
Symbol offsets to the KASLR base do not match symbol address in the vmlinux image. That is the result of setting the KASLR base to the beginning of .text section as result of an optimization. Revert that optimization and allocate virtual memory for the whole kernel image including __START_KERNEL bytes as per the linker script. That allows keeping the semantics of the KASLR base offset in sync with other architectures. Rename __START_KERNEL to TEXT_OFFSET, since it represents the offset of the .text section within the kernel image, rather than a virtual address. Still skip mapping TEXT_OFFSET bytes to save memory on pgtables and provoke exceptions in case an attempt to access this area is made, as no kernel symbol may reside there. In case CONFIG_KASAN is enabled the location counter might exceed the value of TEXT_OFFSET, while the decompressor linker script forcefully resets it to TEXT_OFFSET, which leads to a sections overlap link failure. Use MAX() expression to avoid that. Reported-by: Omar Sandoval <osandov@osandov.com> Closes: https://lore.kernel.org/linux-s390/ZnS8dycxhtXBZVky@telecaster.dhcp.thefacebook.com/ Fixes: 56b1069c40c7 ("s390/boot: Rework deployment of the kernel image") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-22s390/boot: Avoid possible physmem_info segment corruptionAlexander Gordeev1-2/+2
When physical memory for the kernel image is allocated it does not consider extra memory required for offsetting the image start to match it with the lower 20 bits of KASLR virtual base address. That might lead to kernel access beyond its memory range. Suggested-by: Vasily Gorbik <gor@linux.ibm.com> Fixes: 693d41f7c938 ("s390/mm: Restore mapping of kernel image using large pages") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-21s390/mm: Pin identity mapping base to zeroAlexander Gordeev2-1/+15
SIE instruction performs faster when the virtual address of SIE block matches the physical one. Pin the identity mapping base to zero for the benefit of SIE and other instructions that have similar performance impact. Still, randomize the base when DEBUG_VM kernel configuration option is enabled. Suggested-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-21s390/mm: Prevent lowcore vs identity mapping overlapAlexander Gordeev1-1/+18
The identity mapping position in virtual memory is randomized together with the kernel mapping. That position can never overlap with the lowcore even when the lowcore is relocated. Prevent overlapping with the lowcore to allow independent positioning of the identity mapping. With the current value of the alternative lowcore address of 0x70000 the overlap could happen in case the identity mapping is placed at zero. This is a prerequisite for uncoupling of randomization base of kernel image and identity mapping in virtual memory. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-07s390/uv: Panic for set and remove shared access UVC errorsClaudio Imbrenda1-1/+4
The return value uv_set_shared() and uv_remove_shared() (which are wrappers around the share() function) is not always checked. The system integrity of a protected guest depends on the Share and Unshare UVCs being successful. This means that any caller that fails to check the return value will compromise the security of the protected guest. No code path that would lead to such violation of the security guarantees is currently exercised, since all the areas that are shared never get unshared during the lifetime of the system. This might change and become an issue in the future. The Share and Unshare UVCs can only fail in case of hypervisor misbehaviour (either a bug or malicious behaviour). In such cases there is no reasonable way forward, and the system needs to panic. This patch replaces the return at the end of the share() function with a panic, to guarantee system integrity. Fixes: 5abb9351dfd9 ("s390/uv: introduce guest side ultravisor code") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20240801112548.85303-1-imbrenda@linux.ibm.com Message-ID: <20240801112548.85303-1-imbrenda@linux.ibm.com> [frankja@linux.ibm.com: Fixed up patch subject] Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2024-08-06KVM: s390: fix validity interception issue when gisa is switched offMichael Mueller1-1/+6
We might run into a SIE validity if gisa has been disabled either via using kernel parameter "kvm.use_gisa=0" or by setting the related sysfs attribute to N (echo N >/sys/module/kvm/parameters/use_gisa). The validity is caused by an invalid value in the SIE control block's gisa designation. That happens because we pass the uninitialized gisa origin to virt_to_phys() before writing it to the gisa designation. To fix this we return 0 in kvm_s390_get_gisa_desc() if the origin is 0. kvm_s390_get_gisa_desc() is used to determine which gisa designation to set in the SIE control block. A value of 0 in the gisa designation disables gisa usage. The issue surfaces in the host kernel with the following kernel message as soon a new kvm guest start is attemted. kvm: unhandled validity intercept 0x1011 WARNING: CPU: 0 PID: 781237 at arch/s390/kvm/intercept.c:101 kvm_handle_sie_intercept+0x42e/0x4d0 [kvm] Modules linked in: vhost_net tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT xt_tcpudp nft_compat x_tables nf_nat_tftp nf_conntrack_tftp vfio_pci_core irqbypass vhost_vsock vmw_vsock_virtio_transport_common vsock vhost vhost_iotlb kvm nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables sunrpc mlx5_ib ib_uverbs ib_core mlx5_core uvdevice s390_trng eadm_sch vfio_ccw zcrypt_cex4 mdev vfio_iommu_type1 vfio sch_fq_codel drm i2c_core loop drm_panel_orientation_quirks configfs nfnetlink lcs ctcm fsm dm_service_time ghash_s390 prng chacha_s390 libchacha aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common dm_mirror dm_region_hash dm_log zfcp scsi_transport_fc scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt dm_multipath rng_core autofs4 [last unloaded: vfio_pci] CPU: 0 PID: 781237 Comm: CPU 0/KVM Not tainted 6.10.0-08682-gcad9f11498ea #6 Hardware name: IBM 3931 A01 701 (LPAR) Krnl PSW : 0704c00180000000 000003d93deb0122 (kvm_handle_sie_intercept+0x432/0x4d0 [kvm]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 000003d900000027 000003d900000023 0000000000000028 000002cd00000000 000002d063a00900 00000359c6daf708 00000000000bebb5 0000000000001eff 000002cfd82e9000 000002cfd80bc000 0000000000001011 000003d93deda412 000003ff8962df98 000003d93de77ce0 000003d93deb011e 00000359c6daf960 Krnl Code: 000003d93deb0112: c020fffe7259 larl %r2,000003d93de7e5c4 000003d93deb0118: c0e53fa8beac brasl %r14,000003d9bd3c7e70 #000003d93deb011e: af000000 mc 0,0 >000003d93deb0122: a728ffea lhi %r2,-22 000003d93deb0126: a7f4fe24 brc 15,000003d93deafd6e 000003d93deb012a: 9101f0b0 tm 176(%r15),1 000003d93deb012e: a774fe48 brc 7,000003d93deafdbe 000003d93deb0132: 40a0f0ae sth %r10,174(%r15) Call Trace: [<000003d93deb0122>] kvm_handle_sie_intercept+0x432/0x4d0 [kvm] ([<000003d93deb011e>] kvm_handle_sie_intercept+0x42e/0x4d0 [kvm]) [<000003d93deacc10>] vcpu_post_run+0x1d0/0x3b0 [kvm] [<000003d93deaceda>] __vcpu_run+0xea/0x2d0 [kvm] [<000003d93dead9da>] kvm_arch_vcpu_ioctl_run+0x16a/0x430 [kvm] [<000003d93de93ee0>] kvm_vcpu_ioctl+0x190/0x7c0 [kvm] [<000003d9bd728b4e>] vfs_ioctl+0x2e/0x70 [<000003d9bd72a092>] __s390x_sys_ioctl+0xc2/0xd0 [<000003d9be0e9222>] __do_syscall+0x1f2/0x2e0 [<000003d9be0f9a90>] system_call+0x70/0x98 Last Breaking-Event-Address: [<000003d9bd3c7f58>] __warn_printk+0xe8/0xf0 Cc: stable@vger.kernel.org Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com> Fixes: fe0ef0030463 ("KVM: s390: sort out physical vs virtual pointers usage") Signed-off-by: Michael Mueller <mimu@linux.ibm.com> Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20240801123109.2782155-1-mimu@linux.ibm.com Message-ID: <20240801123109.2782155-1-mimu@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2024-07-31s390: Keep inittext section writableHeiko Carstens2-10/+2
There is no added security by making the inittext section non-writable, however it does split part of the kernel mapping into 4K mappings instead of 1M mappings: ---[ Kernel Image Start ]--- 0x000003ffe0000000-0x000003ffe0e00000 14M PMD RO X 0x000003ffe0e00000-0x000003ffe0ec7000 796K PTE RO X 0x000003ffe0ec7000-0x000003ffe0f00000 228K PTE RO NX 0x000003ffe0f00000-0x000003ffe1300000 4M PMD RO NX 0x000003ffe1300000-0x000003ffe1353000 332K PTE RO NX 0x000003ffe1353000-0x000003ffe1400000 692K PTE RW NX 0x000003ffe1400000-0x000003ffe1500000 1M PMD RW NX 0x000003ffe1500000-0x000003ffe1700000 2M PTE RW NX <--- 0x000003ffe1700000-0x000003ffe1800000 1M PMD RW NX 0x000003ffe1800000-0x000003ffe187e000 504K PTE RW NX ---[ Kernel Image End ]--- Keep the inittext writable and enable instruction execution protection (aka noexec) later to prevent this. This also allows to use the generic free_initmem() implementation. ---[ Kernel Image Start ]--- 0x000003ffe0000000-0x000003ffe0e00000 14M PMD RO X 0x000003ffe0e00000-0x000003ffe0ec7000 796K PTE RO X 0x000003ffe0ec7000-0x000003ffe0f00000 228K PTE RO NX 0x000003ffe0f00000-0x000003ffe1300000 4M PMD RO NX 0x000003ffe1300000-0x000003ffe1353000 332K PTE RO NX 0x000003ffe1353000-0x000003ffe1400000 692K PTE RW NX 0x000003ffe1400000-0x000003ffe1800000 4M PMD RW NX <--- 0x000003ffe1800000-0x000003ffe187e000 504K PTE RW NX ---[ Kernel Image End ]--- Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/vmlinux.lds.S: Move ro_after_init section behind rodata sectionHeiko Carstens1-8/+9
The .data.rel.ro and .got section were added between the rodata and ro_after_init data section, which adds an RW mapping in between all RO mapping of the kernel image: ---[ Kernel Image Start ]--- 0x000003ffe0000000-0x000003ffe0e00000 14M PMD RO X 0x000003ffe0e00000-0x000003ffe0ec7000 796K PTE RO X 0x000003ffe0ec7000-0x000003ffe0f00000 228K PTE RO NX 0x000003ffe0f00000-0x000003ffe1300000 4M PMD RO NX 0x000003ffe1300000-0x000003ffe1331000 196K PTE RO NX 0x000003ffe1331000-0x000003ffe13b3000 520K PTE RW NX <--- 0x000003ffe13b3000-0x000003ffe13d5000 136K PTE RO NX 0x000003ffe13d5000-0x000003ffe1400000 172K PTE RW NX 0x000003ffe1400000-0x000003ffe1500000 1M PMD RW NX 0x000003ffe1500000-0x000003ffe1700000 2M PTE RW NX 0x000003ffe1700000-0x000003ffe1800000 1M PMD RW NX 0x000003ffe1800000-0x000003ffe187e000 504K PTE RW NX ---[ Kernel Image End ]--- Move the ro_after_init data section again right behind the rodata section to prevent interleaving RO and RW mappings: ---[ Kernel Image Start ]--- 0x000003ffe0000000-0x000003ffe0e00000 14M PMD RO X 0x000003ffe0e00000-0x000003ffe0ec7000 796K PTE RO X 0x000003ffe0ec7000-0x000003ffe0f00000 228K PTE RO NX 0x000003ffe0f00000-0x000003ffe1300000 4M PMD RO NX 0x000003ffe1300000-0x000003ffe1353000 332K PTE RO NX 0x000003ffe1353000-0x000003ffe1400000 692K PTE RW NX 0x000003ffe1400000-0x000003ffe1500000 1M PMD RW NX 0x000003ffe1500000-0x000003ffe1700000 2M PTE RW NX 0x000003ffe1700000-0x000003ffe1800000 1M PMD RW NX 0x000003ffe1800000-0x000003ffe187e000 504K PTE RW NX ---[ Kernel Image End ]--- Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/mm: Get rid of RELOC_HIDE()Heiko Carstens1-8/+2
Since __va(0) does not translate to NULL anymore remove RELOC_HIDE() which was only added to get rid of a compile warning with clang W=1: arch/s390/mm/vmem.c:666:36: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 666 | __set_memory_4k(__va(0), __va(0) + ident_map_size); | ~~~~~~~ ^ Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/mm/ptdump: Improve sorting of markersHeiko Carstens1-47/+53
Use the sort() from lib/sort.c to sort markers instead of the private implementation. The current implementation does not sort markers properly if they have to be moved downwards: ---[ Real Memory Copy Area Start ]--- 0x0000035b903ff000-0x0000035b90400000 4K PTE I ---[ vmalloc Area Start ]--- ---[ Real Memory Copy Area End ]--- Add a new member to each marker which indicates if a marker is start of an area. If addresses of areas are equal consider an address which defines the start of an area higher than the address which defines the end of an area. In result the output is sorted as intended: ---[ Real Memory Copy Area Start ]--- 0x0000019cedcff000-0x0000019cedd00000 4K PTE I ---[ Real Memory Copy Area End ]--- ---[ vmalloc Area Start ]--- Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/mm/ptdump: Add support for relocated lowcore mappingHeiko Carstens1-13/+22
The page table dumper contains a hard coded assumption that the first mapped area starts at address zero. With a relocated lowcore this is not true anymore. Subsequently the first entry (lowcore) is printed as if it would contain everything from address zero until the end of the location of the lowcore area. Fix this by adding a single "Kernel Virtual Address Space" entry, which always starts at address zero. It ends when the lowcore area starts which is either address zero, or its relocated address. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/mm/ptdump: Fix handling of identity mapping areaHeiko Carstens1-9/+12
Since virtual and real addresses are not the same anymore the assumption that the kernel image is contained within the identity mapping is also not true anymore. Fix this by adding two explicit areas and at the correct locations: one for the 8kb lowcore area, and one for the identity mapping. Fixes: c98d2ecae08f ("s390/mm: Uncouple physical vs virtual address spaces") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/alternatives: Remove unused empty header fileHeiko Carstens1-0/+0
Remove the unused and empty arch/s390/kernel/alternative.h header file which was added by mistake. Fixes: 5ade5be4edf8 ("s390: Add infrastructure to patch lowcore accesses") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-31s390/fpu: Re-add exception handling in load_fpu_state()Heiko Carstens1-1/+1
With the recent rewrite of the fpu code exception handling for the lfpc instruction within load_fpu_state() was erroneously removed. Add it again to prevent that loading invalid floating point register values cause an unhandled specification exception. Fixes: 8c09871a950a ("s390/fpu: limit save and restore to used registers") Cc: stable@vger.kernel.org Reported-by: Aristeu Rozanski <aris@redhat.com> Tested-by: Aristeu Rozanski <aris@redhat.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-26Merge tag 's390-6.11-2' of ↵Linus Torvalds50-521/+745
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Fix KMSAN build breakage caused by the conflict between s390 and mm-stable trees - Add KMSAN page markers for ptdump - Add runtime constant support - Fix __pa/__va for modules under non-GPL licenses by exporting necessary vm_layout struct with EXPORT_SYMBOL to prevent linkage problems - Fix an endless loop in the CF_DIAG event stop in the CPU Measurement Counter Facility code when the counter set size is zero - Remove the PROTECTED_VIRTUALIZATION_GUEST config option and enable its functionality by default - Support allocation of multiple MSI interrupts per device and improve logging of architecture-specific limitations - Add support for lowcore relocation as a debugging feature to catch all null ptr dereferences in the kernel address space, improving detection beyond the current implementation's limited write access protection - Clean up and rework CPU alternatives to allow for callbacks and early patching for the lowcore relocation * tag 's390-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits) s390: Remove protvirt and kvm config guards for uv code s390/boot: Add cmdline option to relocate lowcore s390/kdump: Make kdump ready for lowcore relocation s390/entry: Make system_call() ready for lowcore relocation s390/entry: Make ret_from_fork() ready for lowcore relocation s390/entry: Make __switch_to() ready for lowcore relocation s390/entry: Make restart_int_handler() ready for lowcore relocation s390/entry: Make mchk_int_handler() ready for lowcore relocation s390/entry: Make int handlers ready for lowcore relocation s390/entry: Make pgm_check_handler() ready for lowcore relocation s390/entry: Add base register to CHECK_VMAP_STACK/CHECK_STACK macro s390/entry: Add base register to SIEEXIT macro s390/entry: Add base register to MBEAR macro s390/entry: Make __sie64a() ready for lowcore relocation s390/head64: Make startup code ready for lowcore relocation s390: Add infrastructure to patch lowcore accesses s390/atomic_ops: Disable flag outputs constraint for GCC versions below 14.2.0 s390/entry: Move SIE indicator flag to thread info s390/nmi: Simplify ptregs setup s390/alternatives: Remove alternative facility list ...
2024-07-25Merge tag 'constfy-sysctl-6.11-rc1' of ↵Linus Torvalds4-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl constification from Joel Granados: "Treewide constification of the ctl_table argument of proc_handlers using a coccinelle script and some manual code formatting fixups. This is a prerequisite to moving the static ctl_table structs into read-only data section which will ensure that proc_handler function pointers cannot be modified" * tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: treewide: constify the ctl_table argument of proc_handlers
2024-07-25Merge tag 'driver-core-6.11-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) ARM: sa1100: make match function take a const pointer sysfs/cpu: Make crash_hotplug attribute world-readable dio: Have dio_bus_match() callback take a const * zorro: make match function take a const pointer driver core: module: make module_[add|remove]_driver take a const * driver core: make driver_find_device() take a const * driver core: make driver_[create|remove]_file take a const * firmware_loader: fix soundness issue in `request_internal` firmware_loader: annotate doctests as `no_run` devres: Correct code style for functions that return a pointer type devres: Initialize an uninitialized struct member devres: Fix memory leakage caused by driver API devm_free_percpu() devres: Fix devm_krealloc() wasting memory driver core: platform: Switch to use kmemdup_array() driver core: have match() callback in struct bus_type take a const * MAINTAINERS: add Rust device abstractions to DRIVER CORE device: rust: improve safety comments MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER firmware: rust: improve safety comments ...
2024-07-24sysctl: treewide: constify the ctl_table argument of proc_handlersJoel Granados4-10/+10
const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *); @r4@ identifier func, ctl; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos); ``` * Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Co-developed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>
2024-07-23Merge tag 'kbuild-v6.11' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove tristate choice support from Kconfig - Stop using the PROVIDE() directive in the linker script - Reduce the number of links for the combination of CONFIG_KALLSYMS and CONFIG_DEBUG_INFO_BTF - Enable the warning for symbol reference to .exit.* sections by default - Fix warnings in RPM package builds - Improve scripts/make_fit.py to generate a FIT image with separate base DTB and overlays - Improve choice value calculation in Kconfig - Fix conditional prompt behavior in choice in Kconfig - Remove support for the uncommon EMAIL environment variable in Debian package builds - Remove support for the uncommon "name <email>" form for the DEBEMAIL environment variable - Raise the minimum supported GNU Make version to 4.0 - Remove stale code for the absolute kallsyms - Move header files commonly used for host programs to scripts/include/ - Introduce the pacman-pkg target to generate a pacman package used in Arch Linux - Clean up Kconfig * tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits) kbuild: doc: gcc to CC change kallsyms: change sym_entry::percpu_absolute to bool type kallsyms: unify seq and start_pos fields of struct sym_entry kallsyms: add more original symbol type/name in comment lines kallsyms: use \t instead of a tab in printf() kallsyms: avoid repeated calculation of array size for markers kbuild: add script and target to generate pacman package modpost: use generic macros for hash table implementation kbuild: move some helper headers from scripts/kconfig/ to scripts/include/ Makefile: add comment to discourage tools/* addition for kernel builds kbuild: clean up scripts/remove-stale-files kconfig: recursive checks drop file/lineno kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec kallsyms: get rid of code for absolute kallsyms kbuild: Create INSTALL_PATH directory if it does not exist kbuild: Abort make on install failures kconfig: remove 'e1' and 'e2' macros from expression deduplication kconfig: remove SYMBOL_CHOICEVAL flag kconfig: add const qualifiers to several function arguments kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups() ...
2024-07-23s390: Remove protvirt and kvm config guards for uv codeJanosch Frank10-92/+17
Removing the CONFIG_PROTECTED_VIRTUALIZATION_GUEST ifdefs and config option as well as CONFIG_KVM ifdefs in uv files. Having this configurable has been more of a pain than a help. It's time to remove the ifdefs and the config option. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/boot: Add cmdline option to relocate lowcoreSven Schnelle1-0/+2
Now that everything has been converted, add the option 'relocate_lowcore' to enable relocating the lowcore. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/kdump: Make kdump ready for lowcore relocationSven Schnelle2-16/+12
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in store_status() and __do_machine_kdump(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make system_call() ready for lowcore relocationSven Schnelle1-10/+11
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in system_call(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make ret_from_fork() ready for lowcore relocationSven Schnelle1-3/+4
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in ret_from_fork(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make __switch_to() ready for lowcore relocationSven Schnelle1-4/+5
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in __switch_to(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make restart_int_handler() ready for lowcore relocationSven Schnelle1-6/+8
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in restart_int_handler(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make mchk_int_handler() ready for lowcore relocationSven Schnelle1-23/+25
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in mcck_int_handler(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make int handlers ready for lowcore relocationSven Schnelle1-15/+16
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in the ext/io interrupt handlers. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23s390/entry: Make pgm_check_handler() ready for lowcore relocationSven Schnelle2-21/+32
In preparation of having lowcore at different address than zero, add the base register to all lowcore accesses in pgm_check_handler(). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sve