summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2025-08-01KVM: x86: Free vCPUs before freeing VM stateSean Christopherson1-1/+1
commit 17bcd714426386fda741a4bccd96a2870179344b upstream. Free vCPUs before freeing any VM state, as both SVM and VMX may access VM state when "freeing" a vCPU that is currently "in" L2, i.e. that needs to be kicked out of nested guest mode. Commit 6fcee03df6a1 ("KVM: x86: avoid loading a vCPU after .vm_destroy was called") partially fixed the issue, but for unknown reasons only moved the MMU unloading before VM destruction. Complete the change, and free all vCPU state prior to destroying VM state, as nVMX accesses even more state than nSVM. In addition to the AVIC, KVM can hit a use-after-free on MSR filters: kvm_msr_allowed+0x4c/0xd0 __kvm_set_msr+0x12d/0x1e0 kvm_set_msr+0x19/0x40 load_vmcs12_host_state+0x2d8/0x6e0 [kvm_intel] nested_vmx_vmexit+0x715/0xbd0 [kvm_intel] nested_vmx_free_vcpu+0x33/0x50 [kvm_intel] vmx_free_vcpu+0x54/0xc0 [kvm_intel] kvm_arch_vcpu_destroy+0x28/0xf0 kvm_vcpu_destroy+0x12/0x50 kvm_arch_destroy_vm+0x12c/0x1c0 kvm_put_kvm+0x263/0x3c0 kvm_vm_release+0x21/0x30 and an upcoming fix to process injectable interrupts on nested VM-Exit will access the PIC: BUG: kernel NULL pointer dereference, address: 0000000000000090 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page CPU: 23 UID: 1000 PID: 2658 Comm: kvm-nx-lpage-re RIP: 0010:kvm_cpu_has_extint+0x2f/0x60 [kvm] Call Trace: <TASK> kvm_cpu_has_injectable_intr+0xe/0x60 [kvm] nested_vmx_vmexit+0x2d7/0xdf0 [kvm_intel] nested_vmx_free_vcpu+0x40/0x50 [kvm_intel] vmx_vcpu_free+0x2d/0x80 [kvm_intel] kvm_arch_vcpu_destroy+0x2d/0x130 [kvm] kvm_destroy_vcpus+0x8a/0x100 [kvm] kvm_arch_destroy_vm+0xa7/0x1d0 [kvm] kvm_destroy_vm+0x172/0x300 [kvm] kvm_vcpu_release+0x31/0x50 [kvm] Inarguably, both nSVM and nVMX need to be fixed, but punt on those cleanups for the moment. Conceptually, vCPUs should be freed before VM state. Assets like the I/O APIC and PIC _must_ be allocated before vCPUs are created, so it stands to reason that they must be freed _after_ vCPUs are destroyed. Reported-by: Aaron Lewis <aaronlewis@google.com> Closes: https://lore.kernel.org/all/20240703175618.2304869-2-aaronlewis@google.com Cc: Jim Mattson <jmattson@google.com> Cc: Yan Zhao <yan.y.zhao@intel.com> Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com> Cc: Kai Huang <kai.huang@intel.com> Cc: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250224235542.2562848-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Cheng <chengkev@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01ARM: 9448/1: Use an absolute path to unified.h in KBUILD_AFLAGSNathan Chancellor1-1/+1
[ Upstream commit 87c4e1459e80bf65066f864c762ef4dc932fad4b ] After commit d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper flags and language target"), which updated as-instr to use the 'assembler-with-cpp' language option, the Kbuild version of as-instr always fails internally for arch/arm with <command-line>: fatal error: asm/unified.h: No such file or directory compilation terminated. because '-include' flags are now taken into account by the compiler driver and as-instr does not have '$(LINUXINCLUDE)', so unified.h is not found. This went unnoticed at the time of the Kbuild change because the last use of as-instr in Kbuild that arch/arm could reach was removed in 5.7 by commit 541ad0150ca4 ("arm: Remove 32bit KVM host support") but a stable backport of the Kbuild change to before that point exposed this potential issue if one were to be reintroduced. Follow the general pattern of '-include' paths throughout the tree and make unified.h absolute using '$(srctree)' to ensure KBUILD_AFLAGS can be used independently. Closes: https://lore.kernel.org/CACo-S-1qbCX4WAVFA63dWfHtrRHZBTyyr2js8Lx=Az03XHTTHg@mail.gmail.com/ Cc: stable@vger.kernel.org Fixes: d5c8d6e0fa61 ("kbuild: Update assembler calls to use proper flags and language target") Reported-by: KernelCI bot <bot@kernelci.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [ No KBUILD_RUSTFLAGS in 6.12 ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01arm64: dts: qcom: x1-crd: Fix vreg_l2j_1p2 voltageStephan Gerhold1-2/+2
[ Upstream commit 5ce920e6a8db40e4b094c0d863cbd19fdcfbbb7a ] In the ACPI DSDT table, PPP_RESOURCE_ID_LDO2_J is configured with 1256000 uV instead of the 1200000 uV we have currently in the device tree. Use the same for consistency and correctness. Cc: stable@vger.kernel.org Fixes: bd50b1f5b6f3 ("arm64: dts: qcom: x1e80100: Add Compute Reference Device") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250423-x1e-vreg-l2j-voltage-v1-1-24b6a2043025@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> [ Change x1e80100-crd.dts instead ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()Roman Kisel4-42/+63
[ Upstream commit 86c48271e0d60c82665e9fd61277002391efcef7 ] To start an application processor in SNP-isolated guest, a hypercall is used that takes a virtual processor index. The hv_snp_boot_ap() function uses that START_VP hypercall but passes as VP index to it what it receives as a wakeup_secondary_cpu_64 callback: the APIC ID. As those two aren't generally interchangeable, that may lead to hung APs if the VP index and the APIC ID don't match up. Update the parameter names to avoid confusion as to what the parameter is. Use the APIC ID to the VP index conversion to provide the correct input to the hypercall. Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250507182227.7421-2-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250507182227.7421-2-romank@linux.microsoft.com> [ changed HVCALL_GET_VP_INDEX_FROM_APIC_ID to HVCALL_GET_VP_ID_FROM_APIC_ID ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flushManuel Andreas1-0/+3
[ Upstream commit fa787ac07b3ceb56dd88a62d1866038498e96230 ] In KVM guests with Hyper-V hypercalls enabled, the hypercalls HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST and HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX allow a guest to request invalidation of portions of a virtual TLB. For this, the hypercall parameter includes a list of GVAs that are supposed to be invalidated. However, when non-canonical GVAs are passed, there is currently no filtering in place and they are eventually passed to checked invocations of INVVPID on Intel / INVLPGA on AMD. While AMD's INVLPGA silently ignores non-canonical addresses (effectively a no-op), Intel's INVVPID explicitly signals VM-Fail and ultimately triggers the WARN_ONCE in invvpid_error(): invvpid failed: ext=0x0 vpid=1 gva=0xaaaaaaaaaaaaa000 WARNING: CPU: 6 PID: 326 at arch/x86/kvm/vmx/vmx.c:482 invvpid_error+0x91/0xa0 [kvm_intel] Modules linked in: kvm_intel kvm 9pnet_virtio irqbypass fuse CPU: 6 UID: 0 PID: 326 Comm: kvm-vm Not tainted 6.15.0 #14 PREEMPT(voluntary) RIP: 0010:invvpid_error+0x91/0xa0 [kvm_intel] Call Trace: vmx_flush_tlb_gva+0x320/0x490 [kvm_intel] kvm_hv_vcpu_flush_tlb+0x24f/0x4f0 [kvm] kvm_arch_vcpu_ioctl_run+0x3013/0x5810 [kvm] Hyper-V documents that invalid GVAs (those that are beyond a partition's GVA space) are to be ignored. While not completely clear whether this ruling also applies to non-canonical GVAs, it is likely fine to make that assumption, and manual testing on Azure confirms "real" Hyper-V interprets the specification in the same way. Skip non-canonical GVAs when processing the list of address to avoid tripping the INVVPID failure. Alternatively, KVM could filter out "bad" GVAs before inserting into the FIFO, but practically speaking the only downside of pushing validation to the final processing is that doing so is suboptimal for the guest, and no well-behaved guest will request TLB flushes for non-canonical addresses. Fixes: 260970862c88 ("KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently") Cc: stable@vger.kernel.org Signed-off-by: Manuel Andreas <manuel.andreas@tum.de> Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/c090efb3-ef82-499f-a5e0-360fc8420fb7@tum.de Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01KVM: x86: model canonical checks more preciselyMaxim Levitsky7-22/+66
[ Upstream commit 9245fd6b8531497d129a7a6e3eef258042862f85 ] As a result of a recent investigation, it was determined that x86 CPUs which support 5-level paging, don't always respect CR4.LA57 when doing canonical checks. In particular: 1. MSRs which contain a linear address, allow full 57-bitcanonical address regardless of CR4.LA57 state. For example: MSR_KERNEL_GS_BASE. 2. All hidden segment bases and GDT/IDT bases also behave like MSRs. This means that full 57-bit canonical address can be loaded to them regardless of CR4.LA57, both using MSRS (e.g GS_BASE) and instructions (e.g LGDT). 3. TLB invalidation instructions also allow the user to use full 57-bit address regardless of the CR4.LA57. Finally, it must be noted that the CPU doesn't prevent the user from disabling 5-level paging, even when the full 57-bit canonical address is present in one of the registers mentioned above (e.g GDT base). In fact, this can happen without any userspace help, when the CPU enters SMM mode - some MSRs, for example MSR_KERNEL_GS_BASE are left to contain a non-canonical address in regard to the new mode. Since most of the affected MSRs and all segment bases can be read and written freely by the guest without any KVM intervention, this patch makes the emulator closely follow hardware behavior, which means that the emulator doesn't take in the account the guest CPUID support for 5-level paging, and only takes in the account the host CPU support. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20240906221824.491834-4-mlevitsk@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: fa787ac07b3c ("KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01KVM: x86: Add X86EMUL_F_MSR and X86EMUL_F_DT_LOAD to aid canonical checksMaxim Levitsky3-8/+14
[ Upstream commit c534b37b7584e2abc5d487b4e017f61a61959ca9 ] Add emulation flags for MSR accesses and Descriptor Tables loads, and pass the new flags as appropriate to emul_is_noncanonical_address(). The flags will be used to perform the correct canonical check, as the type of access affects whether or not CR4.LA57 is consulted when determining the canonical bit. No functional change is intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20240906221824.491834-3-mlevitsk@redhat.com [sean: split to separate patch, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: fa787ac07b3c ("KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01KVM: x86: Route non-canonical checks in emulator through emulate_opsMaxim Levitsky3-1/+10
[ Upstream commit 16ccadefa295af434ca296e566f078223ecd79ca ] Add emulate_ops.is_canonical_addr() to perform (non-)canonical checks in the emulator, which will allow extending is_noncanonical_address() to support different flavors of canonical checks, e.g. for descriptor table bases vs. MSRs, without needing duplicate logic in the emulator. No functional change is intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20240906221824.491834-3-mlevitsk@redhat.com [sean: separate from additional of flags, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: fa787ac07b3c ("KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01KVM: x86: drop x86.h include from cpuid.hMaxim Levitsky6-4/+5
[ Upstream commit e52ad1ddd0a3b07777141ec9406d5dc2c9a0de17 ] Drop x86.h include from cpuid.h to allow the x86.h to include the cpuid.h instead. Also fix various places where x86.h was implicitly included via cpuid.h Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20240906221824.491834-2-mlevitsk@redhat.com [sean: fixup a missed include in mtrr.c] Signed-off-by: Sean Christopherson <seanjc@google.com> Stable-dep-of: fa787ac07b3c ("KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01arm64: dts: qcom: x1e78100-t14s: mark l12b and l15b always-onJohan Hovold1-0/+2
[ Upstream commit 673fa129e558c5f1196adb27d97ac90ddfe4f19c ] The l12b and l15b supplies are used by components that are not (fully) described (and some never will be) and must never be disabled. Mark the regulators as always-on to prevent them from being disabled, for example, when consumers probe defer or suspend. Fixes: 7d1cbe2f4985 ("arm64: dts: qcom: Add X1E78100 ThinkPad T14s Gen 6") Cc: stable@vger.kernel.org # 6.12 Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20250314145440.11371-3-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> [ applied changes to .dts file instead of .dtsi ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01crypto: powerpc/poly1305 - add depends on BROKEN for nowEric Biggers1-0/+1
[ Upstream commit bc8169003b41e89fe7052e408cf9fdbecb4017fe ] As discussed in the thread containing https://lore.kernel.org/linux-crypto/20250510053308.GB505731@sol/, the Power10-optimized Poly1305 code is currently not safe to call in softirq context. Disable it for now. It can be re-enabled once it is fixed. Fixes: ba8f8624fde2 ("crypto: poly1305-p10 - Glue code for optmized Poly1305 implementation for ppc64le") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> [ applied to arch/powerpc/crypto/Kconfig ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01x86/bugs: Fix use of possibly uninit value in amd_check_tsa_microcode()Michael Zhivich1-0/+2
For kernels compiled with CONFIG_INIT_STACK_NONE=y, the value of __reserved field in zen_patch_rev union on the stack may be garbage. If so, it will prevent correct microcode check when consulting p.ucode_rev, resulting in incorrect mitigation selection. This is a stable-only fix. Cc: <stable@vger.kernel.org> Signed-off-by: Michael Zhivich <mzhivich@akamai.com> Fixes: 7a0395f6607a5 ("x86/bugs: Add a Transient Scheduler Attacks mitigation") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01arm64/entry: Mask DAIF in cpu_switch_to(), call_on_irq_stack()Ada Couprie Diaz2-0/+11
commit d42e6c20de6192f8e4ab4cf10be8c694ef27e8cb upstream. `cpu_switch_to()` and `call_on_irq_stack()` manipulate SP to change to different stacks along with the Shadow Call Stack if it is enabled. Those two stack changes cannot be done atomically and both functions can be interrupted by SErrors or Debug Exceptions which, though unlikely, is very much broken : if interrupted, we can end up with mismatched stacks and Shadow Call Stack leading to clobbered stacks. In `cpu_switch_to()`, it can happen when SP_EL0 points to the new task, but x18 stills points to the old task's SCS. When the interrupt handler tries to save the task's SCS pointer, it will save the old task SCS pointer (x18) into the new task struct (pointed to by SP_EL0), clobbering it. In `call_on_irq_stack()`, it can happen when switching from the task stack to the IRQ stack and when switching back. In both cases, we can be interrupted when the SCS pointer points to the IRQ SCS, but SP points to the task stack. The nested interrupt handler pushes its return addresses on the IRQ SCS. It then detects that SP points to the task stack, calls `call_on_irq_stack()` and clobbers the task SCS pointer with the IRQ SCS pointer, which it will also use ! This leads to tasks returning to addresses on the wrong SCS, or even on the IRQ SCS, triggering kernel panics via CONFIG_VMAP_STACK or FPAC if enabled. This is possible on a default config, but unlikely. However, when enabling CONFIG_ARM64_PSEUDO_NMI, DAIF is unmasked and instead the GIC is responsible for filtering what interrupts the CPU should receive based on priority. Given the goal of emulating NMIs, pseudo-NMIs can be received by the CPU even in `cpu_switch_to()` and `call_on_irq_stack()`, possibly *very* frequently depending on the system configuration and workload, leading to unpredictable kernel panics. Completely mask DAIF in `cpu_switch_to()` and restore it when returning. Do the same in `call_on_irq_stack()`, but restore and mask around the branch. Mask DAIF even if CONFIG_SHADOW_CALL_STACK is not enabled for consistency of behaviour between all configurations. Introduce and use an assembly macro for saving and masking DAIF, as the existing one saves but only masks IF. Cc: <stable@vger.kernel.org> Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Reported-by: Cristian Prundeanu <cpru@amazon.com> Fixes: 59b37fe52f49 ("arm64: Stash shadow stack pointer in the task struct on interrupt") Tested-by: Cristian Prundeanu <cpru@amazon.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250718142814.133329-1-ada.coupriediaz@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01ARM: 9450/1: Fix allowing linker DCE with binutils < 2.36Nathan Chancellor1-1/+1
commit 53e7e1fb81cc8ba2da1cb31f8917ef397caafe91 upstream. Commit e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE") accidentally broke the binutils version restriction that was added in commit 0d437918fb64 ("ARM: 9414/1: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION"), reintroducing the segmentation fault addressed by that workaround. Restore the binutils version dependency by using CONFIG_LD_CAN_USE_KEEP_IN_OVERLAY as an additional condition to ensure that CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION is only enabled with binutils >= 2.36 and ld.lld >= 21.0.0. Closes: https://lore.kernel.org/6739da7d-e555-407a-b5cb-e5681da71056@landley.net/ Closes: https://lore.kernel.org/CAFERDQ0zPoya5ZQfpbeuKVZEo_fKsonLf6tJbp32QnSGAtbi+Q@mail.gmail.com/ Cc: stable@vger.kernel.org Fixes: e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE") Reported-by: Rob Landley <rob@landley.net> Tested-by: Rob Landley <rob@landley.net> Reported-by: Martin Wetterwald <martin@wetterwald.eu> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-01x86/hyperv: Fix usage of cpu_online_mask to get valid cpuNuno Das Neves1-3/+1
[ Upstream commit bb169f80ed5a156ec3405e0e49c6b8e9ae264718 ] Accessing cpu_online_mask here is problematic because the cpus read lock is not held in this context. However, cpu_online_mask isn't needed here since the effective affinity mask is guaranteed to be valid in this callback. So, just use cpumask_first() to get the cpu instead of ANDing it with cpus_online_mask unnecessarily. Fixes: e39397d1fd68 ("x86/hyperv: implement an MSI domain for root partition") Reported-by: Michael Kelley <mhklinux@outlook.com> Closes: https://lore.kernel.org/linux-hyperv/SN6PR02MB4157639630F8AD2D8FD8F52FD475A@SN6PR02MB4157.namprd02.prod.outlook.com/ Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-01x86/traps: Initialize DR7 by writing its architectural reset valueXin Li (Intel)7-11/+22
commit fa7d0f83c5c4223a01598876352473cb3d3bd4d7 upstream. Initialize DR7 by writing its architectural reset value to always set bit 10, which is reserved to '1', when "clearing" DR7 so as not to trigger unanticipated behavior if said bit is ever unreserved, e.g. as a feature enabling flag with inverted polarity. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Sean Christopherson <seanjc@google.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250620231504.2676902-3-xin%40zytor.com [ context adjusted: no KVM_DEBUGREG_AUTO_SWITCH flag test" ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24KVM: x86/xen: Fix cleanup logic in emulation of Xen schedop poll hypercallsManuel Andreas1-1/+1
commit 5a53249d149f48b558368c5338b9921b76a12f8c upstream. kvm_xen_schedop_poll does a kmalloc_array() when a VM polls the host for more than one event channel potr (nr_ports > 1). After the kmalloc_array(), the error paths need to go through the "out" label, but the call to kvm_read_guest_virt() does not. Fixes: 92c58965e965 ("KVM: x86/xen: Use kvm_read_guest_virt() instead of open-coding it badly") Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Manuel Andreas <manuel.andreas@tum.de> [Adjusted commit message. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24riscv: traps_misaligned: properly sign extend value in misaligned load handlerAndreas Schwab1-1/+1
[ Upstream commit b3510183ab7d63c71a3f5c89043d31686a76a34c ] Add missing cast to signed long. Signed-off-by: Andreas Schwab <schwab@suse.de> Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE") Tested-by: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/mvmikk0goil.fsf@suse.de Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24riscv: Enable interrupt during exception handlingNam Cao1-4/+6
[ Upstream commit 969f028bf2c40573ef18061f702ede3ebfe12b42 ] force_sig_fault() takes a spinlock, which is a sleeping lock with CONFIG_PREEMPT_RT=y. However, exception handling calls force_sig_fault() with interrupt disabled, causing a sleeping in atomic context warning. This can be reproduced using userspace programs such as: int main() { asm ("ebreak"); } or int main() { asm ("unimp"); } There is no reason that interrupt must be disabled while handling exceptions from userspace. Enable interrupt while handling user exceptions. This also has the added benefit of avoiding unnecessary delays in interrupt handling. Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry") Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250625085630.3649485-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24arm64: dts: imx95: Correct the DMA interrupter number of pcie0_epRichard Zhu1-1/+1
[ Upstream commit 61f1065272ea3721c20c4c0a6877d346b0e237c3 ] Correct the DMA interrupter number of pcie0_ep from 317 to 311. Fixes: 3b1d5deb29ff ("arm64: dts: imx95: add pcie[0,1] and pcie-ep[0,1] support") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi 4BAndy Yan1-0/+1
[ Upstream commit 98570e8cb8b0c0893810f285b4a3b1a3ab81a556 ] cd-gpios is used for sdcard detects for sdmmc. Fixes: 3f5d336d64d6 ("arm64: dts: rockchip: Add support for rk3588s based board Cool Pi 4B") Signed-off-by: Andy Yan <andyshrk@163.com> Link: https://lore.kernel.org/r/20250524064223.5741-2-andyshrk@163.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24arm64: dts: rockchip: Add cd-gpios for sdcard detect on Cool Pi CM5Andy Yan1-0/+1
[ Upstream commit e625e284172d235be5cd906a98c6c91c365bb9b1 ] cd-gpios is used for sdcard detects for sdmmc. Fixes: 791c154c3982 ("arm64: dts: rockchip: Add support for rk3588 based board Cool Pi CM5 EVB") Signed-off-by: Andy Yan <andyshrk@163.com> Link: https://lore.kernel.org/r/20250524064223.5741-1-andyshrk@163.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-24s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL againIlya Leoshkevich1-1/+9
commit 6a5abf8cf182f577c7ae6c62f14debc9754ec986 upstream. Commit 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") has accidentally removed the critical piece of commit c730fce7c70c ("s390/bpf: Fix bpf_arch_text_poke() with new_addr == NULL"), causing intermittent kernel panics in e.g. perf's on_switch() prog to reappear. Restore the fix and add a comment. Fixes: 7ded842b356d ("s390/bpf: Fix bpf_plt pointer arithmetic") Cc: stable@vger.kernel.org Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20250716194524.48109-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: rockchip: use cs-gpios for spi1 on ringneckJakob Unterwurzacher1-0/+23
commit 53b6445ad08f07b6f4a84f1434f543196009ed89 upstream. Hardware CS has a very slow rise time of about 6us, causing transmission errors when CS does not reach high between transaction. It looks like it's not driven actively when transitioning from low to high but switched to input, so only the CPU pull-up pulls it high, slowly. Transitions from high to low are fast. On the oscilloscope, CS looks like an irregular sawtooth pattern like this: _____ ^ / | ^ /| / | /| / | / | / | / | / | ___/ |___/ |_____/ |___ With cs-gpios we have a CS rise time of about 20ns, as it should be, and CS looks rectangular. This fixes the data errors when running a flashcp loop against a m25p40 spi flash. With the Rockchip 6.1 kernel we see the same slow rise time, but for some reason CS is always high for long enough to reach a solid high. The RK3399 and RK3588 SoCs use the same SPI driver, so we also checked our "Puma" (RK3399) and "Tiger" (RK3588) boards. They do not have this problem. Hardware CS rise time is good. Fixes: c484cf93f61b ("arm64: dts: rockchip: add PX30-µQ7 (Ringneck) SoM with Haikou baseboard") Cc: stable@vger.kernel.org Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de> Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher@cherry.de> Link: https://lore.kernel.org/r/20250627131715.1074308-1-jakob.unterwurzacher@cherry.de Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: imx8mp-venice-gw73xx: fix TPM SPI frequencyTim Harvey1-1/+1
commit 1fc02c2086003c5fdaa99cde49a987992ff1aae4 upstream. The IMX8MPDS Table 37 [1] shows that the max SPI master read frequency depends on the pins the interface is muxed behind with ECSPI2 muxed behind ECSPI2 supporting up to 25MHz. Adjust the spi-max-frequency based on these findings. [1] https://www.nxp.com/webapp/Download?colCode=IMX8MPIEC Fixes: 2b3ab9d81ab4 ("arm64: dts: imx8mp-venice-gw73xx: add TPM device") Cc: stable@vger.kernel.org Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: imx8mp-venice-gw72xx: fix TPM SPI frequencyTim Harvey1-1/+1
commit b25344753c53a5524ba80280ce68f2046e559ce0 upstream. The IMX8MPDS Table 37 [1] shows that the max SPI master read frequency depends on the pins the interface is muxed behind with ECSPI2 muxed behind ECSPI2 supporting up to 25MHz. Adjust the spi-max-frequency based on these findings. [1] https://www.nxp.com/webapp/Download?colCode=IMX8MPIEC Fixes: 5016f22028e4 ("arm64: dts: imx8mp-venice-gw72xx: add TPM device") Cc: stable@vger.kernel.org Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: imx8mp-venice-gw71xx: fix TPM SPI frequencyTim Harvey1-1/+1
commit 528e2d3125ad8d783e922033a0a8e2adb17b400e upstream. The IMX8MPDS Table 37 [1] shows that the max SPI master read frequency depends on the pins the interface is muxed behind with ECSPI2 muxed behind ECSPI2 supporting up to 25MHz. Adjust the spi-max-frequency based on these findings. [1] https://www.nxp.com/webapp/Download?colCode=IMX8MPIEC Fixes: 1a8f6ff6a291 ("arm64: dts: imx8mp-venice-gw71xx: add TPM device") Cc: stable@vger.kernel.org Signed-off-by: Tim Harvey <tharvey@gateworks.com> Link: https://lore.kernel.org/stable/20250523173723.4167474-1-tharvey%40gateworks.com Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always onFrancesco Dolcini1-0/+1
commit fbe94be09fa81343d623a86ec64a742759b669b3 upstream. LDO5 regulator is used to power the i.MX8MM NVCC_SD2 I/O supply, that is used for the SD2 card interface and also for some GPIOs. When the SD card interface is not enabled the regulator subsystem could turn off this supply, since it is not used anywhere else, however this will also remove the power to some other GPIOs, for example one I/O that is used to power the ethernet phy, leading to a non working ethernet interface. [ 31.820515] On-module +V3.3_1.8_SD (LDO5): disabling [ 31.821761] PMIC_USDHC_VSELECT: disabling [ 32.764949] fec 30be0000.ethernet end0: Link is Down Fix this keeping the LDO5 supply always on. Cc: stable@vger.kernel.org Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini") Fixes: f5aab0438ef1 ("regulator: pca9450: Fix enable register for LDO5") Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: add big-endian property back into watchdog nodeMeng Li1-1/+2
commit 720fd1cbc0a0f3acdb26aedb3092ab10fe05e7ae upstream. Watchdog doesn't work on NXP ls1046ardb board because in commit 7c8ffc5555cb("arm64: dts: layerscape: remove big-endian for mmc nodes"), it intended to remove the big-endian from mmc node, but the big-endian of watchdog node is also removed by accident. So, add watchdog big-endian property back. In addition, add compatible string fsl,ls1046a-wdt, which allow big-endian property. Fixes: 7c8ffc5555cb ("arm64: dts: layerscape: remove big-endian for mmc nodes") Cc: stable@vger.kernel.org Signed-off-by: Meng Li <Meng.Li@windriver.com> Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-24arm64: dts: imx8mp-venice-gw74xx: fix TPM SPI frequencyTim Harvey1-1/+1
commit 0bdaca0922175478ddeadf8e515faa5269f6fae6 upstream. The IMX8MPDS Table 37 [1] shows that the max SPI master read frequency depends on the pins the interface is muxed behind with ECSPI2 muxed behind ECSPI2 supporting up to 25MHz. Adjust the spi-max-frequency based on these findings. [1] https://www.nxp.com/webapp/Download?colCode=IMX8MPIEC Fixes: 531936b218d8 ("arm64: dts: imx8mp-venice-gw74xx: update to revB PCB") Cc: stable@vger.kernel.org Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17KVM: SVM: Set synthesized TSA CPUID flagsBorislav Petkov (AMD)1-0/+4
VERW_CLEAR is supposed to be set only by the hypervisor to denote TSA mitigation support to a guest. SQ_NO and L1_NO are both synthesizable, and are going to be set by hw CPUID on future machines. So keep the kvm_cpu_cap_init_kvm_defined() invocation *and* set them when synthesized. This fix is stable-only. Co-developed-by: Jinpu Wang <jinpu.wang@ionos.com> Signed-off-by: Jinpu Wang <jinpu.wang@ionos.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17arm64: Filter out SME hwcaps when FEAT_SME isn't implementedMark Brown1-19/+26
commit a75ad2fc76a2ab70817c7eed3163b66ea84ca6ac upstream. We have a number of hwcaps for various SME subfeatures enumerated via ID_AA64SMFR0_EL1. Currently we advertise these without cross checking against the main SME feature, advertised in ID_AA64PFR1_EL1.SME which means that if the two are out of sync userspace can see a confusing situation where SME subfeatures are advertised without the base SME hwcap. This can be readily triggered by using the arm64.nosme override which only masks out ID_AA64PFR1_EL1.SME, and there have also been reports of VMMs which do the same thing. Fix this as we did previously for SVE in 064737920bdb ("arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented") by filtering out the SME subfeature hwcaps when FEAT_SME is not present. Fixes: 5e64b862c482 ("arm64/sme: Basic enumeration support") Reported-by: Yury Khrustalev <yury.khrustalev@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250620-arm64-sme-filter-hwcaps-v1-1-02b9d3c2d8ef@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17riscv: vdso: Exclude .rodata from the PT_DYNAMIC segmentFangrui Song1-1/+1
[ Upstream commit e0eb1b6b0cd29ca7793c501d5960fd36ba11f110 ] .rodata is implicitly included in the PT_DYNAMIC segment due to inheriting the segment of the preceding .dynamic section (in both GNU ld and LLD). When the .rodata section's size is not a multiple of 16 bytes on riscv64, llvm-readelf will report a "PT_DYNAMIC dynamic table is invalid" warning. Note: in the presence of the .dynamic section, GNU readelf and llvm-readelf's -d option decodes the dynamic section using the section. This issue arose after commit 8f8c1ff879fab60f80f3a7aec3000f47e5b03ba9 ("riscv: vdso.lds.S: remove hardcoded 0x800 .text start addr"), which placed .rodata directly after .dynamic by removing .eh_frame. This patch resolves the implicit inclusion into PT_DYNAMIC by explicitly specifying the :text output section phdr. Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://github.com/ClangBuiltLinux/linux/issues/2093 Signed-off-by: Fangrui Song <i@maskray.me> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250602-riscv-vdso-v1-1-0620cf63cff0@maskray.me Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-17um: vector: Reduce stack usage in vector_eth_configure()Tiwei Bie1-29/+13
[ Upstream commit 2d65fc13be85c336c56af7077f08ccd3a3a15a4a ] When compiling with clang (19.1.7), initializing *vp using a compound literal may result in excessive stack usage. Fix it by initializing the required fields of *vp individually. Without this patch: $ objdump -d arch/um/drivers/vector_kern.o | ./scripts/checkstack.pl x86_64 0 ... 0x0000000000000540 vector_eth_configure [vector_kern.o]:1472 ... With this patch: $ objdump -d arch/um/drivers/vector_kern.o | ./scripts/checkstack.pl x86_64 0 ... 0x0000000000000540 vector_eth_configure [vector_kern.o]:208 ... Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506221017.WtB7Usua-lkp@intel.com/ Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250623110829.314864-1-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-17x86/mm: Disable hugetlb page table sharing on 32-bitJann Horn1-1/+1
commit 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf upstream. Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86. Page table sharing requires at least three levels because it involves shared references to PMD tables; 32-bit x86 has either two-level paging (without PAE) or three-level paging (with PAE), but even with three-level paging, having a dedicated PGD entry for hugetlb is only barely possible (because the PGD only has four entries), and it seems unlikely anyone's actually using PMD sharing on 32-bit. Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which has 2-level paging) became particularly problematic after commit 59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"), since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and the `pt_share_count` (for PMDs) share the same union storage - and with 2-level paging, PMDs are PGDs. (For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the configuration of page tables such that it is never enabled with 2-level paging.) Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.") Reported-by: Vitaly Chikunov <vt@altlinux.org> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Vitaly Chikunov <vt@altlinux.org> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%40google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17x86/rdrand: Disable RDSEED on AMD Cyan SkillfishMikhail Paulyshka2-0/+8
commit 5b937a1ed64ebeba8876e398110a5790ad77407c upstream. AMD Cyan Skillfish (Family 17h, Model 47h, Stepping 0h) has an error that causes RDSEED to always return 0xffffffff, while RDRAND works correctly. Mask the RDSEED cap for this CPU so that both /proc/cpuinfo and direct CPUID read report RDSEED as unavailable. [ bp: Move to amd.c, massage. ] Signed-off-by: Mikhail Paulyshka <me@mixaill.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/20250524145319.209075-1-me@mixaill.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17KVM: SVM: Reject SEV{-ES} intra host migration if vCPU creation is in-flightSean Christopherson1-0/+4
commit ecf371f8b02d5e31b9aa1da7f159f1b2107bdb01 upstream. Reject migration of SEV{-ES} state if either the source or destination VM is actively creating a vCPU, i.e. if kvm_vm_ioctl_create_vcpu() is in the section between incrementing created_vcpus and online_vcpus. The bulk of vCPU creation runs _outside_ of kvm->lock to allow creating multiple vCPUs in parallel, and so sev_info.es_active can get toggled from false=>true in the destination VM after (or during) svm_vcpu_create(), resulting in an SEV{-ES} VM effectively having a non-SEV{-ES} vCPU. The issue manifests most visibly as a crash when trying to free a vCPU's NULL VMSA page in an SEV-ES VM, but any number of things can go wrong. BUG: unable to handle page fault for address: ffffebde00000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP KASAN NOPTI CPU: 227 UID: 0 PID: 64063 Comm: syz.5.60023 Tainted: G U O 6.15.0-smp-DEV #2 NONE Tainted: [U]=USER, [O]=OOT_MODULE Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024 RIP: 0010:constant_test_bit arch/x86/include/asm/bitops.h:206 [inline] RIP: 0010:arch_test_bit arch/x86/include/asm/bitops.h:238 [inline] RIP: 0010:_test_bit include/asm-generic/bitops/instrumented-non-atomic.h:142 [inline] RIP: 0010:PageHead include/linux/page-flags.h:866 [inline] RIP: 0010:___free_pages+0x3e/0x120 mm/page_alloc.c:5067 Code: <49> f7 06 40 00 00 00 75 05 45 31 ff eb 0c 66 90 4c 89 f0 4c 39 f0 RSP: 0018:ffff8984551978d0 EFLAGS: 00010246 RAX: 0000777f80000001 RBX: 0000000000000000 RCX: ffffffff918aeb98 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffebde00000000 RBP: 0000000000000000 R08: ffffebde00000007 R09: 1ffffd7bc0000000 R10: dffffc0000000000 R11: fffff97bc0000001 R12: dffffc0000000000 R13: ffff8983e19751a8 R14: ffffebde00000000 R15: 1ffffd7bc0000000 FS: 0000000000000000(0000) GS:ffff89ee661d3000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffebde00000000 CR3: 000000793ceaa000 CR4: 0000000000350ef0 DR0: 0000000000000000 DR1: 0000000000000b5f DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <TASK> sev_free_vcpu+0x413/0x630 arch/x86/kvm/svm/sev.c:3169 svm_vcpu_free+0x13a/0x2a0 arch/x86/kvm/svm/svm.c:1515 kvm_arch_vcpu_destroy+0x6a/0x1d0 arch/x86/kvm/x86.c:12396 kvm_vcpu_destroy virt/kvm/kvm_main.c:470 [inline] kvm_destroy_vcpus+0xd1/0x300 virt/kvm/kvm_main.c:490 kvm_arch_destroy_vm+0x636/0x820 arch/x86/kvm/x86.c:12895 kvm_put_kvm+0xb8e/0xfb0 virt/kvm/kvm_main.c:1310 kvm_vm_release+0x48/0x60 virt/kvm/kvm_main.c:1369 __fput+0x3e4/0x9e0 fs/file_table.c:465 task_work_run+0x1a9/0x220 kernel/task_work.c:227 exit_task_work include/linux/task_work.h:40 [inline] do_exit+0x7f0/0x25b0 kernel/exit.c:953 do_group_exit+0x203/0x2d0 kernel/exit.c:1102 get_signal+0x1357/0x1480 kernel/signal.c:3034 arch_do_signal_or_restart+0x40/0x690 arch/x86/kernel/signal.c:337 exit_to_user_mode_loop kernel/entry/common.c:111 [inline] exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] syscall_exit_to_user_mode+0x67/0xb0 kernel/entry/common.c:218 do_syscall_64+0x7c/0x150 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f87a898e969 </TASK> Modules linked in: gq(O) gsmi: Log Shutdown Reason 0x03 CR2: ffffebde00000000 ---[ end trace 0000000000000000 ]--- Deliberately don't check for a NULL VMSA when freeing the vCPU, as crashing the host is likely desirable due to the VMSA being consumed by hardware. E.g. if KVM manages to allow VMRUN on the vCPU, hardware may read/write a bogus VMSA page. Accessing PFN 0 is "fine"-ish now that it's sequestered away thanks to L1TF, but panicking in this scenario is preferable to potentially running with corrupted state. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration") Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration") Cc: stable@vger.kernel.org Cc: James Houghton <jthoughton@google.com> Cc: Peter Gonda <pgonda@google.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Tested-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: James Houghton <jthoughton@google.com> Link: https://lore.kernel.org/r/20250602224459.41505-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17KVM: x86/xen: Allow 'out of range' event channel ports in IRQ routing table.David Woodhouse1-2/+13
commit a7f4dff21fd744d08fa956c243d2b1795f23cbf7 upstream. To avoid imposing an ordering constraint on userspace, allow 'invalid' event channel targets to be configured in the IRQ routing table. This is the same as accepting interrupts targeted at vCPUs which don't exist yet, which is already the case for both Xen event channels *and* for MSIs (which don't do any filtering of permitted APIC ID targets at all). If userspace actually *triggers* an IRQ with an invalid target, that will fail cleanly, as kvm_xen_set_evtchn_fast() also does the same range check. If KVM enforced that the IRQ target must be valid at the time i