summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2024-08-30Revert "powerpc/8xx: Always pin kernel text TLB"Christophe Leroy3-1/+17
This reverts commit bccc58986a2f98e3af349c85c5f49aac7fb19ef2. When STRICT_KERNEL_RWX is selected, EXEC memory must stop where RW memory start. When pinning iTLBs it means an 8M alignment for RW data start. That may be acceptable on boards with a lot of memory but one of my supported boards only has 32 Mbytes and this forced alignment leads to a waste of almost 4 Mbytes with is more than 10% of the total memory. So revert commit bccc58986a2f ("powerpc/8xx: Always pin kernel text TLB") but don't restore previous behaviour in ITLB miss handler as now kernel PGD entries are copied into each process PGDIR. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/01b6780b860c8043b51a1ba9d83acfc6f2dde910.1724173828.git.christophe.leroy@csgroup.eu
2024-08-30powerpc/8xx: Copy kernel PGD entries into all PGDIRsChristophe Leroy1-1/+7
In order to avoid having to select PGDIR at each TLB miss based on fault address, copy kernel PGD entries into all PGDIRs in pgd_alloc(). At first it will be used for ITLB misses for kernel TEXT, then for execmem then for kernel DATA. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c6d2bf5af2ea909071a85bdca8b1f5dc2df134a8.1724173828.git.christophe.leroy@csgroup.eu
2024-08-30powerpc/8xx: Fix kernel vs user address comparisonChristophe Leroy1-3/+3
Since commit 9132a2e82adc ("powerpc/8xx: Define a MODULE area below kernel text"), module exec space is below PAGE_OFFSET so not only space above PAGE_OFFSET, but space above TASK_SIZE need to be seen as kernel space. Until now the problem went undetected because by default TASK_SIZE is 0x8000000 which means address space is determined by just checking upper address bit. But when TASK_SIZE is over 0x80000000, PAGE_OFFSET is used for comparison, leading to thinking module addresses are part of user space. Fix it by using TASK_SIZE instead of PAGE_OFFSET for address comparison. Fixes: 9132a2e82adc ("powerpc/8xx: Define a MODULE area below kernel text") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/3f574c9845ff0a023b46cb4f38d2c45aecd769bd.1724173828.git.christophe.leroy@csgroup.eu
2024-08-30powerpc/8xx: Fix initial memory mappingChristophe Leroy1-2/+2
Commit cf209951fa7f ("powerpc/8xx: Map linear memory with huge pages") introduced an initial mapping of kernel TEXT using PAGE_KERNEL_TEXT, but the pages that contain kernel TEXT may also contain kernel RODATA, and depending on selected debug options PAGE_KERNEL_TEXT may be either RWX or ROX. RODATA must be writable during init because it also contains ro_after_init data. So use PAGE_KERNEL_X instead to be sure it is RWX. Fixes: cf209951fa7f ("powerpc/8xx: Map linear memory with huge pages") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/dac7a828d8497c4548c91840575a706657baa4f1.1724173828.git.christophe.leroy@csgroup.eu
2024-08-30powerpc/powernv/pci: Remove obsoleted declaration for pnv_pci_init_ioda_hubGaosheng Cui1-1/+0
The pnv_pci_init_ioda_hub() have been removed since commit 5ac129cdb50b ("powerpc/powernv/pci: Remove ioda1 support"), and now it is useless, so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240822130043.783756-1-cuigaosheng1@huawei.com
2024-08-30powerpc: Remove obsoleted declarations for use_cop and drop_copGaosheng Cui1-3/+0
The use_cop() and drop_cop() have been removed since commit 6ff4d3e96652 ("powerpc: Remove old unused icswx based coprocessor support"), now they are useless, so remove them. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240822130609.786431-5-cuigaosheng1@huawei.com
2024-08-30powerpc/pasemi: Remove obsoleted declaration for pas_pci_irq_fixup()Gaosheng Cui1-1/+0
The pas_pci_irq_fixup() have been removed since commit 771f7404a9de ("pasemi_mac: Move the IRQ mapping from the PCI layer to the driver"), and now it is useless, so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240822130609.786431-4-cuigaosheng1@huawei.com
2024-08-30powerpc/maple: Remove obsoleted declaration for maple_calibrate_decr()Gaosheng Cui1-1/+0
The maple_calibrate_decr() have been removed since commit 10f7e7c15e6c ("[PATCH] ppc64: consolidate calibrate_decr implementations"), and now it is useless, so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240822130609.786431-3-cuigaosheng1@huawei.com
2024-08-30KVM: arm64: nv: Add support for FEAT_ATS1AMarc Zyngier4-0/+24
Handling FEAT_ATS1A (which provides the AT S1E{1,2}A instructions) is pretty easy, as it is just the usual AT without the permission check. This basically amounts to plumbing the instructions in the various dispatch tables, and handling FEAT_ATS1A being disabled in the ID registers. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Plumb handling of AT S1* traps from EL2Marc Zyngier1-0/+45
Hooray, we're done. Plug the AT traps into the system instruction table, and let it rip. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3Marc Zyngier1-1/+16
FEAT_PAN3 added a check for executable permissions to FEAT_PAN2. Add the required SCTLR_ELx.EPAN and descriptor checks to handle this correctly. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configurationMarc Zyngier1-0/+8
Ensure that SCTLR_EL1.EPAN is RES0 when FEAT_PAN3 isn't supported. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Add SW walker for AT S1 emulationMarc Zyngier1-2/+608
In order to plug the brokenness of our current AT implementation, we need a SW walker that is going to... err.. walk the S1 tables and tell us what it finds. Of course, it builds on top of our S2 walker, and share similar concepts. The beauty of it is that since it uses kvm_read_guest(), it is able to bring back pages that have been otherwise evicted. This is then plugged in the two AT S1 emulation functions as a "slow path" fallback. I'm not sure it is that slow, but hey. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Make ps_to_output_size() generally availableMarc Zyngier2-14/+14
Make this helper visible to at.c, we are going to need it. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}Marc Zyngier2-0/+254
On the face of it, AT S12E{0,1}{R,W} is pretty simple. It is the combination of AT S1E{0,1}{R,W}, followed by an extra S2 walk. However, there is a great deal of complexity coming from combining the S1 and S2 attributes to report something consistent in PAR_EL1. This is an absolute mine field, and I have a splitting headache. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}Marc Zyngier2-0/+52
Similar to our AT S1E{0,1} emulation, we implement the AT S1E2 handling. This emulation of course suffers from the same problems, but is somehow simpler due to the lack of PAN2 and the fact that we are guaranteed to execute it from the correct context. Co-developed-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Add basic emulation of AT S1E1{R,W}PMarc Zyngier1-0/+26
Building on top of our primitive AT S1E{0,1}{R,W} emulation, add minimal support for the FEAT_PAN2 instructions, momentary context-switching PSTATE.PAN so that it takes effect in the context of the guest. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}Marc Zyngier3-1/+142
Emulating AT instructions is one the tasks devolved to the host hypervisor when NV is on. Here, we take the basic approach of emulating AT S1E{0,1}{R,W} using the AT instructions themselves. While this mostly work, it doesn't *always* work: - S1 page tables can be swapped out - shadow S2 can be incomplete and not contain mappings for the S1 page tables We are not trying to handle these case here, and defer it to a later patch. Suitable comments indicate where we are in dire need of better handling. Co-developed-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Honor absence of FEAT_PAN2Marc Zyngier1-0/+4
If our guest has been configured without PAN2, make sure that AT S1E1{R,W}P will generate an UNDEF. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptorMarc Zyngier2-8/+8
The upper_attr attribute has been badly named, as it most of the time carries the full "last walked descriptor". Rename it to "desc" and make ti contain the full 64bit descriptor. This will be used by the S1 PTW. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: nv: Enforce S2 alignment when contiguous bit is setMarc Zyngier2-5/+24
Despite KVM not using the contiguous bit for anything related to TLBs, the spec does require that the alignment defined by the contiguous bit for the page size and the level is enforced. Add the required checks to offset the point where PA and VA merge. Fixes: 61e30b9eef7f ("KVM: arm64: nv: Implement nested Stage-2 page table walk logic") Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30arm64: Add ESR_ELx_FSC_ADDRSZ_L() helperMarc Zyngier1-2/+3
Although we have helpers that encode the level of a given fault type, the Address Size fault type is missing it. While we're at it, fix the bracketting for ESR_ELx_FSC_ACCESS_L() and ESR_ELx_FSC_PERM_L(). Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30arm64: Add system register encoding for PSTATE.PANMarc Zyngier1-0/+3
Although we already have the primitives to set PSTATE.PAN with an immediate, we don't have a way to read the current state nor set it ot an arbitrary value (i.e. we can generally save/restore it). Thankfully, all that is missing for this is the definition for the PAN pseudo system register, here named SYS_PSTATE_PAN. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30arm64: Add PAR_EL1 field descriptionMarc Zyngier1-0/+18
As KVM is about to grow a full emulation for the AT instructions, add the layout of the PAR_EL1 register in its non-D128 configuration. Note that the constants are a bit ugly, as the register has two layouts, based on the state of the F bit. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30arm64: Add missing APTable and TCR_ELx.HPD masksMarc Zyngier2-0/+10
Although Linux doesn't make use of hierarchical permissions (TFFT!), KVM needs to know where the various bits related to this feature live in the TCR_ELx registers as well as in the page tables. Add the missing bits. Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30KVM: arm64: Make kvm_at() take an OP_AT_*Joey Gouly2-2/+3
To allow using newer instructions that current assemblers don't know about, replace the `at` instruction with the underlying SYS instruction. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-08-30powerpc: Remove obsoleted declaration for _get_SPGaosheng Cui1-2/+0
The implementation of _get_SP() was removed in commit f4db196717c6 ("[POWERPC] Remove _get_SP"), remove the now obsolete declaration. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Update change log to refer to correct commit per Christophe] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240822130609.786431-2-cuigaosheng1@huawei.com
2024-08-30arm64: dts: amlogic: add C3 AW419 boardXianwei Zhao2-0/+263
Add Amlogic C3 C308L AW419 board. The corresponding binding has been applied, therefore, this series does not need to add a binding corresponding to the AW419 board. Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com> Link: https://lore.kernel.org/r/20240830-c3_add_node-v4-3-b56c0511e9dc@amlogic.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2024-08-30arm64: dts: amlogic: add some device nodes for C3Xianwei Zhao2-1/+720
Add some device nodes for SoC C3, including periphs clock controller node, PLL clock controller node, SPICC node, regulator node, NAND controller node, sdcard node, Ethernet MAC and PHY node. The sdacrd depends on regulator and pinctrl(select), so some property fields are placed at the board level. The nand chip is placed on the board, So some property fields about SPIFC and NAND controller node are placed at the board level. THe Ethernet MAC support outchip PHY, so place this property field(select PHY) at the board level. Signed-off-by: Xianwei Zhao <xianwei.zhao@amlogic.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240830-c3_add_node-v4-2-b56c0511e9dc@amlogic.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
2024-08-29KVM: x86: Add fastpath handling of HLT VM-ExitsSean Christopherson4-3/+36
Add a fastpath for HLT VM-Exits by immediately re-entering the guest if it has a pending wake event. When virtual interrupt delivery is enabled, i.e. when KVM doesn't need to manually inject interrupts, this allows KVM to stay in the fastpath run loop when a vIRQ arrives between the guest doing CLI and STI;HLT. Without AMD's Idle HLT-intercept support, the CPU generates a HLT VM-Exit even though KVM will immediately resume the guest. Note, on bare metal, it's relatively uncommon for a modern guest kernel to actually trigger this scenario, as the window between the guest checking for a wake event and committing to HLT is quite small. But in a nested environment, the timings change significantly, e.g. rudimentary testing showed that ~50% of HLT exits where HLT-polling was successful would be serviced by this fastpath, i.e. ~50% of the time that a nested vCPU gets a wake event before KVM schedules out the vCPU, the wake event was pending even before the VM-Exit. Link: https://lore.kernel.org/all/20240528041926.3989-3-manali.shukla@amd.com Link: https://lore.kernel.org/r/20240802195120.325560-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Reorganize code in x86.c to co-locate vCPU blocking/running helpersSean Christopherson1-132/+132
Shuffle code around in x86.c so that the various helpers related to vCPU blocking/running logic are (a) located near each other and (b) ordered so that HLT emulation can use kvm_vcpu_has_events() in a future path. No functional change intended. Link: https://lore.kernel.org/r/20240802195120.325560-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Exit to userspace if fastpath triggers one on instruction skipSean Christopherson2-2/+8
Exit to userspace if a fastpath handler triggers such an exit, which can happen when skipping the instruction, e.g. due to userspace single-stepping the guest via KVM_GUESTDBG_SINGLESTEP or because of an emulation failure. Fixes: 404d5d7bff0d ("KVM: X86: Introduce more exit_fastpath_completion enum values") Link: https://lore.kernel.org/r/20240802195120.325560-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Dedup fastpath MSR post-handling logicSean Christopherson1-10/+11
Now that the WRMSR fastpath for x2APIC_ICR and TSC_DEADLINE are identical, ignoring the backend MSR handling, consolidate the common bits of skipping the instruction and setting the return value. No functional change intended. Link: https://lore.kernel.org/r/20240802195120.325560-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Re-enter guest if WRMSR(X2APIC_ICR) fastpath is successfulSean Christopherson1-1/+1
Re-enter the guest in the fastpath if WRMSR emulation for x2APIC's ICR is successful, as no additional work is needed, i.e. there is no code unique for WRMSR exits between the fastpath and the "!= EXIT_FASTPATH_NONE" check in __vmx_handle_exit(). Link: https://lore.kernel.org/r/20240802195120.325560-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: SVM: Track the per-CPU host save area as a VMCB pointerSean Christopherson2-8/+9
The host save area is a VMCB, track it as such to help readers follow along, but mostly to cleanup/simplify the retrieval of the SEV-ES host save area. Note, the compile-time assertion that offsetof(struct vmcb, save) == EXPECTED_VMCB_CONTROL_AREA_SIZE ensures that the SEV-ES save area is indeed at offset 0x400 (whoever added the expected/architectural VMCB offsets apparently likes decimal). No functional change intended. Link: https://lore.kernel.org/r/20240802204511.352017-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: SVM: Add host SEV-ES save area structure into VMCB via a unionSean Christopherson1-5/+15
Incorporate the _host_ SEV-ES save area into the VMCB as a union with the legacy save area. The SEV-ES variant used to save/load host state is larger than the legacy save area, but resides at the same offset. Prefix the field with "host" to make it as obvious as possible that the SEV-ES variant in the VMCB is only ever used for host state. Guest state for SEV-ES VMs is stored in a completely separate page (VMSA), albeit with the same layout as the host state. Add a compile-time assert to ensure the VMCB layout is correct, i.e. that KVM's layout matches the architectural definitions. No functional change intended. Link: https://lore.kernel.org/r/20240802204511.352017-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: SVM: Add a helper to convert a SME-aware PA back to a struct pageSean Christopherson2-6/+19
Add __sme_pa_to_page() to pair with __sme_page_pa() and use it to replace open coded equivalents, including for "iopm_base", which previously avoided having to do __sme_clr() by storing the raw PA in the global variable. Opportunistically convert __sme_page_pa() to a helper to provide type safety. No functional change intended. Link: https://lore.kernel.org/r/20240802204511.352017-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86/mmu: Reword a misleading comment about checking gpte_changed()Sean Christopherson1-2/+8
Rewrite the comment in FNAME(fetch) to explain why KVM needs to check that the gPTE is still fresh before continuing the shadow page walk, even if KVM already has a linked shadow page for the gPTE in question. No functional change intended. Link: https://lore.kernel.org/r/20240802203900.348808-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86/mmu: Drop pointless "return" wrapper label in FNAME(fetch)Sean Christopherson1-7/+4
Drop the pointless and poorly named "out_gpte_changed" label, in FNAME(fetch), and instead return RET_PF_RETRY directly. No functional change intended. Link: https://lore.kernel.org/r/20240802203900.348808-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86/mmu: Decrease indentation in logic to sync new indirect shadow pageSean Christopherson1-21/+19
Combine the back-to-back if-statements for synchronizing children when linking a new indirect shadow page in order to decrease the indentation, and to make it easier to "see" the logic in its entirety. No functional change intended. Link: https://lore.kernel.org/r/20240802203900.348808-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Re-split x2APIC ICR into ICR+ICR2 for AMD (x2AVIC)Sean Christopherson4-12/+36
Re-introduce the "split" x2APIC ICR storage that KVM used prior to Intel's IPI virtualization support, but only for AMD. While not stated anywhere in the APM, despite stating the ICR is a single 64-bit register, AMD CPUs store the 64-bit ICR as two separate 32-bit values in ICR and ICR2. When IPI virtualization (IPIv on Intel, all AVIC flavors on AMD) is enabled, KVM needs to match CPU behavior as some ICR ICR writes will be handled by the CPU, not by KVM. Add a kvm_x86_ops knob to control the underlying format used by the CPU to store the x2APIC ICR, and tune it to AMD vs. Intel regardless of whether or not x2AVIC is enabled. If KVM is handling all ICR writes, the storage format for x2APIC mode doesn't matter, and having the behavior follow AMD versus Intel will provide better test coverage and ease debugging. Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode") Cc: stable@vger.kernel.org Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20240719235107.3023592-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Move x2APIC ICR helper above kvm_apic_write_nodecode()Sean Christopherson1-23/+23
Hoist kvm_x2apic_icr_write() above kvm_apic_write_nodecode() so that a local helper to _read_ the x2APIC ICR can be added and used in the nodecode path without needing a forward declaration. No functional change intended. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240719235107.3023592-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29KVM: x86: Enforce x2APIC's must-be-zero reserved ICR bitsSean Christopherson1-1/+14
Inject a #GP on a WRMSR(ICR) that attempts to set any reserved bits that are must-be-zero on both Intel and AMD, i.e. any reserved bits other than the BUSY bit, which Intel ignores and basically says is undefined. KVM's xapic_state_test selftest has been fudging the bug since commit 4b88b1a518b3 ("KVM: selftests: Enhance handling WRMSR ICR register in x2APIC mode"), which essentially removed the testcase instead of fixing the bug. WARN if the nodecode path triggers a #GP, as the CPU is supposed to check reserved bits for ICR when it's partially virtualized. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240719235107.3023592-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-08-29arm64: dts: rockchip: drop unsupported regulator-property from NanoPC-T6Heiko Stuebner1-1/+0
vcc3v3-sd-s0-regulator used enable-active-low. According the binding of the fixed regulator, that is the assumed mode of operation if enable-active-high is not specified. So this is property is not part of the binding, therefore remove it. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240829132100.1723127-4-heiko@sntech.de Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-08-29arm64: dts: rockchip: drop unsupported regulator property from NanoPC-T6Heiko Stuebner1-1/+0
regulator-init-microvolt is used in the vendor-kernel, but not part of the specification. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240829132100.1723127-3-heiko@sntech.de Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-08-29arm64: dts: rockchip: use correct fcs,suspend-voltage-selector on NanoPC-T6Heiko Stuebner1-1/+1
A remant from moving from the vendor kernel, the regulator is using a fairchild fcs prefix instead of rockchip,* in the mainline kernel according to its binding. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240829132100.1723127-2-heiko@sntech.de Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-08-29s390/hiperdispatch: Add hiperdispatch debug countersMete Durlu1-0/+77
Add three counters to follow and understand hiperdispatch behavior; * adjustment_count (amount of capacity adjustments triggered) * greedy_time_ms (time spent while all cpus are on high capacity) * conservative_time_ms (time spent while only entitled cpus are on high capacity) These counters can be found under /sys/kernel/debug/s390/hiperdispatch/ Time counters are in <msec> format and only cover the time spent when hiperdispatch is active. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-29s390/hiperdispatch: Add hiperdispatch debug attributesMete Durlu1-2/+82
Add two attributes for debug purposes. They can be found under; /sys/devices/system/cpu/hiperdispatch/ * hd_stime_threshold : allows user to adjust steal time threshold * hd_delay_factor : allows user to adjust delay factor of hiperdispatch work (after topology updates, delayed work is always delayed extra by this factor) hd_stime_threshold can have values between 0-100 as it represents a percentage value. hd_delay_factor can have values greater than 1. It is multiplied with the default delay to achieve a longer interval, pushing back the next hiperdispatch adjustment after a topology update. Ex: if delay interval is 250ms and the delay factor is 4; delayed interval is now 1000ms(1sec). After each capacity adjustment or topology change, work has a delayed interval of 1 sec for one interval. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-29s390/hiperdispatch: Add hiperdispatch sysctl interfaceMete Durlu2-0/+72
Expose hiperdispatch controls via sysctl. The user can now toggle hiperdispatch via assigning 0 or 1 to s390.hiperdispatch attribute. When hiperdipatch is toggled on, it tries to adjust CPU capacities, while system is in vertical polarization to gain performance benefits from different CPU polarizations. Disabling hiperdispatch reverts the CPU capacities to their default (HIGH_CAPACITY) and stops the dynamic adjustments. Introduce a kconfig option HIPERDISPATCH_ON which allows users to use hiperdispatch by default on vertical polarization. Using the sysctl attribute s390.hiperdispatch would overwrite this behavior. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-08-29s390/hiperdispatch: Add trace eventsMete Durlu2-0/+63
Add trace events to debug hiperdispatch behavior and track domain rebuilding. Two events provide information about the decision making of hiperdispatch and the adjustments made. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Co-developed-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Tobias Huschle <huschle@linux.ibm.com> Signed-off-by: Mete Durlu <meted@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>