summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2021-09-18KVM: PPC: Book3S HV: Fix copy_tofrom_guest routinesFabiano Rosas1-2/+4
[ Upstream commit 5d7d6dac8fe99ed59eee2300e4a03370f94d5222 ] The __kvmhv_copy_tofrom_guest_radix function was introduced along with nested HV guest support. It uses the platform's Radix MMU quadrants to provide a nested hypervisor with fast access to its nested guests memory (H_COPY_TOFROM_GUEST hypercall). It has also since been added as a fast path for the kvmppc_ld/st routines which are used during instruction emulation. The commit def0bfdbd603 ("powerpc: use probe_user_read() and probe_user_write()") changed the low level copy function from raw_copy_from_user to probe_user_read, which adds a check to access_ok. In powerpc that is: static inline bool __access_ok(unsigned long addr, unsigned long size) { return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr; } and TASK_SIZE_MAX is 0x0010000000000000UL for 64-bit, which means that setting the two MSBs of the effective address (which correspond to the quadrant) now cause access_ok to reject the access. This was not caught earlier because the most common code path via kvmppc_ld/st contains a fallback (kvm_read_guest) that is likely to succeed for L1 guests. For nested guests there is no fallback. Another issue is that probe_user_read (now __copy_from_user_nofault) does not return the number of bytes not copied in case of failure, so the destination memory is not being cleared anymore in kvmhv_copy_from_guest_radix: ret = kvmhv_copy_tofrom_guest_radix(vcpu, eaddr, to, NULL, n); if (ret > 0) <-- always false! memset(to + (n - ret), 0, ret); This patch fixes both issues by skipping access_ok and open-coding the low level __copy_to/from_user_inatomic. Fixes: def0bfdbd603 ("powerpc: use probe_user_read() and probe_user_write()") Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210805212616.2641017-2-farosas@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18powerpc/config: Renable MTD_PHYSMAP_OFJoel Stanley1-0/+1
[ Upstream commit d0e28a6145c3455b69991245e7f6147eb914b34a ] CONFIG_MTD_PHYSMAP_OF is not longer enabled as it depends on MTD_PHYSMAP which is not enabled. This is a regression from commit 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c"), which added the extra dependency. Add CONFIG_MTD_PHYSMAP=y so this stays in the config, as Christophe said it is useful for build coverage. Fixes: 642b1e8dbed7 ("mtd: maps: Merge physmap_of.c into physmap-core.c") Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210817045407.2445664-3-joel@jms.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18powerpc/numa: Consider the max NUMA node for migratable LPARLaurent Dufour1-3/+10
[ Upstream commit 9c7248bb8de31f51c693bfa6a6ea53b1c07e0fa8 ] When a LPAR is migratable, we should consider the maximum possible NUMA node instead of the number of NUMA nodes from the actual system. The DT property 'ibm,current-associativity-domains' defines the maximum number of nodes the LPAR can see when running on that box. But if the LPAR is being migrated on another box, it may see up to the nodes defined by 'ibm,max-associativity-domains'. So if a LPAR is migratable, that value should be used. Unfortunately, there is no easy way to know if an LPAR is migratable or not. The hypervisor exports the property 'ibm,migratable-partition' in the case it set to migrate partition, but that would not mean that the current partition is migratable. Without this patch, when a LPAR is started on a 2 node box and then migrated to a 3 node box, the hypervisor may spread the LPAR's CPUs on the 3rd node. In that case if a CPU from that 3rd node is added to the LPAR, it will be wrongly assigned to the node because the kernel has been set to use up to 2 nodes (the configuration of the departure node). With this patch applies, the CPU is correctly added to the 3rd node. Fixes: f9f130ff2ec9 ("powerpc/numa: Detect support for coregroup") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210511073136.17795-1-ldufour@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18openrisc: don't printk() unconditionallyRandy Dunlap1-0/+2
[ Upstream commit 946e1052cdcc7e585ee5d1e72528ca49fb295243 ] Don't call printk() when CONFIG_PRINTK is not set. Fixes the following build errors: or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler': (.text+0x804): undefined reference to `printk' (.text+0x804): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk' Fixes: 9d02a4283e9c ("OpenRISC: Boot code") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: openrisc@lists.librecores.org Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18powerpc/stacktrace: Include linux/delay.hMichal Suchanek1-0/+1
[ Upstream commit a6cae77f1bc89368a4e2822afcddc45c3062d499 ] commit 7c6986ade69e ("powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()") introduces udelay() call without including the linux/delay.h header. This may happen to work on master but the header that declares the functionshould be included nonetheless. Fixes: 7c6986ade69e ("powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729180103.15578-1-msuchanek@suse.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18ARM: 9105/1: atags_to_fdt: don't warn about stack sizeDavid Heidelberg1-0/+2
commit b30d0289de72c62516df03fdad8d53f552c69839 upstream. The merge_fdt_bootargs() function by definition consumes more than 1024 bytes of stack because it has a 1024 byte command line on the stack, meaning that we always get a warning when building this file: arch/arm/boot/compressed/atags_to_fdt.c: In function 'merge_fdt_bootargs': arch/arm/boot/compressed/atags_to_fdt.c:98:1: warning: the frame size of 1032 bytes is larger than 1024 bytes [-Wframe-larger-than=] However, as this is the decompressor and we know that it has a very shallow call chain, and we do not actually risk overflowing the kernel stack at runtime here. This just shuts up the warning by disabling the warning flag for this file. Tested on Nexus 7 2012 builds. Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: David Heidelberg <david@ixit.cz> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: <stable@vger.kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18arm64: head: avoid over-mapping in map_memoryMark Rutland2-7/+8
commit 90268574a3e8a6b883bd802d702a2738577e1006 upstream. The `compute_indices` and `populate_entries` macros operate on inclusive bounds, and thus the `map_memory` macro which uses them also operates on inclusive bounds. We pass `_end` and `_idmap_text_end` to `map_memory`, but these are exclusive bounds, and if one of these is sufficiently aligned (as a result of kernel configuration, physical placement, and KASLR), then: * In `compute_indices`, the computed `iend` will be in the page/block *after* the final byte of the intended mapping. * In `populate_entries`, an unnecessary entry will be created at the end of each level of table. At the leaf level, this entry will map up to SWAPPER_BLOCK_SIZE bytes of physical addresses that we did not intend to map. As we may map up to SWAPPER_BLOCK_SIZE bytes more than intended, we may violate the boot protocol and map physical address past the 2MiB-aligned end address we are permitted to map. As we map these with Normal memory attributes, this may result in further problems depending on what these physical addresses correspond to. The final entry at each level may require an additional table at that level. As EARLY_ENTRIES() calculates an inclusive bound, we allocate enough memory for this. Avoid the extraneous mapping by having map_memory convert the exclusive end address to an inclusive end address by subtracting one, and do likewise in EARLY_ENTRIES() when calculating the number of required tables. For clarity, comments are updated to more clearly document which boundaries the macros operate on. For consistency with the other macros, the comments in map_memory are also updated to describe `vstart` and `vend` as virtual addresses. Fixes: 0370b31e4845 ("arm64: Extend early page table code to allow for larger kernels") Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210823101253.55567-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18arm64: mm: Fix TLBI vs ASID rolloverWill Deacon2-9/+31
commit 5e10f9887ed85d4f59266d5c60dd09be96b5dbd4 upstream. When switching to an 'mm_struct' for the first time following an ASID rollover, a new ASID may be allocated and assigned to 'mm->context.id'. This reassignment can happen concurrently with other operations on the mm, such as unmapping pages and subsequently issuing TLB invalidation. Consequently, we need to ensure that (a) accesses to 'mm->context.id' are atomic and (b) all page-table updates made prior to a TLBI using the old ASID are guaranteed to be visible to CPUs running with the new ASID. This was found by inspection after reviewing the VMID changes from Shameer but it looks like a real (yet hard to hit) bug. Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Jade Alglave <jade.alglave@arm.com> Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210806113109.2475-2-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18xen: fix setting of max_pfn in shared_infoJuergen Gross1-2/+2
commit 4b511d5bfa74b1926daefd1694205c7f1bcf677f upstream. Xen PV guests are specifying the highest used PFN via the max_pfn field in shared_info. This value is used by the Xen tools when saving or migrating the guest. Unfortunately this field is misnamed, as in reality it is specifying the number of pages (including any memory holes) of the guest, so it is the highest used PFN + 1. Renaming isn't possible, as this is a public Xen hypervisor interface which needs to be kept stable. The kernel will set the value correctly initially at boot time, but when adding more pages (e.g. due to memory hotplug or ballooning) a real PFN number is stored in max_pfn. This is done when expanding the p2m array, and the PFN stored there is even possibly wrong, as it should be the last possible PFN of the just added P2M frame, and not one which led to the P2M expansion. Fix that by setting shared_info->max_pfn to the last possible PFN + 1. Fixes: 98dd166ea3a3c3 ("x86/xen/p2m: hint at the last populated P2M entry") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Link: https://lore.kernel.org/r/20210730092622.9973-2-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-18powerpc/perf/hv-gpci: Fix counter value parsingKajol Jain1-1/+1
commit f9addd85fbfacf0d155e83dbee8696d6df5ed0c7 upstream. H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in the result buffer. Result buffer has specific format defined in the PAPR specification. One of the fields is counter offset and width of the counter data returned. Counter data are returned in a unsigned char array in big endian byte order. To get the final counter data, the values must be left shifted byte at a time. But commit 220a0c609ad17 ("powerpc/perf: Add support for the hv gpci (get performance counter info) interface") made the shifting bitwise and also assumed little endian order. Because of that, hcall counters values are reported incorrectly. In particular this can lead to counters go backwards which messes up the counter prev vs now calculation and leads to huge counter value reporting: #: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ -C 0 -I 1000 time counts unit events 1.000078854 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 2.000213293 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 3.000320107 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 4.000428392 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 5.000537864 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 6.000649087 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 7.000760312 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 8.000865218 16,448 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 9.000978985 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 10.001088891 16,384 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 11.001201435 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ 12.001307937 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ Fix the shifting logic to correct match the format, ie. read bytes in big endian order. Fixes: e4f226b1580b ("powerpc/perf/hv-gpci: Increase request buffer size") Cc: stable@vger.kernel.org # v4.6+ Reported-by: Nageswara R Sastry<rnsastry@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Nageswara R Sastry<rnsastry@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210813082158.429023-1-kjain@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15ARM: dts: at91: add pinctrl-{names, 0} for all gpiosClaudiu Beznea3-1/+63
commit bf781869e5cf3e4ec1a47dad69b6f0df97629cbd upstream. Add pinctrl-names and pinctrl-0 properties on controllers that claims to use pins to avoid failures due to commit 2ab73c6d8323 ("gpio: Support GPIO controllers without pin-ranges") and also to avoid using pins that may be claimed my other IPs. Fixes: b7c2b6157079 ("ARM: at91: add Atmel's SAMA5D3 Xplained board") Fixes: 1e5f532c2737 ("ARM: dts: at91: sam9x60: add device tree for soc and board") Fixes: 38153a017896 ("ARM: at91/dt: sama5d4: add dts for sama5d4 xplained board") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Link: https://lore.kernel.org/r/20210727074006.1609989-1-claudiu.beznea@microchip.com Cc: <stable@vger.kernel.org> # v5.7+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15KVM: nVMX: Unconditionally clear nested.pi_pending on nested VM-EnterSean Christopherson1-4/+3
commit f7782bb8d818d8f47c26b22079db10599922787a upstream. Clear nested.pi_pending on nested VM-Enter even if L2 will run without posted interrupts enabled. If nested.pi_pending is left set from a previous L2, vmx_complete_nested_posted_interrupt() will pick up the stale flag and exit to userspace with an "internal emulation error" due the new L2 not having a valid nested.pi_desc. Arguably, vmx_complete_nested_posted_interrupt() should first check for posted interrupts being enabled, but it's also completely reasonable that KVM wouldn't screw up a fundamental flag. Not to mention that the mere existence of nested.pi_pending is a long-standing bug as KVM shouldn't move the posted interrupt out of the IRR until it's actually processed, e.g. KVM effectively drops an interrupt when it performs a nested VM-Exit with a "pending" posted interrupt. Fixing the mess is a future problem. Prior to vmx_complete_nested_posted_interrupt() interpreting a null PI descriptor as an error, this was a benign bug as the null PI descriptor effectively served as a check on PI not being enabled. Even then, the new flow did not become problematic until KVM started checking the result of kvm_check_nested_events(). Fixes: 705699a13994 ("KVM: nVMX: Enable nested posted interrupt processing") Fixes: 966eefb89657 ("KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable") Fixes: 47d3530f86c0 ("KVM: x86: Exit to userspace when kvm_check_nested_events fails") Cc: stable@vger.kernel.org Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210810144526.2662272-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulationMaxim Levitsky1-0/+3
commit 81b4b56d4f8130bbb99cf4e2b48082e5b4cfccb9 upstream. If we are emulating an invalid guest state, we don't have a correct exit reason, and thus we shouldn't do anything in this function. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210826095750.1650467-2-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Fixes: 95b5a48c4f2b ("KVM: VMX: Handle NMIs, #MCs and async #PFs in common irqs-disabled fn", 2019-06-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is ↵Zelin Deng1-0/+4
adjusted commit d9130a2dfdd4b21736c91b818f87dbc0ccd1e757 upstream. When MSR_IA32_TSC_ADJUST is written by guest due to TSC ADJUST feature especially there's a big tsc warp (like a new vCPU is hot-added into VM which has been up for a long time), tsc_offset is added by a large value then go back to guest. This causes system time jump as tsc_timestamp is not adjusted in the meantime and pvclock monotonic character. To fix this, just notify kvm to update vCPU's guest time before back to guest. Cc: stable@vger.kernel.org Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1619576521-81399-2-git-send-email-zelin.deng@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15KVM: s390: index kvm->arch.idle_mask by vcpu_idxHalil Pasic4-8/+9
commit a3e03bc1368c1bc16e19b001fc96dc7430573cc8 upstream. While in practice vcpu->vcpu_idx == vcpu->vcp_id is often true, it may not always be, and we must not rely on this. Reason is that KVM decides the vcpu_idx, userspace decides the vcpu_id, thus the two might not match. Currently kvm->arch.idle_mask is indexed by vcpu_id, which implies that code like for_each_set_bit(vcpu_id, kvm->arch.idle_mask, online_vcpus) { vcpu = kvm_get_vcpu(kvm, vcpu_id); do_stuff(vcpu); } is not legit. Reason is that kvm_get_vcpu expects an vcpu_idx, not an vcpu_id. The trouble is, we do actually use kvm->arch.idle_mask like this. To fix this problem we have two options. Either use kvm_get_vcpu_by_id(vcpu_id), which would loop to find the right vcpu_id, or switch to indexing via vcpu_idx. The latter is preferable for obvious reasons. Let us make switch from indexing kvm->arch.idle_mask by vcpu_id to indexing it by vcpu_idx. To keep gisa_int.kicked_mask indexed by the same index as idle_mask lets make the same change for it as well. Fixes: 1ee0bc559dc3 ("KVM: s390: get rid of local_int array") Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Christian Bornträger <borntraeger@de.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> # 3.15+ Link: https://lore.kernel.org/r/20210827125429.1912577-1-pasic@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"Sean Christopherson1-6/+0
commit e7177339d7b5f9594b316842122b5fda9513d5e2 upstream. Revert a misguided illegal GPA check when "translating" a non-nested GPA. The check is woefully incomplete as it does not fill in @exception as expected by all callers, which leads to KVM attempting to inject a bogus exception, potentially exposing kernel stack information in the process. WARNING: CPU: 0 PID: 8469 at arch/x86/kvm/x86.c:525 exception_type+0x98/0xb0 arch/x86/kvm/x86.c:525 CPU: 1 PID: 8469 Comm: syz-executor531 Not tainted 5.14.0-rc7-syzkaller #0 RIP: 0010:exception_type+0x98/0xb0 arch/x86/kvm/x86.c:525 Call Trace: x86_emulate_instruction+0xef6/0x1460 arch/x86/kvm/x86.c:7853 kvm_mmu_page_fault+0x2f0/0x1810 arch/x86/kvm/mmu/mmu.c:5199 handle_ept_misconfig+0xdf/0x3e0 arch/x86/kvm/vmx/vmx.c:5336 __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6021 [inline] vmx_handle_exit+0x336/0x1800 arch/x86/kvm/vmx/vmx.c:6038 vcpu_enter_guest+0x2a1c/0x4430 arch/x86/kvm/x86.c:9712 vcpu_run arch/x86/kvm/x86.c:9779 [inline] kvm_arch_vcpu_ioctl_run+0x47d/0x1b20 arch/x86/kvm/x86.c:10010 kvm_vcpu_ioctl+0x49e/0xe50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3652 The bug has escaped notice because practically speaking the GPA check is useless. The GPA check in question only comes into play when KVM is walking guest page tables (or "translating" CR3), and KVM already handles illegal GPA checks by setting reserved bits in rsvd_bits_mask for each PxE, or in the case of CR3 for loading PTDPTRs, manually checks for an illegal CR3. This particular failure doesn't hit the existing reserved bits checks because syzbot sets guest.MAXPHYADDR=1, and IA32 architecture simply doesn't allow for such an absurd MAXPHYADDR, e.g. 32-bit paging doesn't define any reserved PA bits checks, which KVM emulates by only incorporating the reserved PA bits into the "high" bits, i.e. bits 63:32. Simply remove the bogus check. There is zero meaningful value and no architectural justification for supporting guest.MAXPHYADDR < 32, and properly filling the exception would introduce non-trivial complexity. This reverts commit ec7771ab471ba6a945350353617e2e3385d0e013. Fixes: ec7771ab471b ("KVM: x86: mmu: Add guest physical address check in translate_gpa()") Cc: stable@vger.kernel.org Reported-by: syzbot+200c08e88ae818f849ce@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210831164224.1119728-2-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15x86/resctrl: Fix a maybe-uninitialized build warning treated as errorBabu Moger1-0/+6
commit 527f721478bce3f49b513a733bacd19d6f34b08c upstream. The recent commit 064855a69003 ("x86/resctrl: Fix default monitoring groups reporting") caused a RHEL build failure with an uninitialized variable warning treated as an error because it removed the default case snippet. The RHEL Makefile uses '-Werror=maybe-uninitialized' to force possibly uninitialized variable warnings to be treated as errors. This is also reported by smatch via the 0day robot. The error from the RHEL build is: arch/x86/kernel/cpu/resctrl/monitor.c: In function ‘__mon_event_count’: arch/x86/kernel/cpu/resctrl/monitor.c:261:12: error: ‘m’ may be used uninitialized in this function [-Werror=maybe-uninitialized] m->chunks += chunks; ^~ The upstream Makefile does not build using '-Werror=maybe-uninitialized'. So, the problem is not seen there. Fix the problem by putting back the default case snippet. [ bp: note that there's nothing wrong with the code and other compilers do not trigger this warning - this is being done just so the RHEL compiler is happy. ] Fixes: 064855a69003 ("x86/resctrl: Fix default monitoring groups reporting") Reported-by: Terry Bowman <Terry.Bowman@amd.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/162949631908.23903.17090272726012848523.stgit@bmoger-ubuntu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS OpKim Phillips1-0/+1
commit f11dd0d80555cdc8eaf5cfc9e19c9e198217f9f1 upstream. Commit: 2ff40250691e ("perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs") neglected to do so. Fixes: 2ff40250691e ("perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210817221048.88063-2-kim.phillips@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15arm64: dts: marvell: armada-37xx: Extend PCIe MEM spacePali Rohár2-2/+26
[ Upstream commit 514ef1e62d6521c2199d192b1c71b79d2aa21d5a ] Current PCIe MEM space of size 16 MB is not enough for some combination of PCIe cards (e.g. NVMe disk together with ath11k wifi card). ARM Trusted Firmware for Armada 3700 platform already assigns 128 MB for PCIe window, so extend PCIe MEM space to the end of 128 MB PCIe window which allows to allocate more PCIe BARs for more PCIe cards. Without this change some combination of PCIe cards cannot be used and kernel show error messages in dmesg during initialization: pci 0000:00:00.0: BAR 8: no space for [mem size 0x01800000] pci 0000:00:00.0: BAR 8: failed to assign [mem size 0x01800000] pci 0000:00:00.0: BAR 6: assigned [mem 0xe8000000-0xe80007ff pref] pci 0000:01:00.0: BAR 8: no space for [mem size 0x01800000] pci 0000:01:00.0: BAR 8: failed to assign [mem size 0x01800000] pci 0000:02:03.0: BAR 8: no space for [mem size 0x01000000] pci 0000:02:03.0: BAR 8: failed to assign [mem size 0x01000000] pci 0000:02:07.0: BAR 8: no space for [mem size 0x00100000] pci 0000:02:07.0: BAR 8: failed to assign [mem size 0x00100000] pci 0000:03:00.0: BAR 0: no space for [mem size 0x01000000 64bit] pci 0000:03:00.0: BAR 0: failed to assign [mem size 0x01000000 64bit] Due to bugs in U-Boot port for Turris Mox, the second range in Turris Mox kernel DTS file for PCIe must start at 16 MB offset. Otherwise U-Boot crashes during loading of kernel DTB file. This bug is present only in U-Boot code for Turris Mox and therefore other Armada 3700 devices are not affected by this bug. Bug is fixed in U-Boot version 2021.07. To not break booting new kernels on existing versions of U-Boot on Turris Mox, use first 16 MB range for IO and second range with rest of PCIe window for MEM. Signed-off-by: Pali Rohár <pali@kernel.org> Fixes: 76f6386b25cc ("arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700") Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15arm64: dts: exynos: correct GIC CPU interfaces address range on Exynos7Krzysztof Kozlowski1-1/+1
[ Upstream commit 01c72cad790cb6cd3ccbe4c1402b6cb6c6bbffd0 ] The GIC-400 CPU interfaces address range is defined as 0x2000-0x3FFF (by ARM). Reported-by: Sam Protsenko <semen.protsenko@linaro.org> Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Fixes: b9024cbc937d ("arm64: dts: Add initial device tree support for exynos7") Link: https://lore.kernel.org/r/20210805072110.4730-1-krzysztof.kozlowski@canonical.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15arm64: dts: renesas: hihope-rzg2-ex: Add EtherAVB internal rx delayBiju Das1-0/+1
[ Upstream commit c96ca5604a889a142d6b60889cc6da48498806e9 ] Hihope boards use Realtek PHY. From the very beginning it use only tx delays. However the phy driver commit bbc4d71d63549bcd003 ("net: phy: realtek: fix rtl8211e rx/tx delay config") introduced NFS mount failure. Now it needs rx delay inaddition to tx delay for NFS mount to work. This patch fixes NFS mount failure issue by adding MAC internal rx delay. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Fixes: bbc4d71d63549bcd ("net: phy: realtek: fix rtl8211e rx/tx delay config") Link: https://lore.kernel.org/r/20210721180632.15080-1-biju.das.jz@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15arm64: dts: renesas: rzg2: Convert EtherAVB to explicit delay handlingGeert Uytterhoeven6-2/+10
[ Upstream commit a5200e63af57d05ed8bf0ffd9a6ffefc40e01e89 ] Some EtherAVB variants support internal clock delay configuration, which can add larger delays than the delays that are typically supported by the PHY (using an "rgmii-*id" PHY mode, and/or "[rt]xc-skew-ps" properties). Historically, the EtherAVB driver configured these delays based on the "rgmii-*id" PHY mode. This was wrong, as these are meant solely for the PHY, not for the MAC. Hence properties were introduced for explicit configuration of these delays. Convert the RZ/G2 DTS files from the old to the new scheme: - Add default "rx-internal-delay-ps" and "tx-internal-delay-ps" properties to the SoC .dtsi files, to be overridden by board files where needed, - Convert board files from "rgmii-*id" PHY modes to "rgmii", adding the appropriate "rx-internal-delay-ps" and/or "tx-internal-delay-ps" overrides. Notes: - RZ/G2E does not support TX internal delay handling. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20200819134344.27813-8-geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ARM: dts: meson8b: ec100: Fix the pwm regulator supply propertiesAnand Moon1-2/+2
[ Upstream commit 72ccc373b064ae3ac0c5b5f2306069b60ca118df ] After enabling CONFIG_REGULATOR_DEBUG=y we observer below debug logs. Changes help link VCCK and VDDEE pwm regulator to 5V regulator supply instead of dummy regulator. [ 7.117140] pwm-regulator regulator-vcck: Looking up pwm-supply from device tree [ 7.117153] pwm-regulator regulator-vcck: Looking up pwm-supply property in node /regulator-vcck failed [ 7.117184] VCCK: supplied by regulator-dummy [ 7.117194] regulator-dummy: could not add device link regulator.8: -ENOENT [ 7.117266] VCCK: 860 <--> 1140 mV at 986 mV, enabled [ 7.118498] VDDEE: will resolve supply early: pwm [ 7.118515] pwm-regulator regulator-vddee: Looking up pwm-supply from device tree [ 7.118526] pwm-regulator regulator-vddee: Looking up pwm-supply property in node /regulator-vddee failed [ 7.118553] VDDEE: supplied by regulator-dummy [ 7.118563] regulator-dummy: could not add device link regulator.9: -ENOENT Fixes: 087a1d8b4e4c ("ARM: dts: meson8b: ec100: add the VDDEE regulator") Fixes: 3e7db1c1b7a3 ("ARM: dts: meson8b: ec100: improve the description of the regulators") Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Anand Moon <linux.amoon@gmail.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20210705112358.3554-4-linux.amoon@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ARM: dts: meson8b: mxq: Fix the pwm regulator supply propertiesAnand Moon1-1/+3
[ Upstream commit 632062e540becbbcb067523ec8bcadb1239d9578 ] After enabling CONFIG_REGULATOR_DEBUG=y we observer below debug logs. Changes help link VCCK and VDDEE pwm regulator to 5V regulator supply instead of dummy regulator. Add missing pwm-supply for regulator-vcck regulator node. [ 7.117140] pwm-regulator regulator-vcck: Looking up pwm-supply from device tree [ 7.117153] pwm-regulator regulator-vcck: Looking up pwm-supply property in node /regulator-vcck failed [ 7.117184] VCCK: supplied by regulator-dummy [ 7.117194] regulator-dummy: could not add device link regulator.8: -ENOENT [ 7.117266] VCCK: 860 <--> 1140 mV at 986 mV, enabled [ 7.118498] VDDEE: will resolve supply early: pwm [ 7.118515] pwm-regulator regulator-vddee: Looking up pwm-supply from device tree [ 7.118526] pwm-regulator regulator-vddee: Looking up pwm-supply property in node /regulator-vddee failed [ 7.118553] VDDEE: supplied by regulator-dummy [ 7.118563] regulator-dummy: could not add device link regulator.9: -ENOENT Fixes: dee51cd0d2e8 ("ARM: dts: meson8b: mxq: add the VDDEE regulator") Fixes: d94f60e3dfa0 ("ARM: dts: meson8b: mxq: improve support for the TRONFY MXQ S805") Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Anand Moon <linux.amoon@gmail.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20210705112358.3554-3-linux.amoon@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ARM: dts: meson8b: odroidc1: Fix the pwm regulator supply propertiesAnand Moon1-2/+2
[ Upstream commit 876228e9f935f19c7afc7ba394d17e2ec9143b65 ] After enabling CONFIG_REGULATOR_DEBUG=y we observe below debug logs. Changes help link VCCK and VDDEE pwm regulator to 5V regulator supply instead of dummy regulator. [ 7.117140] pwm-regulator regulator-vcck: Looking up pwm-supply from device tree [ 7.117153] pwm-regulator regulator-vcck: Looking up pwm-supply property in node /regulator-vcck failed [ 7.117184] VCCK: supplied by regulator-dummy [ 7.117194] regulator-dummy: could not add device link regulator.8: -ENOENT [ 7.117266] VCCK: 860 <--> 1140 mV at 986 mV, enabled [ 7.118498] VDDEE: will resolve supply early: pwm [ 7.118515] pwm-regulator regulator-vddee: Looking up pwm-supply from device tree [ 7.118526] pwm-regulator regulator-vddee: Looking up pwm-supply property in node /regulator-vddee failed [ 7.118553] VDDEE: supplied by regulator-dummy [ 7.118563] regulator-dummy: could not add device link regulator.9: -ENOENT Fixes: 524d96083b66 ("ARM: dts: meson8b: odroidc1: add the CPU voltage regulator") Fixes: 8bdf38be712d ("ARM: dts: meson8b: odroidc1: add the VDDEE regulator") Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Anand Moon <linux.amoon@gmail.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> [narmstrong: fixed typo in commit s/observer/observe/] Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20210705112358.3554-2-linux.amoon@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ARM: dts: meson8: Use a higher default GPU clock frequencyMartin Blumenstingl1-0/+5
[ Upstream commit 44cf630bcb8c5ec78125805c9447dd5766792224 ] We are seeing "imprecise external abort (0x1406)" errors during boot (which then cause the whole board to hang) on Meson8 (but not Meson8m2). These are observed while trying to access the GPU's registers when the MALI clock is running at it's default setting of 24MHz. The 3.10 vendor kernel uses 318.75MHz as "default" GPU frequency. Using that makes the "imprecise external aborts" go away. Add the assigned-clocks and assigned-clock-rates properties to also bump the MALI clock to 318.75MHz before accessing any of it's registers. Fixes: 7d3f6b536e72c9 ("ARM: dts: meson8: add the Mali-450 MP6 GPU") Reported-by: Demetris Ierokipides <ierokipides.dem@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20210711214023.2163565-1-martin.blumenstingl@googlemail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15arm64: dts: renesas: r8a77995: draak: Remove bogus adv7511w propertiesGeert Uytterhoeven1-4/+0
[ Upstream commit 4ec82a7bb3db8c6005e715c63224c32d458917a2 ] The "max-clock" and "min-vrefresh" properties fail to validate with commit cfe34bb7a770c5d8 ("dt-bindings: drm: bridge: adi,adv7511.txt: convert to yaml"). Drop them, as they are parts of an out-of-tree workaround that is not needed upstream. Fixes: bcf3003438ea4645 ("arm64: dts: renesas: r8a77995: draak: Enable HDMI display output") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Ulrich Hecht <uli+renesas@fpond.eu> Link: https://lore.kernel.org/r/975b6686bc423421b147d367fe7fb9a0db99c5af.1625134398.git.geert+renesas@glider.be Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15ARM: dts: aspeed-g6: Fix HVI3C function-group in pinctrl dtsiDylan Hung1-2/+2
[ Upstream commit 8c295b7f3d01359ff4336fcb6e406e6ed37957d6 ] The HVI3C shall be a group of I3C function, not an independent function. Correct the function name from "HVI3C" to "I3C". Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Fixes: f510f04c8c83 ("ARM: dts: aspeed: Add AST2600 pinmux nodes") Link: https://lore.kernel.org/r/20201029062723.20798-1-dylan_hung@aspeedtech.com Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15x86/mce: Defer processing of early errorsBorislav Petkov2-3/+9
[ Upstream commit 3bff147b187d5dfccfca1ee231b0761a89f1eff5 ] When a fatal machine check results in a system reset, Linux does not clear the error(s) from machine check bank(s) - hardware preserves the machine check banks across a warm reset. During initialization of the kernel after the reboot, Linux reads, logs, and clears all machine check banks. But there is a problem. In: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver") the call to mce_register_decode_chain() moved later in the boot sequence. This means that /dev/mcelog doesn't see those early error logs. This was partially fixed by: cd9c57cad3fe ("x86/MCE: Dump MCE to dmesg if no consumers") which made sure that the logs were not lost completely by printing to the console. But parsing console logs is error prone. Users of /dev/mcelog should expect to find any early errors logged to standard places. Add a new flag MCP_QUEUE_LOG to machine_check_poll() to be used in early machine check initialization to indicate that any errors found should just be queued to genpool. When mcheck_late_init() is called it will call mce_schedule_work() to actually log and flush any errors queued in the genpool. [ Based on an original patch, commit message by and completely productized by Tony Luck. ] Fixes: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver") Reported-by: Sumanth Kamatala <skamatala@juniper.net> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210824003129.GA1642753@agluck-desk2.amr.corp.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15m68k: Fix invalid RMW_INSNS on CPUs that lack CASGeert Uytterhoeven1-1/+7
[ Upstream commit 2189e928b62e91d8efbc9826ae7c0968f0d55790 ] When enabling CONFIG_RMW_INSNS in e.g. a Coldfire build: {standard input}:3068: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d4,%d0,(%a6)' ignored Fix this by (a) adding a new config symbol to track if support for any CPU that lacks the CAS instruction is enabled, and (b) making CONFIG_RMW_INSNS depend on the new symbol not being set. Fixes: 0e152d80507b75c0 ("m68k: reorganize Kconfig options to improve mmu/non-mmu selections") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210725104413.318932-1-geert@linux-m68k.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15m68k: emu: Fix invalid free in nfeth_cleanup()Pavel Skripkin1-2/+2
[ Upstream commit 761608f5cf70e8876c2f0e39ca54b516bdcb7c12 ] In the for loop all nfeth_dev array members should be freed, not only the first one. Freeing only the first array member can cause double-free bugs and memory leaks. Fixes: 9cd7b148312f ("m68k/atari: ARAnyM - Add support for network access") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20210705204727.10743-1-paskripkin@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15s390/debug: fix debug area life cyclePeter Oberparleiter1-46/+56
[ Upstream commit 9372a82892c2caa6bccab9a4081166fa769699f8 ] Currently allocation and registration of s390dbf debug areas are tied together. As a result, a debug area cannot be unregistered and re-registered while any process has an associated debugfs file open. Fix this by splitting alloc/release from register/unregister. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15s390/debug: keep debug data on resizePeter Oberparleiter1-21/+53
[ Upstream commit 1204777867e8486a88dbb4793fe256b31ea05eeb ] Any previously recorded s390dbf debug data is reset when a debug area is resized using the 'pages' sysfs attribute. This can make live-debugging unnecessarily complex. Fix this by copying existing debug data to the newly allocated debug area when resizing. Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15s390/pci: fix misleading rc in clp_set_pci_fn()Niklas Schnelle2-20/+20
[ Upstream commit f7addcdd527a6dddfebe20c358b87bdb95624612 ] Currently clp_set_pci_fn() always returns 0 as long as the CLP request itself succeeds even if the operation itself returns a response code other than CLP_RC_OK or CLP_RC_SETPCIFN_ALRDY. This is highly misleading because calling code assumes that a zero rc means that the operation was successful. Fix this by returning the response code or cc on failure with the exception of the special handling for CLP_RC_SETPCIFN_ALRDY. Also let's not assume that the returned function handle for CLP_RC_SETPCIFN_ALRDY is 0, we don't need it anyway. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15s390/kasan: fix large PMD pages address alignment checkAlexander Gordeev1-21/+20
[ Upstream commit ddd63c85ef67ea9ea7282ad35eafb6568047126e ] It is currently possible to initialize a large PMD page when the address is not aligned on page boundary. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-12x86/reboot: Limit Dell Optiplex 990 quirk to early BIOS versionsPaul Gortmaker1-1/+2
commit a729691b541f6e63043beae72e635635abe5dc09 upstream. When this platform was relatively new in November 2011, with early BIOS revisions, a reboot quirk was added in commit 6be30bb7d750 ("x86/reboot: Blacklist Dell OptiPlex 990 known to require PCI reboot") However, this quirk (and several others) are open-ended to all BIOS versions and left no automatic expiry if/when the system BIOS fixed the issue, meaning that nobody is likely to come along and re-test. What is really problematic with using PCI reboot as this quirk does, is that it causes this platform to do a full power down, wait one second, and then power back on. This is less than ideal if one is using it for boot testing and/or bisecting kernels when legacy rotating hard disks are installed. It was only by chance that the quirk was noticed in dmesg - and when disabled it turned out that it wasn't required anymore (BIOS A24), and a default reboot would work fine without the "harshness" of power cycling the machine (and disks) down and up like the PCI reboot does. Doing a bit more research, it seems that the "newest" BIOS for which the issue was reported[1] was version A06, however Dell[2] seemed to suggest only up to and including version A05, with the A06 having a large number of fixes[3] listed. As is typical with a new platform, the initial BIOS updates come frequently and then taper off (and in this case, with a revival for CPU CVEs); a search for O990-A<ver>.exe reveals the following dates: A02 16 Mar 2011 A03 11 May 2011 A06 14 Sep 2011 A07 24 Oct 2011 A10 08 Dec 2011 A14 06 Sep 2012 A16 15 Oct 2012 A18 30 Sep 2013 A19 23 Sep 2015 A20 02 Jun 2017 A23 07 Mar 2018 A24 21 Aug 2018 While it's overkill to flash and test each of the above, it would seem likely that the issue was contained within A0x BIOS versions, given the dates above and the dates of issue reports[4] from distros. So rather than just throw out the quirk entirely, limit the scope to just those early BIOS versions, in case people are still running systems from 2011 with the original as-shipped early A0x BIOS versions. [1] https://lore.kernel.org/lkml/1320373471-3942-1-git-sen