summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2021-07-19arm64: tlb: fix the TTL value of tlb_get_levelZhenyu Ye1-0/+4
commit 52218fcd61cb42bde0d301db4acb3ffdf3463cc7 upstream. The TTL field indicates the level of page table walk holding the *leaf* entry for the address being invalidated. But currently, the TTL field may be set to an incorrent value in the following stack: pte_free_tlb __pte_free_tlb tlb_remove_table tlb_table_invalidate tlb_flush_mmu_tlbonly tlb_flush In this case, we just want to flush a PTE page, but the tlb->cleared_pmds is set and we get tlb_level = 2 in the tlb_get_level() function. This may cause some unexpected problems. This patch set the TTL field to 0 if tlb->freed_tables is set. The tlb->freed_tables indicates page table pages are freed, not the leaf entry. Cc: <stable@vger.kernel.org> # 5.9.x Fixes: c4ab2cbc1d87 ("arm64: tlb: Set the TTL field in flush_tlb_range") Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: ZhuRui <zhurui3@huawei.com> Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com> Link: https://lore.kernel.org/r/b80ead47-1f88-3a00-18e1-cacc22f54cc4@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19powerpc/powernv/vas: Release reference to tgid during window closeHaren Myneni1-4/+5
commit 91cdbb955aa94ee0841af4685be40937345d29b8 upstream. The kernel handles the NX fault by updating CSB or sending signal to process. In multithread applications, children can open VAS windows and can exit without closing them. But the parent can continue to send NX requests with these windows. To prevent pid reuse, reference will be taken on pid and tgid when the window is opened and release them during window close. The current code is not releasing the tgid reference which can cause pid leak and this patch fixes the issue. Fixes: db1c08a740635 ("powerpc/vas: Take reference to PID and mm for user space windows") Cc: stable@vger.kernel.org # 5.8+ Reported-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6020fc4d444864fe20f7dcdc5edfe53e67480a1c.camel@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19powerpc/barrier: Avoid collision with clang's __lwsync macroNathan Chancellor1-0/+2
commit 015d98149b326e0f1f02e44413112ca8b4330543 upstream. A change in clang 13 results in the __lwsync macro being defined as __builtin_ppc_lwsync, which emits 'lwsync' or 'msync' depending on what the target supports. This breaks the build because of -Werror in arch/powerpc, along with thousands of warnings: In file included from arch/powerpc/kernel/pmc.c:12: In file included from include/linux/bug.h:5: In file included from arch/powerpc/include/asm/bug.h:109: In file included from include/asm-generic/bug.h:20: In file included from include/linux/kernel.h:12: In file included from include/linux/bitops.h:32: In file included from arch/powerpc/include/asm/bitops.h:62: arch/powerpc/include/asm/barrier.h:49:9: error: '__lwsync' macro redefined [-Werror,-Wmacro-redefined] #define __lwsync() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory") ^ <built-in>:308:9: note: previous definition is here #define __lwsync __builtin_ppc_lwsync ^ 1 error generated. Undefine this macro so that the runtime patching introduced by commit 2d1b2027626d ("powerpc: Fixup lwsync at runtime") continues to work properly with clang and the build no longer breaks. Cc: stable@vger.kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/1386 Link: https://github.com/llvm/llvm-project/commit/62b5df7fe2b3fda1772befeda15598fbef96a614 Link: https://lore.kernel.org/r/20210528182752.1852002-1-nathan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19powerpc/mm: Fix lockup on kernel exec faultChristophe Leroy1-3/+1
commit cd5d5e602f502895e47e18cd46804d6d7014e65c upstream. The powerpc kernel is not prepared to handle exec faults from kernel. Especially, the function is_exec_fault() will return 'false' when an exec fault is taken by kernel, because the check is based on reading current->thread.regs->trap which contains the trap from user. For instance, when provoking a LKDTM EXEC_USERSPACE test, current->thread.regs->trap is set to SYSCALL trap (0xc00), and the fault taken by the kernel is not seen as an exec fault by set_access_flags_filter(). Commit d7df2443cd5f ("powerpc/mm: Fix spurious segfaults on radix with autonuma") made it clear and handled it properly. But later on commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults") removed that handling, introducing test based on error_code. And here is the problem, because on the 603 all upper bits of SRR1 get cleared when the TLB instruction miss handler bails out to ISI. Until commit cbd7e6ca0210 ("powerpc/fault: Avoid heavy search_exception_tables() verification"), an exec fault from kernel at a userspace address was indirectly caught by the lack of entry for that address in the exception tables. But after that commit the kernel mainly relies on KUAP or on core mm handling to catch wrong user accesses. Here the access is not wrong, so mm handles it. It is a minor fault because PAGE_EXEC is not set, set_access_flags_filter() should set PAGE_EXEC and voila. But as is_exec_fault() returns false as explained in the beginning, set_access_flags_filter() bails out without setting PAGE_EXEC flag, which leads to a forever minor exec fault. As the kernel is not prepared to handle such exec faults, the thing to do is to fire in bad_kernel_fault() for any exec fault taken by the kernel, as it was prior to commit d3ca587404b3. Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/024bb05105050f704743a0083fe3548702be5706.1625138205.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19arm64: dts: rockchip: Enable USB3 for rk3328 Rock64Cameron Nemo1-0/+5
commit bbac8bd65f5402281cb7b0452c1c5f367387b459 upstream. Enable USB3 nodes for the rk3328-based PINE Rock64 board. The separate power regulator is not added as it is controlled by the same GPIO line as the existing VBUS regulators, so it is already enabled. Also there is no port representation to tie the regulator to. [wens@csie.org: Rebased onto v5.12] Signed-off-by: Cameron Nemo <cnemo@tutanota.com> [wens@csie.org: Rewrote commit message] Signed-off-by: Chen-Yu Tsai <wens@csie.org> Link: https://lore.kernel.org/r/20210504083616.9654-2-wens@kernel.org Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19arm64: dts: rockchip: add rk3328 dwc3 usb controller nodeCameron Nemo1-0/+19
commit 44dd5e2106dc2fd01697b539085818d1d1c58df0 upstream. RK3328 SoCs have one USB 3.0 OTG controller which uses DWC_USB3 core's general architecture. It can act as static xHCI host controller, static device controller, USB 3.0/2.0 OTG basing on ID of USB3.0 PHY. Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Cameron Nemo <cnemo@tutanota.com> Signed-off-by: Johan Jonker <jbx6244@gmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20210209192350.7130-7-jbx6244@gmail.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19MIPS: MT extensions are not available on MIPS32r1Paul Cercueil1-1/+3
commit cad065ed8d8831df67b9754cc4437ed55d8b48c0 upstream. MIPS MT extensions were added with the MIPS 34K processor, which was based on the MIPS32r2 ISA. This fixes a build error when building a generic kernel for a MIPS32r1 CPU. Fixes: c434b9f80b09 ("MIPS: Kconfig: add MIPS_GENERIC_KERNEL symbol") Cc: stable@vger.kernel.org # v5.9 Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-19MIPS: set mips32r5 for virt extensionsNick Desaulniers1-4/+4
[ Upstream commit c994a3ec7ecc8bd2a837b2061e8a76eb8efc082b ] Clang's integrated assembler only accepts these instructions when the cpu is set to mips32r5. With this change, we can assemble malta_defconfig with Clang via `make LLVM_IAS=1`. Link: https://github.com/ClangBuiltLinux/linux/issues/763 Reported-by: Dmitry Golovin <dima@golovin.in> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19MIPS: loongsoon64: Reserve memory below starting pfn to prevent Oopszhanglianjie1-0/+3
[ Upstream commit 6817c944430d00f71ccaa9c99ff5b0096aeb7873 ] The cause of the problem is as follows: 1. when cat /sys/devices/system/memory/memory0/valid_zones, test_pages_in_a_zone() will be called. 2. test_pages_in_a_zone() finds the zone according to stat_pfn = 0. The smallest pfn of the numa node in the mips architecture is 128, and the page corresponding to the previous 0~127 pfn is not initialized (page->flags is 0xFFFFFFFF) 3. The nid and zonenum obtained using page_zone(pfn_to_page(0)) are out of bounds in the corresponding array, &NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)], access to the out-of-bounds zone member variables appear abnormal, resulting in Oops. Therefore, it is necessary to keep the page between 0 and the minimum pfn to prevent Oops from appearing. Signed-off-by: zhanglianjie <zhanglianjie@uniontech.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19MIPS: add PMD table accounting into MIPS'pmd_alloc_oneHuang Pei1-3/+7
[ Upstream commit ed914d48b6a1040d1039d371b56273d422c0081e ] This fixes Page Table accounting bug. MIPS is the ONLY arch just defining __HAVE_ARCH_PMD_ALLOC_ONE alone. Since commit b2b29d6d011944 (mm: account PMD tables like PTE tables), "pmd_free" in asm-generic with PMD table accounting and "pmd_alloc_one" in MIPS without PMD table accounting causes PageTable accounting number negative, which read by global_zone_page_state(), always returns 0. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19MIPS: ingenic: Select CPU_SUPPORTS_CPUFREQ && MIPS_EXTERNAL_TIMERPaul Cercueil1-0/+2
[ Upstream commit eb3849370ae32b571e1f9a63ba52c61adeaf88f7 ] The clock driving the XBurst CPUs in Ingenic SoCs is integer divided from the main PLL. As such, it is possible to control the frequency of the CPU, either by changing the divider, or by changing the rate of the main PLL. The XBurst CPUs also lack the CP0 timer; the TCU, a separate piece of hardware in the SoC, provides this functionality. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19MIPS: cpu-probe: Fix FPU detection on Ingenic JZ4760(B)Paul Cercueil1-0/+5
[ Upstream commit fc52f92a653215fbd6bc522ac5311857b335e589 ] Ingenic JZ4760 and JZ4760B do have a FPU, but the config registers don't report it. Force the FPU detection in case the processor ID match the JZ4760(B) one. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19hugetlb: clear huge pte during flush function on mips platformBibo Mao1-1/+7
[ Upstream commit 33ae8f801ad8bec48e886d368739feb2816478f2 ] If multiple threads are accessing the same huge page at the same time, hugetlb_cow will be called if one thread write the COW huge page. And function huge_ptep_clear_flush is called to notify other threads to clear the huge pte tlb entry. The other threads clear the huge pte tlb entry and reload it from page table, the reload huge pte entry may be old. This patch fixes this issue on mips platform, and it clears huge pte entry before notifying other threads to flush current huge page entry, it is similar with other architectures. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc/preempt: Don't touch the idle task's preempt_count during hotplugValentin Schneider2-6/+0
commit 2c669ef6979c370f98d4b876e54f19613c81e075 upstream. Powerpc currently resets a CPU's idle task preempt_count to 0 before said task starts executing the secondary startup routine (and becomes an idle task proper). This conflicts with commit f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled"). which initializes all of the idle tasks' preempt_count to PREEMPT_DISABLED during smp_init(). Note that this was superfluous before said commit, as back then the hotplug machinery would invoke init_idle() via idle_thread_get(), which would have already reset the CPU's idle task's preempt_count to PREEMPT_ENABLED. Get rid of this preempt_count write. Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled") Reported-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210707183831.2106509-1-valentin.schneider@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14s390: preempt: Fix preempt_count initializationValentin Schneider3-12/+6
commit 6a942f5780545ebd11aca8b3ac4b163397962322 upstream. S390's init_idle_preempt_count(p, cpu) doesn't actually let us initialize the preempt_count of the requested CPU's idle task: it unconditionally writes to the current CPU's. This clearly conflicts with idle_threads_init(), which intends to initialize *all* the idle tasks, including their preempt_count (or their CPU's, if the arch uses a per-CPU preempt_count). Unfortunately, it seems the way s390 does things doesn't let us initialize every possible CPU's preempt_count early on, as the pages where this resides are only allocated when a CPU is brought up and are freed when it is brought down. Let the arch-specific code set a CPU's preempt_count when its lowcore is allocated, and turn init_idle_preempt_count() into an empty stub. Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20210707163338.1623014-1-valentin.schneider@arm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14csky: syscache: Fixup duplicate cache flushGuo Ren1-5/+7
[ Upstream commit 6ea42c84f33368eb3fe1ec1bff8d7cb1a5c7b07a ] The current csky logic of sys_cacheflush is wrong, it'll cause icache flush call dcache flush again. Now fixup it with a conditional "break & fallthrough". Fixes: 997153b9a75c ("csky: Add flush_icache_mm to defer flush icache all") Fixes: 0679d29d3e23 ("csky: fix syscache.c fallthrough warning") Acked-by: Randy Dunlap <rdunlap@infradead.org> Co-Developed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14csky: fix syscache.c fallthrough warningRandy Dunlap1-0/+1
[ Upstream commit 0679d29d3e2351a1c3049c26a63ce1959cad5447 ] This case of the switch statement falls through to the following case. This appears to be on purpose, so declare it as OK. ../arch/csky/mm/syscache.c: In function '__do_sys_cacheflush': ../arch/csky/mm/syscache.c:17:3: warning: this statement may fall through [-Wimplicit-fallthrough=] 17 | flush_icache_mm_range(current->mm, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 18 | (unsigned long)addr, | ~~~~~~~~~~~~~~~~~~~~ 19 | (unsigned long)addr + bytes); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../arch/csky/mm/syscache.c:20:2: note: here 20 | case DCACHE: | ^~~~ Fixes: 997153b9a75c ("csky: Add flush_icache_mm to defer flush icache all") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: linux-csky@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UARTPali Rohár1-1/+1
[ Upstream commit 2cbfdedef39fb5994b8f1e1df068eb8440165975 ] UART1 (standard variant with DT node name 'uart0') has register space 0x12000-0x12018 and not whole size 0x200. So fix also this in example. Signed-off-by: Pali Rohár <pali@kernel.org> Fixes: c737abc193d1 ("arm64: dts: marvell: Fix A37xx UART0 register size") Link: https://lore.kernel.org/r/20210624224909.6350-6-pali@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc/papr_scm: Make 'perf_stats' invisible if perf-stats unavailableVaibhav Jain1-11/+24
[ Upstream commit ed78f56e1271f108e8af61baeba383dcd77adbec ] In case performance stats for an nvdimm are not available, reading the 'perf_stats' sysfs file returns an -ENOENT error. A better approach is to make the 'perf_stats' file entirely invisible to indicate that performance stats for an nvdimm are unavailable. So this patch updates 'papr_nd_attribute_group' to add a 'is_visible' callback implemented as newly introduced 'papr_nd_attribute_visible()' that returns an appropriate mode in case performance stats aren't supported in a given nvdimm. Also the initialization of 'papr_scm_priv.stat_buffer_len' is moved from papr_scm_nvdimm_init() to papr_scm_probe() so that it value is available when 'papr_nd_attribute_visible()' is called during nvdimm initialization. Even though 'perf_stats' attribute is available since v5.9, there are no known user-space tools/scripts that are dependent on presence of its sysfs file. Hence I dont expect any user-space breakage with this patch. Fixes: 2d02bf835e57 ("powerpc/papr_scm: Fetch nvdimm performance stats from PHYP") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210513092349.285021-1-vaibhav@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc/64s: Fix copy-paste data exposure into newly created tasksNicholas Piggin1-16/+32
[ Upstream commit f35d2f249ef05b9671e7898f09ad89aa78f99122 ] copy-paste contains implicit "copy buffer" state that can contain arbitrary user data (if the user process executes a copy instruction). This could be snooped by another process if a context switch hits while the state is live. So cp_abort is executed on context switch to clear out possible sensitive data and prevent the leak. cp_abort is done after the low level _switch(), which means it is never reached by newly created tasks, so they could snoop on this buffer between their first and second context switch. Fix this by doing the cp_abort before calling _switch. Add some comments which should make the issue harder to miss. Fixes: 07d2a628bc000 ("powerpc/64s: Avoid cpabort in context switch when possible") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210622053036.474678-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc/papr_scm: Properly handle UUID types and APIAndy Shevchenko1-9/+18
[ Upstream commit 0e8554b5d7801b0aebc6c348a0a9f7706aa17b3b ] Parse to and export from UUID own type, before dereferencing. This also fixes wrong comment (Little Endian UUID is something else) and should eliminate the direct strict types assignments. Fixes: 43001c52b603 ("powerpc/papr_scm: Use ibm,unit-guid as the iset cookie") Fixes: 259a948c4ba1 ("powerpc/pseries/scm: Use a specific endian format for storing uuid from the device tree") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210616134303.58185-1-andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc: Offline CPU in stop_this_cpu()Nicholas Piggin1-0/+11
[ Upstream commit bab26238bbd44d5a4687c0a64fd2c7f2755ea937 ] printk_safe_flush_on_panic() has special lock breaking code for the case where we panic()ed with the console lock held. It relies on panic IPI causing other CPUs to mark themselves offline. Do as most other architectures do. This effectively reverts commit de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs"), unfortunately it may result in some false positive warnings, but the alternative is more situations where we can crash without getting messages out. Fixes: de6e5d38417e ("powerpc: smp_send_stop do not offline stopped CPUs") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210623041245.865134-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14powerpc/powernv: Fix machine check reporting of async store errorsNicholas Piggin1-8/+40
[ Upstream commit 3729e0ec59a20825bd4c8c70996b2df63915e1dd ] POWER9 and POWER10 asynchronous machine checks due to stores have their cause reported in SRR1 but SRR1[42] is set, which in other cases indicates DSISR cause. Check for these cases and clear SRR1[42], so the cause matching uses the i-side (SRR1) table. Fixes: 7b9f71f974a1 ("powerpc/64s: POWER9 machine check handler") Fixes: 201220bb0e8c ("powerpc/powernv: Machine check handler for POWER10") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210517140355.2325406-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14s390: appldata depends on PROC_SYSCTLRandy Dunlap1-1/+1
[ Upstream commit 5d3516b3647621d5a1180672ea9e0817fb718ada ] APPLDATA_BASE should depend on PROC_SYSCTL instead of PROC_FS. Building with PROC_FS but not PROC_SYSCTL causes a build error, since appldata_base.c uses data and APIs from fs/proc/proc_sysctl.c. arch/s390/appldata/appldata_base.o: in function `appldata_generic_handler': appldata_base.c:(.text+0x192): undefined reference to `sysctl_vals' Fixes: c185b783b099 ("[S390] Remove config options.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20210528002420.17634-1-rdunlap@infradead.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14s390: enable HAVE_IOREMAP_PROTNiklas Schnelle2-0/+20
[ Upstream commit d460bb6c6417588dd8b0907d34f69b237918812a ] In commit b02002cc4c0f ("s390/pci: Implement ioremap_wc/prot() with MIO") we implemented both ioremap_wc() and ioremap_prot() however until now we had not set HAVE_IOREMAP_PROT in Kconfig, do so now. This also requires implementing pte_pgprot() as this is used in the generic_access_phys() code enabled by CONFIG_HAVE_IOREMAP_PROT. As with ioremap_wc() we need to take the MMIO Write Back bit index into account. Moreover since the pgprot value returned from pte_pgprot() is to be used for mappings into kernel address space we must make sure that it uses appropriate kernel page table protection bits. In particular a pgprot value originally coming from userspace could have the _PAGE_PROTECT bit set to enable fault based dirty bit accounting which would then make the mapping inaccessible when used in kernel address space. Fixes: b02002cc4c0f ("s390/pci: Implement ioremap_wc/prot() with MIO") Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACKHeiko Carstens1-0/+1
[ Upstream commit 9ceed9988a8e6a1656ed2bdaa30501cf0f3dd925 ] irq_exit() is always called on async stack. Therefore select HAVE_IRQ_EXIT_ON_IRQ_STACK and get a tiny optimization in invoke_softirq(). Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14MIPS: Fix PKMAP with 32-bit MIPS huge page supportWei Li1-1/+1
[ Upstream commit cf02ce742f09188272bcc8b0e62d789eb671fc4c ] When 32-bit MIPS huge page support is enabled, we halve the number of pointers a PTE page holds, making its last half go to waste. Correspondingly, we should halve the number of kmap entries, as we just initialized only a single pte table for that in pagetable_init(). Fixes: 35476311e529 ("MIPS: Add partial 32-bit huge page support") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14x86/sev: Split up runtime #VC handler for correct state trackingJoerg Roedel3-90/+91
[ Upstream commit be1a5408868af341f61f93c191b5e346ee88c82a ] Split up the #VC handler code into a from-user and a from-kernel part. This allows clean and correct state tracking, as the #VC handler needs to enter NMI-state when raised from kernel mode and plain IRQ state when raised from user-mode. Fixes: 62441a1fb532 ("x86/sev-es: Correctly track IRQ states in runtime #VC handler") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210618115409.22735-3-joro@8bytes.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14x86/sev: Make sure IRQs are disabled while GHCB is activeJoerg Roedel1-12/+22
[ Upstream commit d187f217335dba2b49fc9002aab2004e04acddee ] The #VC handler only cares about IRQs being disabled while the GHCB is active, as it must not be interrupted by something which could cause another #VC while it holds the GHCB (NMI is the exception for which the backup GHCB exits). Make sure nothing interrupts the code path while the GHCB is active by making sure that callers of __sev_{get,put}_ghcb() have disabled interrupts upfront. [ bp: Massage commit message. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210618115409.22735-2-joro@8bytes.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: arm64: Don't zero the cycle count register when PMCR_EL0.P is setAlexandru Elisei1-0/+1
[ Upstream commit 2a71fabf6a1bc9162a84e18d6ab991230ca4d588 ] According to ARM DDI 0487G.a, page D13-3895, setting the PMCR_EL0.P bit to 1 has the following effect: "Reset all event counters accessible in the current Exception level, not including PMCCNTR_EL0, to zero." Similar behaviour is described for AArch32 on page G8-7022. Make it so. Fixes: c01d6a18023b ("KVM: arm64: pmu: Only handle supported event counters") Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210618105139.83795-1-alexandru.elisei@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: x86/mmu: Fix return value in tdp_mmu_map_handle_target_level()Kai Huang1-1/+1
[ Upstream commit 57a3e96d6d17ae5ac9861ef34af024a627f1c3bb ] Currently tdp_mmu_map_handle_target_level() returns 0, which is RET_PF_RETRY, when page fault is actually fixed. This makes kvm_tdp_mmu_map() also return RET_PF_RETRY in this case, instead of RET_PF_FIXED. Fix by initializing ret to RET_PF_FIXED. Note that kvm_mmu_page_fault() resumes guest on both RET_PF_RETRY and RET_PF_FIXED, which means in practice returning the two won't make difference, so this fix alone won't be necessary for stable tree. Fixes: bb18842e2111 ("kvm: x86/mmu: Add TDP MMU PF handler") Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Message-Id: <f9e8956223a586cd28c090879a8ff40f5eb6d609.1623717884.git.kai.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: nVMX: Don't clobber nested MMU's A/D status on EPTP switchSean Christopherson1-7/+0
[ Upstream commit 272b0a998d084e7667284bdd2d0c675c6a2d11de ] Drop bogus logic that incorrectly clobbers the accessed/dirty enabling status of the nested MMU on an EPTP switch. When nested EPT is enabled, walk_mmu points at L2's _legacy_ page tables, not L1's EPT for L2. This is likely a benign bug, as mmu->ept_ad is never consumed (since the MMU is not a nested EPT MMU), and stuffing mmu_role.base.ad_disabled will never propagate into future shadow pages since the nested MMU isn't used to map anything, just to walk L2's page tables. Note, KVM also does a full MMU reload, i.e. the guest_mmu will be recreated using the new EPTP, and thus any change in A/D enabling will be properly recognized in the relevant MMU. Fixes: 41ab93727467 ("KVM: nVMX: Emulate EPTP switching for the L1 hypervisor") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210609234235.1244004-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: nVMX: Ensure 64-bit shift when checking VMFUNC bitmapSean Christopherson1-1/+1
[ Upstream commit 0e75225dfa4c5d5d51291f54a3d2d5895bad38da ] Use BIT_ULL() instead of an open-coded shift to check whether or not a function is enabled in L1's VMFUNC bitmap. This is a benign bug as KVM supports only bit 0, and will fail VM-Enter if any other bits are set, i.e. bits 63:32 are guaranteed to be zero. Note, "function" is bounded by hardware as VMFUNC will #UD before taking a VM-Exit if the function is greater than 63. Before: if ((vmcs12->vm_function_control & (1 << function)) == 0) 0x000000000001a916 <+118>: mov $0x1,%eax 0x000000000001a91b <+123>: shl %cl,%eax 0x000000000001a91d <+125>: cltq 0x000000000001a91f <+127>: and 0x128(%rbx),%rax After: if (!(vmcs12->vm_function_control & BIT_ULL(function & 63))) 0x000000000001a955 <+117>: mov 0x128(%rbx),%rdx 0x000000000001a95c <+124>: bt %rax,%rdx Fixes: 27c42a1bb867 ("KVM: nVMX: Enable VMFUNC for the L1 hypervisor") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210609234235.1244004-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: nVMX: Sync all PGDs on nested transition with shadow pagingSean Christopherson4-8/+15
[ Upstream commit 07ffaf343e34b555c9e7ea39a9c81c439a706f13 ] Trigger a full TLB flush on behalf of the guest on nested VM-Enter and VM-Exit when VPID is disabled for L2. kvm_mmu_new_pgd() syncs only the current PGD, which can theoretically leave stale, unsync'd entries in a previous guest PGD, which could be consumed if L2 is allowed to load CR3 with PCID_NOFLUSH=1. Rename KVM_REQ_HV_TLB_FLUSH to KVM_REQ_TLB_FLUSH_GUEST so that it can be utilized for its obvious purpose of emulating a guest TLB flush. Note, there is no change the actual TLB flush executed by KVM, even though the fast PGD switch uses KVM_REQ_TLB_FLUSH_CURRENT. When VPID is disabled for L2, vpid02 is guaranteed to be '0', and thus nested_get_vpid02() will return the VPID that is shared by L1 and L2. Generate the request outside of kvm_mmu_new_pgd(), as getting the common helper to correctly identify which requested is needed is quite painful. E.g. using KVM_REQ_TLB_FLUSH_GUEST when nested EPT is in play is wrong as a TLB flush from the L1 kernel's perspective does not invalidate EPT mappings. And, by using KVM_REQ_TLB_FLUSH_GUEST, nVMX can do future simplification by moving the logic into nested_vmx_transition_tlb_flush(). Fixes: 41fab65e7c44 ("KVM: nVMX: Skip MMU sync on nested VMX transition when possible") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210609234235.1244004-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14arm64/mm: Fix ttbr0 values stored in struct thread_info for software-panAnshuman Khandual2-3/+3
[ Upstream commit 9163f01130304fab1f74683d7d44632da7bda637 ] When using CONFIG_ARM64_SW_TTBR0_PAN, a task's thread_info::ttbr0 must be the TTBR0_EL1 value used to run userspace. With 52-bit PAs, the PA must be packed into the TTBR using phys_to_ttbr(), but we forget to do this in some of the SW PAN code. Thus, if the value is installed into TTBR0_EL1 (as may happen in the uaccess routines), this could result in UNPREDICTABLE behaviour. Since hardware with 52-bit PA support almost certainly has HW PAN, which will be used in preference, this shouldn't be a practical issue, but let's fix this for consistency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Fixes: 529c4b05a3cb ("arm64: handle 52-bit addresses in TTBR") Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/1623749578-11231-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14arm64: consistently use reserved_pg_dirMark Rutland9-22/+17
[ Upstream commit 833be850f1cabd0e3b5337c0fcab20a6e936dd48 ] Depending on configuration options and specific code paths, we either use the empty_zero_page or the configuration-dependent reserved_ttbr0 as a reserved value for TTBR{0,1}_EL1. To simplify this code, let's always allocate and use the same reserved_pg_dir, replacing reserved_ttbr0. Note that this is allocated (and hence pre-zeroed), and is also marked as read-only in the kernel Image mapping. Keeping this separate from the empty_zero_page potentially helps with robustness as the empty_zero_page is used in a number of cases where a failure to map it read-only could allow it to become corrupted. The (presently unused) swapper_pg_end symbol is also removed, and comments are added wherever we rely on the offsets between the pre-allocated pg_dirs to keep these cases easily identifiable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201103102229.8542-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14crypto: x86/curve25519 - fix cpu feature checking logic in mod_exitHangbin Liu1-1/+1
[ Upstream commit 1b82435d17774f3eaab35dce239d354548aa9da2 ] In curve25519_mod_init() the curve25519_alg will be registered only when (X86_FEATURE_BMI2 && X86_FEATURE_ADX). But in curve25519_mod_exit() it still checks (X86_FEATURE_BMI2 || X86_FEATURE_ADX) when do crypto unregister. This will trigger a BUG_ON in crypto_unregister_alg() as alg->cra_refcnt is 0 if the cpu only supports one of X86_FEATURE_BMI2 and X86_FEATURE_ADX. Fixes: 07b586fe0662 ("crypto: x86/curve25519 - replace with formally verified implementation") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14m68k: atari: Fix ATARI_KBD_CORE kconfig unmet dependency warningRandy Dunlap1-0/+3
[ Upstream commit c1367ee016e3550745315fb9a2dd1e4ce02cdcf6 ] Since the code for ATARI_KBD_CORE does not use drivers/input/keyboard/ code, just move ATARI_KBD_CORE to arch/m68k/Kconfig.machine to remove the dependency on INPUT_KEYBOARD. Removes this kconfig warning: WARNING: unmet direct dependencies detected for ATARI_KBD_CORE Depends on [n]: !UML && INPUT [=y] && INPUT_KEYBOARD [=n] Selected by [y]: - MOUSE_ATARI [=y] && !UML && INPUT [=y] && INPUT_MOUSE [=y] && ATARI [=y] Fixes: c04cb856e20a ("m68k: Atari keyboard and mouse support.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Suggested-by: Michael Schmitz <schmitzmic@gmail.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/20210527001251.8529-1-rdunlap@infradead.org Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14x86/elf: Use _BITUL() macro in UAPI headersJoe Richey1-2/+4
[ Upstream commit d06aca989c243dd9e5d3e20aa4e5c2ecfdd07050 ] Replace BIT() in x86's UAPI header with _BITUL(). BIT() is not defined in the UAPI headers and its usage may cause userspace build errors. Fixes: 742c45c3ecc9 ("x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2") Signed-off-by: Joe Richey <joerichey@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20210521085849.37676-2-joerichey94@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14ia64: mca_drv: fix incorrect array size calculationArnd Bergmann1-1/+1
[ Upstream commit c5f320ff8a79501bb59338278336ec43acb9d7e2 ] gcc points out a mistake in the mca driver that goes back to before the git history: arch/ia64/kernel/mca_drv.c: In function 'init_record_index_pools': arch/ia64/kernel/mca_drv.c:346:54: error: expression does not compute the number of elements in this array; element typ e is 'int', not 'size_t' {aka 'long unsigned int'} [-Werror=sizeof-array-div] 346 | for (i = 1; i < sizeof sal_log_sect_min_sizes/sizeof(size_t); i++) | ^ This is the same as sizeof(size_t), which is two shorter than the actual array. Use the ARRAY_SIZE() macro to get the correct calculation instead. Link: https://lkml.kernel.org/r/20210514214123.875971-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14clocksource: Check per-CPU clock synchronization when marked unstablePaul E. McKenney1-1/+2
[ Upstream commit 7560c02bdffb7c52d1457fa551b9e745d4b9e754 ] Some sorts of per-CPU clock sources have a history of going out of synchronization with each other. However, this problem has purportedy been solved in the past ten years. Except that it is all too possible that the problem has instead simply been made less likely, which might mean that some of the occasional "Marking clocksource 'tsc' as unstable" messages might be due to desynchronization. How would anyone know? Therefore apply CPU-to-CPU synchronization checking to newly unstable clocksource that are marked with the new CLOCK_SOURCE_VERIFY_PERCPU flag. Lists of desynchronized CPUs are printed, with the caveat that if it is the reporting CPU that is itself desynchronized, it will appear that all the other clocks are wrong. Just like in real life. Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Feng Tang <feng.tang@intel.com> Link: https://lore.kernel.org/r/20210527190124.440372-2-paulmck@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: s390: get rid of register asm usageHeiko Carstens1-9/+9
[ Upstream commit 4fa3b91bdee1b08348c82660668ca0ca34e271ad ] Using register asm statements has been proven to be very error prone, especially when using code instrumentation where gcc may add function calls, which clobbers register contents in an unexpected way. Therefore get rid of register asm statements in kvm code, even though there is currently nothing wrong with them. This way we know for sure that this bug class won't be introduced here. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20210621140356.1210771-1-hca@linux.ibm.com [borntraeger@de.ibm.com: checkpatch strict fix] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14KVM: PPC: Book3S HV: Fix TLB management on SMT8 POWER9 and POWER10 processorsSuraj Jitindar Singh4-8/+39
[ Upstream commit 77bbbc0cf84834ed130838f7ac1988567f4d0288 ] The POWER9 vCPU TLB management code assumes all threads in a core share a TLB, and that TLBIEL execued by one thread will invalidate TLBs for all threads. This is not the case for SMT8 capable POWER9 and POWER10 (big core) processors, where the TLB is split between groups of threads. This results in TLB multi-hits, random data corruption, etc. Fix this by introducing cpu_first_tlb_thread_sibling etc., to determine which siblings share TLBs, and use that in the guest TLB flushing code. [npiggin@gmail.com: add changelog and comment] Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210602040441.3984352-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14arm64: perf: Convert snprintf to sysfs_emitTian Tao1-1/+1
[ Upstream commit a5740e955540181f4ab8f076cc9795c6bbe4d730 ] Use sysfs_emit instead of snprintf to avoid buf overrun,because in sysfs_emit it strictly checks whether buf is null or buf whether pagesize aligned, otherwise it returns an error. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1621497585-30887-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14sched/core: Initialize the idle task with preemption disabledValentin Schneider20