summaryrefslogtreecommitdiff
path: root/arch/powerpc/sysdev
AgeCommit message (Collapse)AuthorFilesLines
2021-04-14powerpc/xive: Simplify the dump of XIVE interrupts under xmonCédric Le Goater1-0/+14
Move the xmon routine under XIVE subsystem and rework the loop on the interrupts taking into account the xive_irq_domain to filter out IPIs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-7-clg@kaod.org
2021-04-14powerpc/xive: Drop check on irq_data in xive_core_debug_show()Cédric Le Goater1-11/+10
When looping on IRQ descriptor, irq_data is always valid. Fixes: 930914b7d528 ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-6-clg@kaod.org
2021-04-14powerpc/xive: Simplify xive_core_debug_show()Cédric Le Goater1-14/+4
Now that the IPI interrupt has its own domain, the checks on the HW interrupt number XIVE_IPI_HW_IRQ and on the chip can be replaced by a check on the domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-5-clg@kaod.org
2021-04-14powerpc/xive: Remove useless check on XIVE_IPI_HW_IRQCédric Le Goater1-2/+1
The IPI interrupt has its own domain now. Testing the HW interrupt number is not needed anymore. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-4-clg@kaod.org
2021-04-14powerpc/xive: Introduce an IPI interrupt domainCédric Le Goater1-33/+46
The IPI interrupt is a special case of the XIVE IRQ domain. When mapping and unmapping the interrupts in the Linux interrupt number space, the HW interrupt number 0 (XIVE_IPI_HW_IRQ) is checked to distinguish the IPI interrupt from other interrupts of the system. Simplify the XIVE interrupt domain by introducing a specific domain for the IPI. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-3-clg@kaod.org
2021-04-13of: net: pass the dst buffer to of_get_mac_address()Michael Walle1-4/+1
of_get_mac_address() returns a "const void*" pointer to a MAC address. Lately, support to fetch the MAC address by an NVMEM provider was added. But this will only work with platform devices. It will not work with PCI devices (e.g. of an integrated root complex) and esp. not with DSA ports. There is an of_* variant of the nvmem binding which works without devices. The returned data of a nvmem_cell_read() has to be freed after use. On the other hand the return of_get_mac_address() points to some static data without a lifetime. The trick for now, was to allocate a device resource managed buffer which is then returned. This will only work if we have an actual device. Change it, so that the caller of of_get_mac_address() has to supply a buffer where the MAC address is written to. Unfortunately, this will touch all drivers which use the of_get_mac_address(). Usually the code looks like: const char *addr; addr = of_get_mac_address(np); if (!IS_ERR(addr)) ether_addr_copy(ndev->dev_addr, addr); This can then be simply rewritten as: of_get_mac_address(np, ndev->dev_addr); Sometimes is_valid_ether_addr() is used to test the MAC address. of_get_mac_address() already makes sure, it just returns a valid MAC address. Thus we can just test its return code. But we have to be careful if there are still other sources for the MAC address before the of_get_mac_address(). In this case we have to keep the is_valid_ether_addr() call. The following coccinelle patch was used to convert common cases to the new style. Afterwards, I've manually gone over the drivers and fixed the return code variable: either used a new one or if one was already available use that. Mansour Moufid, thanks for that coccinelle patch! <spml> @a@ identifier x; expression y, z; @@ - x = of_get_mac_address(y); + x = of_get_mac_address(y, z); <... - ether_addr_copy(z, x); ...> @@ identifier a.x; @@ - if (<+... x ...+>) {} @@ identifier a.x; @@ if (<+... x ...+>) { ... } - else {} @@ identifier a.x; expression e; @@ - if (<+... x ...+>@e) - {} - else + if (!(e)) {...} @@ expression x, y, z; @@ - x = of_get_mac_address(y, z); + of_get_mac_address(y, z); ... when != x </spml> All drivers, except drivers/net/ethernet/aeroflex/greth.c, were compile-time tested. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-29powerpc/fsl-pci: Fix section mismatch warningMichael Ellerman1-1/+1
Section mismatch in reference from the function .fsl_add_bridge() to the function .init.text:.setup_pci_cmd() fsl_add_bridge() is not __init, and can't be, and is the only caller of setup_pci_cmd(). Fix it by making setup_pci_cmd() non-init. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314093341.132986-1-mpe@ellerman.id.au
2021-03-29powerpc/xive: use true and false for bool variableYang Li1-1/+1
fixed the following coccicheck: ./arch/powerpc/sysdev/xive/spapr.c:552:8-9: WARNING: return of 0/1 in function 'xive_spapr_match' with return type bool Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1615793096-83758-1-git-send-email-yang.lee@linux.alibaba.com
2020-12-11powerpc/xive: Improve error reporting of OPAL callsCédric Le Goater1-12/+16
Introduce a vp_err() macro to standardize error reporting. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-13-clg@kaod.org
2020-12-11powerpc/xive: Simplify xive_do_source_eoi()Cédric Le Goater1-39/+33
Previous patches removed the need of the first argument which was a hack for Firwmware EOI. Remove it and flatten the routine which has became simpler. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-12-clg@kaod.org
2020-12-11powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FWCédric Le Goater4-32/+1
This flag was used to support the P9 DD1 and we have stopped supporting this CPU when DD2 came out. See skiboot commit: https://github.com/open-power/skiboot/commit/0b0d15e3c170 Also, remove eoi handler which is now unused. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-11-clg@kaod.org
2020-12-11powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FWCédric Le Goater2-41/+1
This flag was used to support the PHB4 LSIs on P9 DD1 and we have stopped supporting this CPU when DD2 came out. See skiboot commit: https://github.com/open-power/skiboot/commit/0b0d15e3c170 Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-10-clg@kaod.org
2020-12-11powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUGCédric Le Goater2-11/+0
This flag was used to support the PHB4 LSIs on P9 DD1 and we have stopped supporting this CPU when DD2 came out. See skiboot commit: https://github.com/open-power/skiboot/commit/0b0d15e3c170 Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-9-clg@kaod.org
2020-12-11powerpc/xive: Add a debug_show handler to the XIVE irq_domainCédric Le Goater1-0/+58
Full state of the Linux interrupt descriptors can be dumped under debugfs when compiled with CONFIG_GENERIC_IRQ_DEBUGFS. Add support for the XIVE interrupt controller. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-7-clg@kaod.org
2020-12-11powerpc/xive: Add a name to the IRQ domainCédric Le Goater4-9/+9
We hope one day to handle multiple irq_domain in the XIVE driver. Start simple by setting the name using the DT node. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-6-clg@kaod.org
2020-12-11powerpc/xive: Introduce XIVE_IPI_HW_IRQCédric Le Goater2-5/+7
The XIVE driver deals with CPU IPIs in a peculiar way. Each CPU has its own XIVE IPI interrupt allocated at the HW level, for PowerNV, or at the hypervisor level for pSeries. In practice, these interrupts are not always used. pSeries/PowerVM prefers local doorbells for local threads since they are faster. On PowerNV, global doorbells are also preferred for the same reason. The mapping in the Linux is reduced to a single interrupt using HW interrupt number 0 and a custom irq_chip to handle EOI. This can cause performance issues in some benchmark (ipistorm) on multichip systems. Clarify the use of the 0 value, it will help in improving multichip support. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-4-clg@kaod.org
2020-12-11powerpc/xive: Rename XIVE_IRQ_NO_EOI to show its a flagCédric Le Goater1-1/+1
This is a simple cleanup to identify easily all flags of the XIVE interrupt structure. The interrupts flagged with XIVE_IRQ_FLAG_NO_EOI are the escalations used to wake up vCPUs in KVM. They are handled very differently from the rest. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201210171450.1933725-3-clg@kaod.org
2020-11-19powerpc: sysdev: add missing iounmap() on error in mpic_msgr_probe()Qinglang Miao1-1/+1
I noticed that iounmap() of msgr_block_addr before return from mpic_msgr_probe() in the error handling case is missing. So use devm_ioremap() instead of just ioremap() when remapping the message register block, so the mapping will be automatically released on probe failure. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201028091551.136400-1-miaoqinglang@huawei.com
2020-09-18powerpc/xive: Make debug routines staticCédric Le Goater1-2/+2
This fixes a compile error with W=1. CC arch/powerpc/sysdev/xive/common.o ../arch/powerpc/sysdev/xive/common.c:1568:6: error: no previous prototype for ‘xive_debug_show_cpu’ [-Werror=missing-prototypes] void xive_debug_show_cpu(struct seq_file *m, int cpu) ^~~~~~~~~~~~~~~~~~~ ../arch/powerpc/sysdev/xive/common.c:1602:6: error: no previous prototype for ‘xive_debug_show_irq’ [-Werror=missing-prototypes] void xive_debug_show_irq(struct seq_file *m, u32 hw_irq, struct irq_data *d) ^~~~~~~~~~~~~~~~~~~ Fixes: 930914b7d528 ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200914211007.2285999-5-clg@kaod.org
2020-08-25powerpc/icp-hv: Fix missing of_node_put() in success pathNicholas Mc Guire1-0/+1
Both of_find_compatible_node() and of_find_node_by_type() will return a refcounted node on success - thus for the success path the node must be explicitly released with a of_node_put(). Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1530691407-3991-1-git-send-email-hofrat@osadl.org
2020-07-30powerpc: fix function annotations to avoid section mismatch warnings with gcc-10Vladis Dronov1-1/+1
Certain warnings are emitted for powerpc code when building with a gcc-10 toolset: WARNING: modpost: vmlinux.o(.text.unlikely+0x377c): Section mismatch in reference from the function remove_pmd_table() to the function .meminit.text:split_kernel_mapping() The function remove_pmd_table() references the function __meminit split_kernel_mapping(). This is often because remove_pmd_table lacks a __meminit annotation or the annotation of split_kernel_mapping is wrong. Add the appropriate __init and __meminit annotations to make modpost not complain. In all the cases there are just a single callsite from another __init or __meminit function: __meminit remove_pagetable() -> remove_pud_table() -> remove_pmd_table() __init prom_init() -> setup_secure_guest() __init xive_spapr_init() -> xive_spapr_disabled() Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200729133741.62789-1-vdronov@redhat.com
2020-06-22powerpc/xive: Ignore kmemleak false positivesAlexey Kardashevskiy1-0/+2
xive_native_provision_pages() allocates memory and passes the pointer to OPAL so kmemleak cannot find the pointer usage in the kernel memory and produces a false positive report (below) (even if the kernel did scan OPAL memory, it is unable to deal with __pa() addresses anyway). This silences the warning. unreferenced object 0xc000200350c40000 (size 65536): comm "qemu-system-ppc", pid 2725, jiffies 4294946414 (age 70776.530s) hex dump (first 32 bytes): 02 00 00 00 50 00 00 00 00 00 00 00 00 00 00 00 ....P........... 01 00 08 07 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000081ff046c>] xive_native_alloc_vp_block+0x120/0x250 [<00000000d555d524>] kvmppc_xive_compute_vp_id+0x248/0x350 [kvm] [<00000000d69b9c9f>] kvmppc_xive_connect_vcpu+0xc0/0x520 [kvm] [<000000006acbc81c>] kvm_arch_vcpu_ioctl+0x308/0x580 [kvm] [<0000000089c69580>] kvm_vcpu_ioctl+0x19c/0xae0 [kvm] [<00000000902ae91e>] ksys_ioctl+0x184/0x1b0 [<00000000f3e68bd7>] sys_ioctl+0x48/0xb0 [<0000000001b2c127>] system_call_exception+0x124/0x1f0 [<00000000d2b2ee40>] system_call_common+0xe8/0x214 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200612043303.84894-1-aik@ozlabs.ru
2020-06-18maccess: rename probe_kernel_address to get_kernel_nofaultChristoph Hellwig1-1/+1
Better describe what this helper does, and match the naming of copy_from_kernel_nofault. Also switch the argument order around, so that it acts and looks like get_user(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofaultChristoph Hellwig1-2/+2
Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport1-1/+1
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport2-2/+2
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: don't include asm/pgtable.h if linux/mm.h is already includedMike Rapoport1-1/+0
Patch series "mm: consolidate definitions of page table accessors", v2. The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures. Most of these definitions are actually identical and typically it boils down to, e.g. static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); } static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); } These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined. For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic. These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header. This patch (of 12): The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>. The include statements in such cases are remove with a simple loop: for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02powerpc/rtas: Implement reentrant rtas callLeonardo Bras1-11/+11
Implement rtas_call_reentrant() for reentrant rtas-calls: "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive". On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4, items 2 and 3 say: 2 - For the PowerPC External Interrupt option: The * call must be reentrant to the number of processors on the platform. 3 - For the PowerPC External Interrupt option: The * argument call buffer for each simultaneous call must be physically unique. So, these rtas-calls can be called in a lockless way, if using a different buffer for each cpu doing such rtas call. For this, it was suggested to add the buffer (struct rtas_args) in the PACA struct, so each cpu can have it's own buffer. The PACA struct received a pointer to rtas buffer, which is allocated in the memory range available to rtas 32-bit. Reentrant rtas calls are useful to avoid deadlocks in crashing, where rtas-calls are needed, but some other thread crashed holding the rtas.lock. This is a backtrace of a deadlock from a kdump testing environment: #0 arch_spin_lock #1 lock_rtas () #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0) #3 ics_rtas_mask_real_irq (hw_irq=4100) #4 machine_kexec_mask_interrupts #5 default_machine_crash_shutdown #6 machine_crash_shutdown #7 __crash_kexec #8 crash_kexec #9 oops_end Signed-off-by: Leonardo Bras <leobras.c@gmail.com> [mpe: Move under #ifdef PSERIES to avoid build breakage] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200518234245.200672-3-leobras.c@gmail.com
2020-05-28powerpc/xive: Share the event-queue page with the Hypervisor.Ram Pai1-0/+7
XIVE interrupt controller uses an Event Queue (EQ) to enqueue event notifications when an exception occurs. The EQ is a single memory page provided by the O/S defining a circular buffer, one per server and priority couple. On baremetal, the EQ page is configured with an OPAL call. On pseries, an extra hop is necessary and the guest OS uses the hcall H_INT_SET_QUEUE_CONFIG to configure the XIVE interrupt controller. The XIVE controller being Hypervisor privileged, it will not be allowed to enqueue event notifications for a Secure VM unless the EQ pages are shared by the Secure VM. Hypervisor/Ultravisor still requires support for the TIMA and ESB page fault handlers. Until this is complete, QEMU can use the emulated XIVE device for Secure VMs, option "kernel_irqchip=off" on the QEMU pseries machine. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Cedric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200426020518.GC5853@oc0525413822.ibm.com
2020-05-28powerpc: Remove Xilinx PPC405/PPC440 supportMichal Simek3-222/+0
The latest Xilinx design tools called ISE and EDK has been released in October 2013. New tool doesn't support any PPC405/PPC440 new designs. These platforms are no longer supported and tested. PowerPC 405/440 port is orphan from 2013 by commit cdeb89943bfc ("MAINTAINERS: Fix incorrect status tag") and commit 19624236cce1 ("MAINTAINERS: Update Grant's email address and maintainership") that's why it is time to remove the support fot these platforms. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8c593895e2cb57d232d85ce4d8c3a1aa7f0869cc.1590079968.git.christophe.leroy@csgroup.eu
2020-05-28powerpc/xive: Do not expose a debugfs file when XIVE is disabledCédric Le Goater1-2/+3
The XIVE interrupt mode can be disabled with the "xive=off" kernel parameter, in which case there is nothing to present to the user in the associated /sys/kernel/debug/powerpc/xive file. Fixes: 930914b7d528 ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200429075122.1216388-4-clg@kaod.org
2020-05-26powerpc/xive: Clear the page tables for the ESB IO mappingCédric Le Goater1-0/+5
Commit 1ca3dec2b2df ("powerpc/xive: Prevent page fault issues in the machine crash handler") fixed an issue in the FW assisted dump of machines using hash MMU and the XIVE interrupt mode under the POWER hypervisor. It forced the mapping of the ESB page of interrupts being mapped in the Linux IRQ number space to make sure the 'crash kexec' sequence worked during such an event. But it didn't handle the un-mapping. This mapping is now blocking the removal of a passthrough IO adapter under the POWER hypervisor because it expects the guest OS to have cleared all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations". Remove these mapping in the IRQ data cleanup routine. Under KVM, this cleanup is not required because the ESB pages for the adapter interrupts are un-mapped from the guest by the hypervisor in the KVM XIVE native device. This is now redundant but it's harmless. Fixes: 1ca3dec2b2df ("powerpc/xive: Prevent page fault issues in the machine crash handler") Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200429075122.1216388-2-clg@kaod.org
2020-05-26Merge "Use hugepages to map kernel mem on 8xx" into nextMichael Ellerman1-0/+2
Merge Christophe's large series to use huge pages for the linear mapping on 8xx. From his cover letter: The main purpose of this big series is to: - reorganise huge page handling to avoid using mm_slices. - use huge pages to map kernel memory on the 8xx. The 8xx supports 4 page sizes: 4k, 16k, 512k and 8M. It uses 2 Level page tables, PGD having 1024 entries, each entry covering 4M address space. Then each page table has 1024 entries. At the time being, page sizes are managed in PGD entries, implying the use of mm_slices as it can't mix several pages of the same size in one page table. The first purpose of this series is to reorganise things so that standard page tables can also handle 512k pages. This is done by adding a new _PAGE_HUGE flag which will be copied into the Level 1 entry in the TLB miss handler. That done, we have 2 types of pages: - PGD entries to regular page tables handling 4k/16k and 512k pages - PGD entries to hugepd tables handling 8M pages. There is no need to mix 8M pages with other sizes, because a 8M page will use more than what a single PGD covers. Then comes the second purpose of this series. At the time being, the 8xx has implemented special handling in the TLB miss handlers in order to transparently map kernel linear address space and the IMMR using huge pages by building the TLB entries in assembly at the time of the exception. As mm_slices is only for user space pages, and also because it would anyway not be convenient to slice kernel address space, it was not possible to use huge pages for kernel address space. But after step one of the series, it is now more flexible to use huge pages. This series drop all assembly 'just in time' handling of huge pages and use huge pages in page tables instead. Once the above is done, then comes icing on the cake: - Use huge pages for KASAN shadow mapping - Allow pinned TLBs with strict kernel rwx - Allow pinned TLBs with debug pagealloc Then, last but not least, those modifications for the 8xx allows the following improvement on book3s/32: - Mapping KASAN shadow with BATs - Allowing BATs with debug pagealloc All this allows to considerably simplify TLB miss handlers and associated initialisation. The overhead of reading page tables is negligible compared to the reduction of the miss handlers. While we were at touching pte_update(), some cleanup was done there too. Tested widely on 8xx and 832x. Boot tested on QEMU MAC99.
2020-05-26powerpc/8xx: Don't set IMMR map anymore at bootChristophe Leroy1-0/+2
Only early debug requires IMMR to be mapped early. No need to set it up and pin it in assembly. Map it through page tables at udbg init when necessary. If CONFIG_PIN_TLB_IMMR is selected, pin it once we don't need the 32 Mb pinned RAM anymore. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/13c1e8539fdf363d3146f4884e5c3c76c6c308b5.1589866984.git.christophe.leroy@csgroup.eu
2020-05-20Merge branch 'topic/ppc-kvm' into nextMichael Ellerman1-0/+3
Merge our topic branch shared with the kvm-ppc tree. This brings in one commit that touches the XIVE interrupt controller logic across core and KVM code.
2020-05-07powerpc/xive: Enforce load-after-store ordering when StoreEOI is activeCédric Le Goater1-0/+3
When an interrupt has been handled, the OS notifies the interrupt controller with a EOI sequence. On a POWER9 system using the XIVE interrupt controller, this can be done with a load or a store operation on the ESB interrupt management page of the interrupt. The StoreEOI operation has less latency and improves interrupt handling performance but it was deactivated during the POWER9 DD2.0 timeframe because of ordering issues. We use the LoadEOI today but we plan to reactivate StoreEOI in future architectures. There is usually no need to enforce ordering between ESB load and store operations as they should lead to the same result. E.g. a store trigger and a load EOI can be executed in any order. Assuming the interrupt state is PQ=10, a store trigger followed by a load EOI will return a Q bit. In the reverse order, it will create a new interrupt trigger from HW. In both cases, the handler processing interrupts is notified. In some cases, the XIVE_ESB_SET_PQ_10 load operation is used to disable temporarily the interrupt source (mask/unmask). When the source is reenabled, the OS can detect if interrupts were received while the source was disabled and reinject them. This process needs special care when StoreEOI is activated. The ESB load and store operations should be correctly ordered because a XIVE_ESB_STORE_EOI operation could leave the source enabled if it has not completed before the loads. For those cases, we enforce Load-after-Store ordering with a special load operation offset. To avoid performance impact, this ordering is only enforced when really needed, that is when interrupt sources are temporarily disabled with the XIVE_ESB_SET_PQ_10 load. It should not be needed for other loads. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200220081506.31209-1-clg@kaod.org
2020-04-20powerpc/xive: Define xive_native_alloc_irq_on_chip()Haren Myneni1-3/+3
This function allocates IRQ on a specific chip. VAS needs per chip IRQ allocation and will have IRQ handler per VAS instance. Signed-off-by: Haren Myneni <haren@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1587016720.2275.1047.camel@hbabu-laptop
2020-03-27powerpc/xive: Add a debugfs file to dump internal XIVE stateCédric Le Goater4-0/+129
As does XMON, the debugfs file /sys/kernel/debug/powerpc/xive exposes the XIVE internal state of the machine CPUs and interrupts. Available on the PowerNV and sPAPR platforms. Signed-off-by: Cédric Le Goater <clg@kaod.org> [mpe: Make the debugfs file 0400] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-5-clg@kaod.org
2020-03-27powerpc/xmon: Add source flags to output of XIVE interruptsCédric Le Goater1-1/+4
Some firmwares or hypervisors can advertise different source characteristics. Track their value under XMON. What we are mostly interested in is the StoreEOI flag. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-4-clg@kaod.org
2020-03-27powerpc/xive: Fix xmon support on the PowerNV platformCédric Le Goater1-0/+4
The PowerNV platform has multiple IRQ chips and the xmon command dumping the state of the XIVE interrupt should only operate on the XIVE IRQ chip. Fixes: 5896163f7f91 ("powerpc/xmon: Improve output of XIVE interrupts") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-3-clg@kaod.org
2020-03-27powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIsCédric Le Goater4-13/+14
When a CPU is brought up, an IPI number is allocated and recorded under the XIVE CPU structure. Invalid IPI numbers are tracked with interrupt number 0x0. On the PowerNV platform, the interrupt number space starts at 0x10 and this works fine. However, on the sPAPR platform, it is possible to allocate the interrupt number 0x0 and this raises an issue when CPU 0 is unplugged. The XIVE spapr driver tracks allocated interrupt numbers in a bitmask and it is not correctly updated when interrupt number 0x0 is freed. It stays allocated and it is then impossible to reallocate. Fix by using the XIVE_BAD_IRQ value instead of zero on both platforms. Reported-by: David Gibson <david@gibson.dropbear.id.au> Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-2-clg@kaod.org
2020-02-04Merge tag 'powerpc-5.6-1' of ↵Linus Torvalds2-8/+6
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "A pretty small batch for us, and apologies for it being a bit late, I wanted to sneak Christophe's user_access_begin() series in. Summary: - Implement user_access_begin() and friends for our platforms that support controlling kernel access to userspace. - Enable CONFIG_VMAP_STACK on 32-bit Book3S and 8xx. - Some tweaks to our pseries IOMMU code to allow SVMs ("secure" virtual machines) to use the IOMMU. - Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 32-bit VDSO, and some other improvements. - A series to use the PCI hotplug framework to control opencapi card's so that they can be reset and re-read after flashing a new FPGA image. As well as other minor fixes and improvements as usual. Thanks to: Alastair D'Silva, Alexandre Ghiti, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Bai Yingjie, Chen Zhou, Christophe Leroy, Frederic Barrat, Greg Kurz, Jason A. Donenfeld, Joel Stanley, Jordan Niethe, Julia Lawall, Krzysztof Kozlowski, Laurent Dufour, Laurentiu Tudor, Linus Walleij, Michael Bringmann, Nathan Chancellor, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Peter Ujfalusi, Pingfan Liu, Ram Pai, Randy Dunlap, Russell Currey, Sam Bobroff, Sebastian Andrzej Siewior, Shawn Anastasio, Stephen Rothwell, Steve Best, Sukadev Bhattiprolu, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain" * tag 'powerpc-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (131 commits) powerpc: configs: Cleanup old Kconfig options powerpc/configs/skiroot: Enable some more hardening options powerpc/configs/skiroot: Disable xmon default & enable reboot on panic powerpc/configs/skiroot: Enable security features powerpc/configs/skiroot: Update for symbol movement only powerpc/configs/skiroot: Drop default n CONFIG_CRYPTO_ECHAINIV powerpc/configs/skiroot: Drop HID_LOGITECH powerpc/configs: Drop NET_VENDOR_HP which moved to staging powerpc/configs: NET_CADENCE became NET_VENDOR_CADENCE powerpc/configs: Drop CONFIG_QLGE which moved to staging powerpc: Do not consider weak unresolved symbol relocations as bad powerpc/32s: Fix kasan_early_hash_table() for CONFIG_VMAP_STACK powerpc: indent to improve Kconfig readability powerpc: Provide initial documentation for PAPR hcalls powerpc: Implement user_access_save() and user_access_restore() powerpc: Implement user_access_begin and friends powerpc/32s: Prepare prevent_user_access() for user_access_end() powerpc/32s: Drop NULL addr verification powerpc/kuap: Fix set direction in allow/prevent_user_access() powerpc/32s: Fix bad_kuap_fault() ...
2020-01-26powerpc: use probe_user_read() and probe_user_write()Christophe Leroy1-6/+4
Instead of opencoding, use probe_user_read() to failessly read a user location and probe_user_write() for writing to user. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e041f5eedb23f09ab553be8a91c3de2087147320.1579800517.git.christophe.leroy@c-s.fr
2020-01-22powerpc/xive: Discard ESB load value when interrupt is invalidFrederic Barrat1-3/+12
A load on an ESB page returning all 1's means that the underlying device has invalidated the access to the PQ state of the interrupt through mmio. It may happen, for example when querying a PHB interrupt while the PHB is in an error state. In that case, we should consider the interrupt to be invalid when checking its state in the irq_get_irqchip_state() handler. Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> [clg: wrote a commit log, introduced XIVE_ESB_INVALID ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200113130118.27969-1-clg@kaod.org
2020-01-07powerpc/mpic: constify copied structureJulia Lawall1-2/+2
The mpic_ipi_chip and mpic_irq_ht_chip structures are only copied into other structures, so make them const. The opportunity for this change was found using Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1577864614-5543-10-git-send-email-Julia.Lawall@inria.fr
2019-12-05powerpc/xive: Skip ioremap() of ESB pages for LSI interruptsCédric Le Goater<