summaryrefslogtreecommitdiff
path: root/Documentation/admin-guide
AgeCommit message (Collapse)AuthorFilesLines
2022-06-27x86/bugs: Add retbleed=ibpbPeter Zijlstra1-0/+3
jmp2ret mitigates the easy-to-attack case at relatively low overhead. It mitigates the long speculation windows after a mispredicted RET, but it does not mitigate the short speculation window from arbitrary instruction boundaries. On Zen2, there is a chicken bit which needs setting, which mitigates "arbitrary instruction boundaries" down to just "basic block boundaries". But there is no fix for the short speculation window on basic block boundaries, other than to flush the entire BTB to evict all attacker predictions. On the spectrum of "fast & blurry" -> "safe", there is (on top of STIBP or no-SMT): 1) Nothing System wide open 2) jmp2ret May stop a script kiddy 3) jmp2ret+chickenbit Raises the bar rather further 4) IBPB Only thing which can count as "safe". Tentative numbers put IBPB-on-entry at a 2.5x hit on Zen2, and a 10x hit on Zen1 according to lmbench. [ bp: Fixup feature bit comments, document option, 32-bit build fix. ] Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRSPawan Gupta1-0/+1
Extend spectre_v2= boot option with Kernel IBRS. [jpoimboe: no STIBP with IBRS] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27x86/bugs: Enable STIBP for JMP2RETKim Phillips1-5/+11
For untrained return thunks to be fully effective, STIBP must be enabled or SMT disabled. Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-27x86/bugs: Add AMD retbleed= boot parameterAlexandre Chartre1-0/+15
Add the "retbleed=<value>" boot parameter to select a mitigation for RETBleed. Possible values are "off", "auto" and "unret" (JMP2RET mitigation). The default value is "auto". Currently, "retbleed=auto" will select the unret mitigation on AMD and Hygon and no mitigation on Intel (JMP2RET is not effective on Intel). [peterz: rebase; add hygon] [jpoimboe: cleanups] Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+0
Pull kvm fixes from Paolo Bonzini: "While last week's pull request contained miscellaneous fixes for x86, this one covers other architectures, selftests changes, and a bigger series for APIC virtualization bugs that were discovered during 5.20 development. The idea is to base 5.20 development for KVM on top of this tag. ARM64: - Properly reset the SVE/SME flags on vcpu load - Fix a vgic-v2 regression regarding accessing the pending state of a HW interrupt from userspace (and make the code common with vgic-v3) - Fix access to the idreg range for protected guests - Ignore 'kvm-arm.mode=protected' when using VHE - Return an error from kvm_arch_init_vm() on allocation failure - A bunch of small cleanups (comments, annotations, indentation) RISC-V: - Typo fix in arch/riscv/kvm/vmid.c - Remove broken reference pattern from MAINTAINERS entry x86-64: - Fix error in page tables with MKTME enabled - Dirty page tracking performance test extended to running a nested guest - Disable APICv/AVIC in cases that it cannot implement correctly" [ This merge also fixes a misplaced end parenthesis bug introduced in commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base") pointed out by Sean Christopherson ] Link: https://lore.kernel.org/all/20220610191813.371682-1-seanjc@google.com/ * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits) KVM: selftests: Restrict test region to 48-bit physical addresses when using nested KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2 KVM: selftests: Clean up LIBKVM files in Makefile KVM: selftests: Link selftests directly with lib object files KVM: selftests: Drop unnecessary rule for STATIC_LIBS KVM: selftests: Add a helper to check EPT/VPID capabilities KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.h KVM: selftests: Refactor nested_map() to specify target level KVM: selftests: Drop stale function parameter comment for nested_map() KVM: selftests: Add option to create 2M and 1G EPT mappings KVM: selftests: Replace x86_page_size with PG_LEVEL_XX KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking KVM: x86: disable preemption while updating apicv inhibition KVM: x86: SVM: fix avic_kick_target_vcpus_fast KVM: x86: SVM: remove avic's broken code that updated APIC ID KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base KVM: x86: document AVIC/APICv inhibit reasons KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs ...
2022-06-14Merge tag 'x86-bugs-2022-06-01' of ↵Linus Torvalds3-0/+283
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 MMIO stale data fixes from Thomas Gleixner: "Yet another hw vulnerability with a software mitigation: Processor MMIO Stale Data. They are a class of MMIO-related weaknesses which can expose stale data by propagating it into core fill buffers. Data which can then be leaked using the usual speculative execution methods. Mitigations include this set along with microcode updates and are similar to MDS and TAA vulnerabilities: VERW now clears those buffers too" * tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation/mmio: Print SMT warning KVM: x86/speculation: Disable Fill buffer clear within guests x86/speculation/mmio: Reuse SRBDS mitigation for SBDS x86/speculation/srbds: Update SRBDS mitigation selection x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data x86/speculation/mmio: Enable CPU Fill buffer clearing on idle x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data x86/speculation: Add a common function for MD_CLEAR mitigation update x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug Documentation: Add documentation for Processor MMIO Stale Data
2022-06-09KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHEWill Deacon1-1/+0
Ignore 'kvm-arm.mode=protected' when using VHE so that kvm_get_mode() only returns KVM_MODE_PROTECTED on systems where the feature is available. Cc: David Brazdil <dbrazdil@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-4-will@kernel.org
2022-06-03Merge tag 'driver-core-5.19-rc1' of ↵Linus Torvalds1-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core changes for 5.19-rc1. Lots of tiny driver core changes and cleanups happened this cycle, but the two major things are: - firmware_loader reorganization and additions including the ability to have XZ compressed firmware images and the ability for userspace to initiate the firmware load when it needs to, instead of being always initiated by the kernel. FPGA devices specifically want this ability to have their firmware changed over the lifetime of the system boot, and this allows them to work without having to come up with yet-another-custom-uapi interface for loading firmware for them. - physical location support added to sysfs so that devices that know this information, can tell userspace where they are located in a common way. Some ACPI devices already support this today, and more bus types should support this in the future. Smaller changes include: - driver_override api cleanups and fixes - error path cleanups and fixes - get_abi script fixes - deferred probe timeout changes. It's that last change that I'm the most worried about. It has been reported to cause boot problems for a number of systems, and I have a tested patch series that resolves this issue. But I didn't get it merged into my tree before 5.18-final came out, so it has not gotten any linux-next testing. I'll send the fixup patches (there are 2) as a follow-on series to this pull request. All have been tested in linux-next for weeks, with no reported issues other than the above-mentioned boot time-outs" * tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) driver core: fix deadlock in __device_attach kernfs: Separate kernfs_pr_cont_buf and rename_lock. topology: Remove unused cpu_cluster_mask() driver core: Extend deferred probe timeout on driver registration MAINTAINERS: add Russ Weight as a firmware loader maintainer driver: base: fix UAF when driver_attach failed test_firmware: fix end of loop test in upload_read_show() driver core: location: Add "back" as a possible output for panel driver core: location: Free struct acpi_pld_info *pld driver core: Add "*" wildcard support to driver_async_probe cmdline param driver core: location: Check for allocations failure arch_topology: Trace the update thermal pressure kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file. export: fix string handling of namespace in EXPORT_SYMBOL_NS rpmsg: use local 'dev' variable rpmsg: Fix calling device_lock() on non-initialized device firmware_loader: describe 'module' parameter of firmware_upload_register() firmware_loader: Move definitions from sysfs_upload.h to sysfs.h firmware_loader: Fix configs for sysfs split selftests: firmware: Add firmware upload selftests ...
2022-06-02Merge tag 'docs-5.19-2' of git://git.lwn.net/linuxLinus Torvalds1-3/+3
Pull documentation fixes from Jonathan Corbet: "A handful of late-arriving documentation fixes and the addition of an SVG tux logo which, I'm assured, we're going to want" * tag 'docs-5.19-2' of git://git.lwn.net/linux: documentation: Format button_dev as a pointer. docs: add SVG version of the Linux logo docs: move Linux logo into a new `images` folder docs: blockdev: change title to match section content docs/conf.py: Cope with removal of language=None in Sphinx 5.0.0
2022-06-01docs: blockdev: change title to match section contentJoel Colledge1-3/+3
This index.rst was added in commit 39443104c7d3 docs: blockdev: convert to ReST It appears that the title from the RapidIO index page was copied. This title does not match the content of this directory. Change it to match. Signed-off-by: Joel Colledge <joel.colledge@linbit.com> Link: https://lore.kernel.org/r/20220530142849.717-1-joel.colledge@linbit.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2022-05-31Merge tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-6/+9
Pull NFS client updates from Anna Schumaker: "New Features: - Add support for 'dacl' and 'sacl' attributes Bugfixes and Cleanups: - Fixes for reporting mapping errors - Fixes for memory allocation errors - Improve warning message when locks are lost - Update documentation for the nfs4_unique_id parameter - Add an explanation of NFSv4 client identifiers - Ensure the i_size attribute is written to the fscache storage - Fix freeing uninitialized nfs4_labels - Better handling when xprtrdma bc_serv is NULL - Mark qualified async operations as MOVEABLE tasks" * tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.1 mark qualified async operations as MOVEABLE tasks xprtrdma: treat all calls not a bcall when bc_serv is NULL NFSv4: Fix free of uninitialized nfs4_label on referral lookup. NFS: Pass i_size to fscache_unuse_cookie() when a file is released Documentation: Add an explanation of NFSv4 client identifiers NFS: update documentation for the nfs4_unique_id parameter NFS: Improve warning message when locks are lost. NFSv4.1: Enable access to the NFSv4.1 'dacl' and 'sacl' attributes NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributes NFSv4: Specify the type of ACL to cache NFSv4: Don't hold the layoutget locks across multiple RPC calls pNFS/files: Fall back to I/O through the MDS on non-fatal layout errors NFS: Further fixes to the writeback error handling NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout NFS: Memory allocation failures are not server fatal errors NFS: Don't report errors from nfs_pageio_complete() more than once NFS: Do not report flush errors in nfs_write_end() NFS: Don't report ENOSPC write errors twice NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYS NFS: Do not report EINTR/ERESTARTSYS as mapping errors
2022-05-30Merge tag 'pm-5.19-rc1-2' of ↵Linus Torvalds1-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These update the ARM cpufreq drivers and fix up the CPPC cpufreq driver after recent changes, update the OPP code and PM documentation and add power sequences support to the system reboot and power off code. Specifics: - Add Tegra234 cpufreq support (Sumit Gupta) - Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing, Rex-BC Chen, and Jia-Wei Chang) - Fix up the CPPC cpufreq driver after recent changes (Zheng Bin, Pierre Gondois) - Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine Oudjana) - Use list iterator only inside the list_for_each_entry loop (Xiaomeng Tong, and Jakob Koschel) - New APIs related to finding OPP based on interconnect bandwidth (Krzysztof Kozlowski) - Fix the missing of_node_put() in _bandwidth_supported() (Dan Carpenter) - Cleanups (Krzysztof Kozlowski, and Viresh Kumar) - Add Out of Band mode description to the intel-speed-select utility documentation (Srinivas Pandruvada) - Add power sequences support to the system reboot and power off code and make related platform-specific changes for multiple platforms (Dmitry Osipenko, Geert Uytterhoeven)" * tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (60 commits) cpufreq: CPPC: Fix unused-function warning cpufreq: CPPC: Fix build error without CONFIG_ACPI_CPPC_CPUFREQ_FIE Documentation: admin-guide: PM: Add Out of Band mode kernel/reboot: Change registration order of legacy power-off handler m68k: virt: Switch to new sys-off handler API kernel/reboot: Add devm_register_restart_handler() kernel/reboot: Add devm_register_power_off_handler() soc/tegra: pmc: Use sys-off handler API to power off Nexus 7 properly reboot: Remove pm_power_off_prepare() regulator: pfuze100: Use devm_register_sys_off_handler() ACPI: power: Switch to sys-off handler API memory: emif: Use kernel_can_power_off() mips: Use do_kernel_power_off() ia64: Use do_kernel_power_off() x86: Use do_kernel_power_off() sh: Use do_kernel_power_off() m68k: Switch to new sys-off handler API powerpc: Use do_kernel_power_off() xen/x86: Use do_kernel_power_off() parisc: Use do_kernel_power_off() ...
2022-05-29Merge tag 'trace-v5.19' of ↵Linus Torvalds1-3/+28
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "The majority of the changes are for fixes and clean ups. Notable changes: - Rework trace event triggers code to be easier to interact with. - Support for embedding bootconfig with the kernel (as suppose to having it embedded in initram). This is useful for embedded boards without initram disks. - Speed up boot by parallelizing the creation of tracefs files. - Allow absolute ring buffer timestamps handle timestamps that use more than 59 bits. - Added new tracing clock "TAI" (International Atomic Time) - Have weak functions show up in available_filter_function list as: __ftrace_invalid_address___<invalid-offset> instead of using the name of the function before it" * tag 'trace-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (52 commits) ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function tracing: Fix comments for event_trigger_separate_filter() x86/traceponit: Fix comment about irq vector tracepoints x86,tracing: Remove unused headers ftrace: Clean up hash direct_functions on register failures tracing: Fix comments of create_filter() tracing: Disable kcov on trace_preemptirq.c tracing: Initialize integer variable to prevent garbage return value ftrace: Fix typo in comment ftrace: Remove return value of ftrace_arch_modify_*() tracing: Cleanup code by removing init "char *name" tracing: Change "char *" string form to "char []" tracing/timerlat: Do not wakeup the thread if the trace stops at the IRQ tracing/timerlat: Print stacktrace in the IRQ handler if needed tracing/timerlat: Notify IRQ new max latency only if stop tracing is set kprobes: Fix build errors with CONFIG_KRETPROBES=n tracing: Fix return value of trace_pid_write() tracing: Fix potential double free in create_var_ref() tracing: Use strim() to remove whitespace instead of doing it manually ftrace: Deal with error return code of the ftrace_process_locs() function ...
2022-05-27Merge tag 'pci-v5.19-changes' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Resource management: - Restrict E820 clipping to PCI host bridge windows (Bjorn Helgaas) - Log E820 clipping better (Bjorn Helgaas) - Add kernel cmdline options to enable/disable E820 clipping (Hans de Goede) - Disable E820 reserved region clipping for IdeaPads, Yoga, Yoga Slip, Acer Spin 5, Clevo Barebone systems where clipping leaves no usable address space for touchpads, Thunderbolt devices, etc (Hans de Goede) - Disable E820 clipping by default starting in 2023 (Hans de Goede) PCI device hotplug: - Include files to remove implicit dependencies (Christophe Leroy) - Only put Root Ports in D3 if they can signal and wake from D3 so AMD Yellow Carp doesn't miss hotplug events (Mario Limonciello) Power management: - Define pci_restore_standard_config() only for CONFIG_PM_SLEEP since it's unused otherwise (Krzysztof Kozlowski) - Power up devices completely, including anything platform firmware needs to do, during runtime resume (Rafael J. Wysocki) - Move pci_resume_bus() to PM callbacks so we observe the required bridge power-up delays (Rafael J. Wysocki) - Drop unneeded runtime_d3cold device flag (Rafael J. Wysocki) - Split pci_raw_set_power_state() between pci_power_up() and a new pci_set_low_power_state() (Rafael J. Wysocki) - Set current_state to D3cold if config read returns ~0, indicating the device is not accessible (Rafael J. Wysocki) - Do not call pci_update_current_state() from pci_power_up() so BARs and ASPM config are restored correctly (Rafael J. Wysocki) - Write 0 to PMCSR in pci_power_up() in all cases (Rafael J. Wysocki) - Split pci_power_up() to pci_set_full_power_state() to avoid some redundant operations (Rafael J. Wysocki) - Skip restoring BARs if device is not in D0 (Rafael J. Wysocki) - Rearrange and clarify pci_set_power_state() (Rafael J. Wysocki) - Remove redundant BAR restores from pci_pm_thaw_noirq() (Rafael J. Wysocki) Virtualization: - Acquire device lock before config space access lock to avoid AB/BA deadlock with sriov_numvfs_store() (Yicong Yang) Error handling: - Clear MULTI_ERR_COR/UNCOR_RCV bits, which a race could previously leave permanently set (Kuppuswamy Sathyanarayanan) Peer-to-peer DMA: - Whitelist Intel Skylake-E Root Ports regardless of which devfn they are (Shlomo Pongratz) ASPM: - Override L1 acceptable latency advertised by Intel DG2 so ASPM L1 can be enabled (Mika Westerberg) Cadence PCIe controller driver: - Set up device-specific register to allow PTM Responder to be enabled by the normal architected bit (Christian Gmeiner) - Override advertised FLR support since the controller doesn't implement FLR correctly (Parshuram Thombare) Cadence PCIe endpoint driver: - Correct bitmap size for the ob_region_map of outbound window usage (Dan Carpenter) Freescale i.MX6 PCIe controller driver: - Fix PERST# assertion/deassertion so we observe the required delays before accessing device (Francesco Dolcini) Freescale Layerscape PCIe controller driver: - Add "big-endian" DT property (Hou Zhiqiang) - Update SCFG DT property (Hou Zhiqiang) - Add "aer", "pme", "intr" DT properties (Li Yang) - Add DT compatible strings for ls1028a (Xiaowei Bao) Intel VMD host bridge driver: - Assign VMD IRQ domain before enumeration to avoid IOMMU interrupt remapping errors when MSI-X remapping is disabled (Nirmal Patel) - Revert VMD workaround that kept MSI-X remapping enabled when IOMMU remapping was enabled (Nirmal Patel) Marvell MVEBU PCIe controller driver: - Add of_pci_get_slot_power_limit() to parse the 'slot-power-limit-milliwatt' DT property (Pali Rohár) - Add mvebu support for sending Set_Slot_Power_Limit message (Pali Rohár) MediaTek PCIe controller driver: - Fix refcount leak in mtk_pcie_subsys_powerup() (Miaoqian Lin) MediaTek PCIe Gen3 controller driver: - Reset PHY and MAC at probe time (AngeloGioacchino Del Regno) Microchip PolarFlare PCIe controller driver: - Add chained_irq_enter()/chained_irq_exit() calls to mc_handle_msi() and mc_handle_intx() to avoid lost interrupts (Conor Dooley) - Fix interrupt handling race (Daire McNamara) NVIDIA Tegra194 PCIe controller driver: - Drop tegra194 MSI register save/restore, which is unnecessary since the DWC core does it (Jisheng Zhang) Qualcomm PCIe controller driver: - Add SM8150 SoC DT binding and support (Bhupesh Sharma) - Fix pipe clock imbalance (Johan Hovold) - Fix runtime PM imbalance on probe errors (Johan Hovold) - Fix PHY init imbalance on probe errors (Johan Hovold) - Convert DT binding to YAML (Dmitry Baryshkov) - Update DT binding to show that resets aren't required for MSM8996/APQ8096 platforms (Dmitry Baryshkov) - Add explicit register names per chipset in DT binding (Dmitry Baryshkov) - Add sc7280-specific clock and reset definitions to DT binding (Dmitry Baryshkov) Rockchip PCIe controller driver: - Fix bitmap size when searching for free outbound region (Dan Carpenter) Rockchip DesignWare PCIe controller driver: - Remove "snps,dw-pcie" from rockchip-dwc DT "compatible" property because it's not fully compatible with rockchip (Peter Geis) - Reset rockchip-dwc controller at probe (Peter Geis) - Add rockchip-dwc INTx support (Peter Geis) Synopsys DesignWare PCIe controller driver: - Return error instead of success if DMA mapping of MSI area fails (Jiantao Zhang) Miscellaneous: - Change pci_set_dma_mask() documentation references to dma_set_mask() (Alex Williamson)" * tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (64 commits) dt-bindings: PCI: qcom: Add schema for sc7280 chipset dt-bindings: PCI: qcom: Specify reg-names explicitly dt-bindings: PCI: qcom: Do not require resets on msm8996 platforms dt-bindings: PCI: qcom: Convert to YAML PCI: qcom: Fix unbalanced PHY init on probe errors PCI: qcom: Fix runtime PM imbalance on probe errors PCI: qcom: Fix pipe clock imbalance PCI: qcom: Add SM8150 SoC support dt-bindings: pci: qcom: Document PCIe bindings for SM8150 SoC x86/PCI: Disable E820 reserved region clipping starting in 2023 x86/PCI: Disable E820 reserved region clipping via quirks x86/PCI: Add kernel cmdline options to use/ignore E820 reserved regions PCI: microchip: Fix potential race in interrupt handling PCI/AER: Clear MULTI_ERR_COR/UNCOR_RCV bits PCI: cadence: Clear FLR in device capabilities register PCI: cadence: Allow PTM Responder to be enabled PCI: vmd: Revert 2565e5b69c44 ("PCI: vmd: Do not disable MSI-X remapping if interrupt remapping is enabled by IOMMU.") PCI: vmd: Assign VMD IRQ domain before enumeration PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store() PCI: rockchip-dwc: Add legacy interrupt support ...
2022-05-26Merge tag 'mm-stable-2022-05-25' of ↵Linus Torvalds8-19/+165
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Almost all of MM here. A few things are still getting finished off, reviewed, etc. - Yang Shi has improved the behaviour of khugepaged collapsing of readonly file-backed transparent hugepages. - Johannes Weiner has arranged for zswap memory use to be tracked and managed on a per-cgroup basis. - Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime enablement of the recent huge page vmemmap optimization feature. - Baolin Wang contributes a series to fix some issues around hugetlb pagetable invalidation. - Zhenwei Pi has fixed some interactions between hwpoisoned pages and virtualization. - Tong Tiangen has enabled the use of the presently x86-only page_table_check debugging feature on arm64 and riscv. - David Vernet has done some fixup work on the memcg selftests. - Peter Xu has taught userfaultfd to handle write protection faults against shmem- and hugetlbfs-backed files. - More DAMON development from SeongJae Park - adding online tuning of the feature and support for monitoring of fixed virtual address ranges. Also easier discovery of which monitoring operations are available. - Nadav Amit has done some optimization of TLB flushing during mprotect(). - Neil Brown continues to labor away at improving our swap-over-NFS support. - David Hildenbrand has some fixes to anon page COWing versus get_user_pages(). - Peng Liu fixed some errors in the core hugetlb code. - Joao Martins has reduced the amount of memory consumed by device-dax's compound devmaps. - Some cleanups of the arch-specific pagemap code from Anshuman Khandual. - Muchun Song has found and fixed some errors in the TLB flushing of transparent hugepages. - Roman Gushchin has done more work on the memcg selftests. ... and, of course, many smaller fixes and cleanups. Notably, the customary million cleanup serieses from Miaohe Lin" * tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits) mm: kfence: use PAGE_ALIGNED helper selftests: vm: add the "settings" file with timeout variable selftests: vm: add "test_hmm.sh" to TEST_FILES selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests selftests: vm: add migration to the .gitignore selftests/vm/pkeys: fix typo in comment ksm: fix typo in comment selftests: vm: add process_mrelease tests Revert "mm/vmscan: never demote for memcg reclaim" mm/kfence: print disabling or re-enabling message include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace" include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion" mm: fix a potential infinite loop in start_isolate_page_range() MAINTAINERS: add Muchun as co-maintainer for HugeTLB zram: fix Kconfig dependency warning mm/shmem: fix shmem folio swapoff hang cgroup: fix an error handling path in alloc_pagecache_max_30M() mm: damon: use HPAGE_PMD_SIZE tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate nodemask.h: fix compilation error with GCC12 ...
2022-05-25Merge tag 'net-next-5.19' of ↵Linus Torvalds2-1/+18
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-25Merge tag 'docs-5.19' of git://git.lwn.net/linuxLinus Torvalds4-179/+189
Pull documentation updates from Jonathan Corbet: "It was a moderately busy cycle for documentation; highlights include: - After a long period of inactivity, the Japanese translations are seeing some much-needed maintenance and updating. - Reworked IOMMU documentation - Some new documentation for static-analysis tools - A new overall structure for the memory-management documentation. This is an LSFMM outcome that, it is hoped, will help encourage developers to fill in the many gaps. Optimism is eternal...but hopefully it will work. - More Chinese translations. Plus the usual typo fixes, updates, etc" * tag 'docs-5.19' of git://git.lwn.net/linux: (70 commits) docs: pdfdocs: Add space for chapter counts >= 100 in TOC docs/zh_CN: Add dev-tools/gdb-kernel-debugging.rst Chinese translation input: Docs: correct ntrig.rst typo input: Docs: correct atarikbd.rst typos MAINTAINERS: Become the docs/zh_CN maintainer docs/zh_CN: fix devicetree usage-model translation mm,doc: Add new documentation structure Documentation: drop more IDE boot options and ide-cd.rst Documentation/process: use scripts/get_maintainer.pl on patches MAINTAINERS: Add entry for DOCUMENTATION/JAPANESE docs/trans/ja_JP/howto: Don't mention specific kernel versions docs/ja_JP/SubmittingPatches: Request summaries for commit references docs/ja_JP/SubmittingPatches: Add Suggested-by as a standard signature docs/ja_JP/SubmittingPatches: Randy has moved docs/ja_JP/SubmittingPatches: Suggest the use of scripts/get_maintainer.pl docs/ja_JP/SubmittingPatches: Update GregKH links Documentation/sysctl: document max_rcu_stall_to_panic Documentation: add missing angle bracket in cgroup-v2 doc Documentation: dev-tools: use literal block instead of code-block docs/zh_CN: add vm numa translation ...
2022-05-25Documentation: admin-guide: PM: Add Out of Band modeSrinivas Pandruvada1-0/+22
Update documentation for using the tool to support performance level change via OOB (Out of Band) interface. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-24Merge tag 'media/v5.19-1' of ↵Linus Torvalds1-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - dvb-usb drivers entries got reworked to avoid usage of magic numbers to refer to data position inside tables - vcodec driver has gained support for MT8186 and for vp8 and vp9 stateless codecs - hantro has gained support for Hantro G1 on RK366x - Added more h264 levels on coda960 - ccs gained support for MIPI CSI-2 28 bits per pixel raw data type - venus driver gained support for Qualcomm custom compressed pixel formats - lots of driver fixes and updates * tag 'media/v5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (308 commits) media: hantro: Enable HOLD_CAPTURE_BUF for H.264 media: hantro: Add H.264 field decoding support media: hantro: h264: Make dpb entry management more robust media: hantro: Stop using H.264 parameter pic_num media: rkvdec: Enable capture buffer holding for H264 media: rkvdec-h264: Add field decoding support media: rkvdec: Ensure decoded resolution fit coded resolution media: rkvdec: h264: Fix reference frame_num wrap for second field media: rkvdec: h264: Validate and use pic width and height in mbs media: rkvdec: Move H264 SPS validation in rkvdec-h264 media: rkvdec: h264: Fix bit depth wrap in pps packet media: rkvdec: h264: Fix dpb_valid implementation media: rkvdec: Stop overclocking the decoder media: v4l2: Reorder field reflist media: h264: Sort p/b reflist using frame_num media: v4l2: Trace calculated p/b0/b1 initial reflist media: h264: Store all fields into the unordered list media: h264: Store current picture fields media: h264: Increase reference lists size to 32 media: h264: Use v4l2_h264_reference for reflist ...
2022-05-24Merge tag 'integrity-v5.19' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull IMA updates from Mimi Zohar: "New is IMA support for including fs-verity file digests and signatures in the IMA measurement list as well as verifying the fs-verity file digest based signatures, both based on policy. In addition, are two bug fixes: - avoid reading UEFI variables, which cause a page fault, on Apple Macs with T2 chips. - remove the original "ima" template Kconfig option to address a boot command line ordering issue. The rest is a mixture of code/documentation cleanup" * tag 'integrity-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: Fix sparse warnings in keyring_handler evm: Clean up some variables evm: Return INTEGRITY_PASS for enum integrity_status value '0' efi: Do not import certificates from UEFI Secure Boot for T2 Macs fsverity: update the documentation ima: support fs-verity file digest based version 3 signatures ima: permit fsverity's file digests in the IMA measurement list ima: define a new template field named 'd-ngv2' and templates fs-verity: define a function to return the integrity protected file digest ima: use IMA default hash algorithm for integrity violations ima: fix 'd-ng' comments and documentation ima: remove the IMA_TEMPLATE Kconfig option ima: remove redundant initialization of pointer 'file'.
2022-05-24Merge tag 'tpmdd-next-v5.19-rc1' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: - Tightened validation of key hashes for SYSTEM_BLACKLIST_HASH_LIST. An invalid hash format causes a compilation error. Previously, they got included to the kernel binary but were silently ignored at run-time. - Allow root user to append new hashes to the blacklist keyring. - Trusted keys backed with Cryptographic Acceleration and Assurance Module (CAAM), which part of some of the new NXP's SoC's. Now there is total three hardware backends for trusted keys: TPM, ARM TEE and CAAM. - A scattered set of fixes and small improvements for the TPM driver. * tag 'tpmdd-next-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: MAINTAINERS: add KEYS-TRUSTED-CAAM doc: trusted-encrypted: describe new CAAM trust source KEYS: trusted: Introduce support for NXP CAAM-based trusted keys crypto: caam - add in-kernel interface for blob generator crypto: caam - determine whether CAAM supports blob encap/decap KEYS: trusted: allow use of kernel RNG for key material KEYS: trusted: allow use of TEE as backend without TCG_TPM support tpm: Add field upgrade mode support for Infineon TPM2 modules tpm: Fix buffer access in tpm2_get_tpm_pt() char: tpm: cr50_i2c: Suppress duplicated error message in .remove() tpm: cr50: Add new device/vendor ID 0x504a6666 tpm: Remove read16/read32/write32 calls from tpm_tis_phy_ops tpm: ibmvtpm: Correct the return value in tpm_ibmvtpm_probe() tpm/tpm_ftpm_tee: Return true/false (not 1/0) from bool functions certs: Explain the rationale to call panic() certs: Allow root user to append signed hashes to the blacklist keyring certs: Check that builtin blacklist hashes are valid certs: Make blacklist_vet_description() more strict certs: Factor out the blacklist hash creation tools/certs: Add print-cert-tbs-hash.sh
2022-05-24Merge tag 'random-5.19-rc1-for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "These updates continue to refine the work began in 5.17 and 5.18 of modernizing the RNG's crypto and streamlining and documenting its code. New for 5.19, the updates aim to improve entropy collection methods and make some initial decisions regarding the "premature next" problem and our threat model. The cloc utility now reports that random.c is 931 lines of code and 466 lines of comments, not that basic metrics like that mean all that much, but at the very least it tells you that this is very much a manageable driver now. Here's a summary of the various updates: - The random_get_entropy() function now always returns something at least minimally useful. This is the primary entropy source in most collectors, which in the best case expands to something like RDTSC, but prior to this change, in the worst case it would just return 0, contributing nothing. For 5.19, additional architectures are wired up, and architectures that are entirely missing a cycle counter now have a generic fallback path, which uses the highest resolution clock available from the timekeeping subsystem. Some of those clocks can actually be quite good, despite the CPU not having a cycle counter of its own, and going off-core for a stamp is generally thought to increase jitter, something positive from the perspective of entropy gathering. Done very early on in the development cycle, this has been sitting in next getting some testing for a while now and has relevant acks from the archs, so it should be pretty well tested and fine, but is nonetheless the thing I'll be keeping my eye on most closely. - Of particular note with the random_get_entropy() improvements is MIPS, which, on CPUs that lack the c0 count register, will now combine the high-speed but short-cycle c0 random register with the lower-speed but long-cycle generic fallback path. - With random_get_entropy() now always returning something useful, the interrupt handler now collects entropy in a consistent construction. - Rather than comparing two samples of random_get_entropy() for the jitter dance, the algorithm now tests many samples, and uses the amount of differing ones to determine whether or not jitter entropy is usable and how laborious it must be. The problem with comparing only two samples was that if the cycle counter was extremely slow, but just so happened to be on the cusp of a change, the slowness wouldn't be detected. Taking many samples fixes that to some degree. This, combined with the other improvements to random_get_entropy(), should make future unification of /dev/random and /dev/urandom maybe more possible. At the very least, were we to attempt it again today (we're not), it wouldn't break any of Guenter's test rigs that broke when we tried it with 5.18. So, not today, but perhaps down the road, that's something we can revisit. - We attempt to reseed the RNG immediately upon waking up from system suspend or hibernation, making use of the various timestamps about suspend time and such available, as well as the usual inputs such as RDRAND when available. - Batched randomness now falls back to ordinary randomness before the RNG is initialized. This provides more consistent guarantees to the types of random numbers being returned by the various accessors. - The "pre-init injection" code is now gone for good. I suspect you in particular will be happy to read that, as I recall you expressing your distaste for it a few months ago. Instead, to avoid a "premature first" issue, while still allowing for maximal amount of entropy availability during system boot, the first 128 bits of estimated entropy are used immediately as it arrives, with the next 128 bits being buffered. And, as before, after the RNG has been fully initialized, it winds up reseeding anyway a few seconds later in most cases. This resulted in a pretty big simplification of the initialization code and let us remove various ad-hoc mechanisms like the ugly crng_pre_init_inject(). - The RNG no longer pretends to handle the "premature next" security model, something that various academics and other RNG designs have tried to care about in the past. After an interesting mailing list thread, these issues are thought to be a) mainly academic and not practical at all, and b) actively harming the real security of the RNG by delaying new entropy additions after a potential compromise, making a potentially bad situation even worse. As well, in the first place, our RNG never even properly handled the premature next issue, so removing an incomplete solution to a fake problem was particularly nice. This allowed for numerous other simplifications in the code, which is a lot cleaner as a consequence. If you didn't see it before, https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a thread worth skimming through. - While the interrupt handler received a separate code path years ago that avoids locks by using per-cpu data structures and a faster mixing algorithm, in order to reduce interrupt latency, input and disk events that are triggered in hardirq handlers were still hitting locks and more expensive algorithms. Those are now redirected to use the faster per-cpu data structures. - Rather than having the fake-crypto almost-siphash-based random32 implementation be used right and left, and in many places where cryptographically secure randomness is desirable, the batched entropy code is now fast enough to replace that. - As usual, numerous code quality and documentation cleanups. For example, the initialization state machine now uses enum symbolic constants instead of just hard coding numbers everywhere. - Since the RNG initializes once, and then is always initialized thereafter, a pretty heavy amount of code used during that initialization is never used again. It is now completely cordoned off using static branches and it winds up in the .text.unlikely section so that it doesn't reduce cache compactness after the RNG is ready. - A variety of functions meant for waiting on the RNG to be initialized were only used by vsprintf, and in not a particularly optimal way. Replacing that usage with a more ordinary setup made it possible to remove those functions. - A cleanup of how we warn userspace about the use of uninitialized /dev/urandom and uninitialized get_random_bytes() usage. Interestingly, with the change you merged for 5.18 that attempts to use jitter (but does not block if it can't), the majority of users should never see those warnings for /dev/urandom at all now, and the one for in-kernel usage is mainly a debug thing. - The file_operations struct for /dev/[u]random now implements .read_iter and .write_iter instead of .read and .write, allowing it to also implement .splice_read and .splice_write, which makes splice(2) work again after it was broken here (and in many other places in the tree) during the set_fs() removal. This was a bit of a last minute arrival from Jens that hasn't had as much time to bake, so I'll be keeping my eye on this as well, but it seems fairly ordinary. Unfortunately, read_iter() is around 3% slower than read() in my tests, which I'm not thrilled about. But Jens and Al, spurred by this observation, seem to be making progress in removing the bottlenecks on the iter paths in the VFS layer in general, which should remove the performance gap for all drivers. - Assorted other bug fixes, cleanups, and optimizations. - A small SipHash cleanup" * tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits) random: check for signals after page of pool writes random: wire up fops->splice_{read,write}_iter() random: convert to using fops->write_iter() random: convert to using fops->read_iter() random: unify batched entropy implementations random: move randomize_page() into mm where it belongs random: remove mostly unused async readiness notifier random: remove get