Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 7eced90b202d63cdc1b9b11b1353adb1389830f9 ]
The reasons for PTEs in the kernel direct map to be marked invalid are not
limited to kfence / debug pagealloc machinery. In particular,
memfd_secret() also steals pages with set_direct_map_invalid_noflush().
When building the transitional page tables for kexec from the current
kernel's page tables, those pages need to become regular writable pages,
otherwise, if the relocation places kexec segments over such pages, a fault
will occur during kexec, leading to host going dark during kexec.
This patch addresses the kexec issue by marking any PTE as valid if it is
not none. While this fixes the kexec crash, it does not address the
security concern that if processes owning secret memory are not terminated
before kexec, the secret content will be mapped in the new kernel without
being scrubbed.
Suggested-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Fares Mehanna <faresx@amazon.de>
Link: https://lore.kernel.org/r/20240902163309.97113-1-faresx@amazon.de
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2b9ac0b84c2cae91bbaceab62df4de6d503421ec ]
Currently, struct snp_guest_msg includes a message header (96 bytes) and
a payload (4000 bytes). There is an implicit assumption here that the
SNP message header will always be 96 bytes, and with that assumption the
payload array size has been set to 4000 bytes - a magic number. If any
new member is added to the SNP message header, the SNP guest message
will span more than a page.
Instead of using a magic number for the payload, declare struct
snp_guest_msg in a way that payload plus the message header do not
exceed a page.
[ bp: Massage. ]
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240731150811.156771-5-nikunj@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3c44d31cb34ce4eb8311a2e73634d57702948230 ]
Algorithm registration is usually carried out during module init,
where as little work as possible should be carried out. The SIMD
code violated this rule by allocating a tfm, this then triggers a
full test of the algorithm which may dead-lock in certain cases.
SIMD is only allocating the tfm to get at the alg object, which is
in fact already available as it is what we are registering. Use
that directly and remove the crypto_alloc_tfm call.
Also remove some obsolete and unused SIMD API.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3363c460ef726ba693704dbcd73b7e7214ccc788 ]
The macros FOUR_ROUNDS_AND_SCHED and DO_4ROUNDS rely on an
unexpected/undocumented behavior of the GNU assembler, which might
change in the future
(https://sourceware.org/bugzilla/show_bug.cgi?id=32073).
M (1) (2) // 1 arg !? Future: 2 args
M 1 + 2 // 1 arg !? Future: 3 args
M 1 2 // 2 args
Add parentheses around the single arguments to support future GNU
assembler and LLVM integrated assembler (when the IsOperator hack from
the following link is dropped).
Link: https://github.com/llvm/llvm-project/commit/055006475e22014b28a070db1bff41ca15f322f0
Signed-off-by: Fangrui Song <maskray@google.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit a84dd0d8ae24bdc6da341187fc4c1a0adfce2ccc upstream.
ftrace_return_address() is called extremely often from
performance-critical code paths when debugging features like
CONFIG_TRACE_IRQFLAGS are enabled. For example, with debug_defconfig,
ftrace selftests on my LPAR currently execute ftrace_return_address()
as follows:
ftrace_return_address(0) - 0 times (common code uses __builtin_return_address(0) instead)
ftrace_return_address(1) - 2,986,805,401 times (with this patch applied)
ftrace_return_address(2) - 140 times
ftrace_return_address(>2) - 0 times
The use of __builtin_return_address(n) was replaced by return_address()
with an unwinder call by commit cae74ba8c295 ("s390/ftrace:
Use unwinder instead of __builtin_return_address()") because
__builtin_return_address(n) simply walks the stack backchain and doesn't
check for reaching the stack top. For shallow stacks with fewer than
"n" frames, this results in reads at low addresses and random
memory accesses.
While calling the fully functional unwinder "works", it is very slow
for this purpose. Moreover, potentially following stack switches and
walking past IRQ context is simply wrong thing to do for
ftrace_return_address().
Reimplement return_address() to essentially be __builtin_return_address(n)
with checks for reaching the stack top. Since the ftrace_return_address(n)
argument is always a constant, keep the implementation in the header,
allowing both GCC and Clang to unroll the loop and optimize it to the
bare minimum.
Fixes: cae74ba8c295 ("s390/ftrace: Use unwinder instead of __builtin_return_address()")
Cc: stable@vger.kernel.org
Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3dedd4889cfc2851444a1f7626b293c0bfd1e42c upstream.
The property is "fsl,pins", not "fsl,pin". Wrong property means the pin
configuration was not applied. Fixes dtbs_check warnings:
imx6ull-seeed-npi-dev-board-emmc.dtb: pinctrl@20e0000: uart1grp: 'fsl,pins' is a required property
imx6ull-seeed-npi-dev-board-emmc.dtb: pinctrl@20e0000: uart1grp: 'fsl,pin' does not match any of the regexes: 'pinctrl-[0-9]+'
Cc: stable@vger.kernel.org
Fixes: e3b5697195c8 ("ARM: dts: imx6ull: add seeed studio NPi dev board")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Parthiban Nallathambi <parthiban@linumiz.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1b0e32753d8550908dff8982410357b5114be78c upstream.
The property is "fsl,pins", not "fsl,pin". Wrong property means the pin
configuration was not applied. Fixes dtbs_check warnings:
imx6ul-geam.dtb: pinctrl@20e0000: tscgrp: 'fsl,pins' is a required property
imx6ul-geam.dtb: pinctrl@20e0000: tscgrp: 'fsl,pin' does not match any of the regexes: 'pinctrl-[0-9]+'
Cc: stable@vger.kernel.org
Fixes: a58e4e608bc8 ("ARM: dts: imx6ul-geam: Add Engicam IMX6UL GEA M6UL initial support")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Michael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit def33fb1191207f5afa6dcb681d71fef2a6c1293 upstream.
All batches of the Pine64 Pinebook Pro, except the latest batch (as of 2024)
whose hardware design was revised due to the component shortage, use a 1S
lithium battery whose nominal/design capacity is 10,000 mAh, according to the
battery datasheet. [1][2] Let's correct the design full-charge value in the
Pinebook Pro board dts, to improve the accuracy of the hardware description,
and to hopefully improve the accuracy of the fuel gauge a bit on all units
that don't belong to the latest batch.
The above-mentioned latest batch uses a different 1S lithium battery with
a slightly lower capacity, more precisely 9,600 mAh. To make the fuel gauge
work reliably on the latest batch, a sample battery would need to be sent to
CellWise, to obtain its proprietary battery profile, whose data goes into
"cellwise,battery-profile" in the Pinebook Pro board dts. Without that data,
the fuel gauge reportedly works unreliably, so changing the design capacity
won't have any negative effects on the already unreliable operation of the
fuel gauge in the Pinebook Pros that belong to the latest batch.
According to the battery datasheet, its voltage can go as low as 2.75 V while
discharging, but it's better to leave the current 3.0 V value in the dts file,
because of the associated Pinebook Pro's voltage regulation issues.
[1] https://wiki.pine64.org/index.php/Pinebook_Pro#Battery
[2] https://files.pine64.org/doc/datasheet/pinebook/40110175P%203.8V%2010000mAh%E8%A7%84%E6%A0%BC%E4%B9%A6-14.pdf
Fixes: c7c4d698cd28 ("arm64: dts: rockchip: add fuel gauge to Pinebook Pro dts")
Cc: stable@vger.kernel.org
Cc: Marek Kraus <gamiee@pine64.org>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Link: https://lore.kernel.org/r/731f8ef9b1a867bcc730d19ed277c8c0534c0842.1721065172.git.dsimic@manjaro.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 421688265d7f5d3ff4211982e7231765378bb64f upstream.
The SMMUs on sa8775p are cache-coherent. GPU SMMU is marked as such,
mark the APPS and PCIe ones as well.
Fixes: 603f96d4c9d0 ("arm64: dts: qcom: add initial support for qcom sa8775p-ride")
Fixes: 2dba7a613a6e ("arm64: dts: qcom: sa8775p: add the pcie smmu node")
Cc: stable@vger.kernel.org
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Qingqing Zhou <quic_qqzhou@quicinc.com>
Rule: add
Link: https://lore.kernel.org/stable/20240723075948.9545-1-quic_qqzhou%40quicinc.com
Link: https://lore.kernel.org/r/20240725072117.22425-1-quic_qqzhou@quicinc.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8c51521de18755d4112a77a598a348b38d0af370 upstream.
Increase the frequency of the PWM signal that drives the LED backlight of
the Pinebook Pro's panel, from about 1.35 KHz (which equals to the PWM
period of 740,740 ns), to exactly 8 kHz (which equals to the PWM period of
125,000 ns). Using a higher PWM frequency for the panel backlight, which
reduces the flicker, can only be beneficial to the end users' eyes.
On top of that, increasing the backlight PWM signal frequency reportedly
eliminates the buzzing emitted from the Pinebook Pro's built-in speakers
when certain backlight levels are set, which cause some weird interference
with some of the components of the Pinebook Pro's audio chain.
The old value for the backlight PWM period, i.e. 740,740 ns, is pretty much
an arbitrary value that was selected during the very early bring-up of the
Pinebook Pro, only because that value seemed to minimize horizontal line
distortion on the display, which resulted from the old X.org drivers causing
screen tearing when dragging windows around. That's no longer an issue, so
there are no reasons to stick with the old PWM period value.
The lower and the upper backlight PWM frequency limits for the Pinebook Pro's
panel, according to its datasheet, are 200 Hz and 10 kHz, respectively. [1]
These changes still leave some headroom, which may have some positive effects
on the lifetime expectancy of the panel's backlight LEDs.
[1] https://files.pine64.org/doc/datasheet/PinebookPro/NV140FHM-N49_Rev.P0_20160804_201710235838.pdf
Fixes: 5a65505a6988 ("arm64: dts: rockchip: Add initial support for Pinebook Pro")
Cc: stable@vger.kernel.org
Reported-by: Nikola Radojevic <nikola@radojevic.rs>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Tested-by: Nikola Radojević <nikola@radojevic.rs>
Link: https://lore.kernel.org/r/2a23b6cfd8c0513e5b233b4006ee3d3ed09b824f.1722805655.git.dsimic@manjaro.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3079fb09ddac159bd8bb87f6f15b924e265f8d4d upstream.
The DPI display interface feeds the external display pipeline. However
the pipeline representation is currently incomplete. Efforts are still
under way to come up with a way to represent the "creative" repurposing
of the DP bridge chip's internal output mux, which is meant to support
USB type-C orientation changes, to output to one of two type-C ports.
Until that is finalized, the external display can't be fully described,
and thus won't work. Even worse, the half complete graph potentially
confuses the OS, breaking the internal display as well.
Disable the external display interface across the whole Corsola family
until the DP / USB Type-C muxing graph binding is ready.
Reported-by: Alper Nebi Yasak <alpernebiyasak@gmail.com>
Closes: https://lore.kernel.org/linux-mediatek/38a703a9-6efb-456a-a248-1dd3687e526d@gmail.com/
Fixes: 8855d01fb81f ("arm64: dts: mediatek: Add MT8186 Krabby platform based Tentacruel / Tentacool")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Tested-by: Alper Nebi Yasak <alpernebiyasak@gmail.com>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://lore.kernel.org/r/20240821042836.2631815-1-wenst@chromium.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db0d8a84348b876df7c4276f0cbce5df3b769f5f upstream.
The ampere1a cpu is affected by erratum AC04_CPU_10 which is the same
bug as AC03_CPU_38. Add ampere1a to the AC03_CPU_38 workaround midr list.
Cc: <stable@vger.kernel.org>
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240827211701.2216719-1-scott@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b6db3eb6c373b97d9e433530d748590421bbeea7 upstream.
Add explicit casting to prevent expantion of 32th bit of
u32 into highest half of u64 in several places.
For example, in inject_abt64:
ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT = 0x24 << 26.
This operation's result is int with 1 in 32th bit.
While casting this value into u64 (esr is u64) 1
fills 32 highest bits.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Cc: <stable@vger.kernel.org>
Fixes: aa8eff9bfbd5 ("arm64: KVM: fault injection into a guest")
Signed-off-by: Anastasia Belova <abelova@astralinux.ru>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/stable/20240910085016.32120-1-abelova%40astralinux.ru
Link: https://lore.kernel.org/r/20240910085016.32120-1-abelova@astralinux.ru
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d92792a4b26e50b96ab734cbe203d8a4c932a7a9 upstream.
pt_event_snapshot_aux() uses pt->handle_nmi to determine if tracing
needs to be stopped, however tracing can still be going because
pt->handle_nmi is set to zero before tracing is stopped in pt_event_stop,
whereas pt_event_snapshot_aux() requires that tracing must be stopped in
order to copy a sample of trace from the buffer.
Instead call pt_config_stop() always, which anyway checks config for
RTIT_CTL_TRACEEN and does nothing if it is already clear.
Note pt_event_snapshot_aux() can continue to use pt->handle_nmi to
determine if the trace needs to be restarted afterwards.
Fixes: 25e8920b301c ("perf/x86/intel/pt: Add sampling support")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20240715160712.127117-2-adrian.hunter@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit ef493f4b122d6b14a6de111d1acac1eab1d673b0 upstream.
The BPF subsystem may capture LBR data on a counting event. However, the
current implementation assumes that LBR can/should only be used with
sampling events.
For instance, retsnoop tool ([0]) makes an extensive use of this
functionality and sets up perf event as follows:
struct perf_event_attr attr;
memset(&attr, 0, sizeof(attr));
attr.size = sizeof(attr);
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_CPU_CYCLES;
attr.sample_type = PERF_SAMPLE_BRANCH_STACK;
attr.branch_sample_type = PERF_SAMPLE_BRANCH_KERNEL;
To limit the LBR for a sampling event is to avoid unnecessary branch
stack setup for a counting event in the sample read. Because LBR is only
read in the sampling event's overflow.
Although in most cases LBR is used in sampling, there is no HW limit to
bind LBR to the sampling mode. Allow an LBR setup for a counting event
unless in the sample read mode.
Fixes: 85846b27072d ("perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag")
Closes: https://lore.kernel.org/lkml/20240905180055.1221620-1-andrii@kernel.org/
Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240909155848.326640-1-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 477d81a1c47a1b79b9c08fc92b5dea3c5143800b upstream.
common_interrupt() and related variants call kvm_set_cpu_l1tf_flush_l1d(),
which is neither marked noinstr nor __always_inline.
So compiler puts it out of line and adds instrumentation to it. Since the
call is inside of instrumentation_begin/end(), objtool does not warn about
it.
The manifestation is that KCOV produces spurious coverage in
kvm_set_cpu_l1tf_flush_l1d() in random places because the call happens when
preempt count is not yet updated to say that the kernel is in an interrupt.
Mark kvm_set_cpu_l1tf_flush_l1d() as __always_inline and move it out of the
instrumentation_begin/end() section. It only calls __this_cpu_write()
which is already safe to call in noinstr contexts.
Fixes: 6368558c3710 ("x86/entry: Provide IDTENTRY_SYSVEC")
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/3f9a1de9e415fcb53d07dc9e19fa8481bb021b1b.1718092070.git.dvyukov@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit be985531a5dd9ca50fc9f3f85b8adeb2a4a75a58 upstream.
USB 3.0 on xhci1 is not used, as the controller shares the same PHY as
pcie1. The latter is enabled to support the M.2 PCIe WLAN card on this
design.
Mark USB 3.0 as disabled on this controller using the
"mediatek,u3p-dis-msk" property.
Fixes: 96564b1e2ea4 ("arm64: dts: mediatek: Introduce the MT8395 Radxa NIO 12L board")
Cc: stable@vger.kernel.org
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20240731034411.371178-3-wenst@chromium.org
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 09d385679487c58f0859c1ad4f404ba3df2f8830 upstream.
USB 3.0 on xhci1 is not used, as the controller shares the same PHY as
pcie1. The latter is enabled to support the M.2 PCIe WLAN card on this
design.
Mark USB 3.0 as disabled on this controller using the
"mediatek,u3p-dis-msk" property.
Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> #KernelCI
Closes: https://lore.kernel.org/all/9fce9838-ef87-4d1b-b3df-63e1ddb0ec51@notapiano/
Fixes: b6267a396e1c ("arm64: dts: mediatek: cherry: Enable T-PHYs and USB XHCI controllers")
Cc: stable@vger.kernel.org
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20240731034411.371178-2-wenst@chromium.org
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 39190ac7cff1fd15135fa8e658030d9646fdb5f2 upstream.
The 'ld' and 'std' instructions require a 4-byte aligned displacement
because they are DS-form instructions. But the "m" asm constraint
doesn't enforce that.
That can lead to build errors if the compiler chooses a non-aligned
displacement, as seen with GCC 14:
/tmp/ccuSzwiR.s: Assembler messages:
/tmp/ccuSzwiR.s:2579: Error: operand out of domain (39 is not a multiple of 4)
make[5]: *** [scripts/Makefile.build:229: net/core/page_pool.o] Error 1
Dumping the generated assembler shows:
ld 8,39(8) # MEM[(const struct atomic64_t *)_29].counter, t
Use the YZ constraints to tell the compiler either to generate a DS-form
displacement, or use an X-form instruction, either of which prevents the
build error.
See commit 2d43cc701b96 ("powerpc/uaccess: Fix build errors seen with
GCC 13/14") for more details on the constraint letters.
Fixes: 9f0cbea0d8cc ("[POWERPC] Implement atomic{, 64}_{read, write}() without volatile")
Cc: stable@vger.kernel.org # v2.6.24+
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20240913125302.0a06b4c7@canb.auug.org.au
Tested-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240916120510.2017749-1-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d4fc4d01471528da8a9797a065982e05090e1d81 upstream.
TDX only supports kernel-initiated MMIO operations. The handle_mmio()
function checks if the #VE exception occurred in the kernel and rejects
the operation if it did not.
However, userspace can deceive the kernel into performing MMIO on its
behalf. For example, if userspace can point a syscall to an MMIO address,
syscall does get_user() or put_user() on it, triggering MMIO #VE. The
kernel will treat the #VE as in-kernel MMIO.
Ensure that the target MMIO address is within the kernel before decoding
instruction.
Fixes: 31d58c4e557d ("x86/tdx: Handle in-kernel MMIO")
Signed-off-by: Alexey Gladkov (Intel) <legion@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/565a804b80387970460a4ebc67c88d1380f61ad1.1726237595.git.legion%40kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 73b42dc69be8564d4951a14d00f827929fe5ef79 upstream.
Re-introduce the "split" x2APIC ICR storage that KVM used prior to Intel's
IPI virtualization support, but only for AMD. While not stated anywhere
in the APM, despite stating the ICR is a single 64-bit register, AMD CPUs
store the 64-bit ICR as two separate 32-bit values in ICR and ICR2. When
IPI virtualization (IPIv on Intel, all AVIC flavors on AMD) is enabled,
KVM needs to match CPU behavior as some ICR ICR writes will be handled by
the CPU, not by KVM.
Add a kvm_x86_ops knob to control the underlying format used by the CPU to
store the x2APIC ICR, and tune it to AMD vs. Intel regardless of whether
or not x2AVIC is enabled. If KVM is handling all ICR writes, the storage
format for x2APIC mode doesn't matter, and having the behavior follow AMD
versus Intel will provide better test coverage and ease debugging.
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20240719235107.3023592-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d33234342f8b468e719e05649fd26549fb37ef8a upstream.
Hoist kvm_x2apic_icr_write() above kvm_apic_write_nodecode() so that a
local helper to _read_ the x2APIC ICR can be added and used in the
nodecode path without needing a forward declaration.
No functional change intended.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240719235107.3023592-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 71bf395a276f0578d19e0ae137a7d1d816d08e0e upstream.
Inject a #GP on a WRMSR(ICR) that attempts to set any reserved bits that
are must-be-zero on both Intel and AMD, i.e. any reserved bits other than
the BUSY bit, which Intel ignores and basically says is undefined.
KVM's xapic_state_test selftest has been fudging the bug since commit
4b88b1a518b3 ("KVM: selftests: Enhance handling WRMSR ICR register in
x2APIC mode"), which essentially removed the testcase instead of fixing
the bug.
WARN if the nodecode path triggers a #GP, as the CPU is supposed to check
reserved bits for ICR when it's partially virtualized.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240719235107.3023592-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f26a525b77e040d584e967369af1e018d2d59112 upstream.
When we share memory through FF-A and the description of the buffers
exceeds the size of the mapped buffer, the fragmentation API is used.
The fragmentation API allows specifying chunks of descriptors in subsequent
FF-A fragment calls and no upper limit has been established for this.
The entire memory region transferred is identified by a handle which can be
used to reclaim the transferred memory.
To be able to reclaim the memory, the description of the buffers has to fit
in the ffa_desc_buf.
Add a bounds check on the FF-A sharing path to prevent the memory reclaim
from failing.
Also do_ffa_mem_xfer() does not need __always_inline, except for the
BUILD_BUG_ON() aspect, which gets moved to a macro.
[maz: fixed the BUILD_BUG_ON() breakage with LLVM, thanks to Wei-Lin Chang
for the timely report]
Fixes: 634d90cf0ac65 ("KVM: arm64: Handle FFA_MEM_LEND calls from the host")
Cc: stable@vger.kernel.org
Reviewed-by: Sebastian Ene <sebastianene@google.com>
Signed-off-by: Snehal Koukuntla <snehalreddy@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240909180154.3267939-1-snehalreddy@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9221222c717dbddac1e3c49906525475d87a3a44 upstream.
When running as a Xen PV dom0 the system needs to map ACPI data of the
host using host physical addresses, while those addresses can conflict
with the guest physical addresses of the loaded linux kernel. The same
problem might apply in case a PV guest is configured to use the host
memory map.
This conflict can be solved by mapping the ACPI data to a different
guest physical address, but mapping the data via acpi_os_ioremap()
must still be possible using the host physical address, as this
address might be generated by AML when referencing some of the ACPI
data.
When configured to support running as a Xen PV domain, have an
implementation of acpi_os_ioremap() being aware of the possibility to
need above mentioned translation of a host physical address to the
guest physical address.
This modification requires to #include linux/acpi.h in some sources
which need to include asm/acpi.h directly.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c4498ae316da5b5786ccd448fc555f3339b8e4ca upstream.
Move the checks for e820 memory map conflicts using the
xen_chk_is_e820_usable() helper further up in order to prepare
resolving some of the possible conflicts by doing some e820 map
modifications, which must happen before evaluating the RAM layout.
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 64dcf0b8779363ca07dfb5649a4cc71f9fdf390b ]
There's no such symbol and we currently don't have any of the
mechanisms to make boot-time selection cheap enough, so we can't
have HAVE_PREEMPT_DYNAMIC_CALL or HAVE_PREEMPT_DYNAMIC_KEY.
Remove the select statement.
Reported-by: Lukas Bulwahn <lbulwahn@redhat.com>
Fixes: cd01672d64a3 ("um: Enable preemption in UML")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c7f06284a6427475e3df742215535ec3f6cd9662 ]
The psc->div[] array has psc->num_div elements. These values come from
when we call clk_hw_register_div(). It's adc_divisors and
ARRAY_SIZE(adc_divisors)) and so on. So this condition needs to be >=
instead of > to prevent an out of bounds read.
Fixes: 9645ccc7bd7a ("ep93xx: clock: convert in-place to COMMON_CLK")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Acked-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Reviewed-by: Nikita Shubin <nikita.shubin@maquefel.me>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Link: https://lore.kernel.org/r/1caf01ad4c0a8069535813c26c7f0b8ea011155e.camel@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 44ac4625ea002deecd0c227336c95b724206c698 ]
Data mismatch found when testing ipsec tunnel with AES/GCM crypto.
Disabling CRYPTO_AES_GCM_P10 in Kconfig for this feature.
Fixes: fd0e9b3e2ee6 ("crypto: p10-aes-gcm - An accelerated AES/GCM stitched implementation")
Fixes: cdcecfd9991f ("crypto: p10-aes-gcm - Glue code for AES/GCM stitched implementation")
Fixes: 45a4672b9a6e2 ("crypto: p10-aes-gcm - Update Kconfig and Makefile")
Signed-off-by: Danny Tsen <dtsen@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 22ab08955ea13be04a8efd20cc30890e0afaa49c ]
The standard RISC-V calling convention said:
"The stack grows downward and the stack pointer is always
kept 16-byte aligned".
So perf_callchain_user() should check whether 16-byte aligned for fp.
Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf
Fixes: dbeb90b0c1eb ("riscv: Add perf callchain support")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dbc3171194403d0d40e4bdeae666f6e76e428b53 ]
If pcie_find_root_port() is unable to locate a Root Port, it will return
NULL. Check the pointer for NULL before dereferencing it.
This particular case is in a quirk for devices that are always below a Root
Port, so this won't avoid a problem and doesn't need to be backported, but
check as a matter of style and to prevent copy/paste mistakes.
Link: https://lore.kernel.org/r/20240812202659.1649121-1-samasth.norway.ananda@oracle.com
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
[bhelgaas: drop Fixes: and explain why there's no problem in this case]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f101b305a7b9513a8042a2cf09018de4ff371af2 ]
Add the missing pieces so the early program check handler also works
with a relocated lowcore. Right now the result of an early program
check in case of a relocated lowcore would be a program check loop.
Fixes: 8f1e70adb1a3 ("s390/boot: Add cmdline option to relocate lowcore")
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f2bb5b97b51ce5425e8f59f899643ce4eadba667 ]
Have all program check handlers in one file to make future changes easy.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Stable-dep-of: f101b305a7b9 ("s390/entry: Make early program check handler relocated lowcore aware")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 66ff4d61dc124eafe9efaeaef696a09b7f236da2 ]
This patch fixes a tailcall issue caused by abusing the tailcall in
bpf2bpf feature on arm64 like the way of "bpf, x64: Fix tailcall
hierarchy".
On arm64, when a tail call happens, it uses tail_call_cnt_ptr to
increment tail_call_cnt, too.
At the prologue of main prog, it has to initialize tail_call_cnt and
prepare tail_call_cnt_ptr.
At the prologue of subprog, it pushes x26 register twice, and does not
initialize tail_call_cnt.
At the epilogue, it pops x26 twice, no matter whether it is main prog or
subprog.
Fixes: d4609a5d8c70 ("bpf, arm64: Keep tail call count across bpf2bpf calls")
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Link: https://lore.kernel.org/r/20240714123902.32305-3-hffilwlqm@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 116e04ba1459fc08f80cf27b8c9f9f188be0fcb2 ]
This patch fixes a tailcall issue caused by abusing the tailcall in
bpf2bpf feature.
As we know, tail_call_cnt propagates by rax from caller to callee when
to call subprog in tailcall context. But, like the following example,
MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt
back-propagation from callee to caller.
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");
int count = 0;
static __noinline
int subprog_tail1(struct __sk_buff *skb)
{
bpf_tail_call_static(skb, &jmp_table, 0);
return 0;
}
static __noinline
int subprog_tail2(struct __sk_buff *skb)
{
bpf_tail_call_static(skb, &jmp_table, 0);
return 0;
}
SEC("tc")
int entry(struct __sk_buff *skb)
{
volatile int ret = 1;
count++;
subprog_tail1(skb);
subprog_tail2(skb);
return ret;
}
char __license[] SEC("license") = "GPL";
At run time, the tail_call_cnt in entry() will be propagated to
subprog_tail1() and subprog_tail2(). But, when the tail_call_cnt in
subprog_tail1() updates when bpf_tail_call_static(), the tail_call_cnt
in entry() won't be updated at the same time. As a result, in entry(),
when tail_call_cnt in entry() is less than MAX_TAIL_CALL_CNT and
subprog_tail1() returns because of MAX_TAIL_CALL_CNT limit,
bpf_tail_call_static() in suprog_tail2() is able to run because the
tail_call_cnt in subprog_tail2() propagated from entry() is less than
MAX_TAIL_CALL_CNT.
So, how many tailcalls are there for this case if no error happens?
From top-down view, does it look like hierarchy layer and layer?
With this view, there will be 2+4+8+...+2^33 = 2^34 - 2 = 17,179,869,182
tailcalls for this case.
How about there are N subprog_tail() in entry()? There will be almost
N^34 tailcalls.
Then, in this patch, it resolves this case on x86_64.
In stead of propagating tail_call_cnt from caller to callee, it
propagates its pointer, tail_call_cnt_ptr, tcc_ptr for short.
However, where does it store tail_call_cnt?
It stores tail_call_cnt on the stack of main prog. When tail call
happens in subprog, it increments tail_call_cnt by tcc_ptr.
Meanwhile, it stores tail_call_cnt_ptr on the stack of main prog, too.
And, before jump to tail callee, it has to pop tail_call_cnt and
tail_call_cnt_ptr.
Then, at the prologue of subprog, it must not make rax as
tail_call_cnt_ptr again. It has to reuse tail_call_cnt_ptr from caller.
As a result, at run time, it has to recognize rax is tail_call_cnt or
tail_call_cnt_ptr at prologue by:
1. rax is tail_call_cnt if rax is <= MAX_TAIL_CALL_CNT.
2. rax is tail_call_cnt_ptr if rax is > MAX_TAIL_CALL_CNT, because a
pointer won't be <= MAX_TAIL_CALL_CNT.
Here's an example to dump JITed.
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");
int count = 0;
static __noinline
int subprog_tail(struct __sk_buff *skb)
{
bpf_tail_call_static(skb, &jmp_table, 0);
return 0;
}
SEC("tc")
int entry(struct __sk_buff *skb)
{
int ret = 1;
count++;
subprog_tail(skb);
subprog_tail(skb);
return ret;
}
When bpftool p d j id 42:
int entry(struct __sk_buff * skb):
bpf_prog_0c0f4c2413ef19b1_entry:
; int entry(struct __sk_buff *skb)
0: endbr64
4: nopl (%rax,%rax)
9: xorq %rax, %rax ;; rax = 0 (tail_call_cnt)
c: pushq %rbp
d: movq %rsp, %rbp
10: endbr64
14: cmpq $33, %rax ;; if rax > 33, rax = tcc_ptr
18: ja 0x20 ;; if rax > 33 goto 0x20 ---+
1a: pushq %rax ;; [rbp - 8] = rax = 0 |
1b: movq %rsp, %rax ;; rax = rbp - 8 |
1e: jmp 0x21 ;; ---------+ |
20: pushq %rax ;; <--------|---------------+
21: pushq %rax ;; <--------+ [rbp - 16] = rax
22: pushq %rbx ;; callee saved
23: movq %rdi, %rbx ;; rbx = skb (callee saved)
; count++;
26: movabsq $-82417199407104, %rdi
30: movl (%rdi), %esi
33: addl $1, %esi
36: movl %esi, (%rdi)
; subprog_tail(skb);
39: movq %rbx, %rdi ;; rdi = skb
3c: movq -16(%rbp), %rax ;; rax = tcc_ptr
43: callq 0x80 ;; call subprog_tail()
; subprog_tail(skb);
48: movq %rbx, %rdi ;; rdi = skb
4b: movq -16(%rbp), %rax ;; rax = tcc_ptr
52: callq 0x80 ;; call subprog_tail()
; return ret;
57: movl $1, %eax
5c: popq %rbx
5d: leave
5e: retq
int subprog_tail(struct __sk_buff * skb):
bpf_prog_3a140cef239a4b4f_subprog_tail:
; int subprog_tail(struct __sk_buff *skb)
0: endbr64
4: nopl (%rax,%rax)
9: nopl (%rax) ;; do not touch tail_call_cnt
c: pushq %rbp
d: movq %rsp, %rbp
10: endbr64
14: pushq %rax ;; [rbp - 8] = rax (tcc_ptr)
15: pushq %rax ;; [rbp - 16] = rax (tcc_ptr)
16: pushq %rbx ;; callee saved
17: pushq %r13 ;; callee saved
19: movq %rdi, %rbx ;; rbx = skb
; asm volatile("r1 = %[ctx]\n\t"
1c: movabsq $-105487587488768, %r13 ;; r13 = jmp_table
26: movq %rbx, %rdi ;; 1st arg, skb
29: movq %r13, %rsi ;; 2nd arg, jmp_table
2c: xorl %edx, %edx ;; 3rd arg, index = 0
2e: movq -16(%rbp), %rax ;; rax = [rbp - 16] (tcc_ptr)
35: cmpq $33, (%rax)
39: jae 0x4e ;; if *tcc_ptr >= 33 goto 0x4e --------+
3b: jmp 0x4e ;; jmp bypass, toggled by poking |
40: addq $1, (%rax) ;; (*tcc_ptr)++ |
44: popq %r13 ;; callee saved |
46: popq %rbx ;; callee saved |
47: popq %rax ;; undo rbp-16 push |
48: popq %rax ;; undo rbp-8 push |
49: nopl (%rax,%rax) ;; tail call target, toggled by poking |
; return 0; ;; |
4e: popq %r13 ;; restore callee saved <--------------+
50: popq %rbx ;; restore callee saved
51: leave
52: retq
Furthermore, when trampoline is the caller of bpf prog, which is
tail_call_reachable, it is required to propagate rax through trampoline.
Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT")
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Link: https://lore.kernel.org/r/20240714123902.32305-2-hffilwlqm@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit be35d91c8880650404f3bf813573222dfb106935 ]
In order to minimize required special handling for running as Xen PV
dom0, the memory layout is modified to match that of the host. This
requires to have only RAM at the locations where Xen allocated memory
is living. Unfortunately there seem to be some machines, where ACPI
NVS is located at 64 MB, resulting in a conflict with the loaded
kernel or the initial page tables built by Xen.
Avoid this conflict by swapping the ACPI NVS area in the memory map
with unused RAM. This is possible via modification of the dom0 P2M map.
Accesses to the ACPI NVS area are done either for saving and restoring
it across suspend operations (this will work the same way as before),
or by ACPI code when NVS memory is referenced from other ACPI tables.
The latter case is handled by a Xen specific indirection of
acpi_os_ioremap().
While the E820 map can (and should) be modified right away, the P2M
map can be updated only after memory allocation is working, as the P2M
map might need to be extended.
Fixes: 808fdb71936c ("xen: check for kernel memory conflicting with memory layout")
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d05208cf7f05420ad10cc7f9550f91d485523659 ]
When running as a Xen PV dom0 it can happen that the kernel is being
loaded to a guest physical address conflicting with the host memory
map.
In order to be able to resolve this conflict, add the capability to
remap non-RAM areas to different guest PFNs. A function to use this
remapping information for other purposes than doing the remap will be
added when needed.
As the number of conflicts should be rather low (currently only
machines with max. 1 conflict are known), save the remap data in a
small statically allocated array.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Stable-dep-of: be35d91c8880 ("xen: tolerate ACPI NVS memory overlapping with Xen allocated memory")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 43dc2a0f479b9cd30f6674986d7a40517e999d31 ]
Instead of having max_pfn as a local variable of xen_memory_setup(),
make it a static variable in setup.c instead. This avoids having to
pass it to subfunctions, which will be needed in more cases in future.
Rename it to ini_nr_pages, as the value denotes the currently usable
number of memory pages as passed from the hypervisor at boot time.
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Stable-dep-of: be35d91c8880 ( |