diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-07-16 11:12:25 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-07-16 11:12:25 -0700 |
| commit | 408323581b722c9bd504dd296920f392049a7f52 (patch) | |
| tree | 651e7d137b01ee1a3cca49787c014aba1e42652e | |
| parent | b84b3381907a3c5c6f1d524185eddc55547068b7 (diff) | |
| parent | 5fa96c7ab3dc666c2904a35895635156c17a8f05 (diff) | |
| download | linux-408323581b722c9bd504dd296920f392049a7f52.tar.gz linux-408323581b722c9bd504dd296920f392049a7f52.tar.bz2 linux-408323581b722c9bd504dd296920f392049a7f52.zip | |
Merge tag 'x86_sev_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Add support for running the kernel in a SEV-SNP guest, over a Secure
VM Service Module (SVSM).
When running over a SVSM, different services can run at different
protection levels, apart from the guest OS but still within the
secure SNP environment. They can provide services to the guest, like
a vTPM, for example.
This series adds the required facilities to interface with such a
SVSM module.
- The usual fixlets, refactoring and cleanups
[ And as always: "SEV" is AMD's "Secure Encrypted Virtualization".
I can't be the only one who gets all the newer x86 TLA's confused,
can I?
- Linus ]
* tag 'x86_sev_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/ABI/configfs-tsm: Fix an unexpected indentation silly
x86/sev: Do RMP memory coverage check after max_pfn has been set
x86/sev: Move SEV compilation units
virt: sev-guest: Mark driver struct with __refdata to prevent section mismatch
x86/sev: Allow non-VMPL0 execution when an SVSM is present
x86/sev: Extend the config-fs attestation support for an SVSM
x86/sev: Take advantage of configfs visibility support in TSM
fs/configfs: Add a callback to determine attribute visibility
sev-guest: configfs-tsm: Allow the privlevel_floor attribute to be updated
virt: sev-guest: Choose the VMPCK key based on executing VMPL
x86/sev: Provide guest VMPL level to userspace
x86/sev: Provide SVSM discovery support
x86/sev: Use the SVSM to create a vCPU when not in VMPL0
x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0
x86/sev: Use kernel provided SVSM Calling Areas
x86/sev: Check for the presence of an SVSM in the SNP secrets page
x86/irqflags: Provide native versions of the local_irq_save()/restore()
24 files changed, 1658 insertions, 189 deletions
diff --git a/Documentation/ABI/testing/configfs-tsm b/Documentation/ABI/testing/configfs-tsm index dd24202b5ba5..534408bc1408 100644 --- a/Documentation/ABI/testing/configfs-tsm +++ b/Documentation/ABI/testing/configfs-tsm @@ -31,6 +31,18 @@ Description: Standardization v2.03 Section 4.1.8.1 MSG_REPORT_REQ. https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/56421.pdf +What: /sys/kernel/config/tsm/report/$name/manifestblob +Date: January, 2024 +KernelVersion: v6.10 +Contact: linux-coco@lists.linux.dev +Description: + (RO) Optional supplemental data that a TSM may emit, visibility + of this attribute depends on TSM, and may be empty if no + manifest data is available. + + See 'service_provider' for information on the format of the + manifest blob. + What: /sys/kernel/config/tsm/report/$name/provider Date: September, 2023 KernelVersion: v6.7 @@ -80,3 +92,54 @@ Contact: linux-coco@lists.linux.dev Description: (RO) Indicates the minimum permissible value that can be written to @privlevel. + +What: /sys/kernel/config/tsm/report/$name/service_provider +Date: January, 2024 +KernelVersion: v6.10 +Contact: linux-coco@lists.linux.dev +Description: + (WO) Attribute is visible if a TSM implementation provider + supports the concept of attestation reports from a service + provider for TVMs, like SEV-SNP running under an SVSM. + Specifying the service provider via this attribute will create + an attestation report as specified by the service provider. + The only currently supported service provider is "svsm". + + For the "svsm" service provider, see the Secure VM Service Module + for SEV-SNP Guests v1.00 Section 7. For the doc, search for + "site:amd.com "Secure VM Service Module for SEV-SNP + Guests", docID: 58019" + +What: /sys/kernel/config/tsm/report/$name/service_guid +Date: January, 2024 +KernelVersion: v6.10 +Contact: linux-coco@lists.linux.dev +Description: + (WO) Attribute is visible if a TSM implementation provider + supports the concept of attestation reports from a service + provider for TVMs, like SEV-SNP running under an SVSM. + Specifying an empty/null GUID (00000000-0000-0000-0000-000000) + requests all active services within the service provider be + part of the attestation report. Specifying a GUID request + an attestation report of just the specified service using the + manifest form specified by the service_manifest_version + attribute. + + See 'service_provider' for information on the format of the + service guid. + +What: /sys/kernel/config/tsm/report/$name/service_manifest_version +Date: January, 2024 +KernelVersion: v6.10 +Contact: linux-coco@lists.linux.dev +Description: + (WO) Attribute is visible if a TSM implementation provider + supports the concept of attestation reports from a service + provider for TVMs, like SEV-SNP running under an SVSM. + Indicates the service manifest version requested for the + attestation report (default 0). If this field is not set by + the user, the default manifest version of the service (the + service's initial/first manifest version) is returned. + + See 'service_provider' for information on the format of the + service manifest version. diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 53ed1a803422..325873385b71 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -605,6 +605,18 @@ Description: Umwait control Note that a value of zero means there is no limit. Low order two bits must be zero. +What: /sys/devices/system/cpu/sev + /sys/devices/system/cpu/sev/vmpl +Date: May 2024 +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> +Description: Secure Encrypted Virtualization (SEV) information + + This directory is only present when running as an SEV-SNP guest. + + vmpl: Reports the Virtual Machine Privilege Level (VMPL) at which + the SEV-SNP guest is running. + + What: /sys/devices/system/cpu/svm Date: August 2019 Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> diff --git a/Documentation/arch/x86/amd-memory-encryption.rst b/Documentation/arch/x86/amd-memory-encryption.rst index 414bc7402ae7..6df3264f23b9 100644 --- a/Documentation/arch/x86/amd-memory-encryption.rst +++ b/Documentation/arch/x86/amd-memory-encryption.rst @@ -130,4 +130,31 @@ SNP feature support. More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR -[1] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf +Secure VM Service Module (SVSM) +=============================== +SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which +defines four privilege levels at which guest software can run. The most +privileged level is 0 and numerically higher numbers have lesser privileges. +More details in the AMD64 APM Vol 2, section "15.35.7 Virtual Machine +Privilege Levels", docID: 24593. + +When using that feature, different services can run at different protection +levels, apart from the guest OS but still within the secure SNP environment. +They can provide services to the guest, like a vTPM, for example. + +When a guest is not running at VMPL0, it needs to communicate with the software +running at VMPL0 to perform privileged operations or to interact with secure +services. An example fur such a privileged operation is PVALIDATE which is +*required* to be executed at VMPL0. + +In this scenario, the software running at VMPL0 is usually called a Secure VM +Service Module (SVSM). Discovery of an SVSM and the API used to communicate +with it is documented in "Secure VM Service Module for SEV-SNP Guests", docID: +58019. + +(Latest versions of the above-mentioned documents can be found by using +a search engine like duckduckgo.com and typing in: + + site:amd.com "Secure VM Service Module for SEV-SNP Guests", docID: 58019 + +for example.) diff --git a/Documentation/virt/coco/sev-guest.rst b/Documentation/virt/coco/sev-guest.rst index e1eaf6a830ce..9d00967a5b2b 100644 --- a/Documentation/virt/coco/sev-guest.rst +++ b/Documentation/virt/coco/sev-guest.rst @@ -204,6 +204,17 @@ has taken care to make use of the SEV-SNP CPUID throughout all stages of boot. Otherwise, guest owner attestation provides no assurance that the kernel wasn't fed incorrect values at some point during boot. +4. SEV Guest Driver Communication Key +===================================== + +Communication between an SEV guest and the SEV firmware in the AMD Secure +Processor (ASP, aka PSP) is protected by a VM Platform Communication Key +(VMPCK). By default, the sev-guest driver uses the VMPCK associated with the +VM Privilege Level (VMPL) at which the guest is running. Should this key be +wiped by the sev-guest driver (see the driver for reasons why a VMPCK can be +wiped), a different key can be used by reloading the sev-guest driver and +specifying the desired key using the vmpck_id module parameter. + Reference --------- diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index 0457a9d7e515..cd44e120fe53 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -127,7 +127,35 @@ static bool fault_in_kernel_space(unsigned long address) #include "../../lib/insn.c" /* Include code for early handlers */ -#include "../../kernel/sev-shared.c" +#include "../../coco/sev/shared.c" + +static struct svsm_ca *svsm_get_caa(void) +{ + return boot_svsm_caa; +} + +static u64 svsm_get_caa_pa(void) +{ + return boot_svsm_caa_pa; +} + +static int svsm_perform_call_protocol(struct svsm_call *call) +{ + struct ghcb *ghcb; + int ret; + + if (boot_ghcb) + ghcb = boot_ghcb; + else + ghcb = NULL; + + do { + ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call) + : svsm_perform_msr_protocol(call); + } while (ret == -EAGAIN); + + return ret; +} bool sev_snp_enabled(void) { @@ -145,8 +173,8 @@ static void __page_state_change(unsigned long paddr, enum psc_op op) * If private -> shared then invalidate the page before requesting the * state change in the RMP table. */ - if (op == SNP_PAGE_STATE_SHARED && pvalidate(paddr, RMP_PG_SIZE_4K, 0)) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); + if (op == SNP_PAGE_STATE_SHARED) + pvalidate_4k_page(paddr, paddr, false); /* Issue VMGEXIT to change the page state in RMP table. */ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op)); @@ -161,8 +189,8 @@ static void __page_state_change(unsigned long paddr, enum psc_op op) * Now that page state is changed in the RMP table, validate it so that it is * consistent with the RMP entry. */ - if (op == SNP_PAGE_STATE_PRIVATE && pvalidate(paddr, RMP_PG_SIZE_4K, 1)) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); + if (op == SNP_PAGE_STATE_PRIVATE) + pvalidate_4k_page(paddr, paddr, true); } void snp_set_page_private(unsigned long paddr) @@ -256,6 +284,16 @@ void sev_es_shutdown_ghcb(void) error("SEV-ES CPU Features missing."); /* + * This denotes whether to use the GHCB MSR protocol or the GHCB + * shared page to perform a GHCB request. Since the GHCB page is + * being changed to encrypted, it can't be used to perform GHCB + * requests. Clear the boot_ghcb variable so that the GHCB MSR + * protocol is used to change the GHCB page over to an encrypted + * page. + */ + boot_ghcb = NULL; + + /* * GHCB Page must be flushed from the cache and mapped encrypted again. * Otherwise the running kernel will see strange cache effects when * trying to use that page. @@ -463,6 +501,13 @@ static bool early_snp_init(struct boot_params *bp) setup_cpuid_table(cc_info); /* + * Record the SVSM Calling Area (CA) address if the guest is not + * running at VMPL0. The CA will be used to communicate with the + * SVSM and request its services. + */ + svsm_setup_ca(cc_info); + + /* * Pass run-time kernel a pointer to CC info via boot_params so EFI * config table doesn't need to be searched again during early startup * phase. @@ -565,22 +610,31 @@ void sev_enable(struct boot_params *bp) * features. */ if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) { - if (!(get_hv_features() & GHCB_HV_FT_SNP)) + u64 hv_features; + int ret; + + hv_features = get_hv_features(); + if (!(hv_features & GHCB_HV_FT_SNP)) sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); /* - * Enforce running at VMPL0. - * - * RMPADJUST modifies RMP permissions of a lesser-privileged (numerically - * higher) privilege level. Here, clear the VMPL1 permission mask of the - * GHCB page. If the guest is not running at VMPL0, this will fail. + * Enforce running at VMPL0 or with an SVSM. * - * If the guest is running at VMPL0, it will succeed. Even if that operation - * modifies permission bits, it is still ok to do so currently because Linux - * SNP guests running at VMPL0 only run at VMPL0, so VMPL1 or higher - * permission mask changes are a don't-care. + * Use RMPADJUST (see the rmpadjust() function for a description of + * what the instruction does) to update the VMPL1 permissions of a + * page. If the guest is running at VMPL0, this will succeed. If the + * guest is running at any other VMPL, this will fail. Linux SNP guests + * only ever run at a single VMPL level so permission mask changes of a + * lesser-privileged VMPL are a don't-care. + */ + ret = rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, 1); + + /* + * Running at VMPL0 is not required if an SVSM is present and the hypervisor + * supports the required SVSM GHCB events. */ - if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, 1)) + if (ret && + !(snp_vmpl && (hv_features & GHCB_HV_FT_SNP_MULTI_VMPL))) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0); } diff --git a/arch/x86/coco/Makefile b/arch/x86/coco/Makefile index c816acf78b6a..eabdc7486538 100644 --- a/arch/x86/coco/Makefile +++ b/arch/x86/coco/Makefile @@ -6,3 +6,4 @@ CFLAGS_core.o += -fno-stack-protector obj-y += core.o obj-$(CONFIG_INTEL_TDX_GUEST) += tdx/ +obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev/ diff --git a/arch/x86/coco/sev/Makefile b/arch/x86/coco/sev/Makefile new file mode 100644 index 000000000000..4e375e7305ac --- /dev/null +++ b/arch/x86/coco/sev/Makefile @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-y += core.o + +ifdef CONFIG_FUNCTION_TRACER +CFLAGS_REMOVE_core.o = -pg +endif + +KASAN_SANITIZE_core.o := n +KMSAN_SANITIZE_core.o := n +KCOV_INSTRUMENT_core.o := n + +# With some compiler versions the generated code results in boot hangs, caused +# by several compilation units. To be safe, disable all instrumentation. +KCSAN_SANITIZE := n diff --git a/arch/x86/kernel/sev.c b/arch/x86/coco/sev/core.c index 3342ed58e168..082d61d85dfc 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/coco/sev/core.c @@ -133,16 +133,20 @@ struct ghcb_state { struct ghcb *ghcb; }; +/* For early boot SVSM communication */ +static struct svsm_ca boot_svsm_ca_page __aligned(PAGE_SIZE); + static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data); static DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa); +static DEFINE_PER_CPU(struct svsm_ca *, svsm_caa); +static DEFINE_PER_CPU(u64, svsm_caa_pa); struct sev_config { __u64 debug : 1, /* - * A flag used by __set_pages_state() that indicates when the - * per-CPU GHCB has been created and registered and thus can be - * used by the BSP instead of the early boot GHCB. + * Indicates when the per-CPU GHCB has been created and registered + * and thus can be used by the BSP instead of the early boot GHCB. * * For APs, the per-CPU GHCB is created before they are started * and registered upon startup, so this flag can be used globally @@ -150,6 +154,15 @@ struct sev_config { */ ghcbs_initialized : 1, + /* + * Indicates when the per-CPU SVSM CA is to be used instead of the + * boot SVSM CA. + * + * For APs, the per-CPU SVSM CA is created as part of the AP + * bringup, so this flag can be used globally for the BSP and APs. + */ + use_cas : 1, + __reserved : 62; }; @@ -572,8 +585,61 @@ fault: return ES_EXCEPTION; } +static __always_inline void vc_forward_exception(struct es_em_ctxt *ctxt) +{ + long error_code = ctxt->fi.error_code; + int trapnr = ctxt->fi.vector; + + ctxt->regs->orig_ax = ctxt->fi.error_code; + + switch (trapnr) { + case X86_TRAP_GP: + exc_general_protection(ctxt->regs, error_code); + break; + case X86_TRAP_UD: + exc_invalid_op(ctxt->regs); + break; + case X86_TRAP_PF: + write_cr2(ctxt->fi.cr2); + exc_page_fault(ctxt->regs, error_code); + break; + case X86_TRAP_AC: + exc_alignment_check(ctxt->regs, error_code); + break; + default: + pr_emerg("Unsupported exception in #VC instruction emulation - can't continue\n"); + BUG(); + } +} + /* Include code shared with pre-decompression boot stage */ -#include "sev-shared.c" +#include "shared.c" + +static inline struct svsm_ca *svsm_get_caa(void) +{ + /* + * Use rIP-relative references when called early in the boot. If + * ->use_cas is set, then it is late in the boot and no need + * to worry about rIP-relative references. + */ + if (RIP_REL_REF(sev_cfg).use_cas) + return this_cpu_read(svsm_caa); + else + return RIP_REL_REF(boot_svsm_caa); +} + +static u64 svsm_get_caa_pa(void) +{ + /* + * Use rIP-relative references when called early in the boot. If + * ->use_cas is set, then it is late in the boot and no need + * to worry about rIP-relative references. + */ + if (RIP_REL_REF(sev_cfg).use_cas) + return this_cpu_read(svsm_caa_pa); + else + return RIP_REL_REF(boot_svsm_caa_pa); +} static noinstr void __sev_put_ghcb(struct ghcb_state *state) { @@ -600,6 +666,44 @@ static noinstr void __sev_put_ghcb(struct ghcb_state *state) } } +static int svsm_perform_call_protocol(struct svsm_call *call) +{ + struct ghcb_state state; + unsigned long flags; + struct ghcb *ghcb; + int ret; + + /* + * This can be called very early in the boot, use native functions in + * order to avoid paravirt issues. + */ + flags = native_local_irq_save(); + + /* + * Use rip-relative references when called early in the boot. If + * ghcbs_initialized is set, then it is late in the boot and no need + * to worry about rip-relative references in called functions. + */ + if (RIP_REL_REF(sev_cfg).ghcbs_initialized) + ghcb = __sev_get_ghcb(&state); + else if (RIP_REL_REF(boot_ghcb)) + ghcb = RIP_REL_REF(boot_ghcb); + else + ghcb = NULL; + + do { + ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call) + : svsm_perform_msr_protocol(call); + } while (ret == -EAGAIN); + + if (RIP_REL_REF(sev_cfg).ghcbs_initialized) + __sev_put_ghcb(&state); + + native_local_irq_restore(flags); + + return ret; +} + void noinstr __sev_es_nmi_complete(void) { struct ghcb_state state; @@ -709,7 +813,6 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr, { unsigned long paddr_end; u64 val; - int ret; vaddr = vaddr & PAGE_MASK; @@ -717,12 +820,9 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr, paddr_end = paddr + (npages << PAGE_SHIFT); while (paddr < paddr_end) { - if (op == SNP_PAGE_STATE_SHARED) { - /* Page validation must be rescinded before changing to shared */ - ret = pvalidate(vaddr, RMP_PG_SIZE_4K, false); - if (WARN(ret, "Failed to validate address 0x%lx ret %d", paddr, ret)) - goto e_term; - } + /* Page validation must be rescinded before changing to shared */ + if (op == SNP_PAGE_STATE_SHARED) + pvalidate_4k_page(vaddr, paddr, false); /* * Use the MSR protocol because this function can be called before @@ -744,12 +844,9 @@ early_set_pages_state(unsigned long vaddr, unsigned long paddr, paddr, GHCB_MSR_PSC_RESP_VAL(val))) goto e_term; - if (op == SNP_PAGE_STATE_PRIVATE) { - /* Page validation must be performed after changing to private */ - ret = pvalidate(vaddr, RMP_PG_SIZE_4K, true); - if (WARN(ret, "Failed to validate address 0x%lx ret %d", paddr, ret)) - goto e_term; - } + /* Page validation must be performed after changing to private */ + if (op == SNP_PAGE_STATE_PRIVATE) + pvalidate_4k_page(vaddr, paddr, true); vaddr += PAGE_SIZE; paddr += PAGE_SIZE; @@ -913,22 +1010,49 @@ void snp_accept_memory(phys_addr_t start, phys_addr_t end) set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE); } -static int snp_set_vmsa(void *va, bool vmsa) +static int snp_set_vmsa(void *va, void *caa, int apic_id, bool make_vmsa) { - u64 attrs; + int ret; - /* - * Running at VMPL0 allows the kernel to change the VMSA bit for a page - * using the RMPADJUST instruction. However, for the instruction to - * succeed it must target the permissions of a lesser privileged - * (higher numbered) VMPL level, so use VMPL1 (refer to the RMPADJUST - * instruction in the AMD64 APM Volume 3). - */ - attrs = 1; - if (vmsa) - attrs |= RMPADJUST_VMSA_PAGE_BIT; + if (snp_vmpl) { + struct svsm_call call = {}; + unsigned long flags; + + local_irq_save(flags); - return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs); + call.caa = this_cpu_read(svsm_caa); + call.rcx = __pa(va); + + if (make_vmsa) { + /* Protocol 0, Call ID 2 */ + call.rax = SVSM_CORE_CALL(SVSM_CORE_CREATE_VCPU); + call.rdx = __pa(caa); + call.r8 = apic_id; + } else { + /* Protocol 0, Call ID 3 */ + call.rax = SVSM_CORE_CALL(SVSM_CORE_DELETE_VCPU); + } + + ret = svsm_perform_call_protocol(&call); + + local_irq_restore(flags); + } else { + /* + * If the kernel runs at VMPL0, it can change the VMSA + * bit for a page using the RMPADJUST instruction. + * However, for the instruction to succeed it must + * target the permissions of a lesser privileged (higher + * numbered) VMPL level, so use VMPL1. + */ + u64 attrs = 1; + + if (make_vmsa) + attrs |= RMPADJUST_VMSA_PAGE_BIT; + + ret = rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs); + } + + return ret; } #define __ATTR_BASE (SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK) @@ -962,11 +1086,11 @@ static void *snp_alloc_vmsa_page(int cpu) return page_address(p + 1); } -static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa) +static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa, int apic_id) { int err; - err = snp_set_vmsa(vmsa, false); + err = snp_set_vmsa(vmsa, NULL, apic_id, false); if (err) pr_err("clear VMSA page failed (%u), leaking page\n", err); else @@ -977,6 +1101,7 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) { struct sev_es_save_area *cur_vmsa, *vmsa; struct ghcb_state state; + struct svsm_ca *caa; unsigned long flags; struct ghcb *ghcb; u8 sipi_vector; @@ -1023,6 +1148,9 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) if (!vmsa) return -ENOMEM; + /* If an SVSM is present, the SVSM per-CPU CAA will be !NULL */ + caa = per_cpu(svsm_caa, cpu); + /* CR4 should maintain the MCE value */ cr4 = native_read_cr4() & X86_CR4_MCE; @@ -1070,11 +1198,11 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) * VMPL level * SEV_FEATURES (matches the SEV STATUS MSR right shifted 2 bits) */ - vmsa->vmpl = 0; + vmsa->vmpl = snp_vmpl; vmsa->sev_features = sev_status >> 2; /* Switch the page over to a VMSA page now that it is initialized */ - ret = snp_set_vmsa(vmsa, true); + ret = snp_set_vmsa(vmsa, caa, apic_id, true); if (ret) { pr_err("set VMSA page failed (%u)\n", ret); free_page((unsigned long)vmsa); @@ -1090,7 +1218,10 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) vc_ghcb_invalidate(ghcb); ghcb_set_rax(ghcb, vmsa->sev_features); ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION); - ghcb_set_sw_exit_info_1(ghcb, ((u64)apic_id << 32) | SVM_VMGEXIT_AP_CREATE); + ghcb_set_sw_exit_info_1(ghcb, + ((u64)apic_id << 32) | + ((u64)snp_vmpl << 16) | + SVM_VMGEXIT_AP_CREATE); ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa)); sev_es_wr_ghcb_msr(__pa(ghcb)); @@ -1108,13 +1239,13 @@ static int wakeup_cpu_via_vmgexit(u32 apic_id, unsigned long start_ip) /* Perform cleanup if there was an error */ if (ret) { - snp_cleanup_vmsa(vmsa); + snp_cleanup_vmsa(vmsa, apic_id); vmsa = NULL; } /* Free up any previous VMSA page */ if (cur_vmsa) - snp_cleanup_vmsa(cur_vmsa); + snp_cleanup_vmsa(cur_vmsa, apic_id); /* Record the current VMSA page */ per_cpu(sev_vmsa, cpu) = vmsa; @@ -1209,6 +1340,17 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) /* Is it a WRMSR? */ exit_info_1 = (ctxt->insn.opcode.bytes[1] == 0x30) ? 1 : 0; + if (regs->cx == MSR_SVSM_CAA) { + /* Writes to the SVSM CAA msr are ignored */ + if (exit_info_1) + return ES_OK; + + regs->ax = lower_32_bits(this_cpu_read(svsm_caa_pa)); + regs->dx = upper_32_bits(this_cpu_read(svsm_caa_pa)); + + return ES_OK; + } + ghcb_set_rcx(ghcb, regs->cx); if (exit_info_1) { ghcb_set_rax(ghcb, regs->ax); @@ -1346,6 +1488,18 @@ static void __init alloc_runtime_data(int cpu) panic("Can't allocate SEV-ES runtime data"); per_cpu(runtime_data, cpu) = data; + + if (snp_vmpl) { + struct svsm_ca *caa; + + /* Allocate the SVSM CA page if an SVSM is present */ + caa = memblock_alloc(sizeof(*caa), PAGE_SIZE); + if (!caa) + panic("Can't allocate SVSM CA page\n"); + + per_cpu(svsm_caa, cpu) = caa; + per_cpu(svsm_caa_pa, cpu) = __pa(caa); + } } static void __init init_ghcb(int cpu) @@ -1395,6 +1549,32 @@ void __init sev_es_init_vc_handling(void) init_ghcb(cpu); } + /* If running under an SVSM, switch to the per-cpu CA */ + if (snp_vmpl) { + struct svsm_call call = {}; + unsigned long flags; + int ret; + + local_irq_save(flags); + + /* + * SVSM_CORE_REMAP_CA call: + * RAX = 0 (Protocol=0, CallID=0) + * RCX = New CA GPA + */ + call.caa = svsm_get_caa(); + call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA); + call.rcx = this_cpu_read(svsm_caa_pa); + ret = svsm_perform_call_protocol(&call); + if (ret) + panic("Can't remap the SVSM CA, ret=%d, rax_out=0x%llx\n", + ret, call.rax_out); + + sev_cfg.use_cas = true; + + local_irq_restore(flags); + } + sev_es_setup_play_dead(); /* Secondary CPUs use the runtime #VC handler */ @@ -1819,33 +1999,6 @@ static enum es_result vc_handle_exitcode(struct es_em_ctxt *ctxt, return result; } -static __always_inline void vc_forward_exception(struct es_em_ctxt *ctxt) -{ - long error_code = ctxt->fi.error_code; - int trapnr = ctxt->fi.vector; - - ctxt->regs->orig_ax = ctxt->fi.error_code; - - switch (trapnr) { - case X86_TRAP_GP: - exc_general_protection(ctxt->regs, error_code); - break; - case X86_TRAP_UD: - exc_invalid_op(ctxt->regs); - break; - case X86_TRAP_PF: - write_cr2(ctxt->fi.cr2); - exc_page_fault(ctxt->regs, error_code); - break; - case X86_TRAP_AC: - exc_alignment_check(ctxt->regs, error_code); - break; - default: - pr_emerg("Unsupported exception in #VC instruction emulation - can't continue\n"); - BUG(); - } -} - static __always_inline bool is_vc2_stack(unsigned long sp) { return (sp >= __this_cpu_ist_bottom_va(VC2) && sp < __this_cpu_ist_top_va(VC2)); @@ -2095,6 +2248,47 @@ found_cc_info: return cc_info; } +static __head void svsm_setup(struct cc_blob_sev_info *cc_info) +{ + struct svsm_call call = {}; + int ret; + u64 pa; + + /* + * Record the SVSM Calling Area address (CAA) if the guest is not + * running at VMPL0. The CA will be used to communicate with the + * SVSM to perform the SVSM services. + */ + if (!svsm_setup_ca(cc_info)) + return; + + /* + * It is very early in the boot and the kernel is running identity + * mapped but without having adjusted the pagetables to where the + * kernel was loaded (physbase), so the get the CA address using + * RIP-relative addressing. + */ + pa = (u64)&RIP_REL_REF(boot_svsm_ca_page); + + /* + * Switch over to the boot SVSM CA while the current CA is still + * addressable. There is no GHCB at this point so use the MSR protocol. + * + * SVSM_CORE_REMAP_CA call: + * RAX = 0 (Protocol=0, CallID=0) + * RCX = New CA GPA + */ + call.caa = svsm_get_caa(); + call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA); + call.rcx = pa; + ret = svsm_perform_call_protocol(&call); + if (ret) + panic("Can't remap the SVSM CA, ret=%d, rax_out=0x%llx\n", ret, call.rax_out); + + RIP_REL_REF(boot_svsm_caa) = (struct svsm_ca *)pa; + RIP_REL_REF(boot_svsm_caa_pa) = pa; +} + |
