From 49ae248b61aefa0eff84dca8e81bd9306cdaa6c9 Mon Sep 17 00:00:00 2001 From: Janis Schoetterl-Glausch Date: Thu, 18 Nov 2021 11:25:22 +0100 Subject: KVM: s390: Fix names of skey constants in api documentation They are defined in include/uapi/linux/kvm.h as KVM_S390_GET_SKEYS_NONE and KVM_S390_SKEYS_MAX, but the api documetation talks of KVM_S390_GET_KEYS_NONE and KVM_S390_SKEYS_ALLOC_MAX respectively. Signed-off-by: Janis Schoetterl-Glausch Reviewed-by: Janosch Frank Message-Id: <20211118102522.569660-1-scgl@linux.ibm.com> Signed-off-by: Janosch Frank --- Documentation/virt/kvm/api.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation/virt/kvm/api.rst') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index aeeb071c7688..b86c7edae888 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -3701,7 +3701,7 @@ KVM with the currently defined set of flags. :Architectures: s390 :Type: vm ioctl :Parameters: struct kvm_s390_skeys -:Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage +:Returns: 0 on success, KVM_S390_GET_SKEYS_NONE if guest is not using storage keys, negative value on error This ioctl is used to get guest storage key values on the s390 @@ -3720,7 +3720,7 @@ you want to get. The count field is the number of consecutive frames (starting from start_gfn) whose storage keys to get. The count field must be at least 1 and the maximum -allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range +allowed value is defined as KVM_S390_SKEYS_MAX. Values outside this range will cause the ioctl to return -EINVAL. The skeydata_addr field is the address to a buffer large enough to hold count @@ -3744,7 +3744,7 @@ you want to set. The count field is the number of consecutive frames (starting from start_gfn) whose storage keys to get. The count field must be at least 1 and the maximum -allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range +allowed value is defined as KVM_S390_SKEYS_MAX. Values outside this range will cause the ioctl to return -EINVAL. The skeydata_addr field is the address to a buffer containing count bytes of -- cgit v1.2.3 From 1cfc9c4b9d4606a1e90e7dbc50058b9f0c1d43a6 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Fri, 10 Dec 2021 16:36:22 +0000 Subject: KVM: x86/xen: Maintain valid mapping of Xen shared_info page Use the newly reinstated gfn_to_pfn_cache to maintain a kernel mapping of the Xen shared_info page so that it can be accessed in atomic context. Note that we do not participate in dirty tracking for the shared info page and we do not explicitly mark it dirty every single tim we deliver an event channel interrupts. We wouldn't want to do that even if we *did* have a valid vCPU context with which to do so. Signed-off-by: David Woodhouse Message-Id: <20211210163625.2886-4-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'Documentation/virt/kvm/api.rst') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index b86c7edae888..c168be764707 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -371,6 +371,9 @@ The bits in the dirty bitmap are cleared before the ioctl returns, unless KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is enabled. For more information, see the description of the capability. +Note that the Xen shared info page, if configured, shall always be assumed +to be dirty. KVM will not explicitly mark it such. + 4.9 KVM_SET_MEMORY_ALIAS ------------------------ @@ -5134,6 +5137,15 @@ KVM_XEN_ATTR_TYPE_SHARED_INFO not aware of the Xen CPU id which is used as the index into the vcpu_info[] array, so cannot know the correct default location. + Note that the shared info page may be constantly written to by KVM; + it contains the event channel bitmap used to deliver interrupts to + a Xen guest, amongst other things. It is exempt from dirty tracking + mechanisms — KVM will not explicitly mark the page as dirty each + time an event channel interrupt is delivered to the guest! Thus, + userspace should always assume that the designated GFN is dirty if + any vCPU has been running or any event channel interrupts can be + routed to the guest. + KVM_XEN_ATTR_TYPE_UPCALL_VECTOR Sets the exception vector used to deliver Xen event channel upcalls. -- cgit v1.2.3 From 14243b387137a4afbe1df5d9dc15182d6657bb79 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Fri, 10 Dec 2021 16:36:23 +0000 Subject: KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery This adds basic support for delivering 2 level event channels to a guest. Initially, it only supports delivery via the IRQ routing table, triggered by an eventfd. In order to do so, it has a kvm_xen_set_evtchn_fast() function which will use the pre-mapped shared_info page if it already exists and is still valid, while the slow path through the irqfd_inject workqueue will remap the shared_info page if necessary. It sets the bits in the shared_info page but not the vcpu_info; that is deferred to __kvm_xen_has_interrupt() which raises the vector to the appropriate vCPU. Add a 'verbose' mode to xen_shinfo_test while adding test cases for this. Signed-off-by: David Woodhouse Message-Id: <20211210163625.2886-5-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'Documentation/virt/kvm/api.rst') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index c168be764707..6b683dfea8f2 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1799,6 +1799,7 @@ No flags are specified so far, the corresponding field must be set to zero. struct kvm_irq_routing_msi msi; struct kvm_irq_routing_s390_adapter adapter; struct kvm_irq_routing_hv_sint hv_sint; + struct kvm_irq_routing_xen_evtchn xen_evtchn; __u32 pad[8]; } u; }; @@ -1808,6 +1809,7 @@ No flags are specified so far, the corresponding field must be set to zero. #define KVM_IRQ_ROUTING_MSI 2 #define KVM_IRQ_ROUTING_S390_ADAPTER 3 #define KVM_IRQ_ROUTING_HV_SINT 4 + #define KVM_IRQ_ROUTING_XEN_EVTCHN 5 flags: @@ -1859,6 +1861,20 @@ address_hi must be zero. __u32 sint; }; + struct kvm_irq_routing_xen_evtchn { + __u32 port; + __u32 vcpu; + __u32 priority; + }; + + +When KVM_CAP_XEN_HVM includes the KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL bit +in its indication of supported features, routing to Xen event channels +is supported. Although the priority field is present, only the value +KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL is supported, which means delivery by +2 level event channels. FIFO event channel support may be added in +the future. + 4.55 KVM_SET_TSC_KHZ -------------------- @@ -7413,6 +7429,7 @@ PVHVM guests. Valid flags are:: #define KVM_XEN_HVM_CONFIG_INTERCEPT_HCALL (1 << 1) #define KVM_XEN_HVM_CONFIG_SHARED_INFO (1 << 2) #define KVM_XEN_HVM_CONFIG_RUNSTATE (1 << 2) + #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 3) The KVM_XEN_HVM_CONFIG_HYPERCALL_MSR flag indicates that the KVM_XEN_HVM_CONFIG ioctl is available, for the guest to set its hypercall page. @@ -7432,6 +7449,10 @@ The KVM_XEN_HVM_CONFIG_RUNSTATE flag indicates that the runstate-related features KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR/_CURRENT/_DATA/_ADJUST are supported by the KVM_XEN_VCPU_SET_ATTR/KVM_XEN_VCPU_GET_ATTR ioctls. +The KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL flag indicates that IRQ routing entries +of the type KVM_IRQ_ROUTING_XEN_EVTCHN are supported, with the priority +field set to indicate 2 level event channel delivery. + 8.31 KVM_CAP_PPC_MULTITCE ------------------------- -- cgit v1.2.3 From 445ecdf79be0c71ca248f7611aeefceaea3ec59f Mon Sep 17 00:00:00 2001 From: Jing Liu Date: Wed, 5 Jan 2022 04:35:15 -0800 Subject: kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID KVM_GET_SUPPORTED_CPUID should not include any dynamic xstates in CPUID[0xD] if they have not been requested with prctl. Otherwise a process which directly passes KVM_GET_SUPPORTED_CPUID to KVM_SET_CPUID2 would now fail even if it doesn't intend to use a dynamically enabled feature. Userspace must know that prctl is required and allocate >4K xstate buffer before setting any dynamic bit. Suggested-by: Paolo Bonzini Signed-off-by: Jing Liu Signed-off-by: Yang Zhong Message-Id: <20220105123532.12586-5-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation/virt/kvm/api.rst') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 6b683dfea8f2..f4ea5e41a4d0 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1687,6 +1687,10 @@ userspace capabilities, and with user requirements (for example, the user may wish to constrain cpuid to emulate older hardware, or for feature consistency across a cluster). +Dynamically-enabled feature bits need to be requested with +``arch_prctl()`` before calling this ioctl. Feature bits that have not +been requested are excluded from the result. + Note that certain capabilities, such as KVM_CAP_X86_DISABLE_EXITS, may expose cpuid features (e.g. MONITOR) which are not supported by kvm in its default configuration. If userspace enables such capabilities, it -- cgit v1.2.3 From be50b2065dfa3d88428fdfdc340d154d96bf6848 Mon Sep 17 00:00:00 2001 From: Guang Zeng Date: Wed, 5 Jan 2022 04:35:29 -0800 Subject: kvm: x86: Add support for getting/setting expanded xstate buffer With KVM_CAP_XSAVE, userspace uses a hardcoded 4KB buffer to get/set xstate data from/to KVM. This doesn't work when dynamic xfeatures (e.g. AMX) are exposed to the guest as they require a larger buffer size. Introduce a new capability (KVM_CAP_XSAVE2). Userspace VMM gets the required xstate buffer size via KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2). KVM_SET_XSAVE is extended to work with both legacy and new capabilities by doing properly-sized memdup_user() based on the guest fpu container. KVM_GET_XSAVE is kept for backward-compatible reason. Instead, KVM_GET_XSAVE2 is introduced under KVM_CAP_XSAVE2 as the preferred interface for getting xstate buffer (4KB or larger size) from KVM (Link: https://lkml.org/lkml/2021/12/15/510) Also, update the api doc with the new KVM_GET_XSAVE2 ioctl. Signed-off-by: Guang Zeng Signed-off-by: Wei Wang Signed-off-by: Jing Liu Signed-off-by: Kevin Tian Signed-off-by: Yang Zhong Message-Id: <20220105123532.12586-19-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 40 insertions(+), 2 deletions(-) (limited to 'Documentation/virt/kvm/api.rst') diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index f4ea5e41a4d0..d3791a14eb9a 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1569,6 +1569,7 @@ otherwise it will return EBUSY error. struct kvm_xsave { __u32 region[1024]; + __u32 extra[0]; }; This ioctl would copy current vcpu's xsave struct to the userspace. @@ -1577,7 +1578,7 @@ This ioctl would copy current vcpu's xsave struct to the userspace. 4.43 KVM_SET_XSAVE ------------------ -:Capability: KVM_CAP_XSAVE +:Capability: KVM_CAP_XSAVE and KVM_CAP_XSAVE2 :Architectures: x86 :Type: vcpu ioctl :Parameters: struct kvm_xsave (in) @@ -1588,9 +1589,18 @@ This ioctl would copy current vcpu's xsave struct to the userspace. struct kvm_xsave { __u32 region[1024]; + __u32 extra[0]; }; -This ioctl would copy userspace's xsave struct to the kernel. +This ioctl would copy userspace's xsave struct to the kernel. It copies +as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2), +when invoked on the vm file descriptor. The size value returned by +KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096. +Currently, it is only greater than 4096 if a dynamic feature has been +enabled with ``arch_prctl()``, but this may change in the future. + +The offsets of the state save areas in struct kvm_xsave follow the +contents of CPUID leaf 0xD on the host. 4.44 KVM_GET_XCRS @@ -5535,6 +5545,34 @@ the trailing ``'\0'``, is indicated by ``name_size`` in the header. The Stats Data block contains an array of 64-bit values in the same order as the descriptors in Descriptors block. +4.42 KVM_GET_XSAVE2 +------------------ + +:Capability: KVM_CAP_XSAVE2 +:Architectures: x86 +:Type: vcpu ioctl +:Parameters: struct kvm_xsave (out) +:Returns: 0 on success, -1 on error + + +:: + + struct kvm_xsave { + __u32 region[1024]; + __u32 extra[0]; + }; + +This ioctl would copy current vcpu's xsave struct to the userspace. It +copies as many bytes as are returned by KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) +when invoked on the vm file descriptor. The size value returned by +KVM_CHECK_EXTENSION(KVM_CAP_XSAVE2) will always be at least 4096. +Currently, it is only greater than 4096 if a dynamic feature has been +enabled with ``arch_prctl()``, but this may change in the future. + +The offsets of the state save areas in struct kvm_xsave follow the contents +of CPUID leaf 0xD on the host. + + 5. The kvm_run structure ======================== -- cgit v1.2.3