|
Pablo Greco |
e6a3ae |
From 2427e21de274cf7b56ef79e4a7ba78a08def7a58 Mon Sep 17 00:00:00 2001
|
|
Pablo Greco |
e6a3ae |
From: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Date: Mon, 22 Jul 2019 18:22:18 +0100
|
|
Pablo Greco |
e6a3ae |
Subject: [PATCH 37/39] target/i386: kvm: Demand nested migration kernel
|
|
Pablo Greco |
e6a3ae |
capabilities only when vCPU may have enabled VMX
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Message-id: <20190722182220.19374-17-pbonzini@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Patchwork-id: 89634
|
|
Pablo Greco |
e6a3ae |
O-Subject: [RHEL-8.1.0 PATCH qemu-kvm v3 16/18] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
|
|
Pablo Greco |
e6a3ae |
Bugzilla: 1689269
|
|
Pablo Greco |
e6a3ae |
RH-Acked-by: Peter Xu <zhexu@redhat.com>
|
|
Pablo Greco |
e6a3ae |
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
|
|
Pablo Greco |
e6a3ae |
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
From: Liran Alon <liran.alon@oracle.com>
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Previous to this change, a vCPU exposed with VMX running on a kernel
|
|
Pablo Greco |
e6a3ae |
without KVM_CAP_NESTED_STATE or KVM_CAP_EXCEPTION_PAYLOAD resulted in
|
|
Pablo Greco |
e6a3ae |
adding a migration blocker. This was because when the code was written
|
|
Pablo Greco |
e6a3ae |
it was thought there is no way to reliably know if a vCPU is utilising
|
|
Pablo Greco |
e6a3ae |
VMX or not at runtime. However, it turns out that this can be known to
|
|
Pablo Greco |
e6a3ae |
some extent:
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
|
|
Pablo Greco |
e6a3ae |
Since it was set, CR4.VMXE must remain set as long as the vCPU is in
|
|
Pablo Greco |
e6a3ae |
VMX operation. This is because CR4.VMXE is one of the bits set
|
|
Pablo Greco |
e6a3ae |
in MSR_IA32_VMX_CR4_FIXED1.
|
|
Pablo Greco |
e6a3ae |
There is one exception to the above statement when vCPU enters SMM mode.
|
|
Pablo Greco |
e6a3ae |
When a vCPU enters SMM mode, it temporarily exits VMX operation and
|
|
Pablo Greco |
e6a3ae |
may also reset CR4.VMXE during execution in SMM mode.
|
|
Pablo Greco |
e6a3ae |
When the vCPU exits SMM mode, vCPU state is restored to be in VMX operation
|
|
Pablo Greco |
e6a3ae |
and CR4.VMXE is restored to its original state of being set.
|
|
Pablo Greco |
e6a3ae |
Therefore, when the vCPU is not in SMM mode, we can infer whether
|
|
Pablo Greco |
e6a3ae |
VMX is being used by examining CR4.VMXE. Otherwise, we cannot
|
|
Pablo Greco |
e6a3ae |
know for certain but assume the worse that vCPU may utilise VMX.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Summaring all the above, a vCPU may have enabled VMX in case
|
|
Pablo Greco |
e6a3ae |
CR4.VMXE is set or vCPU is in SMM mode.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Therefore, remove migration blocker and check before migration
|
|
Pablo Greco |
e6a3ae |
(cpu_pre_save()) if the vCPU may have enabled VMX. If true, only then
|
|
Pablo Greco |
e6a3ae |
require relevant kernel capabilities.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when the vCPU is in
|
|
Pablo Greco |
e6a3ae |
guest-mode and there is a pending/injected exception. Otherwise, this
|
|
Pablo Greco |
e6a3ae |
kernel capability is not required for proper migration.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Liran Alon <liran.alon@oracle.com>
|
|
Pablo Greco |
e6a3ae |
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
|
|
Pablo Greco |
e6a3ae |
Tested-by: Maran Wilson <maran.wilson@oracle.com>
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pablo Greco |
e6a3ae |
(cherry picked from commit 79a197ab180e75838523c58973b1221ad7bf51eb)
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
|
|
Pablo Greco |
e6a3ae |
---
|
|
Pablo Greco |
e6a3ae |
target/i386/cpu.h | 22 ++++++++++++++++++++++
|
|
Pablo Greco |
e6a3ae |
target/i386/kvm.c | 26 ++++++--------------------
|
|
Pablo Greco |
e6a3ae |
target/i386/kvm_i386.h | 1 +
|
|
Pablo Greco |
e6a3ae |
target/i386/machine.c | 24 ++++++++++++++++++++----
|
|
Pablo Greco |
e6a3ae |
4 files changed, 49 insertions(+), 24 deletions(-)
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
|
|
Pablo Greco |
e6a3ae |
index d120f62..273c90b 100644
|
|
Pablo Greco |
e6a3ae |
--- a/target/i386/cpu.h
|
|
Pablo Greco |
e6a3ae |
+++ b/target/i386/cpu.h
|
|
Pablo Greco |
e6a3ae |
@@ -1848,6 +1848,28 @@ static inline bool cpu_has_vmx(CPUX86State *env)
|
|
Pablo Greco |
e6a3ae |
return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
+/*
|
|
Pablo Greco |
e6a3ae |
+ * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
|
|
Pablo Greco |
e6a3ae |
+ * Since it was set, CR4.VMXE must remain set as long as vCPU is in
|
|
Pablo Greco |
e6a3ae |
+ * VMX operation. This is because CR4.VMXE is one of the bits set
|
|
Pablo Greco |
e6a3ae |
+ * in MSR_IA32_VMX_CR4_FIXED1.
|
|
Pablo Greco |
e6a3ae |
+ *
|
|
Pablo Greco |
e6a3ae |
+ * There is one exception to above statement when vCPU enters SMM mode.
|
|
Pablo Greco |
e6a3ae |
+ * When a vCPU enters SMM mode, it temporarily exit VMX operation and
|
|
Pablo Greco |
e6a3ae |
+ * may also reset CR4.VMXE during execution in SMM mode.
|
|
Pablo Greco |
e6a3ae |
+ * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
|
|
Pablo Greco |
e6a3ae |
+ * and CR4.VMXE is restored to it's original value of being set.
|
|
Pablo Greco |
e6a3ae |
+ *
|
|
Pablo Greco |
e6a3ae |
+ * Therefore, when vCPU is not in SMM mode, we can infer whether
|
|
Pablo Greco |
e6a3ae |
+ * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
|
|
Pablo Greco |
e6a3ae |
+ * know for certain.
|
|
Pablo Greco |
e6a3ae |
+ */
|
|
Pablo Greco |
e6a3ae |
+static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
|
|
Pablo Greco |
e6a3ae |
+{
|
|
Pablo Greco |
e6a3ae |
+ return cpu_has_vmx(env) &&
|
|
Pablo Greco |
e6a3ae |
+ ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
|
|
Pablo Greco |
e6a3ae |
+}
|
|
Pablo Greco |
e6a3ae |
+
|
|
Pablo Greco |
e6a3ae |
/* fpu_helper.c */
|
|
Pablo Greco |
e6a3ae |
void update_fp_status(CPUX86State *env);
|
|
Pablo Greco |
e6a3ae |
void update_mxcsr_status(CPUX86State *env);
|
|
Pablo Greco |
e6a3ae |
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
|
|
Pablo Greco |
e6a3ae |
index 0619aba..0bd286e 100644
|
|
Pablo Greco |
e6a3ae |
--- a/target/i386/kvm.c
|
|
Pablo Greco |
e6a3ae |
+++ b/target/i386/kvm.c
|
|
Pablo Greco |
e6a3ae |
@@ -127,6 +127,11 @@ bool kvm_has_adjust_clock_stable(void)
|
|
Pablo Greco |
e6a3ae |
return (ret == KVM_CLOCK_TSC_STABLE);
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
+bool kvm_has_exception_payload(void)
|
|
Pablo Greco |
e6a3ae |
+{
|
|
Pablo Greco |
e6a3ae |
+ return has_exception_payload;
|
|
Pablo Greco |
e6a3ae |
+}
|
|
Pablo Greco |
e6a3ae |
+
|
|
Pablo Greco |
e6a3ae |
bool kvm_allows_irq0_override(void)
|
|
Pablo Greco |
e6a3ae |
{
|
|
Pablo Greco |
e6a3ae |
return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
|
|
Pablo Greco |
e6a3ae |
@@ -814,7 +819,6 @@ static int hyperv_handle_properties(CPUState *cs)
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
static Error *invtsc_mig_blocker;
|
|
Pablo Greco |
e6a3ae |
-static Error *nested_virt_mig_blocker;
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
#define KVM_MAX_CPUID_ENTRIES 100
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
@@ -1159,22 +1163,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
|
|
Pablo Greco |
e6a3ae |
!!(c->ecx & CPUID_EXT_SMX);
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
- if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
|
|
Pablo Greco |
e6a3ae |
- ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
|
|
Pablo Greco |
e6a3ae |
- error_setg(&nested_virt_mig_blocker,
|
|
Pablo Greco |
e6a3ae |
- "Kernel do not provide required capabilities for "
|
|
Pablo Greco |
e6a3ae |
- "nested virtualization migration. "
|
|
Pablo Greco |
e6a3ae |
- "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
|
|
Pablo Greco |
e6a3ae |
- kvm_max_nested_state_length() > 0,
|
|
Pablo Greco |
e6a3ae |
- has_exception_payload);
|
|
Pablo Greco |
e6a3ae |
- r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
|
|
Pablo Greco |
e6a3ae |
- if (local_err) {
|
|
Pablo Greco |
e6a3ae |
- error_report_err(local_err);
|
|
Pablo Greco |
e6a3ae |
- error_free(nested_virt_mig_blocker);
|
|
Pablo Greco |
e6a3ae |
- return r;
|
|
Pablo Greco |
e6a3ae |
- }
|
|
Pablo Greco |
e6a3ae |
- }
|
|
Pablo Greco |
e6a3ae |
-
|
|
Pablo Greco |
e6a3ae |
if (env->mcg_cap & MCG_LMCE_P) {
|
|
Pablo Greco |
e6a3ae |
has_msr_mcg_ext_ctl = has_msr_feature_control = true;
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
@@ -1190,7 +1178,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
|
|
Pablo Greco |
e6a3ae |
if (local_err) {
|
|
Pablo Greco |
e6a3ae |
error_report_err(local_err);
|
|
Pablo Greco |
e6a3ae |
error_free(invtsc_mig_blocker);
|
|
Pablo Greco |
e6a3ae |
- goto fail2;
|
|
Pablo Greco |
e6a3ae |
+ return r;
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
/* for savevm */
|
|
Pablo Greco |
e6a3ae |
vmstate_x86_cpu.unmigratable = 1;
|
|
Pablo Greco |
e6a3ae |
@@ -1256,8 +1244,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
fail:
|
|
Pablo Greco |
e6a3ae |
migrate_del_blocker(invtsc_mig_blocker);
|
|
Pablo Greco |
e6a3ae |
- fail2:
|
|
Pablo Greco |
e6a3ae |
- migrate_del_blocker(nested_virt_mig_blocker);
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
return r;
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
|
|
Pablo Greco |
e6a3ae |
index 1de9876..df9bbf3 100644
|
|
Pablo Greco |
e6a3ae |
--- a/target/i386/kvm_i386.h
|
|
Pablo Greco |
e6a3ae |
+++ b/target/i386/kvm_i386.h
|
|
Pablo Greco |
e6a3ae |
@@ -41,6 +41,7 @@
|
|
Pablo Greco |
e6a3ae |
bool kvm_allows_irq0_override(void);
|
|
Pablo Greco |
e6a3ae |
bool kvm_has_smm(void);
|
|
Pablo Greco |
e6a3ae |
bool kvm_has_adjust_clock_stable(void);
|
|
Pablo Greco |
e6a3ae |
+bool kvm_has_exception_payload(void);
|
|
Pablo Greco |
e6a3ae |
void kvm_synchronize_all_tsc(void);
|
|
Pablo Greco |
e6a3ae |
void kvm_arch_reset_vcpu(X86CPU *cs);
|
|
Pablo Greco |
e6a3ae |
void kvm_arch_do_init_vcpu(X86CPU *cs);
|
|
Pablo Greco |
e6a3ae |
diff --git a/target/i386/machine.c b/target/i386/machine.c
|
|
Pablo Greco |
e6a3ae |
index 5ffee8f..8d90d98 100644
|
|
Pablo Greco |
e6a3ae |
--- a/target/i386/machine.c
|
|
Pablo Greco |
e6a3ae |
+++ b/target/i386/machine.c
|
|
Pablo Greco |
e6a3ae |
@@ -7,6 +7,7 @@
|
|
Pablo Greco |
e6a3ae |
#include "hw/i386/pc.h"
|
|
Pablo Greco |
e6a3ae |
#include "hw/isa/isa.h"
|
|
Pablo Greco |
e6a3ae |
#include "migration/cpu.h"
|
|
Pablo Greco |
e6a3ae |
+#include "kvm_i386.h"
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
#include "sysemu/kvm.h"
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
@@ -231,10 +232,25 @@ static int cpu_pre_save(void *opaque)
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
#ifdef CONFIG_KVM
|
|
Pablo Greco |
e6a3ae |
- /* Verify we have nested virtualization state from kernel if required */
|
|
Pablo Greco |
e6a3ae |
- if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
|
|
Pablo Greco |
e6a3ae |
- error_report("Guest enabled nested virtualization but kernel "
|
|
Pablo Greco |
e6a3ae |
- "does not support saving of nested state");
|
|
Pablo Greco |
e6a3ae |
+ /*
|
|
Pablo Greco |
e6a3ae |
+ * In case vCPU may have enabled VMX, we need to make sure kernel have
|
|
Pablo Greco |
e6a3ae |
+ * required capabilities in order to perform migration correctly:
|
|
Pablo Greco |
e6a3ae |
+ *
|
|
Pablo Greco |
e6a3ae |
+ * 1) We must be able to extract vCPU nested-state from KVM.
|
|
Pablo Greco |
e6a3ae |
+ *
|
|
Pablo Greco |
e6a3ae |
+ * 2) In case vCPU is running in guest-mode and it has a pending exception,
|
|
Pablo Greco |
e6a3ae |
+ * we must be able to determine if it's in a pending or injected state.
|
|
Pablo Greco |
e6a3ae |
+ * Note that in case KVM don't have required capability to do so,
|
|
Pablo Greco |
e6a3ae |
+ * a pending/injected exception will always appear as an
|
|
Pablo Greco |
e6a3ae |
+ * injected exception.
|
|
Pablo Greco |
e6a3ae |
+ */
|
|
Pablo Greco |
e6a3ae |
+ if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
|
|
Pablo Greco |
e6a3ae |
+ (!env->nested_state ||
|
|
Pablo Greco |
e6a3ae |
+ (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
|
|
Pablo Greco |
e6a3ae |
+ env->exception_injected))) {
|
|
Pablo Greco |
e6a3ae |
+ error_report("Guest maybe enabled nested virtualization but kernel "
|
|
Pablo Greco |
e6a3ae |
+ "does not support required capabilities to save vCPU "
|
|
Pablo Greco |
e6a3ae |
+ "nested state");
|
|
Pablo Greco |
e6a3ae |
return -EINVAL;
|
|
Pablo Greco |
e6a3ae |
}
|
|
Pablo Greco |
e6a3ae |
#endif
|
|
Pablo Greco |
e6a3ae |
--
|
|
Pablo Greco |
e6a3ae |
1.8.3.1
|
|
Pablo Greco |
e6a3ae |
|