yeahuh / rpms / qemu-kvm

Forked from rpms/qemu-kvm 2 years ago
Clone

Blame SOURCES/kvm-target-i386-kvm-Demand-nested-migration-kernel-capab.patch

4ec855
From 2427e21de274cf7b56ef79e4a7ba78a08def7a58 Mon Sep 17 00:00:00 2001
4ec855
From: Paolo Bonzini <pbonzini@redhat.com>
4ec855
Date: Mon, 22 Jul 2019 18:22:18 +0100
4ec855
Subject: [PATCH 37/39] target/i386: kvm: Demand nested migration kernel
4ec855
 capabilities only when vCPU may have enabled VMX
4ec855
4ec855
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
4ec855
Message-id: <20190722182220.19374-17-pbonzini@redhat.com>
4ec855
Patchwork-id: 89634
4ec855
O-Subject: [RHEL-8.1.0 PATCH qemu-kvm v3 16/18] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
4ec855
Bugzilla: 1689269
4ec855
RH-Acked-by: Peter Xu <zhexu@redhat.com>
4ec855
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
4ec855
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
4ec855
4ec855
From: Liran Alon <liran.alon@oracle.com>
4ec855
4ec855
Previous to this change, a vCPU exposed with VMX running on a kernel
4ec855
without KVM_CAP_NESTED_STATE or KVM_CAP_EXCEPTION_PAYLOAD resulted in
4ec855
adding a migration blocker. This was because when the code was written
4ec855
it was thought there is no way to reliably know if a vCPU is utilising
4ec855
VMX or not at runtime. However, it turns out that this can be known to
4ec855
some extent:
4ec855
4ec855
In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
4ec855
Since it was set, CR4.VMXE must remain set as long as the vCPU is in
4ec855
VMX operation. This is because CR4.VMXE is one of the bits set
4ec855
in MSR_IA32_VMX_CR4_FIXED1.
4ec855
There is one exception to the above statement when vCPU enters SMM mode.
4ec855
When a vCPU enters SMM mode, it temporarily exits VMX operation and
4ec855
may also reset CR4.VMXE during execution in SMM mode.
4ec855
When the vCPU exits SMM mode, vCPU state is restored to be in VMX operation
4ec855
and CR4.VMXE is restored to its original state of being set.
4ec855
Therefore, when the vCPU is not in SMM mode, we can infer whether
4ec855
VMX is being used by examining CR4.VMXE. Otherwise, we cannot
4ec855
know for certain but assume the worse that vCPU may utilise VMX.
4ec855
4ec855
Summaring all the above, a vCPU may have enabled VMX in case
4ec855
CR4.VMXE is set or vCPU is in SMM mode.
4ec855
4ec855
Therefore, remove migration blocker and check before migration
4ec855
(cpu_pre_save()) if the vCPU may have enabled VMX. If true, only then
4ec855
require relevant kernel capabilities.
4ec855
4ec855
While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when the vCPU is in
4ec855
guest-mode and there is a pending/injected exception. Otherwise, this
4ec855
kernel capability is not required for proper migration.
4ec855
4ec855
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
4ec855
Signed-off-by: Liran Alon <liran.alon@oracle.com>
4ec855
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
4ec855
Tested-by: Maran Wilson <maran.wilson@oracle.com>
4ec855
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4ec855
(cherry picked from commit 79a197ab180e75838523c58973b1221ad7bf51eb)
4ec855
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
4ec855
---
4ec855
 target/i386/cpu.h      | 22 ++++++++++++++++++++++
4ec855
 target/i386/kvm.c      | 26 ++++++--------------------
4ec855
 target/i386/kvm_i386.h |  1 +
4ec855
 target/i386/machine.c  | 24 ++++++++++++++++++++----
4ec855
 4 files changed, 49 insertions(+), 24 deletions(-)
4ec855
4ec855
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
4ec855
index d120f62..273c90b 100644
4ec855
--- a/target/i386/cpu.h
4ec855
+++ b/target/i386/cpu.h
4ec855
@@ -1848,6 +1848,28 @@ static inline bool cpu_has_vmx(CPUX86State *env)
4ec855
     return env->features[FEAT_1_ECX] & CPUID_EXT_VMX;
4ec855
 }
4ec855
 
4ec855
+/*
4ec855
+ * In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
4ec855
+ * Since it was set, CR4.VMXE must remain set as long as vCPU is in
4ec855
+ * VMX operation. This is because CR4.VMXE is one of the bits set
4ec855
+ * in MSR_IA32_VMX_CR4_FIXED1.
4ec855
+ *
4ec855
+ * There is one exception to above statement when vCPU enters SMM mode.
4ec855
+ * When a vCPU enters SMM mode, it temporarily exit VMX operation and
4ec855
+ * may also reset CR4.VMXE during execution in SMM mode.
4ec855
+ * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation
4ec855
+ * and CR4.VMXE is restored to it's original value of being set.
4ec855
+ *
4ec855
+ * Therefore, when vCPU is not in SMM mode, we can infer whether
4ec855
+ * VMX is being used by examining CR4.VMXE. Otherwise, we cannot
4ec855
+ * know for certain.
4ec855
+ */
4ec855
+static inline bool cpu_vmx_maybe_enabled(CPUX86State *env)
4ec855
+{
4ec855
+    return cpu_has_vmx(env) &&
4ec855
+           ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK));
4ec855
+}
4ec855
+
4ec855
 /* fpu_helper.c */
4ec855
 void update_fp_status(CPUX86State *env);
4ec855
 void update_mxcsr_status(CPUX86State *env);
4ec855
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
4ec855
index 0619aba..0bd286e 100644
4ec855
--- a/target/i386/kvm.c
4ec855
+++ b/target/i386/kvm.c
4ec855
@@ -127,6 +127,11 @@ bool kvm_has_adjust_clock_stable(void)
4ec855
     return (ret == KVM_CLOCK_TSC_STABLE);
4ec855
 }
4ec855
 
4ec855
+bool kvm_has_exception_payload(void)
4ec855
+{
4ec855
+    return has_exception_payload;
4ec855
+}
4ec855
+
4ec855
 bool kvm_allows_irq0_override(void)
4ec855
 {
4ec855
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
4ec855
@@ -814,7 +819,6 @@ static int hyperv_handle_properties(CPUState *cs)
4ec855
 }
4ec855
 
4ec855
 static Error *invtsc_mig_blocker;
4ec855
-static Error *nested_virt_mig_blocker;
4ec855
 
4ec855
 #define KVM_MAX_CPUID_ENTRIES  100
4ec855
 
4ec855
@@ -1159,22 +1163,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
4ec855
                                   !!(c->ecx & CPUID_EXT_SMX);
4ec855
     }
4ec855
 
4ec855
-    if (cpu_has_vmx(env) && !nested_virt_mig_blocker &&
4ec855
-        ((kvm_max_nested_state_length() <= 0) || !has_exception_payload)) {
4ec855
-        error_setg(&nested_virt_mig_blocker,
4ec855
-                   "Kernel do not provide required capabilities for "
4ec855
-                   "nested virtualization migration. "
4ec855
-                   "(CAP_NESTED_STATE=%d, CAP_EXCEPTION_PAYLOAD=%d)",
4ec855
-                   kvm_max_nested_state_length() > 0,
4ec855
-                   has_exception_payload);
4ec855
-        r = migrate_add_blocker(nested_virt_mig_blocker, &local_err);
4ec855
-        if (local_err) {
4ec855
-            error_report_err(local_err);
4ec855
-            error_free(nested_virt_mig_blocker);
4ec855
-            return r;
4ec855
-        }
4ec855
-    }
4ec855
-
4ec855
     if (env->mcg_cap & MCG_LMCE_P) {
4ec855
         has_msr_mcg_ext_ctl = has_msr_feature_control = true;
4ec855
     }
4ec855
@@ -1190,7 +1178,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
4ec855
             if (local_err) {
4ec855
                 error_report_err(local_err);
4ec855
                 error_free(invtsc_mig_blocker);
4ec855
-                goto fail2;
4ec855
+                return r;
4ec855
             }
4ec855
             /* for savevm */
4ec855
             vmstate_x86_cpu.unmigratable = 1;
4ec855
@@ -1256,8 +1244,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
4ec855
 
4ec855
  fail:
4ec855
     migrate_del_blocker(invtsc_mig_blocker);
4ec855
- fail2:
4ec855
-    migrate_del_blocker(nested_virt_mig_blocker);
4ec855
 
4ec855
     return r;
4ec855
 }
4ec855
diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h
4ec855
index 1de9876..df9bbf3 100644
4ec855
--- a/target/i386/kvm_i386.h
4ec855
+++ b/target/i386/kvm_i386.h
4ec855
@@ -41,6 +41,7 @@
4ec855
 bool kvm_allows_irq0_override(void);
4ec855
 bool kvm_has_smm(void);
4ec855
 bool kvm_has_adjust_clock_stable(void);
4ec855
+bool kvm_has_exception_payload(void);
4ec855
 void kvm_synchronize_all_tsc(void);
4ec855
 void kvm_arch_reset_vcpu(X86CPU *cs);
4ec855
 void kvm_arch_do_init_vcpu(X86CPU *cs);
4ec855
diff --git a/target/i386/machine.c b/target/i386/machine.c
4ec855
index 5ffee8f..8d90d98 100644
4ec855
--- a/target/i386/machine.c
4ec855
+++ b/target/i386/machine.c
4ec855
@@ -7,6 +7,7 @@
4ec855
 #include "hw/i386/pc.h"
4ec855
 #include "hw/isa/isa.h"
4ec855
 #include "migration/cpu.h"
4ec855
+#include "kvm_i386.h"
4ec855
 
4ec855
 #include "sysemu/kvm.h"
4ec855
 
4ec855
@@ -231,10 +232,25 @@ static int cpu_pre_save(void *opaque)
4ec855
     }
4ec855
 
4ec855
 #ifdef CONFIG_KVM
4ec855
-    /* Verify we have nested virtualization state from kernel if required */
4ec855
-    if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) {
4ec855
-        error_report("Guest enabled nested virtualization but kernel "
4ec855
-                "does not support saving of nested state");
4ec855
+    /*
4ec855
+     * In case vCPU may have enabled VMX, we need to make sure kernel have
4ec855
+     * required capabilities in order to perform migration correctly:
4ec855
+     *
4ec855
+     * 1) We must be able to extract vCPU nested-state from KVM.
4ec855
+     *
4ec855
+     * 2) In case vCPU is running in guest-mode and it has a pending exception,
4ec855
+     * we must be able to determine if it's in a pending or injected state.
4ec855
+     * Note that in case KVM don't have required capability to do so,
4ec855
+     * a pending/injected exception will always appear as an
4ec855
+     * injected exception.
4ec855
+     */
4ec855
+    if (kvm_enabled() && cpu_vmx_maybe_enabled(env) &&
4ec855
+        (!env->nested_state ||
4ec855
+         (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) &&
4ec855
+          env->exception_injected))) {
4ec855
+        error_report("Guest maybe enabled nested virtualization but kernel "
4ec855
+                "does not support required capabilities to save vCPU "
4ec855
+                "nested state");
4ec855
         return -EINVAL;
4ec855
     }
4ec855
 #endif
4ec855
-- 
4ec855
1.8.3.1
4ec855