LinuxLists.cc - [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.

2014-07-23 07:42:35

Subject: [PATCH v3 0/6] kvm, mem-hotplug: Do not pin ept identity pagetable and apic access page.

ept identity pagetable and apic access page in kvm are pinned in memory.
As a result, they cannot be migrated/hot-removed.

But actually they don't need to be pinned in memory.

[For ept identity page]
Just do not pin it. When it is migrated, guest will be able to find the
new page in the next ept violation.

[For apic access page]
The hpa of apic access page is stored in VMCS APIC_ACCESS_ADDR pointer.
When apic access page is migrated, we update VMCS APIC_ACCESS_ADDR pointer
for each vcpu in addition.

NOTE: Patch 1~5 are tested with -cpu xxx,-x2apic option, and they work well.
Patch 6 is not tested yet, not sure if it is right.

Change log v2 -> v3:
1. Remove original [PATCH 3/6] since ept_identity_pagetable has been removed
in new [PATCH 3/6].
2. In [PATCH 3/6], fix the problem that kvm->slots_lock does not protect
kvm->arch.ept_identity_pagetable_done checking.
3. In [PATCH 3/6], drop gfn_to_page() since ept_identity_pagetable has been
removed.
4. Add new [PATCH 4/6], remove redundant variable in init_rmode_identity_map(),
and make it return 0 on success.
5. In [PATCH 5/6], drop put_page(kvm->arch.apic_access_page) from x86.c .
6. In [PATCH 5/6], update kvm->arch.apic_access_page in vcpu_reload_apic_access_page().
7. Add new [PATCH 6/6], reload apic access page in L2->L1 exit.

Change log v1 -> v2:
1. Add [PATCH 4/5] to remove unnecessary kvm_arch->ept_identity_pagetable.
2. In [PATCH 5/5], only introduce KVM_REQ_APIC_PAGE_RELOAD request.
3. In [PATCH 5/5], add set_apic_access_page_addr() for svm.

Tang Chen (6):
kvm: Add gfn_to_page_no_pin() to translate gfn to page without
pinning.
kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address.
kvm: Remove ept_identity_pagetable from struct kvm_arch.
kvm: Make init_rmode_identity_map() return 0 on success.
kvm, mem-hotplug: Do not pin apic access page in memory.
kvm, mem-hotplug: Reload L1's apic access page if it is migrated when
L2 is running.

arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/svm.c | 15 +++++-
arch/x86/kvm/vmx.c | 108 +++++++++++++++++++++++++++-------------
arch/x86/kvm/x86.c | 22 ++++++--
include/linux/kvm_host.h | 3 ++
virt/kvm/kvm_main.c | 29 ++++++++++-
6 files changed, 139 insertions(+), 41 deletions(-)

--
1.8.3.1

2014-07-23 07:42:39

by Tang Chen

[permalink] [raw]

Subject: [PATCH v3 1/6] kvm: Add gfn_to_page_no_pin() to translate gfn to page without pinning.

gfn_to_page() will finally call hva_to_pfn() to get the pfn, and pin the page
in memory by calling GUP functions. This function unpins the page.

Will be used by the followed patches.

Signed-off-by: Tang Chen <[email protected]>
---
include/linux/kvm_host.h | 1 +
virt/kvm/kvm_main.c | 17 ++++++++++++++++-
2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ec4e3bd..7c58d9d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -541,6 +541,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, struct page **pages,
int nr_pages);

struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
+struct page *gfn_to_page_no_pin(struct kvm *kvm, gfn_t gfn);
unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
unsigned long gfn_to_hva_prot(struct kvm *kvm, gfn_t gfn, bool *writable);
unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4b6c01b..6091849 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1371,9 +1371,24 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)

return kvm_pfn_to_page(pfn);
}
-
EXPORT_SYMBOL_GPL(gfn_to_page);

+struct page *gfn_to_page_no_pin(struct kvm *kvm, gfn_t gfn)
+{
+ struct page *page = gfn_to_page(kvm, gfn);
+
+ /*
+ * gfn_to_page() will finally call hva_to_pfn() to get the pfn, and pin
+ * the page in memory by calling GUP functions. This function unpins
+ * the page.
+ */
+ if (!is_error_page(page))
+ put_page(page);
+
+ return page;
+}
+EXPORT_SYMBOL_GPL(gfn_to_page_no_pin);
+
void kvm_release_page_clean(struct page *page)
{
WARN_ON(is_error_page(page));
--
1.8.3.1

2014-07-23 07:42:42

by Tang Chen

[permalink] [raw]

Subject: [PATCH v3 2/6] kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address.

We have APIC_DEFAULT_PHYS_BASE defined as 0xfee00000, which is also the address of
apic access page. So use this macro.

Signed-off-by: Tang Chen <[email protected]>
---
arch/x86/kvm/svm.c | 3 ++-
arch/x86/kvm/vmx.c | 6 +++---
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index ec8366c..576b525 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1257,7 +1257,8 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
svm->asid_generation = 0;
init_vmcb(svm);

- svm->vcpu.arch.apic_base = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+ svm->vcpu.arch.apic_base = APIC_DEFAULT_PHYS_BASE |
+ MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_bsp(&svm->vcpu))
svm->vcpu.arch.apic_base |= MSR_IA32_APICBASE_BSP;

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 801332e..0e1117c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3982,13 +3982,13 @@ static int alloc_apic_access_page(struct kvm *kvm)
goto out;
kvm_userspace_mem.slot = APIC_ACCESS_PAGE_PRIVATE_MEMSLOT;
kvm_userspace_mem.flags = 0;
- kvm_userspace_mem.guest_phys_addr = 0xfee00000ULL;
+ kvm_userspace_mem.guest_phys_addr = APIC_DEFAULT_PHYS_BASE;
kvm_userspace_mem.memory_size = PAGE_SIZE;
r = __kvm_set_memory_region(kvm, &kvm_userspace_mem);
if (r)
goto out;

- page = gfn_to_page(kvm, 0xfee00);
+ page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
if (is_error_page(page)) {
r = -EFAULT;
goto out;
@@ -4460,7 +4460,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu)

vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
kvm_set_cr8(&vmx->vcpu, 0);
- apic_base_msr.data = 0xfee00000 | MSR_IA32_APICBASE_ENABLE;
+ apic_base_msr.data = APIC_DEFAULT_PHYS_BASE | MSR_IA32_APICBASE_ENABLE;
if (kvm_vcpu_is_bsp(&vmx->vcpu))
apic_base_msr.data |= MSR_IA32_APICBASE_BSP;
apic_base_msr.host_initiated = true;
--
1.8.3.1

2014-07-23 07:43:01

by Tang Chen

[permalink] [raw]

Subject: [PATCH v3 5/6] kvm, mem-hotplug: Do not pin apic access page in memory.

apic access page is pinned in memory. As a result, it cannot be migrated/hot-removed.
Actually, it is not necessary to be pinned.

The hpa of apic access page is stored in VMCS APIC_ACCESS_ADDR pointer. When
the page is migrated, kvm_mmu_notifier_invalidate_page() will invalidate the
corresponding ept entry. This patch introduces a new vcpu request named
KVM_REQ_APIC_PAGE_RELOAD, and makes this request to all the vcpus at this time,
and force all the vcpus exit guest, and re-enter guest till they updates the VMCS
APIC_ACCESS_ADDR pointer to the new apic access page address, and updates
kvm->arch.apic_access_page to the new page.

Signed-off-by: Tang Chen <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 6 ++++++
arch/x86/kvm/vmx.c | 8 +++++++-
arch/x86/kvm/x86.c | 17 +++++++++++++++--
include/linux/kvm_host.h | 2 ++
virt/kvm/kvm_main.c | 12 ++++++++++++
6 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 62f973e..9ce6bfd 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -737,6 +737,7 @@ struct kvm_x86_ops {
void (*hwapic_isr_update)(struct kvm *kvm, int isr);
void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
+ void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 576b525..dc76f29 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3612,6 +3612,11 @@ static void svm_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
return;
}

+static void svm_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
+{
+ return;
+}
+
static int svm_vm_has_apicv(struct kvm *kvm)
{
return 0;
@@ -4365,6 +4370,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.enable_irq_window = enable_irq_window,
.update_cr8_intercept = update_cr8_intercept,
.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
+ .set_apic_access_page_addr = svm_set_apic_access_page_addr,
.vm_has_apicv = svm_vm_has_apicv,
.load_eoi_exitmap = svm_load_eoi_exitmap,
.hwapic_isr_update = svm_hwapic_isr_update,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6ab4f87..c123c1d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3995,7 +3995,7 @@ static int alloc_apic_access_page(struct kvm *kvm)
if (r)
goto out;

- page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
+ page = gfn_to_page_no_pin(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
if (is_error_page(page)) {
r = -EFAULT;
goto out;
@@ -7072,6 +7072,11 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
vmx_set_msr_bitmap(vcpu);
}

+static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
+{
+ vmcs_write64(APIC_ACCESS_ADDR, hpa);
+}
+
static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
{
u16 status;
@@ -8841,6 +8846,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
.enable_irq_window = enable_irq_window,
.update_cr8_intercept = update_cr8_intercept,
.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
+ .set_apic_access_page_addr = vmx_set_apic_access_page_addr,
.vm_has_apicv = vmx_vm_has_apicv,
.load_eoi_exitmap = vmx_load_eoi_exitmap,
.hwapic_irr_update = vmx_hwapic_irr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ffbe557..7541a66 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5929,6 +5929,19 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
kvm_apic_update_tmr(vcpu, tmr);
}

+static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
+{
+ /*
+ * apic access page could be migrated. When the page is being migrated,
+ * GUP will wait till the migrate entry is replaced with the new pte
+ * entry pointing to the new page.
+ */
+ vcpu->kvm->arch.apic_access_page = gfn_to_page_no_pin(vcpu->kvm,
+ APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
+ kvm_x86_ops->set_apic_access_page_addr(vcpu->kvm,
+ page_to_phys(vcpu->kvm->arch.apic_access_page));
+}
+
/*
* Returns 1 to let __vcpu_run() continue the guest execution loop without
* exiting to the userspace. Otherwise, the value will be returned to the
@@ -5989,6 +6002,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
kvm_deliver_pmi(vcpu);
if (kvm_check_request(KVM_REQ_SCAN_IOAPIC, vcpu))
vcpu_scan_ioapic(vcpu);
+ if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
+ vcpu_reload_apic_access_page(vcpu);
}

if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
@@ -7175,8 +7190,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kfree(kvm->arch.vpic);
kfree(kvm->arch.vioapic);
kvm_free_vcpus(kvm);
- if (kvm->arch.apic_access_page)
- put_page(kvm->arch.apic_access_page);
kfree(rcu_dereference_check(kvm->arch.apic_map, 1));
}

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7c58d9d..f49be86 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -136,6 +136,7 @@ static inline bool is_error_page(struct page *page)
#define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
#define KVM_REQ_ENABLE_IBS 23
#define KVM_REQ_DISABLE_IBS 24
+#define KVM_REQ_APIC_PAGE_RELOAD 25

#define KVM_USERSPACE_IRQ_SOURCE_ID 0
#define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
@@ -596,6 +597,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm);
void kvm_reload_remote_mmus(struct kvm *kvm);
void kvm_make_mclock_inprogress_request(struct kvm *kvm);
void kvm_make_scan_ioapic_request(struct kvm *kvm);
+void kvm_reload_apic_access_page(struct kvm *kvm);

long kvm_arch_dev_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6091849..965b702 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -210,6 +210,11 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
}

+void kvm_reload_apic_access_page(struct kvm *kvm)
+{
+ make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD);
+}
+
int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
{
struct page *page;
@@ -294,6 +299,13 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
if (need_tlb_flush)
kvm_flush_remote_tlbs(kvm);

+ /*
+ * The physical address of apic access page is stroed in VMCS.
+ * So need to update it when it becomes invalid.
+ */
+ if (address == gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT))
+ kvm_reload_apic_access_page(kvm);
+
spin_unlock(&kvm->mmu_lock);
srcu_read_unlock(&kvm->srcu, idx);
}
--
1.8.3.1

2014-07-23 07:43:05

by Tang Chen

[permalink] [raw]

Subject: [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running.

This patch only handle "L1 and L2 vm share one apic access page" situation.

When L1 vm is running, if the shared apic access page is migrated, mmu_notifier will
request all vcpus to exit to L0, and reload apic access page physical address for
all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, L2's vmcs
will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to do
nothing.

When L2 vm is running, if the shared apic access page is migrated, mmu_notifier will
request all vcpus to exit to L0, and reload apic access page physical address for
all L2 vmcs. And this patch requests apic access page reload in L2->L1 vmexit.

Signed-off-by: Tang Chen <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 6 ++++++
arch/x86/kvm/vmx.c | 37 +++++++++++++++++++++++++++++++++++++
arch/x86/kvm/x86.c | 3 +++
4 files changed, 47 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9ce6bfd..613ee7f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -738,6 +738,7 @@ struct kvm_x86_ops {
void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
+ void (*set_nested_apic_page_migrated)(struct kvm_vcpu *vcpu, bool set);
void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index dc76f29..87273ef 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3617,6 +3617,11 @@ static void svm_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
return;
}

+static void svm_set_nested_apic_page_migrated(struct kvm_vcpu *vcpu, bool set)
+{
+ return;
+}
+
static int svm_vm_has_apicv(struct kvm *kvm)
{
return 0;
@@ -4371,6 +4376,7 @@ static struct kvm_x86_ops svm_x86_ops = {
.update_cr8_intercept = update_cr8_intercept,
.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
.set_apic_access_page_addr = svm_set_apic_access_page_addr,
+ .set_nested_apic_page_migrated = svm_set_nested_apic_page_migrated,
.vm_has_apicv = svm_vm_has_apicv,
.load_eoi_exitmap = svm_load_eoi_exitmap,
.hwapic_isr_update = svm_hwapic_isr_update,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c123c1d..9231afe 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -379,6 +379,16 @@ struct nested_vmx {
* we must keep them pinned while L2 runs.
*/
struct page *apic_access_page;
+ /*
+ * L1's apic access page can be migrated. When L1 and L2 are sharing
+ * the apic access page, after the page is migrated when L2 is running,
+ * we have to reload it to L1 vmcs before we enter L1.
+ *
+ * When the shared apic access page is migrated in L1 mode, we don't
+ * need to do anything else because we reload apic access page each
+ * time when entering L2 in prepare_vmcs02().
+ */
+ bool apic_access_page_migrated;
u64 msr_ia32_feature_control;

struct hrtimer preemption_timer;
@@ -7077,6 +7087,12 @@ static void vmx_set_apic_access_page_addr(struct kvm *kvm, hpa_t hpa)
vmcs_write64(APIC_ACCESS_ADDR, hpa);
}

+static void vmx_set_nested_apic_page_migrated(struct kvm_vcpu *vcpu, bool set)
+{
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ vmx->nested.apic_access_page_migrated = set;
+}
+
static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
{
u16 status;
@@ -8727,6 +8743,26 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
}

/*
+ * When shared (L1 & L2) apic access page is migrated during L2 is
+ * running, mmu_notifier will force to reload the page's hpa for L2
+ * vmcs. Need to reload it for L1 before entering L1.
+ */
+ if (vmx->nested.apic_access_page_migrated) {
+ /*
+ * Do not call kvm_reload_apic_access_page() because we are now
+ * in L2. We should not call make_all_cpus_request() to exit to
+ * L0, otherwise we will reload for L2 vmcs again.
+ */
+ int i;
+
+ for (i = 0; i < atomic_read(&vcpu->kvm->online_vcpus); i++)
+ kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD,
+ vcpu->kvm->vcpus[i]);
+
+ vmx->nested.apic_access_page_migrated = false;
+ }
+
+ /*
* Exiting from L2 to L1, we're now back to L1 which thinks it just
* finished a VMLAUNCH or VMRESUME instruction, so we need to set the
* success or failure flag accordingly.
@@ -8847,6 +8883,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
.update_cr8_intercept = update_cr8_intercept,
.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
+ .set_nested_apic_page_migrated = vmx_set_nested_apic_page_migrated,
.vm_has_apicv = vmx_vm_has_apicv,
.load_eoi_exitmap = vmx_load_eoi_exitmap,
.hwapic_irr_update = vmx_hwapic_irr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7541a66..0c11e12 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5940,6 +5940,9 @@ static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
kvm_x86_ops->set_apic_access_page_addr(vcpu->kvm,
page_to_phys(vcpu->kvm->arch.apic_access_page));
+
+ if (is_guest_mode(vcpu))
+ kvm_x86_ops->set_nested_apic_page_migrated(vcpu, true);
}

/*
--
1.8.3.1

2014-07-23 07:43:16

by Tang Chen

[permalink] [raw]

Subject: [PATCH v3 3/6] kvm: Remove ept_identity_pagetable from struct kvm_arch.

kvm_arch->ept_identity_pagetable holds the ept identity pagetable page. But
it is never used to refer to the page at all.

In vcpu initialization, it indicates two things:
1. indicates if ept page is allocated
2. indicates if a memory slot for identity page is initialized

Actually, kvm_arch->ept_identity_pagetable_done is enough to tell if the ept
identity pagetable is initialized. So we can remove ept_identity_pagetable.

NOTE: In the original code, ept identity pagetable page is pinned in memroy.
As a result, it cannot be migrated/hot-removed. After this patch, since
kvm_arch->ept_identity_pagetable is removed, ept identity pagetable page
is no longer pinned in memory. And it can be migrated/hot-removed.

Signed-off-by: Tang Chen <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/vmx.c | 50 ++++++++++++++++++++---------------------
arch/x86/kvm/x86.c | 2 --
3 files changed, 25 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4931415..62f973e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -578,7 +578,6 @@ struct kvm_arch {

gpa_t wall_clock;

- struct page *ept_identity_pagetable;
bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0e1117c..b8bf47d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -741,6 +741,7 @@ static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
static bool vmx_mpx_supported(void);
+static int alloc_identity_pagetable(struct kvm *kvm);

static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -3921,21 +3922,27 @@ out:

static int init_rmode_identity_map(struct kvm *kvm)
{
- int i, idx, r, ret;
+ int i, idx, r, ret = 0;
pfn_t identity_map_pfn;
u32 tmp;

if (!enable_ept)
return 1;
- if (unlikely(!kvm->arch.ept_identity_pagetable)) {
- printk(KERN_ERR "EPT: identity-mapping pagetable "
- "haven't been allocated!\n");
- return 0;
+
+ /* Protect kvm->arch.ept_identity_pagetable_done. */
+ mutex_lock(&kvm->slots_lock);
+
+ if (likely(kvm->arch.ept_identity_pagetable_done)) {
+ ret = 1;
+ goto out2;
}
- if (likely(kvm->arch.ept_identity_pagetable_done))
- return 1;
- ret = 0;
+
identity_map_pfn = kvm->arch.ept_identity_map_addr >> PAGE_SHIFT;
+
+ r = alloc_identity_pagetable(kvm);
+ if (r)
+ goto out2;
+
idx = srcu_read_lock(&kvm->srcu);
r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
if (r < 0)
@@ -3953,6 +3960,9 @@ static int init_rmode_identity_map(struct kvm *kvm)
ret = 1;
out:
srcu_read_unlock(&kvm->srcu, idx);
+
+out2:
+ mutex_unlock(&kvm->slots_lock);
return ret;
}

@@ -4002,31 +4012,23 @@ out:

static int alloc_identity_pagetable(struct kvm *kvm)
{
- struct page *page;
+ /*
+ * In init_rmode_identity_map(), kvm->arch.ept_identity_pagetable_done
+ * is checked before calling this function and set to true after the
+ * calling. The access to kvm->arch.ept_identity_pagetable_done should
+ * be protected by kvm->slots_lock.
+ */
+
struct kvm_userspace_memory_region kvm_userspace_mem;
int r = 0;

- mutex_lock(&kvm->slots_lock);
- if (kvm->arch.ept_identity_pagetable)
- goto out;
kvm_userspace_mem.slot = IDENTITY_PAGETABLE_PRIVATE_MEMSLOT;
kvm_userspace_mem.flags = 0;
kvm_userspace_mem.guest_phys_addr =
kvm->arch.ept_identity_map_addr;
kvm_userspace_mem.memory_size = PAGE_SIZE;
r = __kvm_set_memory_region(kvm, &kvm_userspace_mem);
- if (r)
- goto out;

- page = gfn_to_page(kvm, kvm->arch.ept_identity_map_addr >> PAGE_SHIFT);
- if (is_error_page(page)) {
- r = -EFAULT;
- goto out;
- }
-
- kvm->arch.ept_identity_pagetable = page;
-out:
- mutex_unlock(&kvm->slots_lock);
return r;
}

@@ -7582,8 +7584,6 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
kvm->arch.ept_identity_map_addr =
VMX_EPT_IDENTITY_PAGETABLE_ADDR;
err = -ENOMEM;
- if (alloc_identity_pagetable(kvm) != 0)
- goto free_vmcs;
if (!init_rmode_identity_map(kvm))
goto free_vmcs;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f32a025..ffbe557 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7177,8 +7177,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kvm_free_vcpus(kvm);
if (kvm->arch.apic_access_page)
put_page(kvm->arch.apic_access_page);
- if (kvm->arch.ept_identity_pagetable)
- put_page(kvm->arch.ept_identity_pagetable);
kfree(rcu_dereference_check(kvm->arch.apic_map, 1));
}

--
1.8.3.1

2014-07-23 07:43:55

by Tang Chen

[permalink] [raw]

Subject: [PATCH v3 4/6] kvm: Make init_rmode_identity_map() return 0 on success.

In init_rmode_identity_map(), there two variables indicating the return
value, r and ret, and it return 0 on error, 1 on success. The function
is only called by vmx_create_vcpu(), and r is redundant.

This patch removes the redundant variable r, and make init_rmode_identity_map()
return 0 on success, -errno on failure.

Signed-off-by: Tang Chen <[email protected]>
---
arch/x86/kvm/vmx.c | 25 +++++++++++--------------
1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index b8bf47d..6ab4f87 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3922,45 +3922,42 @@ out:

static int init_rmode_identity_map(struct kvm *kvm)
{
- int i, idx, r, ret = 0;
+ int i, idx, ret = 0;
pfn_t identity_map_pfn;
u32 tmp;

if (!enable_ept)
- return 1;
+ return 0;

/* Protect kvm->arch.ept_identity_pagetable_done. */
mutex_lock(&kvm->slots_lock);

- if (likely(kvm->arch.ept_identity_pagetable_done)) {
- ret = 1;
+ if (likely(kvm->arch.ept_identity_pagetable_done))
goto out2;
- }

identity_map_pfn = kvm->arch.ept_identity_map_addr >> PAGE_SHIFT;

- r = alloc_identity_pagetable(kvm);
- if (r)
+ ret = alloc_identity_pagetable(kvm);
+ if (ret)
goto out2;

idx = srcu_read_lock(&kvm->srcu);
- r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
- if (r < 0)
+ ret = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
+ if (ret)
goto out;
/* Set up identity-mapping pagetable for EPT in real mode */
for (i = 0; i < PT32_ENT_PER_PAGE; i++) {
tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
_PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
- r = kvm_write_guest_page(kvm, identity_map_pfn,
+ ret = kvm_write_guest_page(kvm, identity_map_pfn,
&tmp, i * sizeof(tmp), sizeof(tmp));
- if (r < 0)
+ if (ret)
goto out;
}
kvm->arch.ept_identity_pagetable_done = true;
- ret = 1;
+
out:
srcu_read_unlock(&kvm->srcu, idx);
-
out2:
mutex_unlock(&kvm->slots_lock);
return ret;
@@ -7584,7 +7581,7 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
kvm->arch.ept_identity_map_addr =
VMX_EPT_IDENTITY_PAGETABLE_ADDR;
err = -ENOMEM;
- if (!init_rmode_identity_map(kvm))
+ if (init_rmode_identity_map(kvm))
goto free_vmcs;
}

--
1.8.3.1

2014-07-26 08:44:34

by Jan Kiszka

[permalink] [raw]

Subject: Re: [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running.

On 2014-07-23 21:42, Tang Chen wrote:
> This patch only handle "L1 and L2 vm share one apic access page" situation.
>
> When L1 vm is running, if the shared apic access page is migrated, mmu_notifier will
> request all vcpus to exit to L0, and reload apic access page physical address for
> all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, L2's vmcs
> will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to do
> nothing.
>
> When L2 vm is running, if the shared apic access page is migrated, mmu_notifier will
> request all vcpus to exit to L0, and reload apic access page physical address for
> all L2 vmcs. And this patch requests apic access page reload in L2->L1 vmexit.

Shouldn't this patch come before we allow apic access page migration?

Jan

Attachments:

signature.asc (263.00 B)
OpenPGP digital signature

2014-07-29 10:57:17

by Tang Chen

[permalink] [raw]

Subject: Re: [PATCH v3 6/6] kvm, mem-hotplug: Reload L1's apic access page if it is migrated when L2 is running.

On 07/26/2014 04:44 AM, Jan Kiszka wrote:
> On 2014-07-23 21:42, Tang Chen wrote:
>> This patch only handle "L1 and L2 vm share one apic access page" situation.
>>
>> When L1 vm is running, if the shared apic access page is migrated, mmu_notifier will
>> request all vcpus to exit to L0, and reload apic access page physical address for
>> all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, L2's vmcs
>> will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to do
>> nothing.
>>
>> When L2 vm is running, if the shared apic access page is migrated, mmu_notifier will
>> request all vcpus to exit to L0, and reload apic access page physical address for
>> all L2 vmcs. And this patch requests apic access page reload in L2->L1 vmexit.
> Shouldn't this patch come before we allow apic access page migration?
Yes, it should come before patch 5.

Thanks.