2018-02-21 19:11:09

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 00/10] KVM/X86: Handle guest memory that does not have a struct page

For the most part, KVM can handle guest memory that does not have a struct
page (i.e. not directly managed by the kernel). However, There are a few places
in the code, specially in the nested code, that does not support that.

Patch 1, 2, and 3 avoid the mapping and unmapping all together and just
directly use kvm_guest_read and kvm_guest_write.

Patch 4 introduces a new guest mapping interface that encapsulate all the
bioler plate code that is needed to map and unmap guest memory. It also
supports guest memory without "struct page".

Patch 5, 6, 7, 8, 9, and 10 switch most of the offending code in VMX and hyperv
to use the new guest mapping API.

This patch series is the first set of fixes. Handling SVM and APIC-access page
will be handled in a different patch series.

KarimAllah Ahmed (10):
X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of
map->read->unmap sequence
X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory
instead of map->copy->unmap sequence.
X86/nVMX: Update the PML table without mapping and unmapping the page
KVM: Introduce a new guest mapping API
KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap
KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page
KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt
descriptor table
KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated
KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending
KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg

arch/x86/kvm/hyperv.c | 28 ++++-----
arch/x86/kvm/vmx.c | 144 +++++++++++++++--------------------------------
arch/x86/kvm/x86.c | 13 ++---
include/linux/kvm_host.h | 15 +++++
virt/kvm/kvm_main.c | 50 ++++++++++++++++
5 files changed, 129 insertions(+), 121 deletions(-)

--
2.7.4



2018-02-21 19:10:44

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 01/10] X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of map->read->unmap sequence

... which also avoids using kvm_vcpu_gpa_to_page() and kmap() which assumes
that there is a "struct page" for guest memory.

Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/vmx.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 19a150b..cf696e2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7562,7 +7562,7 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
{
int ret;
gpa_t vmptr;
- struct page *page;
+ uint32_t revision;
struct vcpu_vmx *vmx = to_vmx(vcpu);
const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
| FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
@@ -7608,19 +7608,11 @@ static int handle_vmon(struct kvm_vcpu *vcpu)
return kvm_skip_emulated_instruction(vcpu);
}

- page = kvm_vcpu_gpa_to_page(vcpu, vmptr);
- if (is_error_page(page)) {
+ if (kvm_read_guest(vcpu->kvm, vmptr, &revision, sizeof(revision)) ||
+ revision != VMCS12_REVISION) {
nested_vmx_failInvalid(vcpu);
return kvm_skip_emulated_instruction(vcpu);
}
- if (*(u32 *)kmap(page) != VMCS12_REVISION) {
- kunmap(page);
- kvm_release_page_clean(page);
- nested_vmx_failInvalid(vcpu);
- return kvm_skip_emulated_instruction(vcpu);
- }
- kunmap(page);
- kvm_release_page_clean(page);

vmx->nested.vmxon_ptr = vmptr;
ret = enter_vmx_operation(vcpu);
--
2.7.4


2018-02-21 19:10:47

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 05/10] KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap

... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
memory that has a "struct page".

The life-cycle of the mapping also changes to avoid doing map and unmap on
every single exit (which becomes very expesive once we use memremap). Now
the memory is mapped and only unmapped when a new VMCS12 is loaded into the
vCPU (or when the vCPU is freed!).

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/vmx.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0a98d1a..4bfef58 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -462,6 +462,9 @@ struct nested_vmx {
struct page *apic_access_page;
struct page *virtual_apic_page;
struct page *pi_desc_page;
+
+ struct kvm_host_map msr_bitmap_map;
+
struct pi_desc *pi_desc;
bool pi_pending;
u16 posted_intr_nv;
@@ -7664,6 +7667,8 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
vmx->nested.current_vmptr >> PAGE_SHIFT,
vmx->nested.cached_vmcs12, 0, VMCS12_SIZE);

+ kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);
+
vmx->nested.current_vmptr = -1ull;
}

@@ -7704,6 +7709,8 @@ static void free_nested(struct vcpu_vmx *vmx)
vmx->nested.pi_desc = NULL;
}

+ kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);
+
free_loaded_vmcs(&vmx->nested.vmcs02);
}

@@ -10374,9 +10381,10 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12)
{
int msr;
- struct page *page;
unsigned long *msr_bitmap_l1;
unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
+ struct kvm_host_map *map = &to_vmx(vcpu)->nested.msr_bitmap_map;
+
/*
* pred_cmd & spec_ctrl are trying to verify two things:
*
@@ -10402,11 +10410,11 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
!pred_cmd && !spec_ctrl)
return false;

- page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->msr_bitmap);
- if (is_error_page(page))
+ if (!kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->msr_bitmap), map))
return false;

- msr_bitmap_l1 = (unsigned long *)kmap(page);
+ msr_bitmap_l1 = (unsigned long *)map->kaddr;
+
if (nested_cpu_has_apic_reg_virt(vmcs12)) {
/*
* L0 need not intercept reads for MSRs between 0x800 and 0x8ff, it
@@ -10454,9 +10462,6 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
MSR_IA32_PRED_CMD,
MSR_TYPE_W);

- kunmap(page);
- kvm_release_page_clean(page);
-
return true;
}

--
2.7.4


2018-02-21 19:10:52

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 07/10] KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt descriptor table

... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
memory that has a "struct page".

The life-cycle of the mapping also changes to avoid doing map and unmap on
every single exit (which becomes very expesive once we use memremap). Now
the memory is mapped and only unmapped when a new VMCS12 is loaded into the
vCPU (or when the vCPU is freed!).

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/vmx.c | 45 +++++++++++++--------------------------------
1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a700338..7b29419 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -461,7 +461,7 @@ struct nested_vmx {
*/
struct page *apic_access_page;
struct kvm_host_map virtual_apic_map;
- struct page *pi_desc_page;
+ struct kvm_host_map pi_desc_map;
struct kvm_host_map msr_bitmap_map;

struct pi_desc *pi_desc;
@@ -7666,6 +7666,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
vmx->nested.cached_vmcs12, 0, VMCS12_SIZE);

kvm_vcpu_unmap(&vmx->nested.virtual_apic_map);
+ kvm_vcpu_unmap(&vmx->nested.pi_desc_map);
kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);

vmx->nested.current_vmptr = -1ull;
@@ -7698,14 +7699,9 @@ static void free_nested(struct vcpu_vmx *vmx)
vmx->nested.apic_access_page = NULL;
}
kvm_vcpu_unmap(&vmx->nested.virtual_apic_map);
- if (vmx->nested.pi_desc_page) {
- kunmap(vmx->nested.pi_desc_page);
- kvm_release_page_dirty(vmx->nested.pi_desc_page);
- vmx->nested.pi_desc_page = NULL;
- vmx->nested.pi_desc = NULL;
- }
-
+ kvm_vcpu_unmap(&vmx->nested.pi_desc_map);
kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);
+ vmx->nested.pi_desc = NULL;

free_loaded_vmcs(&vmx->nested.vmcs02);
}
@@ -10278,24 +10274,16 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu,
}

if (nested_cpu_has_posted_intr(vmcs12)) {
- if (vmx->nested.pi_desc_page) { /* shouldn't happen */
- kunmap(vmx->nested.pi_desc_page);
- kvm_release_page_dirty(vmx->nested.pi_desc_page);
- vmx->nested.pi_desc_page = NULL;
+ map = &vmx->nested.pi_desc_map;
+
+ if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->posted_intr_desc_addr), map)) {
+ vmx->nested.pi_desc =
+ (struct pi_desc *)(((void *)map->kaddr) +
+ offset_in_page(vmcs12->posted_intr_desc_addr));
+ vmcs_write64(POSTED_INTR_DESC_ADDR, pfn_to_hpa(map->pfn) +
+ offset_in_page(vmcs12->posted_intr_desc_addr));
}
- page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->posted_intr_desc_addr);
- if (is_error_page(page))
- return;
- vmx->nested.pi_desc_page = page;
- vmx->nested.pi_desc = kmap(vmx->nested.pi_desc_page);
- vmx->nested.pi_desc =
- (struct pi_desc *)((void *)vmx->nested.pi_desc +
- (unsigned long)(vmcs12->posted_intr_desc_addr &
- (PAGE_SIZE - 1)));
- vmcs_write64(POSTED_INTR_DESC_ADDR,
- page_to_phys(vmx->nested.pi_desc_page) +
- (unsigned long)(vmcs12->posted_intr_desc_addr &
- (PAGE_SIZE - 1)));
+
}
if (nested_vmx_prepare_msr_bitmap(vcpu, vmcs12))
vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL,
@@ -11893,13 +11881,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
kvm_release_page_dirty(vmx->nested.apic_access_page);
vmx->nested.apic_access_page = NULL;
}
- if (vmx->nested.pi_desc_page) {
- kunmap(vmx->nested.pi_desc_page);
- kvm_release_page_dirty(vmx->nested.pi_desc_page);
- vmx->nested.pi_desc_page = NULL;
- vmx->nested.pi_desc = NULL;
- }
-
/*
* We are now running in L2, mmu_notifier will force to reload the
* page's hpa for L2 vmcs. Need to reload it for L1 before entering L1.
--
2.7.4


2018-02-21 19:10:56

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 04/10] KVM: Introduce a new guest mapping API

In KVM, specially for nested guests, there is a dominant pattern of:

=> map guest memory -> do_something -> unmap guest memory

In addition to all this unnecessarily noise in the code due to boiler plate
code, most of the time the mapping function does not properly handle memory
that is not backed by "struct page". This new guest mapping API encapsulate
most of this boiler plate code and also handles guest memory that is not
backed by "struct page".

Keep in mind that memremap is horribly slow, so this mapping API should not
be used for high-frequency mapping operations. But rather for low-frequency
mappings.

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
include/linux/kvm_host.h | 15 +++++++++++++++
virt/kvm/kvm_main.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ac0062b..6cc2c29 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -204,6 +204,13 @@ enum {
READING_SHADOW_PAGE_TABLES,
};

+struct kvm_host_map {
+ struct page *page;
+ void *kaddr;
+ kvm_pfn_t pfn;
+ kvm_pfn_t gfn;
+};
+
/*
* Sometimes a large or cross-page mmio needs to be broken up into separate
* exits for userspace servicing.
@@ -700,6 +707,9 @@ struct kvm_memslots *kvm_vcpu_memslots(struct kvm_vcpu *vcpu);
struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn);
kvm_pfn_t kvm_vcpu_gfn_to_pfn_atomic(struct kvm_vcpu *vcpu, gfn_t gfn);
kvm_pfn_t kvm_vcpu_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
+bool kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa,
+ struct kvm_host_map *map);
+void kvm_vcpu_unmap(struct kvm_host_map *map);
struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn);
unsigned long kvm_vcpu_gfn_to_hva(struct kvm_vcpu *vcpu, gfn_t gfn);
unsigned long kvm_vcpu_gfn_to_hva_prot(struct kvm_vcpu *vcpu, gfn_t gfn, bool *writable);
@@ -996,6 +1006,11 @@ static inline struct page *kvm_vcpu_gpa_to_page(struct kvm_vcpu *vcpu,
return kvm_vcpu_gfn_to_page(vcpu, gpa_to_gfn(gpa));
}

+static inline bool kvm_vcpu_map_valid(struct kvm_host_map *map)
+{
+ return map->kaddr != NULL;
+}
+
static inline bool kvm_is_error_gpa(struct kvm *kvm, gpa_t gpa)
{
unsigned long hva = gfn_to_hva(kvm, gpa_to_gfn(gpa));
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4501e65..54e7329 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1632,6 +1632,56 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
}
EXPORT_SYMBOL_GPL(gfn_to_page);

+bool kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
+{
+ kvm_pfn_t pfn;
+ void *kaddr = NULL;
+ struct page *page = NULL;
+
+ if (map->kaddr && map->gfn == gfn)
+ /* If the mapping is valid and guest memory is already mapped */
+ return true;
+ else if (map->kaddr)
+ /* If the mapping is valid but trying to map a different guest pfn */
+ kvm_vcpu_unmap(map);
+
+ pfn = kvm_vcpu_gfn_to_pfn(vcpu, gfn);
+ if (is_error_pfn(pfn))
+ return false;
+
+ if (pfn_valid(pfn)) {
+ page = pfn_to_page(pfn);
+ kaddr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
+ } else {
+ kaddr = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
+ }
+
+ if (!kaddr)
+ return false;
+
+ map->page = page;
+ map->kaddr = kaddr;
+ map->pfn = pfn;
+ map->gfn = gfn;
+
+ return true;
+}
+EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
+
+void kvm_vcpu_unmap(struct kvm_host_map *map)
+{
+ if (!map->kaddr)
+ return;
+
+ if (map->page)
+ kunmap(map->page);
+ else
+ memunmap(map->kaddr);
+
+ kvm_release_pfn_dirty(map->pfn);
+ memset(map, 0, sizeof(*map));
+}
+
struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn)
{
kvm_pfn_t pfn;
--
2.7.4


2018-02-21 19:11:04

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 06/10] KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page

... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
memory that has a "struct page".

The life-cycle of the mapping also changes to avoid doing map and unmap on
every single exit (which becomes very expesive once we use memremap). Now
the memory is mapped and only unmapped when a new VMCS12 is loaded into the
vCPU (or when the vCPU is freed!).

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/vmx.c | 31 +++++++++----------------------
1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4bfef58..a700338 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -460,9 +460,8 @@ struct nested_vmx {
* pointers, so we must keep them pinned while L2 runs.
*/
struct page *apic_access_page;
- struct page *virtual_apic_page;
+ struct kvm_host_map virtual_apic_map;
struct page *pi_desc_page;
-
struct kvm_host_map msr_bitmap_map;

struct pi_desc *pi_desc;
@@ -5444,10 +5443,9 @@ static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)

max_irr = find_last_bit((unsigned long *)vmx->nested.pi_desc->pir, 256);
if (max_irr != 256) {
- vapic_page = kmap(vmx->nested.virtual_apic_page);
+ vapic_page = vmx->nested.virtual_apic_map.kaddr;
__kvm_apic_update_irr(vmx->nested.pi_desc->pir,
vapic_page, &max_irr);
- kunmap(vmx->nested.virtual_apic_page);

status = vmcs_read16(GUEST_INTR_STATUS);
if ((u8)max_irr > ((u8)status & 0xff)) {
@@ -7667,6 +7665,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
vmx->nested.current_vmptr >> PAGE_SHIFT,
vmx->nested.cached_vmcs12, 0, VMCS12_SIZE);

+ kvm_vcpu_unmap(&vmx->nested.virtual_apic_map);
kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);

vmx->nested.current_vmptr = -1ull;
@@ -7698,10 +7697,7 @@ static void free_nested(struct vcpu_vmx *vmx)
kvm_release_page_dirty(vmx->nested.apic_access_page);
vmx->nested.apic_access_page = NULL;
}
- if (vmx->nested.virtual_apic_page) {
- kvm_release_page_dirty(vmx->nested.virtual_apic_page);
- vmx->nested.virtual_apic_page = NULL;
- }
+ kvm_vcpu_unmap(&vmx->nested.virtual_apic_map);
if (vmx->nested.pi_desc_page) {
kunmap(vmx->nested.pi_desc_page);
kvm_release_page_dirty(vmx->nested.pi_desc_page);
@@ -10222,6 +10218,7 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12)
{
+ struct kvm_host_map *map;
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct page *page;
u64 hpa;
@@ -10260,11 +10257,7 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu,
}

if (nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW)) {
- if (vmx->nested.virtual_apic_page) { /* shouldn't happen */
- kvm_release_page_dirty(vmx->nested.virtual_apic_page);
- vmx->nested.virtual_apic_page = NULL;
- }
- page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->virtual_apic_page_addr);
+ map = &vmx->nested.virtual_apic_map;

/*
* If translation failed, VM entry will fail because
@@ -10279,11 +10272,9 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu,
* control. But such a configuration is useless, so
* let's keep the code simple.
*/
- if (!is_error_page(page)) {
- vmx->nested.virtual_apic_page = page;
- hpa = page_to_phys(vmx->nested.virtual_apic_page);
- vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, hpa);
- }
+
+ if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->virtual_apic_page_addr), map))
+ vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, gfn_to_gpa(map->pfn));
}

if (nested_cpu_has_posted_intr(vmcs12)) {
@@ -11902,10 +11893,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
kvm_release_page_dirty(vmx->nested.apic_access_page);
vmx->nested.apic_access_page = NULL;
}
- if (vmx->nested.virtual_apic_page) {
- kvm_release_page_dirty(vmx->nested.virtual_apic_page);
- vmx->nested.virtual_apic_page = NULL;
- }
if (vmx->nested.pi_desc_page) {
kunmap(vmx->nested.pi_desc_page);
kvm_release_page_dirty(vmx->nested.pi_desc_page);
--
2.7.4


2018-02-21 19:11:05

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 10/10] KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg

... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
memory that has a "struct page".

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/hyperv.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 909b498..870fd24 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -559,7 +559,7 @@ static int synic_deliver_msg(struct kvm_vcpu_hv_synic *synic, u32 sint,
struct hv_message *src_msg)
{
struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
- struct page *page;
+ struct kvm_host_map map;
gpa_t gpa;
struct hv_message *dst_msg;
int r;
@@ -569,11 +569,11 @@ static int synic_deliver_msg(struct kvm_vcpu_hv_synic *synic, u32 sint,
return -ENOENT;

gpa = synic->msg_page & PAGE_MASK;
- page = kvm_vcpu_gfn_to_page(vcpu, gpa >> PAGE_SHIFT);
- if (is_error_page(page))
+
+ if (!kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map))
return -EFAULT;

- msg_page = kmap_atomic(page);
+ msg_page = map.kaddr;
dst_msg = &msg_page->sint_message[sint];
if (sync_cmpxchg(&dst_msg->header.message_type, HVMSG_NONE,
src_msg->header.message_type) != HVMSG_NONE) {
@@ -590,8 +590,8 @@ static int synic_deliver_msg(struct kvm_vcpu_hv_synic *synic, u32 sint,
else if (r == 0)
r = -EFAULT;
}
- kunmap_atomic(msg_page);
- kvm_release_page_dirty(page);
+
+ kvm_vcpu_unmap(&map);
kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
return r;
}
--
2.7.4


2018-02-21 19:12:04

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 09/10] KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending

... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
memory that has a "struct page".

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/hyperv.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index dc97f25..909b498 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -137,26 +137,22 @@ static void synic_clear_sint_msg_pending(struct kvm_vcpu_hv_synic *synic,
u32 sint)
{
struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
- struct page *page;
- gpa_t gpa;
+ struct kvm_host_map map;
struct hv_message *msg;
struct hv_message_page *msg_page;

- gpa = synic->msg_page & PAGE_MASK;
- page = kvm_vcpu_gfn_to_page(vcpu, gpa >> PAGE_SHIFT);
- if (is_error_page(page)) {
+ if (!kvm_vcpu_map(vcpu, gpa_to_gfn(synic->msg_page), &map)) {
vcpu_err(vcpu, "Hyper-V SynIC can't get msg page, gpa 0x%llx\n",
- gpa);
+ synic->msg_page);
return;
}
- msg_page = kmap_atomic(page);

+ msg_page = map.kaddr;
msg = &msg_page->sint_message[sint];
msg->header.message_flags.msg_pending = 0;

- kunmap_atomic(msg_page);
- kvm_release_page_dirty(page);
- kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
+ kvm_vcpu_unmap(&map);
+ kvm_vcpu_mark_page_dirty(vcpu, gpa_to_gfn(synic->msg_page));
}

static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
--
2.7.4


2018-02-21 19:12:15

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 08/10] KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated

... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
memory that has a "struct page".

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/x86.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 37f5df9..197a395 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5013,9 +5013,9 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
unsigned int bytes,
struct x86_exception *exception)
{
+ struct kvm_host_map map;
struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
gpa_t gpa;
- struct page *page;
char *kaddr;
bool exchanged;

@@ -5032,12 +5032,11 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
if (((gpa + bytes - 1) & PAGE_MASK) != (gpa & PAGE_MASK))
goto emul_write;

- page = kvm_vcpu_gfn_to_page(vcpu, gpa >> PAGE_SHIFT);
- if (is_error_page(page))
+ if (!kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map))
goto emul_write;

- kaddr = kmap_atomic(page);
- kaddr += offset_in_page(gpa);
+ kaddr = map.kaddr + offset_in_page(gpa);
+
switch (bytes) {
case 1:
exchanged = CMPXCHG_TYPE(u8, kaddr, old, new);
@@ -5054,8 +5053,8 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
default:
BUG();
}
- kunmap_atomic(kaddr);
- kvm_release_page_dirty(page);
+
+ kvm_vcpu_unmap(&map);

if (!exchanged)
return X86EMUL_CMPXCHG_FAILED;
--
2.7.4


2018-02-21 19:12:58

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 02/10] X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory instead of map->copy->unmap sequence.

... which also avoids using kvm_vcpu_gpa_to_page() and kmap() which assumes
that there is a "struct page" for guest memory.

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/vmx.c | 26 +++++++-------------------
1 file changed, 7 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index cf696e2..e5653d2 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8046,30 +8046,18 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
}

if (vmx->nested.current_vmptr != vmptr) {
- struct vmcs12 *new_vmcs12;
- struct page *page;
- page = kvm_vcpu_gpa_to_page(vcpu, vmptr);
- if (is_error_page(page)) {
- nested_vmx_failInvalid(vcpu);
- return kvm_skip_emulated_instruction(vcpu);
- }
- new_vmcs12 = kmap(page);
- if (new_vmcs12->revision_id != VMCS12_REVISION) {
- kunmap(page);
- kvm_release_page_clean(page);
- nested_vmx_failValid(vcpu,
- VMXERR_VMPTRLD_INCORRECT_VMCS_REVISION_ID);
- return kvm_skip_emulated_instruction(vcpu);
- }
-
nested_release_vmcs12(vmx);
+
/*
* Load VMCS12 from guest memory since it is not already
* cached.
*/
- memcpy(vmx->nested.cached_vmcs12, new_vmcs12, VMCS12_SIZE);
- kunmap(page);
- kvm_release_page_clean(page);
+ if (kvm_read_guest(vcpu->kvm, vmptr, vmx->nested.cached_vmcs12,
+ sizeof(*vmx->nested.cached_vmcs12)) ||
+ vmx->nested.cached_vmcs12->revision_id != VMCS12_REVISION) {
+ nested_vmx_failInvalid(vcpu);
+ return kvm_skip_emulated_instruction(vcpu);
+ }

set_current_vmptr(vmx, vmptr);
}
--
2.7.4


2018-02-21 19:13:37

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH 03/10] X86/nVMX: Update the PML table without mapping and unmapping the page

... which also avoids using kvm_vcpu_gpa_to_page(..) which assumes that
there is a "struct page" for guest memory.

Signed-off-by: KarimAllah Ahmed <[email protected]>
---
arch/x86/kvm/vmx.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e5653d2..0a98d1a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -12111,9 +12111,7 @@ static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu)
{
struct vmcs12 *vmcs12;
struct vcpu_vmx *vmx = to_vmx(vcpu);
- gpa_t gpa;
- struct page *page = NULL;
- u64 *pml_address;
+ gpa_t gpa, dst;

if (is_guest_mode(vcpu)) {
WARN_ON_ONCE(vmx->nested.pml_full);
@@ -12133,15 +12131,12 @@ static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu)
}

gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS) & ~0xFFFull;
+ dst = (gpa_t)(((u64 *)vmcs12->pml_address) + vmcs12->guest_pml_index);

- page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->pml_address);
- if (is_error_page(page))
+ if (kvm_write_guest(vcpu->kvm, dst, &gpa, sizeof(gpa)))
return 0;

- pml_address = kmap(page);
- pml_address[vmcs12->guest_pml_index--] = gpa;
- kunmap(page);
- kvm_release_page_clean(page);
+ vmcs12->guest_pml_index--;
}

return 0;
--
2.7.4


2018-02-22 02:57:38

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH 08/10] KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated

On Wed, 2018-02-21 at 18:47 +0100, KarimAllah Ahmed wrote:
> ... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
> memory that has a "struct page".
>
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/x86.c | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 37f5df9..197a395 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5013,9 +5013,9 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
> unsigned int bytes,
> struct x86_exception *exception)
> {
> + struct kvm_host_map map;

"map" here needs to be memset to '0'. Will fix in v2

> struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> gpa_t gpa;
> - struct page *page;
> char *kaddr;
> bool exchanged;
>
> @@ -5032,12 +5032,11 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
> if (((gpa + bytes - 1) & PAGE_MASK) != (gpa & PAGE_MASK))
> goto emul_write;
>
> - page = kvm_vcpu_gfn_to_page(vcpu, gpa >> PAGE_SHIFT);
> - if (is_error_page(page))
> + if (!kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map))
> goto emul_write;
>
> - kaddr = kmap_atomic(page);
> - kaddr += offset_in_page(gpa);
> + kaddr = map.kaddr + offset_in_page(gpa);
> +
> switch (bytes) {
> case 1:
> exchanged = CMPXCHG_TYPE(u8, kaddr, old, new);
> @@ -5054,8 +5053,8 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
> default:
> BUG();
> }
> - kunmap_atomic(kaddr);
> - kvm_release_page_dirty(page);
> +
> + kvm_vcpu_unmap(&map);
>
> if (!exchanged)
> return X86EMUL_CMPXCHG_FAILED;
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-02-23 01:30:25

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 04/10] KVM: Introduce a new guest mapping API

Hi KarimAllah,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/auto-latest]
[also build test ERROR on v4.16-rc2 next-20180222]
[cannot apply to kvm/linux-next]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/KarimAllah-Ahmed/KVM-X86-Handle-guest-memory-that-does-not-have-a-struct-page/20180223-064826
config: x86_64-randconfig-x012-201807 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All error/warnings (new ones prefixed by >>):

In file included from include/linux/linkage.h:7:0,
from include/linux/preempt.h:10,
from include/linux/hardirq.h:5,
from include/linux/kvm_host.h:10,
from arch/x86/kvm/../../../virt/kvm/kvm_main.c:21:
>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1669:19: error: 'kvm_vcpu_gfn_to_kaddr' undeclared here (not in a function); did you mean 'kvm_vcpu_gfn_to_page'?
EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
^
include/linux/export.h:65:16: note: in definition of macro '___EXPORT_SYMBOL'
extern typeof(sym) sym; \
^~~
>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1669:1: note: in expansion of macro 'EXPORT_SYMBOL_GPL'
EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
^~~~~~~~~~~~~~~~~

vim +1669 arch/x86/kvm/../../../virt/kvm/kvm_main.c

1634
1635 bool kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
1636 {
1637 kvm_pfn_t pfn;
1638 void *kaddr = NULL;
1639 struct page *page = NULL;
1640
1641 if (map->kaddr && map->gfn == gfn)
1642 /* If the mapping is valid and guest memory is already mapped */
1643 return true;
1644 else if (map->kaddr)
1645 /* If the mapping is valid but trying to map a different guest pfn */
1646 kvm_vcpu_unmap(map);
1647
1648 pfn = kvm_vcpu_gfn_to_pfn(vcpu, gfn);
1649 if (is_error_pfn(pfn))
1650 return false;
1651
1652 if (pfn_valid(pfn)) {
1653 page = pfn_to_page(pfn);
1654 kaddr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
1655 } else {
1656 kaddr = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
1657 }
1658
1659 if (!kaddr)
1660 return false;
1661
1662 map->page = page;
1663 map->kaddr = kaddr;
1664 map->pfn = pfn;
1665 map->gfn = gfn;
1666
1667 return true;
1668 }
> 1669 EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
1670

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.78 kB)
.config.gz (32.45 kB)
Download all attachments

2018-02-23 01:40:26

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 04/10] KVM: Introduce a new guest mapping API

Hi KarimAllah,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/auto-latest]
[also build test ERROR on v4.16-rc2 next-20180222]
[cannot apply to kvm/linux-next]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/KarimAllah-Ahmed/KVM-X86-Handle-guest-memory-that-does-not-have-a-struct-page/20180223-064826
config: mips-malta_kvm_defconfig (attached as .config)
compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=mips

All error/warnings (new ones prefixed by >>):

In file included from include/linux/linkage.h:7:0,
from include/linux/preempt.h:10,
from include/linux/hardirq.h:5,
from include/linux/kvm_host.h:10,
from arch/mips/kvm/../../../virt/kvm/kvm_main.c:21:
>> arch/mips/kvm/../../../virt/kvm/kvm_main.c:1669:19: error: 'kvm_vcpu_gfn_to_kaddr' undeclared here (not in a function); did you mean 'kvm_vcpu_gfn_to_page'?
EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
^
include/linux/export.h:65:16: note: in definition of macro '___EXPORT_SYMBOL'
extern typeof(sym) sym; \
^~~
>> arch/mips/kvm/../../../virt/kvm/kvm_main.c:1669:1: note: in expansion of macro 'EXPORT_SYMBOL_GPL'
EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
^~~~~~~~~~~~~~~~~

vim +1669 arch/mips/kvm/../../../virt/kvm/kvm_main.c

1634
1635 bool kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
1636 {
1637 kvm_pfn_t pfn;
1638 void *kaddr = NULL;
1639 struct page *page = NULL;
1640
1641 if (map->kaddr && map->gfn == gfn)
1642 /* If the mapping is valid and guest memory is already mapped */
1643 return true;
1644 else if (map->kaddr)
1645 /* If the mapping is valid but trying to map a different guest pfn */
1646 kvm_vcpu_unmap(map);
1647
1648 pfn = kvm_vcpu_gfn_to_pfn(vcpu, gfn);
1649 if (is_error_pfn(pfn))
1650 return false;
1651
1652 if (pfn_valid(pfn)) {
1653 page = pfn_to_page(pfn);
1654 kaddr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
1655 } else {
1656 kaddr = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
1657 }
1658
1659 if (!kaddr)
1660 return false;
1661
1662 map->page = page;
1663 map->kaddr = kaddr;
1664 map->pfn = pfn;
1665 map->gfn = gfn;
1666
1667 return true;
1668 }
> 1669 EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
1670

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.94 kB)
.config.gz (18.89 kB)
Download all attachments

2018-02-23 02:03:44

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 03/10] X86/nVMX: Update the PML table without mapping and unmapping the page

Hi KarimAllah,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tip/auto-latest]
[also build test WARNING on v4.16-rc2 next-20180222]
[cannot apply to kvm/linux-next]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/KarimAllah-Ahmed/KVM-X86-Handle-guest-memory-that-does-not-have-a-struct-page/20180223-064826
config: i386-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All warnings (new ones prefixed by >>):

arch/x86/kvm/vmx.c: In function 'vmx_write_pml_buffer':
>> arch/x86/kvm/vmx.c:11951:18: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
dst = (gpa_t)(((u64 *)vmcs12->pml_address) + vmcs12->guest_pml_index);
^
>> arch/x86/kvm/vmx.c:11951:9: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
dst = (gpa_t)(((u64 *)vmcs12->pml_address) + vmcs12->guest_pml_index);
^

vim +11951 arch/x86/kvm/vmx.c

11926
11927 static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu)
11928 {
11929 struct vmcs12 *vmcs12;
11930 struct vcpu_vmx *vmx = to_vmx(vcpu);
11931 gpa_t gpa, dst;
11932
11933 if (is_guest_mode(vcpu)) {
11934 WARN_ON_ONCE(vmx->nested.pml_full);
11935
11936 /*
11937 * Check if PML is enabled for the nested guest.
11938 * Whether eptp bit 6 is set is already checked
11939 * as part of A/D emulation.
11940 */
11941 vmcs12 = get_vmcs12(vcpu);
11942 if (!nested_cpu_has_pml(vmcs12))
11943 return 0;
11944
11945 if (vmcs12->guest_pml_index >= PML_ENTITY_NUM) {
11946 vmx->nested.pml_full = true;
11947 return 1;
11948 }
11949
11950 gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS) & ~0xFFFull;
11951 dst = (gpa_t)(((u64 *)vmcs12->pml_address) + vmcs12->guest_pml_index);
11952
11953 if (kvm_write_guest(vcpu->kvm, dst, &gpa, sizeof(gpa)))
11954 return 0;
11955
11956 vmcs12->guest_pml_index--;
11957 }
11958
11959 return 0;
11960 }
11961

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.37 kB)
.config.gz (61.62 kB)
Download all attachments

2018-02-23 21:45:24

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH 05/10] KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap

On Wed, Feb 21, 2018 at 06:47:16PM +0100, KarimAllah Ahmed wrote:
> ... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
> memory that has a "struct page".
>
> The life-cycle of the mapping also changes to avoid doing map and unmap on

s/also changes/has been changed/ ?

2018-02-23 23:46:52

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH 05/10] KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap

On Fri, 2018-02-23 at 16:36 -0500, Konrad Rzeszutek Wilk wrote:
> On Wed, Feb 21, 2018 at 06:47:16PM +0100, KarimAllah Ahmed wrote:
> >
> > ... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
> > memory that has a "struct page".
> >
> > The life-cycle of the mapping also changes to avoid doing map and unmap on
>
> s/also changes/has been changed/ ?

Right, will fix in v2.

Thanks.
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-02-23 23:49:45

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH 04/10] KVM: Introduce a new guest mapping API

On Fri, 2018-02-23 at 09:37 +0800, kbuild test robot wrote:
> Hi KarimAllah,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on tip/auto-latest]
> [also build test ERROR on v4.16-rc2 next-20180222]
> [cannot apply to kvm/linux-next]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
>
> url: https://github.com/0day-ci/linux/commits/KarimAllah-Ahmed/KVM-X86-Handle-guest-memory-that-does-not-have-a-struct-page/20180223-064826
> config: mips-malta_kvm_defconfig (attached as .config)
> compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=mips
>
> All error/warnings (new ones prefixed by >>):
>
> In file included from include/linux/linkage.h:7:0,
> from include/linux/preempt.h:10,
> from include/linux/hardirq.h:5,
> from include/linux/kvm_host.h:10,
> from arch/mips/kvm/../../../virt/kvm/kvm_main.c:21:
> >
> > >
> > > arch/mips/kvm/../../../virt/kvm/kvm_main.c:1669:19: error: 'kvm_vcpu_gfn_to_kaddr' undeclared here (not in a function); did you mean 'kvm_vcpu_gfn_to_page'?
> EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
> ^
> include/linux/export.h:65:16: note: in definition of macro '___EXPORT_SYMBOL'
> extern typeof(sym) sym; \
> ^~~
> >
> > >
> > > arch/mips/kvm/../../../virt/kvm/kvm_main.c:1669:1: note: in expansion of macro 'EXPORT_SYMBOL_GPL'
> EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
> ^~~~~~~~~~~~~~~~~

Ooops! I will make sure I build KVM as a module as well before posting 
v2.

I will also drop "kvm_vcpu_map_valid" since it is no longer used.

>
> vim +1669 arch/mips/kvm/../../../virt/kvm/kvm_main.c
>
> 1634
> 1635 bool kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
> 1636 {
> 1637 kvm_pfn_t pfn;
> 1638 void *kaddr = NULL;
> 1639 struct page *page = NULL;
> 1640
> 1641 if (map->kaddr && map->gfn == gfn)
> 1642 /* If the mapping is valid and guest memory is already mapped */
> 1643 return true;
> 1644 else if (map->kaddr)
> 1645 /* If the mapping is valid but trying to map a different guest pfn */
> 1646 kvm_vcpu_unmap(map);
> 1647
> 1648 pfn = kvm_vcpu_gfn_to_pfn(vcpu, gfn);
> 1649 if (is_error_pfn(pfn))
> 1650 return false;
> 1651
> 1652 if (pfn_valid(pfn)) {
> 1653 page = pfn_to_page(pfn);
> 1654 kaddr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> 1655 } else {
> 1656 kaddr = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
> 1657 }
> 1658
> 1659 if (!kaddr)
> 1660 return false;
> 1661
> 1662 map->page = page;
> 1663 map->kaddr = kaddr;
> 1664 map->pfn = pfn;
> 1665 map->gfn = gfn;
> 1666
> 1667 return true;
> 1668 }
> >
> > 1669 EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_kaddr);
> 1670
>
> ---
> 0-DAY kernel test infrastructure Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all Intel Corporation
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-03-01 15:26:31

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH 00/10] KVM/X86: Handle guest memory that does not have a struct page

Jim/Paolo/Radim,

Any complains about the current API? (introduced in 4/10)

I have more patches on top and I would like to ensure that this is 
agreed upon at least before sending more revisions/patches.

Also 1, 2, and 3 should be a bit straight forward and does not use 
this API.

Thanks.

On Wed, 2018-02-21 at 18:47 +0100, KarimAllah Ahmed wrote:
> For the most part, KVM can handle guest memory that does not have a struct
> page (i.e. not directly managed by the kernel). However, There are a few places
> in the code, specially in the nested code, that does not support that.
>
> Patch 1, 2, and 3 avoid the mapping and unmapping all together and just
> directly use kvm_guest_read and kvm_guest_write.
>
> Patch 4 introduces a new guest mapping interface that encapsulate all the
> bioler plate code that is needed to map and unmap guest memory. It also
> supports guest memory without "struct page".
>
> Patch 5, 6, 7, 8, 9, and 10 switch most of the offending code in VMX and hyperv
> to use the new guest mapping API.
>
> This patch series is the first set of fixes. Handling SVM and APIC-access page
> will be handled in a different patch series.
>
> KarimAllah Ahmed (10):
> X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of
> map->read->unmap sequence
> X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory
> instead of map->copy->unmap sequence.
> X86/nVMX: Update the PML table without mapping and unmapping the page
> KVM: Introduce a new guest mapping API
> KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap
> KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page
> KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt
> descriptor table
> KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated
> KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending
> KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg
>
> arch/x86/kvm/hyperv.c | 28 ++++-----
> arch/x86/kvm/vmx.c | 144 +++++++++++++++--------------------------------
> arch/x86/kvm/x86.c | 13 ++---
> include/linux/kvm_host.h | 15 +++++
> virt/kvm/kvm_main.c | 50 ++++++++++++++++
> 5 files changed, 129 insertions(+), 121 deletions(-)
>
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-03-01 17:53:10

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH 00/10] KVM/X86: Handle guest memory that does not have a struct page

No complaints here!

On Thu, Mar 1, 2018 at 7:24 AM, Raslan, KarimAllah <[email protected]> wrote:
> Jim/Paolo/Radim,
>
> Any complains about the current API? (introduced in 4/10)
>
> I have more patches on top and I would like to ensure that this is
> agreed upon at least before sending more revisions/patches.
>
> Also 1, 2, and 3 should be a bit straight forward and does not use
> this API.
>
> Thanks.
>
> On Wed, 2018-02-21 at 18:47 +0100, KarimAllah Ahmed wrote:
>> For the most part, KVM can handle guest memory that does not have a struct
>> page (i.e. not directly managed by the kernel). However, There are a few places
>> in the code, specially in the nested code, that does not support that.
>>
>> Patch 1, 2, and 3 avoid the mapping and unmapping all together and just
>> directly use kvm_guest_read and kvm_guest_write.
>>
>> Patch 4 introduces a new guest mapping interface that encapsulate all the
>> bioler plate code that is needed to map and unmap guest memory. It also
>> supports guest memory without "struct page".
>>
>> Patch 5, 6, 7, 8, 9, and 10 switch most of the offending code in VMX and hyperv
>> to use the new guest mapping API.
>>
>> This patch series is the first set of fixes. Handling SVM and APIC-access page
>> will be handled in a different patch series.
>>
>> KarimAllah Ahmed (10):
>> X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of
>> map->read->unmap sequence
>> X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory
>> instead of map->copy->unmap sequence.
>> X86/nVMX: Update the PML table without mapping and unmapping the page
>> KVM: Introduce a new guest mapping API
>> KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap
>> KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page
>> KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt
>> descriptor table
>> KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated
>> KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending
>> KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg
>>
>> arch/x86/kvm/hyperv.c | 28 ++++-----
>> arch/x86/kvm/vmx.c | 144 +++++++++++++++--------------------------------
>> arch/x86/kvm/x86.c | 13 ++---
>> include/linux/kvm_host.h | 15 +++++
>> virt/kvm/kvm_main.c | 50 ++++++++++++++++
>> 5 files changed, 129 insertions(+), 121 deletions(-)
>>
> Amazon Development Center Germany GmbH
> Berlin - Dresden - Aachen
> main office: Krausenstr. 38, 10117 Berlin
> Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
> Ust-ID: DE289237879
> Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

2018-03-02 20:26:45

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 00/10] KVM/X86: Handle guest memory that does not have a struct page

On 01/03/2018 18:51, Jim Mattson wrote:
> No complaints here!

It seems sane here too, I'll review the individual patches as soon as
possible.

Thanks,

Paolo

> On Thu, Mar 1, 2018 at 7:24 AM, Raslan, KarimAllah <[email protected]> wrote:
>> Jim/Paolo/Radim,
>>
>> Any complains about the current API? (introduced in 4/10)
>>
>> I have more patches on top and I would like to ensure that this is
>> agreed upon at least before sending more revisions/patches.
>>
>> Also 1, 2, and 3 should be a bit straight forward and does not use
>> this API.
>>
>> Thanks.
>>
>> On Wed, 2018-02-21 at 18:47 +0100, KarimAllah Ahmed wrote:
>>> For the most part, KVM can handle guest memory that does not have a struct
>>> page (i.e. not directly managed by the kernel). However, There are a few places
>>> in the code, specially in the nested code, that does not support that.
>>>
>>> Patch 1, 2, and 3 avoid the mapping and unmapping all together and just
>>> directly use kvm_guest_read and kvm_guest_write.
>>>
>>> Patch 4 introduces a new guest mapping interface that encapsulate all the
>>> bioler plate code that is needed to map and unmap guest memory. It also
>>> supports guest memory without "struct page".
>>>
>>> Patch 5, 6, 7, 8, 9, and 10 switch most of the offending code in VMX and hyperv
>>> to use the new guest mapping API.
>>>
>>> This patch series is the first set of fixes. Handling SVM and APIC-access page
>>> will be handled in a different patch series.
>>>
>>> KarimAllah Ahmed (10):
>>> X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of
>>> map->read->unmap sequence
>>> X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory
>>> instead of map->copy->unmap sequence.
>>> X86/nVMX: Update the PML table without mapping and unmapping the page
>>> KVM: Introduce a new guest mapping API
>>> KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap
>>> KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page
>>> KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt
>>> descriptor table
>>> KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated
>>> KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending
>>> KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg
>>>
>>> arch/x86/kvm/hyperv.c | 28 ++++-----
>>> arch/x86/kvm/vmx.c | 144 +++++++++++++++--------------------------------
>>> arch/x86/kvm/x86.c | 13 ++---
>>> include/linux/kvm_host.h | 15 +++++
>>> virt/kvm/kvm_main.c | 50 ++++++++++++++++
>>> 5 files changed, 129 insertions(+), 121 deletions(-)
>>>
>> Amazon Development Center Germany GmbH
>> Berlin - Dresden - Aachen
>> main office: Krausenstr. 38, 10117 Berlin
>> Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
>> Ust-ID: DE289237879
>> Eingetragen am Amtsgericht Charlottenburg HRB 149173 B


2018-04-12 14:36:59

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 04/10] KVM: Introduce a new guest mapping API

Coming back to this (nice) series.

On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> +bool kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
> +{
> + kvm_pfn_t pfn;
> + void *kaddr = NULL;

Can we s/kaddr/hva/ since that's the standard nomenclature?

> + struct page *page = NULL;
> +
> + if (map->kaddr && map->gfn == gfn)
> + /* If the mapping is valid and guest memory is already mapped */
> + return true;

Please return 0/-EINVAL instead. More important, this optimization is
problematic because:

1) the underlying memslots array could have changed. You'd also need to
store the generation count (see kvm_read_guest_cached for an example)

2) worse, the memslots array could have switched between the SMM and
non-SMM address spaces. This is by the way the reason why there is no
kvm_vcpu_read_guest_cached API.

However, all the places you're changing in patches 4-10 are doing
already kvm_vcpu_gpa_to_page, so I suggest just dropping this optimization.

> + else if (map->kaddr)
> + /* If the mapping is valid but trying to map a different guest pfn */
> + kvm_vcpu_unmap(map);
> +
> + pfn = kvm_vcpu_gfn_to_pfn(vcpu, gfn);

Please change the API to:

static int __kvm_map_gfn(struct kvm_memslot *memslots, gfn_t gfn,
struct kvm_host_map *map)

calling gfn_to_pfn_memslot(memslots, gfn)

int kvm_vcpu_map_gfn(struct kvm_vcpu *vcpu gfn_t gfn,
struct kvm_host_map *map)

calling kvm_vcpu_gfn_to_memslot + __kvm_map

void kvm_unmap_gfn(struct kvm_host_map *map)


> + if (is_error_pfn(pfn))
> + return false;

Should this be is_error_noslot_pfn?

> + if (pfn_valid(pfn)) {
> + page = pfn_to_page(pfn);
> + kaddr = vmap(&page, 1, VM_MAP, PAGE_KERNEL);
> + } else {
> + kaddr = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
> + }
> +
> + if (!kaddr)
> + return false;
> +
> + map->page = page;
> + map->kaddr = kaddr;
> + map->pfn = pfn;
> + map->gfn = gfn;
> +
> + return true;
> +}

>
> +void kvm_vcpu_unmap(struct kvm_host_map *map)
> +{
> + if (!map->kaddr)
> + return;
> +
> + if (map->page)
> + kunmap(map->page);
> + else
> + memunmap(map->kaddr);
> +
> + kvm_release_pfn_dirty(map->pfn);
> + memset(map, 0, sizeof(*map));

This can clear just map->kaddr (map->hva after the above review).

Thanks,

Paolo

> +}
> +


2018-04-12 14:40:19

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 05/10] KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap

On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> ... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
> memory that has a "struct page".
>
> The life-cycle of the mapping also changes to avoid doing map and unmap on
> every single exit (which becomes very expesive once we use memremap). Now
> the memory is mapped and only unmapped when a new VMCS12 is loaded into the
> vCPU (or when the vCPU is freed!).

In this particular case SMM is not an issue because it cannot use VMX.
Therefore it's safe to ignore non-SMM address spaces. You can then
introduce

int kvm_map_gfn(struct kvm_vcpu *vcpu gfn_t gfn,
struct kvm_host_map *map)

calling kvm_gfn_to_memslot + __kvm_map_gfn

which could also handle the caching aspect. But please let's look at it
later, making the lifecycle change separate from the new API.

Paolo

> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> arch/x86/kvm/vmx.c | 19 ++++++++++++-------
> 1 file changed, 12 insertions(+), 7 deletions(-)


2018-04-12 14:43:21

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 06/10] KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page

On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> +
> + if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->virtual_apic_page_addr), map))
> + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, gfn_to_gpa(map->pfn));

This should in principle be pfn_to_hpa. However, pfn_to_hpa is unused;
let's just remove it and do "<< PAGE_SHIFT". Unlike gfn_to_gpa, where
in principle the shift could be different, pfn_to_hpa is *by definition*
a shift left by PAGE_SHIFT.

Paolo

2018-04-12 14:49:21

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 07/10] KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt descriptor table

On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> ... since using kvm_vcpu_gpa_to_page() and kmap() will only work for guest
> memory that has a "struct page".
>
> The life-cycle of the mapping also changes to avoid doing map and unmap on
> every single exit (which becomes very expesive once we use memremap). Now
> the memory is mapped and only unmapped when a new VMCS12 is loaded into the
> vCPU (or when the vCPU is freed!).
>
> Signed-off-by: KarimAllah Ahmed <[email protected]>

Same here, let's change the lifecycle separately.

Paolo

> ---
> arch/x86/kvm/vmx.c | 45 +++++++++++++--------------------------------
> 1 file changed, 13 insertions(+), 32 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index a700338..7b29419 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -461,7 +461,7 @@ struct nested_vmx {
> */
> struct page *apic_access_page;
> struct kvm_host_map virtual_apic_map;
> - struct page *pi_desc_page;
> + struct kvm_host_map pi_desc_map;
> struct kvm_host_map msr_bitmap_map;
>
> struct pi_desc *pi_desc;
> @@ -7666,6 +7666,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx *vmx)
> vmx->nested.cached_vmcs12, 0, VMCS12_SIZE);
>
> kvm_vcpu_unmap(&vmx->nested.virtual_apic_map);
> + kvm_vcpu_unmap(&vmx->nested.pi_desc_map);
> kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);
>
> vmx->nested.current_vmptr = -1ull;
> @@ -7698,14 +7699,9 @@ static void free_nested(struct vcpu_vmx *vmx)
> vmx->nested.apic_access_page = NULL;
> }
> kvm_vcpu_unmap(&vmx->nested.virtual_apic_map);
> - if (vmx->nested.pi_desc_page) {
> - kunmap(vmx->nested.pi_desc_page);
> - kvm_release_page_dirty(vmx->nested.pi_desc_page);
> - vmx->nested.pi_desc_page = NULL;
> - vmx->nested.pi_desc = NULL;
> - }
> -
> + kvm_vcpu_unmap(&vmx->nested.pi_desc_map);
> kvm_vcpu_unmap(&vmx->nested.msr_bitmap_map);
> + vmx->nested.pi_desc = NULL;
>
> free_loaded_vmcs(&vmx->nested.vmcs02);
> }
> @@ -10278,24 +10274,16 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu,
> }
>
> if (nested_cpu_has_posted_intr(vmcs12)) {
> - if (vmx->nested.pi_desc_page) { /* shouldn't happen */
> - kunmap(vmx->nested.pi_desc_page);
> - kvm_release_page_dirty(vmx->nested.pi_desc_page);
> - vmx->nested.pi_desc_page = NULL;
> + map = &vmx->nested.pi_desc_map;
> +
> + if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->posted_intr_desc_addr), map)) {
> + vmx->nested.pi_desc =
> + (struct pi_desc *)(((void *)map->kaddr) +
> + offset_in_page(vmcs12->posted_intr_desc_addr));
> + vmcs_write64(POSTED_INTR_DESC_ADDR, pfn_to_hpa(map->pfn) +
> + offset_in_page(vmcs12->posted_intr_desc_addr));
> }
> - page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->posted_intr_desc_addr);
> - if (is_error_page(page))
> - return;
> - vmx->nested.pi_desc_page = page;
> - vmx->nested.pi_desc = kmap(vmx->nested.pi_desc_page);
> - vmx->nested.pi_desc =
> - (struct pi_desc *)((void *)vmx->nested.pi_desc +
> - (unsigned long)(vmcs12->posted_intr_desc_addr &
> - (PAGE_SIZE - 1)));
> - vmcs_write64(POSTED_INTR_DESC_ADDR,
> - page_to_phys(vmx->nested.pi_desc_page) +
> - (unsigned long)(vmcs12->posted_intr_desc_addr &
> - (PAGE_SIZE - 1)));
> +
> }
> if (nested_vmx_prepare_msr_bitmap(vcpu, vmcs12))
> vmcs_set_bits(CPU_BASED_VM_EXEC_CONTROL,
> @@ -11893,13 +11881,6 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
> kvm_release_page_dirty(vmx->nested.apic_access_page);
> vmx->nested.apic_access_page = NULL;
> }
> - if (vmx->nested.pi_desc_page) {
> - kunmap(vmx->nested.pi_desc_page);
> - kvm_release_page_dirty(vmx->nested.pi_desc_page);
> - vmx->nested.pi_desc_page = NULL;
> - vmx->nested.pi_desc = NULL;
> - }
> -
> /*
> * We are now running in L2, mmu_notifier will force to reload the
> * page's hpa for L2 vmcs. Need to reload it for L1 before entering L1.
>


2018-04-12 15:04:39

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 00/10] KVM/X86: Handle guest memory that does not have a struct page

On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> For the most part, KVM can handle guest memory that does not have a struct
> page (i.e. not directly managed by the kernel). However, There are a few places
> in the code, specially in the nested code, that does not support that.
>
> Patch 1, 2, and 3 avoid the mapping and unmapping all together and just
> directly use kvm_guest_read and kvm_guest_write.
>
> Patch 4 introduces a new guest mapping interface that encapsulate all the
> bioler plate code that is needed to map and unmap guest memory. It also
> supports guest memory without "struct page".
>
> Patch 5, 6, 7, 8, 9, and 10 switch most of the offending code in VMX and hyperv
> to use the new guest mapping API.
>
> This patch series is the first set of fixes. Handling SVM and APIC-access page
> will be handled in a different patch series.

I like the patches and the new API. However, I'm a bit less convinced
about the caching aspect; keeping a page pinned is not the nicest thing
with respect (for example) to memory hot-unplug.

Since you're basically reinventing kmap_high, or alternatively
(depending on your background) xc_map_foreign_pages, it's not surprising
that memremap is slow. How slow is it really (as seen e.g. with
vmexit.flat running in L1, on EC2 compared to vanilla KVM)?

Perhaps you can keep some kind of per-CPU cache of the last N remapped
pfns? This cache would sit between memremap and __kvm_map_gfn and it
would be completely transparent to the layer below since it takes raw
pfns. This removes the need to store the memslots generation etc. (If
you go this way please place it in virt/kvm/pfncache.[ch], since
kvm_main.c is already way too big).

Thanks,

Paolo

> KarimAllah Ahmed (10):
> X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of
> map->read->unmap sequence
> X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory
> instead of map->copy->unmap sequence.
> X86/nVMX: Update the PML table without mapping and unmapping the page
> KVM: Introduce a new guest mapping API
> KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap
> KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page
> KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt
> descriptor table
> KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated
> KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending
> KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg
>
> arch/x86/kvm/hyperv.c | 28 ++++-----
> arch/x86/kvm/vmx.c | 144 +++++++++++++++--------------------------------
> arch/x86/kvm/x86.c | 13 ++---
> include/linux/kvm_host.h | 15 +++++
> virt/kvm/kvm_main.c | 50 ++++++++++++++++
> 5 files changed, 129 insertions(+), 121 deletions(-)
>


2018-04-12 15:09:14

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 03/10] X86/nVMX: Update the PML table without mapping and unmapping the page

On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> + dst = (gpa_t)(((u64 *)vmcs12->pml_address) + vmcs12->guest_pml_index);

This is not a pointer, since it's in the guest. Please use

dst = vmcs12->pml_address + sizeof(u64) * vmcs12->guest_pml_index;

(It may also make sense to use kvm_write_guest_page if you prefer).

Thanks,

Paolo

> - page = kvm_vcpu_gpa_to_page(vcpu, vmcs12->pml_address);
> - if (is_error_page(page))
> + if (kvm_write_guest(vcpu->kvm, dst, &gpa, sizeof(gpa)))
> return 0;


2018-04-12 17:59:38

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 06/10] KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page

On Thu, Apr 12, 2018 at 04:38:39PM +0200, Paolo Bonzini wrote:
> On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> > +
> > + if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->virtual_apic_page_addr), map))
> > + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, gfn_to_gpa(map->pfn));
>
> This should in principle be pfn_to_hpa. However, pfn_to_hpa is unused;
> let's just remove it and do "<< PAGE_SHIFT". Unlike gfn_to_gpa, where
> in principle the shift could be different, pfn_to_hpa is *by definition*
> a shift left by PAGE_SHIFT.

Any reason not to use PFN_PHYS instead of the handcoded shift?

> Paolo

2018-04-12 20:25:16

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 06/10] KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page

On 12/04/2018 19:57, Sean Christopherson wrote:
> On Thu, Apr 12, 2018 at 04:38:39PM +0200, Paolo Bonzini wrote:
>> On 21/02/2018 18:47, KarimAllah Ahmed wrote:
>>> +
>>> + if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->virtual_apic_page_addr), map))
>>> + vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, gfn_to_gpa(map->pfn));
>>
>> This should in principle be pfn_to_hpa. However, pfn_to_hpa is unused;
>> let's just remove it and do "<< PAGE_SHIFT". Unlike gfn_to_gpa, where
>> in principle the shift could be different, pfn_to_hpa is *by definition*
>> a shift left by PAGE_SHIFT.
>
> Any reason not to use PFN_PHYS instead of the handcoded shift?

That works too of course. It's just not used much (or at all?) in KVM.

Paolo

2018-04-12 22:40:47

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH 00/10] KVM/X86: Handle guest memory that does not have a struct page

On Thu, 2018-04-12 at 16:59 +0200, Paolo Bonzini wrote:
> On 21/02/2018 18:47, KarimAllah Ahmed wrote:
> >
> > For the most part, KVM can handle guest memory that does not have a struct
> > page (i.e. not directly managed by the kernel). However, There are a few places
> > in the code, specially in the nested code, that does not support that.
> >
> > Patch 1, 2, and 3 avoid the mapping and unmapping all together and just
> > directly use kvm_guest_read and kvm_guest_write.
> >
> > Patch 4 introduces a new guest mapping interface that encapsulate all the
> > bioler plate code that is needed to map and unmap guest memory. It also
> > supports guest memory without "struct page".
> >
> > Patch 5, 6, 7, 8, 9, and 10 switch most of the offending code in VMX and hyperv
> > to use the new guest mapping API.
> >
> > This patch series is the first set of fixes. Handling SVM and APIC-access page
> > will be handled in a different patch series.
>
> I like the patches and the new API. However, I'm a bit less convinced
> about the caching aspect; keeping a page pinned is not the nicest thing
> with respect (for example) to memory hot-unplug.
>
> Since you're basically reinventing kmap_high, or alternatively
> (depending on your background) xc_map_foreign_pages, it's not surprising
> that memremap is slow. How slow is it really (as seen e.g. with
> vmexit.flat running in L1, on EC2 compared to vanilla KVM)?

I have not actually compared EC2 vs vanilla but I compared between the 
version with cached vs no-cache (both in EC2 setup). The one that 
cached the mappings was an order of magnitude better. For example, 
booting an Ubuntu L2 guest with QEMU took around 10-13 seconds with 
this caching and it took over 5 minutes without the caching.

I will test with vanilla KVM and post the results.

>
> Perhaps you can keep some kind of per-CPU cache of the last N remapped
> pfns? This cache would sit between memremap and __kvm_map_gfn and it
> would be completely transparent to the layer below since it takes raw
> pfns. This removes the need to store the memslots generation etc. (If
> you go this way please place it in virt/kvm/pfncache.[ch], since
> kvm_main.c is already way too big).

Yup, that sounds like a good idea. I actually already implemented some 
sort of a per-CPU mapping pool in order to reduce the overhead when
the vCPU is over-committed. I will clean this and post as you
suggested.

>
> Thanks,
>
> Paolo
>
> >
> > KarimAllah Ahmed (10):
> > X86/nVMX: handle_vmon: Read 4 bytes from guest memory instead of
> > map->read->unmap sequence
> > X86/nVMX: handle_vmptrld: Copy the VMCS12 directly from guest memory
> > instead of map->copy->unmap sequence.
> > X86/nVMX: Update the PML table without mapping and unmapping the page
> > KVM: Introduce a new guest mapping API
> > KVM/nVMX: Use kvm_vcpu_map when mapping the L1 MSR bitmap
> > KVM/nVMX: Use kvm_vcpu_map when mapping the virtual APIC page
> > KVM/nVMX: Use kvm_vcpu_map when mapping the posted interrupt
> > descriptor table
> > KVM/X86: Use kvm_vcpu_map in emulator_cmpxchg_emulated
> > KVM/X86: hyperv: Use kvm_vcpu_map in synic_clear_sint_msg_pending
> > KVM/X86: hyperv: Use kvm_vcpu_map in synic_deliver_msg
> >
> > arch/x86/kvm/hyperv.c | 28 ++++-----
> > arch/x86/kvm/vmx.c | 144 +++++++++++++++--------------------------------
> > arch/x86/kvm/x86.c | 13 ++---
> > include/linux/kvm_host.h | 15 +++++
> > virt/kvm/kvm_main.c | 50 ++++++++++++++++
> > 5 files changed, 129 insertions(+), 121 deletions(-)
> >
>
>
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B