2019-07-12 08:25:07

by KarimAllah Ahmed

[permalink] [raw]
Subject: [PATCH] KVM: arm/arm64: Properly check for MMIO regions

Valid RAM can live outside kernel control (e.g. using "mem=" command-line
parameter). This memory can still be used as valid guest memory for KVM. So
ensure that we validate that this memory is definitely not "RAM" before
assuming that it is an MMIO region.

One way to use memory outside kernel control is:

1- Pass 'mem=' in the kernel command-line to limit the amount of memory managed
by the kernel.
2- Map this physical memory you want to give to the guest with:
mmap("/dev/mem", physical_address_offset, ..)
3- Use the user-space virtual address as the "userspace_addr" field in
KVM_SET_USER_MEMORY_REGION ioctl.

One of the limitations of the current /dev/mem for ARM is that it would map
this memory as uncached without this patch:
https://lkml.org/lkml/2019/7/11/684

This work is similar to the work done on x86 here:
https://lkml.org/lkml/2019/1/31/933

Cc: Marc Zyngier <[email protected]>
Cc: James Morse <[email protected]>
Cc: Julien Thierry <[email protected]>
Cc: Suzuki K Pouloze <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: KarimAllah Ahmed <[email protected]>
---
virt/kvm/arm/mmu.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 06180c9..2105134 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -8,6 +8,7 @@
#include <linux/kvm_host.h>
#include <linux/io.h>
#include <linux/hugetlb.h>
+#include <linux/memblock.h>
#include <linux/sched/signal.h>
#include <trace/events/kvm.h>
#include <asm/pgalloc.h>
@@ -89,7 +90,7 @@ static void kvm_flush_dcache_pud(struct kvm *kvm,

static bool kvm_is_device_pfn(unsigned long pfn)
{
- return !pfn_valid(pfn);
+ return !memblock_is_memory(__pfn_to_phys(pfn));
}

/**
@@ -949,6 +950,7 @@ static void stage2_unmap_memslot(struct kvm *kvm,
do {
struct vm_area_struct *vma = find_vma(current->mm, hva);
hva_t vm_start, vm_end;
+ gpa_t gpa;

if (!vma || vma->vm_start >= reg_end)
break;
@@ -959,11 +961,14 @@ static void stage2_unmap_memslot(struct kvm *kvm,
vm_start = max(hva, vma->vm_start);
vm_end = min(reg_end, vma->vm_end);

- if (!(vma->vm_flags & VM_PFNMAP)) {
- gpa_t gpa = addr + (vm_start - memslot->userspace_addr);
- unmap_stage2_range(kvm, gpa, vm_end - vm_start);
- }
hva = vm_end;
+
+ if ((vma->vm_flags & VM_PFNMAP) &&
+ !memblock_is_memory(__pfn_to_phys(vma->vm_pgoff)))
+ continue;
+
+ gpa = addr + (vm_start - memslot->userspace_addr);
+ unmap_stage2_range(kvm, gpa, vm_end - vm_start);
} while (hva < reg_end);
}

@@ -2329,7 +2334,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
vm_start = max(hva, vma->vm_start);
vm_end = min(reg_end, vma->vm_end);

- if (vma->vm_flags & VM_PFNMAP) {
+ if ((vma->vm_flags & VM_PFNMAP) &&
+ !memblock_is_memory(__pfn_to_phys(vma->vm_pgoff))) {
gpa_t gpa = mem->guest_phys_addr +
(vm_start - mem->userspace_addr);
phys_addr_t pa;
--
2.7.4


2019-07-12 16:00:43

by KarimAllah Ahmed

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm/arm64: Properly check for MMIO regions

On Fri, 2019-07-12 at 10:22 +0200, KarimAllah Ahmed wrote:
> Valid RAM can live outside kernel control (e.g. using "mem=" command-line
> parameter). This memory can still be used as valid guest memory for KVM. So
> ensure that we validate that this memory is definitely not "RAM" before
> assuming that it is an MMIO region.

This patch actually suffers from the same problem pointed out here:
https://lkml.org/lkml/2019/7/12/760

.. so I will need to rework them together.

>
> One way to use memory outside kernel control is:
>
> 1- Pass 'mem=' in the kernel command-line to limit the amount of memory managed
> by the kernel.
> 2- Map this physical memory you want to give to the guest with:
> mmap("/dev/mem", physical_address_offset, ..)
> 3- Use the user-space virtual address as the "userspace_addr" field in
> KVM_SET_USER_MEMORY_REGION ioctl.
>
> One of the limitations of the current /dev/mem for ARM is that it would map
> this memory as uncached without this patch:
> https://lkml.org/lkml/2019/7/11/684
>
> This work is similar to the work done on x86 here:
> https://lkml.org/lkml/2019/1/31/933
>
> Cc: Marc Zyngier <[email protected]>
> Cc: James Morse <[email protected]>
> Cc: Julien Thierry <[email protected]>
> Cc: Suzuki K Pouloze <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: KarimAllah Ahmed <[email protected]>
> ---
> virt/kvm/arm/mmu.c | 18 ++++++++++++------
> 1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 06180c9..2105134 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -8,6 +8,7 @@
> #include <linux/kvm_host.h>
> #include <linux/io.h>
> #include <linux/hugetlb.h>
> +#include <linux/memblock.h>
> #include <linux/sched/signal.h>
> #include <trace/events/kvm.h>
> #include <asm/pgalloc.h>
> @@ -89,7 +90,7 @@ static void kvm_flush_dcache_pud(struct kvm *kvm,
>
> static bool kvm_is_device_pfn(unsigned long pfn)
> {
> - return !pfn_valid(pfn);
> + return !memblock_is_memory(__pfn_to_phys(pfn));
> }
>
> /**
> @@ -949,6 +950,7 @@ static void stage2_unmap_memslot(struct kvm *kvm,
> do {
> struct vm_area_struct *vma = find_vma(current->mm, hva);
> hva_t vm_start, vm_end;
> + gpa_t gpa;
>
> if (!vma || vma->vm_start >= reg_end)
> break;
> @@ -959,11 +961,14 @@ static void stage2_unmap_memslot(struct kvm *kvm,
> vm_start = max(hva, vma->vm_start);
> vm_end = min(reg_end, vma->vm_end);
>
> - if (!(vma->vm_flags & VM_PFNMAP)) {
> - gpa_t gpa = addr + (vm_start - memslot->userspace_addr);
> - unmap_stage2_range(kvm, gpa, vm_end - vm_start);
> - }
> hva = vm_end;
> +
> + if ((vma->vm_flags & VM_PFNMAP) &&
> + !memblock_is_memory(__pfn_to_phys(vma->vm_pgoff)))
> + continue;
> +
> + gpa = addr + (vm_start - memslot->userspace_addr);
> + unmap_stage2_range(kvm, gpa, vm_end - vm_start);
> } while (hva < reg_end);
> }
>
> @@ -2329,7 +2334,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
> vm_start = max(hva, vma->vm_start);
> vm_end = min(reg_end, vma->vm_end);
>
> - if (vma->vm_flags & VM_PFNMAP) {
> + if ((vma->vm_flags & VM_PFNMAP) &&
> + !memblock_is_memory(__pfn_to_phys(vma->vm_pgoff))) {
> gpa_t gpa = mem->guest_phys_addr +
> (vm_start - mem->userspace_addr);
> phys_addr_t pa;



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2019-07-24 11:09:29

by James Morse

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm/arm64: Properly check for MMIO regions

Hi KarimAllah,

On 12/07/2019 09:22, KarimAllah Ahmed wrote:
> Valid RAM can live outside kernel control (e.g. using "mem=" command-line
> parameter). This memory can still be used as valid guest memory for KVM. So
> ensure that we validate that this memory is definitely not "RAM" before
> assuming that it is an MMIO region.
>
> One way to use memory outside kernel control is:
>
> 1- Pass 'mem=' in the kernel command-line to limit the amount of memory managed
> by the kernel.

"mem=" is a debug option, we ignore it if we need something located outside the 'mem=' region.


> 2- Map this physical memory you want to give to the guest with:
> mmap("/dev/mem", physical_address_offset, ..)

/dev/mem is an egregious hack! If you need to use it, you probably didn't want an
operating-system in the first place.


> 3- Use the user-space virtual address as the "userspace_addr" field in
> KVM_SET_USER_MEMORY_REGION ioctl.


... What do you want to do this for?

At a guess: this is to save all that annoying 'memory allocation' overhead at guest
startup. If you get your VMM to use hugetlbfs, you can reserve the memory during boot. I
do this with "hugepagesz=2M hugepages=512" on the kernel command-line.

(if you get a RAS error affecting memory that the kernel doesn't know about, it will
ignore it. Using hugetlbfs instead gives you all the good things: hugepage-splitting,
signals to your VMM, stage2 unmapping etc.)


Thanks,

James