Currently, kvm only sets new.userspace_addr in slots
that were just allocated. This is not the intended behaviour,
and actually breaks when we try to use the slots to implement
aliases, for example.
Cirrus VGA aliases maps and address to a userspace address, and
then keep mapping this same address to different locations
until the whole screen is filled.
The solution is to assign new.userspace_addr no matter what,
so we can be sure that whenever the guest changes this field,
it sees the change being reflected in the code.
Signed-off-by: Glauber Costa <[email protected]>
---
virt/kvm/kvm_main.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a87f45e..fc3abf0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -762,15 +762,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
memset(new.rmap, 0, npages * sizeof(*new.rmap));
new.user_alloc = user_alloc;
- /*
- * hva_to_rmmap() serialzies with the mmu_lock and to be
- * safe it has to ignore memslots with !user_alloc &&
- * !userspace_addr.
- */
- if (user_alloc)
- new.userspace_addr = mem->userspace_addr;
- else
- new.userspace_addr = 0;
}
if (npages && !new.lpage_info) {
int largepages = npages / KVM_PAGES_PER_HPAGE;
@@ -791,6 +782,15 @@ int __kvm_set_memory_region(struct kvm *kvm,
if ((base_gfn+npages) % KVM_PAGES_PER_HPAGE)
new.lpage_info[largepages-1].write_count = 1;
}
+ /*
+ * hva_to_rmmap() serialzies with the mmu_lock and to be
+ * safe it has to ignore memslots with !user_alloc &&
+ * !userspace_addr.
+ */
+ if (npages && user_alloc)
+ new.userspace_addr = mem->userspace_addr;
+ else
+ new.userspace_addr = 0;
/* Allocate page dirty bitmap if needed */
if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) {
--
1.5.6.5
Glauber Costa wrote:
> Currently, kvm only sets new.userspace_addr in slots
> that were just allocated. This is not the intended behaviour,
> and actually breaks when we try to use the slots to implement
> aliases, for example.
>
> Cirrus VGA aliases maps and address to a userspace address, and
> then keep mapping this same address to different locations
> until the whole screen is filled.
>
> The solution is to assign new.userspace_addr no matter what,
> so we can be sure that whenever the guest changes this field,
> it sees the change being reflected in the code.
>
> Signed-off-by: Glauber Costa <[email protected]>
>
I think this is masking a much bigger problem.
> ---
> virt/kvm/kvm_main.c | 18 +++++++++---------
> 1 files changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index a87f45e..fc3abf0 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -762,15 +762,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
> memset(new.rmap, 0, npages * sizeof(*new.rmap));
>
> new.user_alloc = user_alloc;
> - /*
> - * hva_to_rmmap() serialzies with the mmu_lock and to be
> - * safe it has to ignore memslots with !user_alloc &&
> - * !userspace_addr.
> - */
> - if (user_alloc)
> - new.userspace_addr = mem->userspace_addr;
> - else
> - new.userspace_addr = 0;
>
This is guarded in:
> if (npages && !new.rmap) {
In this case, npages > 0 but !new.rmap is already allocated. But this
is a new slot? The problem is that when we delete the slot, the rmap
never gets freed. This means that if we delete a slot, then create a
new slot which happens to be a different size, we use the old rmap and
potentially overrun that buffer.
So I think we need a fix that properly frees the rmap when the slot is
destroyed.
Regards,
Anthony Liguori
On Wed, Nov 19, 2008 at 09:55:10AM -0600, Anthony Liguori wrote:
> Glauber Costa wrote:
>> Currently, kvm only sets new.userspace_addr in slots
>> that were just allocated. This is not the intended behaviour,
>> and actually breaks when we try to use the slots to implement
>> aliases, for example.
>>
>> Cirrus VGA aliases maps and address to a userspace address, and
>> then keep mapping this same address to different locations
>> until the whole screen is filled.
>>
>> The solution is to assign new.userspace_addr no matter what,
>> so we can be sure that whenever the guest changes this field,
>> it sees the change being reflected in the code.
>>
>> Signed-off-by: Glauber Costa <[email protected]>
>>
>
> I think this is masking a much bigger problem.
>
>
>> ---
>> virt/kvm/kvm_main.c | 18 +++++++++---------
>> 1 files changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index a87f45e..fc3abf0 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -762,15 +762,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
>> memset(new.rmap, 0, npages * sizeof(*new.rmap));
>> new.user_alloc = user_alloc;
>> - /*
>> - * hva_to_rmmap() serialzies with the mmu_lock and to be
>> - * safe it has to ignore memslots with !user_alloc &&
>> - * !userspace_addr.
>> - */
>> - if (user_alloc)
>> - new.userspace_addr = mem->userspace_addr;
>> - else
>> - new.userspace_addr = 0;
>>
>
> This is guarded in:
>
>> if (npages && !new.rmap) {
>
> In this case, npages > 0 but !new.rmap is already allocated. But this
> is a new slot? The problem is that when we delete the slot, the rmap
> never gets freed. This means that if we delete a slot, then create a
> new slot which happens to be a different size, we use the old rmap and
> potentially overrun that buffer.
Oh yeah, it does get freed.
The delete path ends up in a kvm_free_physmem_slot, which will effectively
vfree() the rmap structure. In fact, my userspace use case worked totally
properly when I deleted the slot prior to re-registering it.
The problem here is when there is an already existant slot, and we are
trying to change some information about it. The problem you are concerned
basically does not exist, because it would raise only if we are changing
the slot size. The code says:
/* Disallow changing a memory slot's size. */
r = -EINVAL;
if (npages && old.npages && npages != old.npages)
goto out_free;
And this seem pretty much as the expected behaviour to me. (At least, it
is a reasonable behaviour, although one could argue that it could easily
be different)
Glauber Costa wrote:
> On Wed, Nov 19, 2008 at 09:55:10AM -0600, Anthony Liguori wrote:
>
>> Glauber Costa wrote:
>>
>>> Currently, kvm only sets new.userspace_addr in slots
>>> that were just allocated. This is not the intended behaviour,
>>> and actually breaks when we try to use the slots to implement
>>> aliases, for example.
>>>
>>> Cirrus VGA aliases maps and address to a userspace address, and
>>> then keep mapping this same address to different locations
>>> until the whole screen is filled.
>>>
>>> The solution is to assign new.userspace_addr no matter what,
>>> so we can be sure that whenever the guest changes this field,
>>> it sees the change being reflected in the code.
>>>
>>> Signed-off-by: Glauber Costa <[email protected]>
>>>
>>>
>> I think this is masking a much bigger problem.
>>
>>
>>
>>> ---
>>> virt/kvm/kvm_main.c | 18 +++++++++---------
>>> 1 files changed, 9 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index a87f45e..fc3abf0 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -762,15 +762,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
>>> memset(new.rmap, 0, npages * sizeof(*new.rmap));
>>> new.user_alloc = user_alloc;
>>> - /*
>>> - * hva_to_rmmap() serialzies with the mmu_lock and to be
>>> - * safe it has to ignore memslots with !user_alloc &&
>>> - * !userspace_addr.
>>> - */
>>> - if (user_alloc)
>>> - new.userspace_addr = mem->userspace_addr;
>>> - else
>>> - new.userspace_addr = 0;
>>>
>>>
>> This is guarded in:
>>
>>
>>> if (npages && !new.rmap) {
>>>
>> In this case, npages > 0 but !new.rmap is already allocated. But this
>> is a new slot? The problem is that when we delete the slot, the rmap
>> never gets freed. This means that if we delete a slot, then create a
>> new slot which happens to be a different size, we use the old rmap and
>> potentially overrun that buffer.
>>
>
> Oh yeah, it does get freed.
>
> The delete path ends up in a kvm_free_physmem_slot, which will effectively
> vfree() the rmap structure. In fact, my userspace use case worked totally
> properly when I deleted the slot prior to re-registering it.
>
That's not how I read the code. I see:
>
> static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
> struct kvm_memory_slot *dont)
> {
> if (!dont || free->rmap != dont->rmap)
> vfree(free->rmap);
And it's called as kvm_free_physmem_slot(&old, &new);
new is assigned to old to start out with so old.rmap will equal new.rmap.
> The problem here is when there is an already existant slot, and we are
> trying to change some information about it. The problem you are concerned
> basically does not exist, because it would raise only if we are changing
> the slot size. The code says:
>
If a memory slot exists, the current code always deletes it and creates
a new one so this problem shouldn't exist. See
>
> /* unregister whole slot */
> memcpy(&slot, mem, sizeof(slot));
> mem->memory_size = 0;
> kvm_vm_ioctl(s, KVM_SET_USER_MEMORY_REGION, mem);
But the problem still exists even with this code. I checked.
So if you have something working without modifying the kernel, can you
post it?
Regards,
Anthony Liguori
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index f18b86f..66ee286 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -39,6 +39,7 @@ struct kvm_userspace_memory_region {
/* for kvm_memory_region::flags */
#define KVM_MEM_LOG_DIRTY_PAGES 1UL
+#define KVM_MEM_FREE_BEFORE_ALLOC 2UL
/* for KVM_IRQ_LINE */
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4c39d4f..41f5666 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -735,24 +735,31 @@ int __kvm_set_memory_region(struct kvm *kvm,
base_gfn = mem->guest_phys_addr >> PAGE_SHIFT;
npages = mem->memory_size >> PAGE_SHIFT;
- if (!npages)
- mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
+
+ r = kvm_check_overlap(kvm, base_gfn, npages, memslot);
+ if (r)
+ goto out_free;
+
new = old = *memslot;
- new.base_gfn = base_gfn;
- new.npages = npages;
- new.flags = mem->flags;
+ if (!npages)
+ mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
- /* Disallow changing a memory slot's size. */
- r = -EINVAL;
- if (npages && old.npages && npages != old.npages)
- goto out_free;
- r = kvm_check_overlap(kvm, base_gfn, npages, memslot);
- if (r)
+ if (mem->flags & KVM_MEM_FREE_BEFORE_ALLOC)
+ kvm_free_physmem_slot(&new, NULL);
+
+ else {
+ /* Disallow changing a memory slot's size. */
+ r = -EINVAL;
+ if (npages && old.npages && npages != old.npages)
goto out_free;
+ }
+ new.base_gfn = base_gfn;
+ new.flags = mem->flags;
+ new.npages = npages;
/* Free page dirty bitmap if unneeded */
if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
@@ -771,6 +778,15 @@ int __kvm_set_memory_region(struct kvm *kvm,
memset(new.rmap, 0, npages * sizeof(*new.rmap));
new.user_alloc = user_alloc;
+ /*
+ * hva_to_rmmap() serialzies with the mmu_lock and to be
+ * safe it has to ignore memslots with !user_alloc &&
+ * !userspace_addr.
+ */
+ if (user_alloc)
+ new.userspace_addr = mem->userspace_addr;
+ else
+ new.userspace_addr = 0;
}
if (npages && !new.lpage_info) {
int largepages = npages / KVM_PAGES_PER_HPAGE;
@@ -791,15 +807,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
if ((base_gfn+npages) % KVM_PAGES_PER_HPAGE)
new.lpage_info[largepages-1].write_count = 1;
}
- /*
- * hva_to_rmmap() serialzies with the mmu_lock and to be
- * safe it has to ignore memslots with !user_alloc &&
- * !userspace_addr.
- */
- if (npages && user_alloc)
- new.userspace_addr = mem->userspace_addr;
- else
- new.userspace_addr = 0;
/* Allocate page dirty bitmap if needed */
if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) {
@@ -830,7 +837,9 @@ int __kvm_set_memory_region(struct kvm *kvm,
goto out_free;
}
- kvm_free_physmem_slot(&old, &new);
+
+ if (!(mem->flags & KVM_MEM_FREE_BEFORE_ALLOC))
+ kvm_free_physmem_slot(&old, &new);
#ifdef CONFIG_DMAR
/* map the pages in iommu page table */
r = kvm_iommu_map_pages(kvm, base_gfn, npages);
@@ -840,7 +849,9 @@ int __kvm_set_memory_region(struct kvm *kvm,
return 0;
out_free:
- kvm_free_physmem_slot(&new, &old);
+
+ if (!(mem->flags & KVM_MEM_FREE_BEFORE_ALLOC))
+ kvm_free_physmem_slot(&new, &old);
out:
return r;
Glauber Costa wrote:
> On Wed, Nov 19, 2008 at 12:51:00PM -0600, Anthony Liguori wrote:
>
>> But the problem still exists even with this code. I checked.
>>
>> So if you have something working without modifying the kernel, can you
>> post it?
>>
>> Regards,
>>
>> Anthony Liguori
>>
>
> Ok, how do you feel about this one?
>
> My proposal is to always delete a memslot before reusing the space,
> but controlling this behaviour by a flag, so we can maintain backwards
> compatibility with people using older versions of the interface.
>
Is the old behavior ever correct? I think it's always wrong.
Regards,
Anthony Liguori
Anthony Liguori wrote:
>
> That's not how I read the code. I see:
>
>>
>> static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
>> struct kvm_memory_slot *dont)
>> {
>> if (!dont || free->rmap != dont->rmap)
>> vfree(free->rmap);
>
> And it's called as kvm_free_physmem_slot(&old, &new);
>
> new is assigned to old to start out with so old.rmap will equal new.rmap.
>
Hm, if !npages we should just kvm_free_physmem_slot(&old, NULL).
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
Anthony Liguori wrote:
>
> That's not how I read the code. I see:
>
>>
>> static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
>> struct kvm_memory_slot *dont)
>> {
>> if (!dont || free->rmap != dont->rmap)
>> vfree(free->rmap);
>
> And it's called as kvm_free_physmem_slot(&old, &new);
>
> new is assigned to old to start out with so old.rmap will equal new.rmap.
>
Hm, if !npages we should just kvm_free_physmem_slot(&old, NULL).
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b1953ee..f605bba 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -735,11 +735,17 @@ int __kvm_set_memory_region(struct kvm *kvm,
base_gfn = mem->guest_phys_addr >> PAGE_SHIFT;
npages = mem->memory_size >> PAGE_SHIFT;
- if (!npages)
- mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
-
new = old = *memslot;
+ if (!npages) {
+ mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
+ kvm_arch_flush_shadow(kvm);
+ kvm_free_physmem_slot(memslot, NULL);
+ kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
+ goto out;
+ }
+
+
new.base_gfn = base_gfn;
new.npages = npages;
new.flags = mem->flags;
@@ -812,9 +818,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
}
#endif /* not defined CONFIG_S390 */
- if (!npages)
- kvm_arch_flush_shadow(kvm);
-
spin_lock(&kvm->mmu_lock);
if (mem->slot >= kvm->nmemslots)
kvm->nmemslots = mem->slot + 1;
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b1953ee..f605bba 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -735,11 +735,17 @@ int __kvm_set_memory_region(struct kvm *kvm,
> base_gfn = mem->guest_phys_addr >> PAGE_SHIFT;
> npages = mem->memory_size >> PAGE_SHIFT;
>
> - if (!npages)
> - mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
> -
> new = old = *memslot;
>
> + if (!npages) {
> + mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
> + kvm_arch_flush_shadow(kvm);
> + kvm_free_physmem_slot(memslot, NULL);
> + kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
> + goto out;
> + }
> +
> +
> new.base_gfn = base_gfn;
Any comments about this version? In the absense of it, I'll submit a
version with a SoB for inclusion.
> new.npages = npages;
> new.flags = mem->flags;
> @@ -812,9 +818,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
> }
> #endif /* not defined CONFIG_S390 */
>
> - if (!npages)
> - kvm_arch_flush_shadow(kvm);
> -
> spin_lock(&kvm->mmu_lock);
> if (mem->slot >= kvm->nmemslots)
> kvm->nmemslots = mem->slot + 1;
>
>
--
Glauber Costa.
"Free as in Freedom"
http://glommer.net
"The less confident you are, the more serious you have to act."
Glauber Costa wrote:
>> Hm, if !npages we should just kvm_free_physmem_slot(&old, NULL).
>>
> Actually, I believe we need a little bit more than that, because we can
> have valid rmaps in flight.
>
> Tell me what you think about this.
>
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index b1953ee..f605bba 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -735,11 +735,17 @@ int __kvm_set_memory_region(struct kvm *kvm,
> base_gfn = mem->guest_phys_addr >> PAGE_SHIFT;
> npages = mem->memory_size >> PAGE_SHIFT;
>
> - if (!npages)
> - mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
> -
> new = old = *memslot;
>
> + if (!npages) {
> + mem->flags &= ~KVM_MEM_LOG_DIRTY_PAGES;
> + kvm_arch_flush_shadow(kvm);
> + kvm_free_physmem_slot(memslot, NULL);
> + kvm_arch_set_memory_region(kvm, mem, old, user_alloc);
> + goto out;
> + }
>
Why doesn't the kvm_free_physmem_slot() at the end of the normal path
work? I don't like adding a new path, for example you missed the iommu
call.
--
error compiling committee.c: too many arguments to function