With the 3 private slots, this gives us 512 slots total.
Motivation for this is in addition to assigned devices
support more memory hotplug slots, where 1 slot is
used by a hotplugged memory stick.
It will allow to support upto 256 hotplug memory
slots and leave 253 slots for assigned devices and
other devices that use them.
Signed-off-by: Igor Mammedov <[email protected]>
---
previously increased to 125 slots for assigned devices
by 0f888f5acd
---
arch/x86/include/asm/kvm_host.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ed0c30..cfd60e3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -33,7 +33,7 @@
#define KVM_MAX_VCPUS 255
#define KVM_SOFT_MAX_VCPUS 160
-#define KVM_USER_MEM_SLOTS 125
+#define KVM_USER_MEM_SLOTS 509
/* memory slots that are not exposed to userspace */
#define KVM_PRIVATE_MEM_SLOTS 3
#define KVM_MEM_SLOTS_NUM (KVM_USER_MEM_SLOTS + KVM_PRIVATE_MEM_SLOTS)
--
1.8.3.1
On 06/11/2014 16:52, Igor Mammedov wrote:
> With the 3 private slots, this gives us 512 slots total.
> Motivation for this is in addition to assigned devices
> support more memory hotplug slots, where 1 slot is
> used by a hotplugged memory stick.
> It will allow to support upto 256 hotplug memory
> slots and leave 253 slots for assigned devices and
> other devices that use them.
>
> Signed-off-by: Igor Mammedov <[email protected]>
It would use more memory, and some loops are now becoming more
expensive. In general adding a memory slot to a VM is not cheap, and I
question the wisdom of having 256 hotplug memory slots. But the
slowdown mostly would only happen if you actually _use_ those memory
slots, so it is not a blocker for this patch.
We probably should change the kmemdup + heap sort of
__kvm_set_memory_region + update_memslots to copy the array and insert
the new item at the right place, at the same time. Using a heap sort is
overkill and unnecessarily goes from O(n^2) to O(n^2 log n). With a
bigger constant in front as well.
If you want to do it, I'd be grateful. Otherwise I can look at it as
time permits.
Paolo
memslots is a sorted array, when slot changes in it
with current heapsort it would take O(n log n) time
to update array, while using insertion sort like
algorithm on array with 1 item out of order will
take only O(n) time.
Replace current heapsort with custom sort that
takes advantage of memslots usage pattern and known
position of changed slot.
performance change of 128 memslots insersions with
gradually increasing size (the worst case):
heap sort custom sort
max: 249747 15654 cycles
min: 52536 5562 cycles
with custom sort alg taking 90% less then original
update time.
Signed-off-by: Igor Mammedov <[email protected]>
---
virt/kvm/kvm_main.c | 54 +++++++++++++++++++++++++++++------------------------
1 file changed, 30 insertions(+), 24 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 25ffac9..5fcbc45 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -668,31 +668,43 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
return 0;
}
-static int cmp_memslot(const void *slot1, const void *slot2)
+static void swap_memslot(struct kvm_memory_slot *s1, struct kvm_memory_slot *s2)
{
- struct kvm_memory_slot *s1, *s2;
+ struct kvm_memory_slot tmp;
- s1 = (struct kvm_memory_slot *)slot1;
- s2 = (struct kvm_memory_slot *)slot2;
-
- if (s1->npages < s2->npages)
- return 1;
- if (s1->npages > s2->npages)
- return -1;
-
- return 0;
+ tmp = *s2;
+ *s2 = *s1;
+ *s1 = tmp;
}
/*
- * Sort the memslots base on its size, so the larger slots
- * will get better fit.
+ * Insert memslot and re-sort memslots based on their size,
+ * so the larger slots will get better fit. Sorting algorithm
+ * takes advantage of having initially sorted array and
+ * known changed memslot position.
*/
-static void sort_memslots(struct kvm_memslots *slots)
+static void insert_memslot(struct kvm_memslots *slots,
+ struct kvm_memory_slot *new)
{
- int i;
+ int i = slots->id_to_index[new->id];
+ struct kvm_memory_slot *old = id_to_memslot(slots, new->id);
+ struct kvm_memory_slot *mslots = slots->memslots;
- sort(slots->memslots, KVM_MEM_SLOTS_NUM,
- sizeof(struct kvm_memory_slot), cmp_memslot, NULL);
+ if (new->npages == old->npages)
+ return;
+
+ *old = *new;
+ while (1) {
+ if (i < (KVM_MEM_SLOTS_NUM - 1) &&
+ mslots[i].npages < mslots[i + 1].npages) {
+ swap_memslot(&mslots[i], &mslots[i + 1]);
+ i++;
+ } else if (i > 0 && mslots[i].npages > mslots[i - 1].npages) {
+ swap_memslot(&mslots[i], &mslots[i - 1]);
+ i--;
+ } else
+ break;
+ }
for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
slots->id_to_index[slots->memslots[i].id] = i;
@@ -702,13 +714,7 @@ static void update_memslots(struct kvm_memslots *slots,
struct kvm_memory_slot *new)
{
if (new) {
- int id = new->id;
- struct kvm_memory_slot *old = id_to_memslot(slots, id);
- unsigned long npages = old->npages;
-
- *old = *new;
- if (new->npages != npages)
- sort_memslots(slots);
+ insert_memslot(slots, new);
}
}
--
1.8.3.1
On 13/11/2014 17:31, Igor Mammedov wrote:
> memslots is a sorted array, when slot changes in it
> with current heapsort it would take O(n log n) time
> to update array, while using insertion sort like
> algorithm on array with 1 item out of order will
> take only O(n) time.
>
> Replace current heapsort with custom sort that
> takes advantage of memslots usage pattern and known
> position of changed slot.
>
> performance change of 128 memslots insersions with
> gradually increasing size (the worst case):
> heap sort custom sort
> max: 249747 15654 cycles
> min: 52536 5562 cycles
> with custom sort alg taking 90% less then original
> update time.
>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> virt/kvm/kvm_main.c | 54 +++++++++++++++++++++++++++++------------------------
> 1 file changed, 30 insertions(+), 24 deletions(-)
Nice! I think strictly speaking it's not an insertion sort because
insertion sort doesn't use swaps; it's more similar to a bubble sort.
But the code is very readable and we do not need ultimate
performance---we are just trying to avoid doing something stupid.
Reviewed-by: Paolo Bonzini <[email protected]>
Paolo
memslots is a sorted array, when slot changes in it
with current heapsort it would take O(n log n) time
to update array, while using insertion sort like
algorithm on array with 1 item out of order will
take only O(n) time.
Replace current heapsort with custom sort that
takes advantage of memslots usage pattern and known
position of changed slot.
performance change of 128 memslots insersions with
gradually increasing size (the worst case):
heap sort custom sort
max: 249747 2500 cycles
with custom sort alg taking ~98% less then original
update time.
Signed-off-by: Igor Mammedov <[email protected]>
---
v2:
- replace swap with slot shift, improves result 2x
- reprofile original/swap based and swapless 15 times
discarding spikes swap based takes ~5900 cycles max
and swapless ~2500 cycles.
---
virt/kvm/kvm_main.c | 54 ++++++++++++++++++++++++++---------------------------
1 file changed, 26 insertions(+), 28 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 25ffac9..49f896a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -668,31 +668,35 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
return 0;
}
-static int cmp_memslot(const void *slot1, const void *slot2)
-{
- struct kvm_memory_slot *s1, *s2;
-
- s1 = (struct kvm_memory_slot *)slot1;
- s2 = (struct kvm_memory_slot *)slot2;
-
- if (s1->npages < s2->npages)
- return 1;
- if (s1->npages > s2->npages)
- return -1;
-
- return 0;
-}
-
/*
- * Sort the memslots base on its size, so the larger slots
- * will get better fit.
+ * Insert memslot and re-sort memslots based on their size,
+ * so the larger slots will get better fit. Sorting algorithm
+ * takes advantage of having initially sorted array and
+ * known changed memslot position.
*/
-static void sort_memslots(struct kvm_memslots *slots)
+static void insert_memslot(struct kvm_memslots *slots,
+ struct kvm_memory_slot *new)
{
- int i;
+ int i = slots->id_to_index[new->id];
+ struct kvm_memory_slot *old = id_to_memslot(slots, new->id);
+ struct kvm_memory_slot *mslots = slots->memslots;
+
+ if (new->npages == old->npages)
+ return;
- sort(slots->memslots, KVM_MEM_SLOTS_NUM,
- sizeof(struct kvm_memory_slot), cmp_memslot, NULL);
+ while (1) {
+ if (i < (KVM_MEM_SLOTS_NUM - 1) &&
+ new->npages < mslots[i + 1].npages) {
+ mslots[i] = mslots[i + 1];
+ i++;
+ } else if (i > 0 && new->npages > mslots[i - 1].npages) {
+ mslots[i] = mslots[i - 1];
+ i--;
+ } else {
+ mslots[i] = *new;
+ break;
+ }
+ }
for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
slots->id_to_index[slots->memslots[i].id] = i;
@@ -702,13 +706,7 @@ static void update_memslots(struct kvm_memslots *slots,
struct kvm_memory_slot *new)
{
if (new) {
- int id = new->id;
- struct kvm_memory_slot *old = id_to_memslot(slots, id);
- unsigned long npages = old->npages;
-
- *old = *new;
- if (new->npages != npages)
- sort_memslots(slots);
+ insert_memslot(slots, new);
}
}
--
1.8.3.1
On 14/11/2014 00:00, Igor Mammedov wrote:
> memslots is a sorted array, when slot changes in it
> with current heapsort it would take O(n log n) time
> to update array, while using insertion sort like
> algorithm on array with 1 item out of order will
> take only O(n) time.
>
> Replace current heapsort with custom sort that
> takes advantage of memslots usage pattern and known
> position of changed slot.
>
> performance change of 128 memslots insersions with
> gradually increasing size (the worst case):
> heap sort custom sort
> max: 249747 2500 cycles
> with custom sort alg taking ~98% less then original
> update time.
>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> v2:
> - replace swap with slot shift, improves result 2x
> - reprofile original/swap based and swapless 15 times
> discarding spikes swap based takes ~5900 cycles max
> and swapless ~2500 cycles.
> ---
> virt/kvm/kvm_main.c | 54 ++++++++++++++++++++++++++---------------------------
> 1 file changed, 26 insertions(+), 28 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 25ffac9..49f896a 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -668,31 +668,35 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
> return 0;
> }
>
> -static int cmp_memslot(const void *slot1, const void *slot2)
> -{
> - struct kvm_memory_slot *s1, *s2;
> -
> - s1 = (struct kvm_memory_slot *)slot1;
> - s2 = (struct kvm_memory_slot *)slot2;
> -
> - if (s1->npages < s2->npages)
> - return 1;
> - if (s1->npages > s2->npages)
> - return -1;
> -
> - return 0;
> -}
> -
> /*
> - * Sort the memslots base on its size, so the larger slots
> - * will get better fit.
> + * Insert memslot and re-sort memslots based on their size,
> + * so the larger slots will get better fit. Sorting algorithm
> + * takes advantage of having initially sorted array and
> + * known changed memslot position.
> */
> -static void sort_memslots(struct kvm_memslots *slots)
> +static void insert_memslot(struct kvm_memslots *slots,
> + struct kvm_memory_slot *new)
> {
> - int i;
> + int i = slots->id_to_index[new->id];
> + struct kvm_memory_slot *old = id_to_memslot(slots, new->id);
> + struct kvm_memory_slot *mslots = slots->memslots;
> +
It is important to leave the "*old = *new" assignment here, see the
comment in __kvm_set_memory_region:
/*
* We can re-use the old_memslots from above, the only difference
* from the currently installed memslots is the invalid flag. This
* will get overwritten by update_memslots anyway.
*/
This small problem was already present in v1, but I didn't notice it
yesterday. With the new code, we can add it inside this if:
> + if (new->npages == old->npages)
> + return;
Do you agree? (No need to send v3).
Paolo
> - sort(slots->memslots, KVM_MEM_SLOTS_NUM,
> - sizeof(struct kvm_memory_slot), cmp_memslot, NULL);
> + while (1) {
> + if (i < (KVM_MEM_SLOTS_NUM - 1) &&
> + new->npages < mslots[i + 1].npages) {
> + mslots[i] = mslots[i + 1];
> + i++;
> + } else if (i > 0 && new->npages > mslots[i - 1].npages) {
> + mslots[i] = mslots[i - 1];
> + i--;
> + } else {
> + mslots[i] = *new;
> + break;
> + }
> + }
>
> for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
> slots->id_to_index[slots->memslots[i].id] = i;
> @@ -702,13 +706,7 @@ static void update_memslots(struct kvm_memslots *slots,
> struct kvm_memory_slot *new)
> {
> if (new) {
> - int id = new->id;
> - struct kvm_memory_slot *old = id_to_memslot(slots, id);
> - unsigned long npages = old->npages;
> -
> - *old = *new;
> - if (new->npages != npages)
> - sort_memslots(slots);
> + insert_memslot(slots, new);
> }
> }
>
>
On Fri, 14 Nov 2014 10:57:10 +0100
Paolo Bonzini <[email protected]> wrote:
>
>
> On 14/11/2014 00:00, Igor Mammedov wrote:
> > memslots is a sorted array, when slot changes in it
> > with current heapsort it would take O(n log n) time
> > to update array, while using insertion sort like
> > algorithm on array with 1 item out of order will
> > take only O(n) time.
> >
> > Replace current heapsort with custom sort that
> > takes advantage of memslots usage pattern and known
> > position of changed slot.
> >
> > performance change of 128 memslots insersions with
> > gradually increasing size (the worst case):
> > heap sort custom sort
> > max: 249747 2500 cycles
> > with custom sort alg taking ~98% less then original
> > update time.
> >
> > Signed-off-by: Igor Mammedov <[email protected]>
> > ---
> > v2:
> > - replace swap with slot shift, improves result 2x
> > - reprofile original/swap based and swapless 15 times
> > discarding spikes swap based takes ~5900 cycles max
> > and swapless ~2500 cycles.
> > ---
> > virt/kvm/kvm_main.c | 54
> > ++++++++++++++++++++++++++--------------------------- 1 file
> > changed, 26 insertions(+), 28 deletions(-)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 25ffac9..49f896a 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -668,31 +668,35 @@ static int kvm_create_dirty_bitmap(struct
> > kvm_memory_slot *memslot) return 0;
> > }
> >
> > -static int cmp_memslot(const void *slot1, const void *slot2)
> > -{
> > - struct kvm_memory_slot *s1, *s2;
> > -
> > - s1 = (struct kvm_memory_slot *)slot1;
> > - s2 = (struct kvm_memory_slot *)slot2;
> > -
> > - if (s1->npages < s2->npages)
> > - return 1;
> > - if (s1->npages > s2->npages)
> > - return -1;
> > -
> > - return 0;
> > -}
> > -
> > /*
> > - * Sort the memslots base on its size, so the larger slots
> > - * will get better fit.
> > + * Insert memslot and re-sort memslots based on their size,
> > + * so the larger slots will get better fit. Sorting algorithm
> > + * takes advantage of having initially sorted array and
> > + * known changed memslot position.
> > */
> > -static void sort_memslots(struct kvm_memslots *slots)
> > +static void insert_memslot(struct kvm_memslots *slots,
> > + struct kvm_memory_slot *new)
> > {
> > - int i;
> > + int i = slots->id_to_index[new->id];
> > + struct kvm_memory_slot *old = id_to_memslot(slots,
> > new->id);
> > + struct kvm_memory_slot *mslots = slots->memslots;
> > +
>
> It is important to leave the "*old = *new" assignment here, see the
> comment in __kvm_set_memory_region:
>
> /*
> * We can re-use the old_memslots from above, the only
> difference
> * from the currently installed memslots is the invalid
> flag. This
> * will get overwritten by update_memslots anyway.
> */
>
> This small problem was already present in v1, but I didn't notice it
> yesterday. With the new code, we can add it inside this if:
>
> > + if (new->npages == old->npages)
> > + return;
>
> Do you agree? (No need to send v3).
yes, slot must be updated even if number of pages haven't changed,
since other fields could hold updated values.
>
> Paolo
>
> > - sort(slots->memslots, KVM_MEM_SLOTS_NUM,
> > - sizeof(struct kvm_memory_slot), cmp_memslot, NULL);
> > + while (1) {
> > + if (i < (KVM_MEM_SLOTS_NUM - 1) &&
> > + new->npages < mslots[i + 1].npages) {
> > + mslots[i] = mslots[i + 1];
> > + i++;
> > + } else if (i > 0 && new->npages > mslots[i -
> > 1].npages) {
> > + mslots[i] = mslots[i - 1];
> > + i--;
> > + } else {
> > + mslots[i] = *new;
> > + break;
> > + }
> > + }
> >
> > for (i = 0; i < KVM_MEM_SLOTS_NUM; i++)
> > slots->id_to_index[slots->memslots[i].id] = i;
> > @@ -702,13 +706,7 @@ static void update_memslots(struct
> > kvm_memslots *slots, struct kvm_memory_slot *new)
> > {
> > if (new) {
> > - int id = new->id;
> > - struct kvm_memory_slot *old = id_to_memslot(slots,
> > id);
> > - unsigned long npages = old->npages;
> > -
> > - *old = *new;
> > - if (new->npages != npages)
> > - sort_memslots(slots);
> > + insert_memslot(slots, new);
> > }
> > }
> >
> >
On Thu, 06 Nov 2014 17:23:58 +0100
Paolo Bonzini <[email protected]> wrote:
>
>
> On 06/11/2014 16:52, Igor Mammedov wrote:
> > With the 3 private slots, this gives us 512 slots total.
> > Motivation for this is in addition to assigned devices
> > support more memory hotplug slots, where 1 slot is
> > used by a hotplugged memory stick.
> > It will allow to support upto 256 hotplug memory
> > slots and leave 253 slots for assigned devices and
> > other devices that use them.
> >
> > Signed-off-by: Igor Mammedov <[email protected]>
>
> It would use more memory, and some loops are now becoming more
> expensive. In general adding a memory slot to a VM is not cheap, and
> I question the wisdom of having 256 hotplug memory slots. But the
> slowdown mostly would only happen if you actually _use_ those memory
> slots, so it is not a blocker for this patch.
It might be useful to have a big amount of slots for big guests
and although linux works with minimum section 128Mb but Windows memory
hotplug works just fine even with page-sized slots so when unplug in
QEMU is implemented it would be possible to drop balooning driver at
least there.
And providing that memslots could be allocated during runtime when guest
programs devices or maps roms (i.e. no fail path), I don't see a way
to fix it in QEMU (i.e. avoid abort when limit is reached).
Hence an attempt to bump memslots limit to 512, where current 125
are reserved for initial memory mappings and passthrough devices
256 goes to hotplug memory slots and leaves us 128 free slots for
future expansion.
To see what would be affected by large amount of slots I played with
perf a bit and the biggest hotspot offender with large amount of
memslots was:
gfn_to_memslot() -> ... -> search_memslots()
I'll try to make it faster for this case so 512 memslots wouldn't
affect guest performance.
So please consider applying this patch.
>
> Paolo
On 14/11/2014 15:10, Igor Mammedov wrote:
> On Thu, 06 Nov 2014 17:23:58 +0100 Paolo Bonzini <[email protected]> wrote:
>> It would use more memory, and some loops are now becoming more
>> expensive. In general adding a memory slot to a VM is not cheap, and
>> I question the wisdom of having 256 hotplug memory slots. But the
>> slowdown mostly would only happen if you actually _use_ those memory
>> slots, so it is not a blocker for this patch.
> It might be useful to have a big amount of slots for big guests
> and although linux works with minimum section 128Mb but Windows memory
> hotplug works just fine even with page-sized slots so when unplug in
> QEMU is implemented it would be possible to drop balooning driver at
> least there.
I think for a big (64G?) guest it doesn't make much sense anyway to
balloon at a granularity that is less than 1G or even more. So I like
the idea of dropping ballooning in favor of memory hotplug for big guests.
> And providing that memslots could be allocated during runtime when guest
> programs devices or maps roms (i.e. no fail path), I don't see a way
> to fix it in QEMU (i.e. avoid abort when limit is reached).
> Hence an attempt to bump memslots limit to 512, where current 125
> are reserved for initial memory mappings and passthrough devices
> 256 goes to hotplug memory slots and leaves us 128 free slots for
> future expansion.
>
> To see what would be affected by large amount of slots I played with
> perf a bit and the biggest hotspot offender with large amount of
> memslots was:
>
> gfn_to_memslot() -> ... -> search_memslots()
>
> I'll try to make it faster for this case so 512 memslots wouldn't
> affect guest performance.
>
> So please consider applying this patch.
Yes, sorry for the delay---I am definitely going to apply it.
Paolo