LinuxLists.cc - [RFC] KVM: optimize the kvm_vcpu_on

2017-07-29 06:23:45

Subject: [RFC] KVM: optimize the kvm_vcpu_on_spin

We had disscuss the idea here:
https://www.spinics.net/lists/kvm/msg140593.html

I think it's also suitable for other architectures.

If the vcpu(me) exit due to request a usermode spinlock, then
the spinlock-holder may be preempted in usermode or kernmode.
But if the vcpu(me) is in kernmode, then the holder must be
preempted in kernmode, so we should choose a vcpu in kernmode
as the most eligible candidate.

PS: I only implement X86 arch currently for I'm not familiar
with other architecture.

Signed-off-by: Longpeng(Mike) <[email protected]>
---
arch/mips/kvm/mips.c | 5 +++++
arch/powerpc/kvm/powerpc.c | 5 +++++
arch/s390/kvm/kvm-s390.c | 5 +++++
arch/x86/kvm/x86.c | 5 +++++
include/linux/kvm_host.h | 4 ++++
virt/kvm/arm/arm.c | 5 +++++
virt/kvm/kvm_main.c | 9 ++++++++-
7 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index d4b2ad1..2e2701d 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -98,6 +98,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return !!(vcpu->arch.pending_exceptions);
}

+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return 1;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 1a75c0b..2489f64 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -58,6 +58,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
}

+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return 1;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 3f2884e..9d7c42e 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2443,6 +2443,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_s390_vcpu_has_irq(vcpu, 0);
}

+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
{
atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 82a63c5..b5a2e53 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8435,6 +8435,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
}

+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+ return kvm_x86_ops->get_cpl(vcpu) == 0;
+}
+
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
{
return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 648b34c..f8f0d74 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -272,6 +272,9 @@ struct kvm_vcpu {
} spin_loop;
#endif
bool preempted;
+ /* If vcpu is in kernel-mode when preempted */
+ bool in_kernmode;
+
struct kvm_vcpu_arch arch;
struct dentry *debugfs_dentry;
};
@@ -797,6 +800,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
void kvm_arch_hardware_unsetup(void);
void kvm_arch_check_processor_compat(void *rtn);
int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu);
int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);

#ifndef __KVM_HAVE_ARCH_VM_ALLOC
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a39a1e1..ca6a394 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
&& !v->arch.power_off && !v->arch.pause);
}

+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+ return false;
+}
+
/* Just ensure a guest exit from a particular CPU */
static void exit_vm_noop(void *info)
{
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 82987d4..8d83caa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -290,6 +290,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
kvm_vcpu_set_in_spin_loop(vcpu, false);
kvm_vcpu_set_dy_eligible(vcpu, false);
vcpu->preempted = false;
+ vcpu->in_kernmode = false;

r = kvm_arch_vcpu_init(vcpu);
if (r < 0)
@@ -2330,6 +2331,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
int pass;
int i;

+ me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
kvm_vcpu_set_in_spin_loop(me, true);
/*
* We boost the priority of a VCPU that is runnable but not
@@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
continue;
if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
continue;
+ if (me->in_kernmode && !vcpu->in_kernmode)
+ continue;
if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
continue;

@@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
{
struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);

- if (current->state == TASK_RUNNING)
+ if (current->state == TASK_RUNNING) {
vcpu->preempted = true;
+ vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
+ }
+
kvm_arch_vcpu_put(vcpu);
}

--
1.8.3.1

2017-07-31 11:31:52

by David Hildenbrand

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

[no idea if this change makes sense (and especially if it has any bad
side effects), do you have performance numbers? I'll just have a look at
the general structure of the patch in the meanwhile]

> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)

kvm_arch_vcpu_in_kernel() ?

> +{
> + return kvm_x86_ops->get_cpl(vcpu) == 0;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 648b34c..f8f0d74 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -272,6 +272,9 @@ struct kvm_vcpu {
> } spin_loop;
> #endif
> bool preempted;
> + /* If vcpu is in kernel-mode when preempted */
> + bool in_kernmode;
> +

Why do you have to store that ...

[...]> + me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
> kvm_vcpu_set_in_spin_loop(me, true);
> /*
> * We boost the priority of a VCPU that is runnable but not
> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> continue;
> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> continue;
> + if (me->in_kernmode && !vcpu->in_kernmode)

Wouldn't it be easier to simply have

in_kernel = kvm_arch_vcpu_in_kernel(me);
...
if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
...

> + continue;
> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> continue;
>
> @@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
> {
> struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>
> - if (current->state == TASK_RUNNING)
> + if (current->state == TASK_RUNNING) {
> vcpu->preempted = true;
> + vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
> + }
> +

so you don't have to do this change, too.

> kvm_arch_vcpu_put(vcpu);
> }
>
>

--

Thanks,

David

2017-07-31 12:11:00

by Longpeng(Mike)

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

Hi David,

On 2017/7/31 19:31, David Hildenbrand wrote:

> [no idea if this change makes sense (and especially if it has any bad
> side effects), do you have performance numbers? I'll just have a look at
> the general structure of the patch in the meanwhile]
>

I haven't any test results yet, could you give me some suggestion about what
benchmarks are suitable ?

>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
>
> kvm_arch_vcpu_in_kernel() ?
>

Um...yes, I'll correct this.

>> +{
>> + return kvm_x86_ops->get_cpl(vcpu) == 0;
>> +}
>> +
>> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>> {
>> return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 648b34c..f8f0d74 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -272,6 +272,9 @@ struct kvm_vcpu {
>> } spin_loop;
>> #endif
>> bool preempted;
>> + /* If vcpu is in kernel-mode when preempted */
>> + bool in_kernmode;
>> +
>
> Why do you have to store that ...
>

> [...]> + me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
>> kvm_vcpu_set_in_spin_loop(me, true);
>> /*
>> * We boost the priority of a VCPU that is runnable but not
>> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> continue;
>> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>> continue;
>> + if (me->in_kernmode && !vcpu->in_kernmode)
>
> Wouldn't it be easier to simply have
>
> in_kernel = kvm_arch_vcpu_in_kernel(me);
> ...
> if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
> ...
>

I'm not sure whether the operation of get the vcpu's priority-level is
expensive on all architectures, so I record it in kvm_sched_out() for
minimal the extra cycles cost in kvm_vcpu_on_spin().

>> + continue;
>> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>> continue;
>>
>> @@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
>> {
>> struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>>
>> - if (current->state == TASK_RUNNING)
>> + if (current->state == TASK_RUNNING) {
>> vcpu->preempted = true;
>> + vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
>> + }
>> +
>
> so you don't have to do this change, too.
>
>> kvm_arch_vcpu_put(vcpu);
>> }
>>
>>
>
>

--
Regards,
Longpeng(Mike)

2017-07-31 12:27:24

by David Hildenbrand

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

> I'm not sure whether the operation of get the vcpu's priority-level is
> expensive on all architectures, so I record it in kvm_sched_out() for
> minimal the extra cycles cost in kvm_vcpu_on_spin().
>

as you only care for x86 right now either way, you can directly optimize
here for the good (here: x86) case (keeping changes and therefore
possible bugs minimal).

--

Thanks,

David

2017-07-31 12:32:03

by Cornelia Huck

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On Mon, 31 Jul 2017 20:08:14 +0800
"Longpeng (Mike)" <[email protected]> wrote:

> Hi David,
>
> On 2017/7/31 19:31, David Hildenbrand wrote:

> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >> index 648b34c..f8f0d74 100644
> >> --- a/include/linux/kvm_host.h
> >> +++ b/include/linux/kvm_host.h
> >> @@ -272,6 +272,9 @@ struct kvm_vcpu {
> >> } spin_loop;
> >> #endif
> >> bool preempted;
> >> + /* If vcpu is in kernel-mode when preempted */
> >> + bool in_kernmode;
> >> +
> >
> > Why do you have to store that ...
> >
>
> > [...]> + me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
> >> kvm_vcpu_set_in_spin_loop(me, true);
> >> /*
> >> * We boost the priority of a VCPU that is runnable but not
> >> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> >> continue;
> >> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> >> continue;
> >> + if (me->in_kernmode && !vcpu->in_kernmode)
> >
> > Wouldn't it be easier to simply have
> >
> > in_kernel = kvm_arch_vcpu_in_kernel(me);
> > ...
> > if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
> > ...
> >
>
> I'm not sure whether the operation of get the vcpu's priority-level is
> expensive on all architectures, so I record it in kvm_sched_out() for
> minimal the extra cycles cost in kvm_vcpu_on_spin().

As it is now, this handling looks a bit inconsistent. You only update
the field on sched-out via preemption _or_ if kvm_vcpu_on_spin is
called for the vcpu. In most contexts, this field will have stale
content.

Also, would checking for kernel mode be more expensive than the various
other checks already done in this function?

[I like David's suggestion.]

>
> >> + continue;
> >> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> >> continue;
> >>

2017-07-31 13:04:23

by Longpeng(Mike)

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 2017/7/31 20:31, Cornelia Huck wrote:

> On Mon, 31 Jul 2017 20:08:14 +0800
> "Longpeng (Mike)" <[email protected]> wrote:
>
>> Hi David,
>>
>> On 2017/7/31 19:31, David Hildenbrand wrote:
>
>>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>>> index 648b34c..f8f0d74 100644
>>>> --- a/include/linux/kvm_host.h
>>>> +++ b/include/linux/kvm_host.h
>>>> @@ -272,6 +272,9 @@ struct kvm_vcpu {
>>>> } spin_loop;
>>>> #endif
>>>> bool preempted;
>>>> + /* If vcpu is in kernel-mode when preempted */
>>>> + bool in_kernmode;
>>>> +
>>>
>>> Why do you have to store that ...
>>>
>>
>>> [...]> + me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
>>>> kvm_vcpu_set_in_spin_loop(me, true);
>>>> /*
>>>> * We boost the priority of a VCPU that is runnable but not
>>>> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>>>> continue;
>>>> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>>>> continue;
>>>> + if (me->in_kernmode && !vcpu->in_kernmode)
>>>
>>> Wouldn't it be easier to simply have
>>>
>>> in_kernel = kvm_arch_vcpu_in_kernel(me);
>>> ...
>>> if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
>>> ...
>>>
>>
>> I'm not sure whether the operation of get the vcpu's priority-level is
>> expensive on all architectures, so I record it in kvm_sched_out() for
>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>
> As it is now, this handling looks a bit inconsistent. You only update
> the field on sched-out via preemption _or_ if kvm_vcpu_on_spin is
> called for the vcpu. In most contexts, this field will have stale
> content.
>
> Also, would checking for kernel mode be more expensive than the various
> other checks already done in this function?
>

> [I like David's suggestion.]
>

Hi Cornelia & David,

I'll take your suggestion, thanks :)

>>
>>>> + continue;
>>>> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>>>> continue;
>>>>
>
> .
>

--
Regards,
Longpeng(Mike)

2017-07-31 13:20:20

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 31/07/2017 14:27, David Hildenbrand wrote:
>> I'm not sure whether the operation of get the vcpu's priority-level is
>> expensive on all architectures, so I record it in kvm_sched_out() for
>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>>
> as you only care for x86 right now either way, you can directly optimize
> here for the good (here: x86) case (keeping changes and therefore
> possible bugs minimal).

I agree with Cornelia that this is inconsistent, so you shouldn't update
me->in_kernmode in kvm_vcpu_on_spin. However, get_cpl requires
vcpu_load on Intel x86, so Mike's patch is necessary (vmx_get_cpl ->
vmx_read_guest_seg_ar -> vmcs_read32).

Alternatively, we can add a new callback kvm_x86_ops->sched_out to x86
KVM, and call vmx_get_cpl from the Intel implementation (vmx_sched_out).
This will cache the result until the next sched_in, so that
kvm_vcpu_on_spin can use it.

Paolo

2017-07-31 13:23:01

by Christoffer Dall

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On Sat, Jul 29, 2017 at 02:22:57PM +0800, Longpeng(Mike) wrote:
> We had disscuss the idea here:
> https://www.spinics.net/lists/kvm/msg140593.html

This is not a very nice way to start a commit description.

Please provide the necessary background to understand your change
directly in the commit message.

>
> I think it's also suitable for other architectures.
>

I think this sentence can go in the end of the commit message together
with your explanation of only doing this for x86.

By the way, the ARM solution should be pretty simple:

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a39a1e1..b9f68e4 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
&& !v->arch.power_off && !v->arch.pause);
}

+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return vcpu_mode_priv(vcpu);
+}
+
/* Just ensure a guest exit from a particular CPU */
static void exit_vm_noop(void *info)
{

I am also curious in the workload you use to measure this and how I can
evaluate the benefit on ARM?

Thanks,
-Christoffer

> If the vcpu(me) exit due to request a usermode spinlock, then
> the spinlock-holder may be preempted in usermode or kernmode.
> But if the vcpu(me) is in kernmode, then the holder must be
> preempted in kernmode, so we should choose a vcpu in kernmode
> as the most eligible candidate.
>
> PS: I only implement X86 arch currently for I'm not familiar
> with other architecture.
>
> Signed-off-by: Longpeng(Mike) <[email protected]>
> ---
> arch/mips/kvm/mips.c | 5 +++++
> arch/powerpc/kvm/powerpc.c | 5 +++++
> arch/s390/kvm/kvm-s390.c | 5 +++++
> arch/x86/kvm/x86.c | 5 +++++
> include/linux/kvm_host.h | 4 ++++
> virt/kvm/arm/arm.c | 5 +++++
> virt/kvm/kvm_main.c | 9 ++++++++-
> 7 files changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index d4b2ad1..2e2701d 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -98,6 +98,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return !!(vcpu->arch.pending_exceptions);
> }
>
> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return 1;
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 1a75c0b..2489f64 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -58,6 +58,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
> }
>
> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return 1;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 3f2884e..9d7c42e 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2443,6 +2443,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>
> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
> {
> atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 82a63c5..b5a2e53 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8435,6 +8435,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
> }
>
> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
> +{
> + return kvm_x86_ops->get_cpl(vcpu) == 0;
> +}
> +
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
> {
> return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 648b34c..f8f0d74 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -272,6 +272,9 @@ struct kvm_vcpu {
> } spin_loop;
> #endif
> bool preempted;
> + /* If vcpu is in kernel-mode when preempted */
> + bool in_kernmode;
> +
> struct kvm_vcpu_arch arch;
> struct dentry *debugfs_dentry;
> };
> @@ -797,6 +800,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> void kvm_arch_hardware_unsetup(void);
> void kvm_arch_check_processor_compat(void *rtn);
> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu);
> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
>
> #ifndef __KVM_HAVE_ARCH_VM_ALLOC
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a39a1e1..ca6a394 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> && !v->arch.power_off && !v->arch.pause);
> }
>
> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
> +{
> + return false;
> +}
> +
> /* Just ensure a guest exit from a particular CPU */
> static void exit_vm_noop(void *info)
> {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 82987d4..8d83caa 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -290,6 +290,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
> kvm_vcpu_set_in_spin_loop(vcpu, false);
> kvm_vcpu_set_dy_eligible(vcpu, false);
> vcpu->preempted = false;
> + vcpu->in_kernmode = false;
>
> r = kvm_arch_vcpu_init(vcpu);
> if (r < 0)
> @@ -2330,6 +2331,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> int pass;
> int i;
>
> + me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
> kvm_vcpu_set_in_spin_loop(me, true);
> /*
> * We boost the priority of a VCPU that is runnable but not
> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> continue;
> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> continue;
> + if (me->in_kernmode && !vcpu->in_kernmode)
> + continue;
> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> continue;
>
> @@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
> {
> struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>
> - if (current->state == TASK_RUNNING)
> + if (current->state == TASK_RUNNING) {
> vcpu->preempted = true;
> + vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
> + }
> +
> kvm_arch_vcpu_put(vcpu);
> }
>
> --
> 1.8.3.1
>
>

2017-07-31 13:42:43

by Marc Zyngier

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 29/07/17 07:22, Longpeng(Mike) wrote:
> We had disscuss the idea here:
> https://www.spinics.net/lists/kvm/msg140593.html
>
> I think it's also suitable for other architectures.
>
> If the vcpu(me) exit due to request a usermode spinlock, then
> the spinlock-holder may be preempted in usermode or kernmode.
> But if the vcpu(me) is in kernmode, then the holder must be
> preempted in kernmode, so we should choose a vcpu in kernmode
> as the most eligible candidate.

That seems to preclude any form of locking between userspace and kernel
(which probably wouldn't be Linux). Are you sure that this form of
construct is not used anywhere? I have the feeling this patch could
break this scenario...

Thanks,

M.
--
Jazz is not dead. It just smells funny...

2017-07-31 14:24:02

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 31/07/2017 15:42, Marc Zyngier wrote:
>> If the vcpu(me) exit due to request a usermode spinlock, then
>> the spinlock-holder may be preempted in usermode or kernmode.
>> But if the vcpu(me) is in kernmode, then the holder must be
>> preempted in kernmode, so we should choose a vcpu in kernmode
>> as the most eligible candidate.
>
> That seems to preclude any form of locking between userspace and kernel
> (which probably wouldn't be Linux). Are you sure that this form of
> construct is not used anywhere? I have the feeling this patch could
> break this scenario...

It's just a heuristic; it would only be broken if you overcommit, and it
would be just as broken as if KVM didn't implement directed yield at all.

Paolo

2017-07-31 17:32:51

by David Hildenbrand

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 31.07.2017 15:22, Christoffer Dall wrote:
> On Sat, Jul 29, 2017 at 02:22:57PM +0800, Longpeng(Mike) wrote:
>> We had disscuss the idea here:
>> https://www.spinics.net/lists/kvm/msg140593.html
>
> This is not a very nice way to start a commit description.
>
> Please provide the necessary background to understand your change
> directly in the commit message.
>
>>
>> I think it's also suitable for other architectures.
>>
>
> I think this sentence can go in the end of the commit message together
> with your explanation of only doing this for x86.
>
> By the way, the ARM solution should be pretty simple:
>
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a39a1e1..b9f68e4 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> && !v->arch.power_off && !v->arch.pause);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return vcpu_mode_priv(vcpu);
> +}
> +
> /* Just ensure a guest exit from a particular CPU */
> static void exit_vm_noop(void *info)
> {
>
>

This one should work for s390x, no caching (or special access patterns
like on x86) needed:

+++ b/arch/s390/kvm/kvm-s390.c
@@ -2447,6 +2447,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
return kvm_s390_vcpu_has_irq(vcpu, 0);
}

+bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
+{
+ return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE);
+}
+
void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
{
atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);

--

Thanks,

David

2017-08-01 02:27:30

by Longpeng(Mike)

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 2017/7/31 21:22, Christoffer Dall wrote:

> On Sat, Jul 29, 2017 at 02:22:57PM +0800, Longpeng(Mike) wrote:
>> We had disscuss the idea here:
>> https://www.spinics.net/lists/kvm/msg140593.html
>
> This is not a very nice way to start a commit description.
>
> Please provide the necessary background to understand your change
> directly in the commit message.
>
>>
>> I think it's also suitable for other architectures.
>>
>
> I think this sentence can go in the end of the commit message together
> with your explanation of only doing this for x86.
>

OK :)

> By the way, the ARM solution should be pretty simple:
>
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a39a1e1..b9f68e4 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
> && !v->arch.power_off && !v->arch.pause);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return vcpu_mode_priv(vcpu);
> +}
> +
> /* Just ensure a guest exit from a particular CPU */
> static void exit_vm_noop(void *info)
> {
>
>
> I am also curious in the workload you use to measure this and how I can
> evaluate the benefit on ARM?
>

We had tested this using the SpecVirt testsuite, no improvement (no decrease at
least) because of the spinlock isn't the major factor of this testsuite.

Currently I haven't any performance numbers to prove the patch is make sense,
but I'll do some tests later.

> Thanks,
> -Christoffer
>
>> If the vcpu(me) exit due to request a usermode spinlock, then
>> the spinlock-holder may be preempted in usermode or kernmode.
>> But if the vcpu(me) is in kernmode, then the holder must be
>> preempted in kernmode, so we should choose a vcpu in kernmode
>> as the most eligible candidate.
>>
>> PS: I only implement X86 arch currently for I'm not familiar
>> with other architecture.
>>
>> Signed-off-by: Longpeng(Mike) <[email protected]>
>> ---
>> arch/mips/kvm/mips.c | 5 +++++
>> arch/powerpc/kvm/powerpc.c | 5 +++++
>> arch/s390/kvm/kvm-s390.c | 5 +++++
>> arch/x86/kvm/x86.c | 5 +++++
>> include/linux/kvm_host.h | 4 ++++
>> virt/kvm/arm/arm.c | 5 +++++
>> virt/kvm/kvm_main.c | 9 ++++++++-
>> 7 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
>> index d4b2ad1..2e2701d 100644
>> --- a/arch/mips/kvm/mips.c
>> +++ b/arch/mips/kvm/mips.c
>> @@ -98,6 +98,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>> return !!(vcpu->arch.pending_exceptions);
>> }
>>
>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>> +
>> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>> {
>> return 1;
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index 1a75c0b..2489f64 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -58,6 +58,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>> return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
>> }
>>
>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>> +
>> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>> {
>> return 1;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 3f2884e..9d7c42e 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2443,6 +2443,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>> return kvm_s390_vcpu_has_irq(vcpu, 0);
>> }
>>
>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>> +
>> void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
>> {
>> atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 82a63c5..b5a2e53 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -8435,6 +8435,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
>> return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
>> }
>>
>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
>> +{
>> + return kvm_x86_ops->get_cpl(vcpu) == 0;
>> +}
>> +
>> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>> {
>> return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 648b34c..f8f0d74 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -272,6 +272,9 @@ struct kvm_vcpu {
>> } spin_loop;
>> #endif
>> bool preempted;
>> + /* If vcpu is in kernel-mode when preempted */
>> + bool in_kernmode;
>> +
>> struct kvm_vcpu_arch arch;
>> struct dentry *debugfs_dentry;
>> };
>> @@ -797,6 +800,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>> void kvm_arch_hardware_unsetup(void);
>> void kvm_arch_check_processor_compat(void *rtn);
>> int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu);
>> int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
>>
>> #ifndef __KVM_HAVE_ARCH_VM_ALLOC
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index a39a1e1..ca6a394 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -416,6 +416,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
>> && !v->arch.power_off && !v->arch.pause);
>> }
>>
>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>> +
>> /* Just ensure a guest exit from a particular CPU */
>> static void exit_vm_noop(void *info)
>> {
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 82987d4..8d83caa 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -290,6 +290,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
>> kvm_vcpu_set_in_spin_loop(vcpu, false);
>> kvm_vcpu_set_dy_eligible(vcpu, false);
>> vcpu->preempted = false;
>> + vcpu->in_kernmode = false;
>>
>> r = kvm_arch_vcpu_init(vcpu);
>> if (r < 0)
>> @@ -2330,6 +2331,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> int pass;
>> int i;
>>
>> + me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
>> kvm_vcpu_set_in_spin_loop(me, true);
>> /*
>> * We boost the priority of a VCPU that is runnable but not
>> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>> continue;
>> if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>> continue;
>> + if (me->in_kernmode && !vcpu->in_kernmode)
>> + continue;
>> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>> continue;
>>
>> @@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
>> {
>> struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>>
>> - if (current->state == TASK_RUNNING)
>> + if (current->state == TASK_RUNNING) {
>> vcpu->preempted = true;
>> + vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
>> + }
>> +
>> kvm_arch_vcpu_put(vcpu);
>> }
>>
>> --
>> 1.8.3.1
>>
>>
>
> .
>

--
Regards,
Longpeng(Mike)

2017-08-01 03:29:06

by Longpeng(Mike)

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 2017/7/31 21:20, Paolo Bonzini wrote:

> On 31/07/2017 14:27, David Hildenbrand wrote:
>>> I'm not sure whether the operation of get the vcpu's priority-level is
>>> expensive on all architectures, so I record it in kvm_sched_out() for
>>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>>>
>> as you only care for x86 right now either way, you can directly optimize
>> here for the good (here: x86) case (keeping changes and therefore
>> possible bugs minimal).
>
> I agree with Cornelia that this is inconsistent, so you shouldn't update
> me->in_kernmode in kvm_vcpu_on_spin. However, get_cpl requires
> vcpu_load on Intel x86, so Mike's patch is necessary (vmx_get_cpl ->
> vmx_read_guest_seg_ar -> vmcs_read32).
>

Hi Paolo,

It seems that other architectures(e.g. arm/s390) needn't to cache the result,
but x86 need, so I need to move 'in_kernmode' into kvm_vcpu_arch and only add
this field to x86, right?

> Alternatively, we can add a new callback kvm_x86_ops->sched_out to x86
> KVM, and call vmx_get_cpl from the Intel implementation (vmx_sched_out).

In this approach, vmx_sched_out would only call vmx_get_cpl, isn't too
redundant, because we can just call kvm_x86_ops->get_cpl instead at the right place?

> This will cache the result until the next sched_in, so that

'until the next sched_in' --> Do we need to clear the result in sched in ?

> kvm_vcpu_on_spin can use it.
>
> Paolo
>
> .
>

--
Regards,
Longpeng(Mike)

2017-08-01 06:51:14

by Cornelia Huck

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On Mon, 31 Jul 2017 19:32:26 +0200
David Hildenbrand <[email protected]> wrote:

> This one should work for s390x, no caching (or special access patterns
> like on x86) needed:
>
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2447,6 +2447,11 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
> return kvm_s390_vcpu_has_irq(vcpu, 0);
> }
>
> +bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu)
> +{
> + return !(vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE);
> +}
> +
> void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
> {
> atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);

Yes, that should work.

2017-08-01 08:32:41

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [RFC] KVM: optimize the kvm_vcpu_on_spin

On 01/08/2017 05:26, Longpeng (Mike) wrote:
>
>
> On 2017/7/31 21:20, Paolo Bonzini wrote:
>
>> On 31/07/2017 14:27, David Hildenbrand wrote:
>>>> I'm not sure whether the operation of get the vcpu's priority-level is
>>>> expensive on all architectures, so I record it in kvm_sched_out() for
>>>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>>>>
>>> as you only care for x86 right now either way, you can directly optimize
>>> here for the good (here: x86) case (keeping changes and therefore
>>> possible bugs minimal).
>>
>> I agree with Cornelia that this is inconsistent, so you shouldn't update
>> me->in_kernmode in kvm_vcpu_on_spin. However, get_cpl requires
>> vcpu_load on Intel x86, so Mike's patch is necessary (vmx_get_cpl ->
>> vmx_read_guest_seg_ar -> vmcs_read32).
>>
>
> Hi Paolo,
>
> It seems that other architectures(e.g. arm/s390) needn't to cache the result,
> but x86 need, so I need to move 'in_kernmode' into kvm_vcpu_arch and only add
> this field to x86, right?

That's another way to do it, and I like it.

>> This will cache the result until the next sched_in, so that
>
> 'until the next sched_in' --> Do we need to clear the result in sched in ?

No, thanks.

Paolo