2021-10-09 09:11:26

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2 1/3] KVM: emulate: Don't inject #GP when emulating RDMPC if CR0.PE=0

From: Wanpeng Li <[email protected]>

DM mentioned that, RDPMC:

IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported counter))
THEN
EAX := counter[31:0];
EDX := ZeroExtend(counter[MSCB:32]);
ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
#GP(0);
FI;

Let's add the CR0.PE is 1 checking to rdpmc emulate, though this isn't
strictly necessary since it's impossible for CPL to be >0 if CR0.PE=0.

Signed-off-by: Wanpeng Li <[email protected]>
---
v1 -> v2:
* update patch description

arch/x86/kvm/emulate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 9a144ca8e146..ab7ec569e8c9 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4213,6 +4213,7 @@ static int check_rdtsc(struct x86_emulate_ctxt *ctxt)
static int check_rdpmc(struct x86_emulate_ctxt *ctxt)
{
u64 cr4 = ctxt->ops->get_cr(ctxt, 4);
+ u64 cr0 = ctxt->ops->get_cr(ctxt, 0);
u64 rcx = reg_read(ctxt, VCPU_REGS_RCX);

/*
@@ -4222,7 +4223,7 @@ static int check_rdpmc(struct x86_emulate_ctxt *ctxt)
if (enable_vmware_backdoor && is_vmware_backdoor_pmc(rcx))
return X86EMUL_CONTINUE;

- if ((!(cr4 & X86_CR4_PCE) && ctxt->ops->cpl(ctxt)) ||
+ if ((!(cr4 & X86_CR4_PCE) && ctxt->ops->cpl(ctxt) && (cr0 & X86_CR0_PE)) ||
ctxt->ops->check_pmc(ctxt, rcx))
return emulate_gp(ctxt, 0);

--
2.25.1


2021-10-09 09:12:34

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v2 3/3] KVM: vCPU kick tax cut for running vCPU

From: Wanpeng Li <[email protected]>

Sometimes a vCPU kick is following a pending request, even if @vcpu is
the running vCPU. It suffers from both rcuwait_wake_up() which has
rcu/memory barrier operations and cmpxchg(). Let's check vcpu->wait
before rcu_wait_wake_up() and whether @vcpu is the running vCPU before
cmpxchg() to tax cut this overhead.

We evaluate the kvm-unit-test/vmexit.flat on an Intel ICX box, most of the
scores can improve ~600 cpu cycles especially when APICv is disabled.

tscdeadline_immed
tscdeadline
self_ipi_sti_nop
..............
x2apic_self_ipi_tpr_sti_hlt

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
v1 -> v2:
* move checking running vCPU logic to kvm_vcpu_kick
* check rcuwait_active(&vcpu->wait) etc

virt/kvm/kvm_main.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7851f3a1b5f7..18209d7b3711 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3314,8 +3314,15 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
{
int me, cpu;

- if (kvm_vcpu_wake_up(vcpu))
- return;
+ me = get_cpu();
+
+ if (rcuwait_active(&vcpu->wait) && kvm_vcpu_wake_up(vcpu))
+ goto out;
+
+ if (vcpu == __this_cpu_read(kvm_running_vcpu)) {
+ WARN_ON_ONCE(vcpu->mode == IN_GUEST_MODE);
+ goto out;
+ }

/*
* Note, the vCPU could get migrated to a different pCPU at any point
@@ -3324,12 +3331,12 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
* IPI is to force the vCPU to leave IN_GUEST_MODE, and migrating the
* vCPU also requires it to leave IN_GUEST_MODE.
*/
- me = get_cpu();
if (kvm_arch_vcpu_should_kick(vcpu)) {
cpu = READ_ONCE(vcpu->cpu);
if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
smp_send_reschedule(cpu);
}
+out:
put_cpu();
}
EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
--
2.25.1

2021-10-18 03:22:58

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] KVM: emulate: Don't inject #GP when emulating RDMPC if CR0.PE=0

On Sat, Oct 09, 2021, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> DM mentioned that, RDPMC:

Heh, missing 'S' in "SDM".

>
> IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported counter))
> THEN
> EAX := counter[31:0];
> EDX := ZeroExtend(counter[MSCB:32]);
> ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
> #GP(0);
> FI;
>
> Let's add the CR0.PE is 1 checking to rdpmc emulate, though this isn't
> strictly necessary since it's impossible for CPL to be >0 if CR0.PE=0.
>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---

Reviewed-by: Sean Christopherson <[email protected]>

2021-10-18 03:23:21

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] KVM: vCPU kick tax cut for running vCPU

On Sat, Oct 09, 2021, Wanpeng Li wrote:
> From: Wanpeng Li <[email protected]>
>
> Sometimes a vCPU kick is following a pending request, even if @vcpu is
> the running vCPU. It suffers from both rcuwait_wake_up() which has
> rcu/memory barrier operations and cmpxchg(). Let's check vcpu->wait
> before rcu_wait_wake_up() and whether @vcpu is the running vCPU before
> cmpxchg() to tax cut this overhead.
>
> We evaluate the kvm-unit-test/vmexit.flat on an Intel ICX box, most of the
> scores can improve ~600 cpu cycles especially when APICv is disabled.
>
> tscdeadline_immed
> tscdeadline
> self_ipi_sti_nop
> ..............
> x2apic_self_ipi_tpr_sti_hlt
>
> Suggested-by: Sean Christopherson <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> v1 -> v2:
> * move checking running vCPU logic to kvm_vcpu_kick
> * check rcuwait_active(&vcpu->wait) etc
>
> virt/kvm/kvm_main.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 7851f3a1b5f7..18209d7b3711 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3314,8 +3314,15 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> {
> int me, cpu;
>
> - if (kvm_vcpu_wake_up(vcpu))
> - return;
> + me = get_cpu();
> +
> + if (rcuwait_active(&vcpu->wait) && kvm_vcpu_wake_up(vcpu))

This needs to use kvm_arch_vcpu_get_wait(), not vcpu->wait, because PPC has some
funky wait stuff.

One potential issue I didn't think of before. rcuwait_active() comes with the
below warning, which means we might be at risk of a false negative that could
result in a missed wakeup. I'm not postive on that though.

/*
* Note: this provides no serialization and, just as with waitqueues,
* requires care to estimate as to whether or not the wait is active.
*/

> + goto out;
> +
> + if (vcpu == __this_cpu_read(kvm_running_vcpu)) {
> + WARN_ON_ONCE(vcpu->mode == IN_GUEST_MODE);
> + goto out;
> + }
>
> /*
> * Note, the vCPU could get migrated to a different pCPU at any point
> @@ -3324,12 +3331,12 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> * IPI is to force the vCPU to leave IN_GUEST_MODE, and migrating the
> * vCPU also requires it to leave IN_GUEST_MODE.
> */
> - me = get_cpu();
> if (kvm_arch_vcpu_should_kick(vcpu)) {
> cpu = READ_ONCE(vcpu->cpu);
> if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
> smp_send_reschedule(cpu);
> }
> +out:
> put_cpu();
> }
> EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> --
> 2.25.1
>

2021-10-18 03:25:59

by Wanpeng Li

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] KVM: vCPU kick tax cut for running vCPU

On Sat, 16 Oct 2021 at 07:26, Sean Christopherson <[email protected]> wrote:
>
> On Sat, Oct 09, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li <[email protected]>
> >
> > Sometimes a vCPU kick is following a pending request, even if @vcpu is
> > the running vCPU. It suffers from both rcuwait_wake_up() which has
> > rcu/memory barrier operations and cmpxchg(). Let's check vcpu->wait
> > before rcu_wait_wake_up() and whether @vcpu is the running vCPU before
> > cmpxchg() to tax cut this overhead.
> >
> > We evaluate the kvm-unit-test/vmexit.flat on an Intel ICX box, most of the
> > scores can improve ~600 cpu cycles especially when APICv is disabled.
> >
> > tscdeadline_immed
> > tscdeadline
> > self_ipi_sti_nop
> > ..............
> > x2apic_self_ipi_tpr_sti_hlt
> >
> > Suggested-by: Sean Christopherson <[email protected]>
> > Signed-off-by: Wanpeng Li <[email protected]>
> > ---
> > v1 -> v2:
> > * move checking running vCPU logic to kvm_vcpu_kick
> > * check rcuwait_active(&vcpu->wait) etc
> >
> > virt/kvm/kvm_main.c | 13 ++++++++++---
> > 1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 7851f3a1b5f7..18209d7b3711 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -3314,8 +3314,15 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> > {
> > int me, cpu;
> >
> > - if (kvm_vcpu_wake_up(vcpu))
> > - return;
> > + me = get_cpu();
> > +
> > + if (rcuwait_active(&vcpu->wait) && kvm_vcpu_wake_up(vcpu))
>
> This needs to use kvm_arch_vcpu_get_wait(), not vcpu->wait, because PPC has some
> funky wait stuff.
>
> One potential issue I didn't think of before. rcuwait_active() comes with the
> below warning, which means we might be at risk of a false negative that could
> result in a missed wakeup. I'm not postive on that though.

There is only ever a single waiting vCPU, an event will be requested
before kick the sleeping vCPU and it will be checked after setting
vcpu->wait to task. I can't find scenario could result in a missed
wakeup.

Wanpeng

>
> /*
> * Note: this provides no serialization and, just as with waitqueues,
> * requires care to estimate as to whether or not the wait is active.
> */