Date:   Wed, 8 Dec 2021 11:36:09 +0800
From:   Aili Yao <yaoaili126@gmail.com>
To:     Sean Christopherson <seanjc@google.com>
Cc:     pbonzini@redhat.com, vkuznets@redhat.com, wanpengli@tencent.com,
        jmattson@google.com, joro@8bytes.org, tglx@linutronix.de,
        mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
        x86@kernel.org, hpa@zytor.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, yaoaili@kingsoft.com
Subject: Re: [PATCH v2] KVM: LAPIC: Per vCPU control over
 kvm_can_post_timer_interrupt
Message-ID: <20211208113609.483d5f1a@gmail.com>
In-Reply-To: <Ya/s17QDlGZi9COR@google.com>
References: <20211124125409.6eec3938@gmail.com>
        <Ya/s17QDlGZi9COR@google.com>
Organization: ksyun
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Precedence: bulk

On Tue, 7 Dec 2021 23:23:03 +0000
Sean Christopherson <seanjc@google.com> wrote:

> On Wed, Nov 24, 2021, Aili Yao wrote:
> > When cpu-pm is successfully enabled, and hlt_in_guest is true and
> > mwait_in_guest is false, the guest cant't use Monitor/Mwait instruction
> > for idle operation, instead, the guest may use halt for that purpose, as
> > we have enable the cpu-pm feature and hlt_in_guest is true, we will also
> > minimize the guest exit; For such a scenario, Monitor/Mwait instruction
> > support is totally disabled, the guest has no way to use Mwait to exit from
> > non-root mode;
> > 
> > For cpu-pm feature, hlt_in_guest and others except mwait_in_guest will
> > be a good hint for it. So replace it with hlt_in_guest.  
> 
> This should be a separate patch from the housekeeping_cpu() check, if we add
> the housekeeping check.
> 
> > Signed-off-by: Aili Yao <yaoaili@kingsoft.com>
> > ---
> >  arch/x86/kvm/lapic.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 759952dd1222..42aef1accd6b 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -34,6 +34,7 @@
> >  #include <asm/delay.h>
> >  #include <linux/atomic.h>
> >  #include <linux/jump_label.h>
> > +#include <linux/sched/isolation.h>
> >  #include "kvm_cache_regs.h"
> >  #include "irq.h"
> >  #include "ioapic.h"
> > @@ -113,13 +114,14 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
> >  
> >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> >  {
> > -	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > +	return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> > +		!housekeeping_cpu(vcpu->cpu, HK_FLAG_TIMER);  
> 
> Why not check kvm_{hlt,mwait}_in_guest()?  IIUC, non-housekeeping CPUs don't _have_
> to be associated 1:1 with a vCPU, in which case posting the timer is unlikely
> to be a performance win even though the target isn't a housekeeping CPU.

Yes, non-housekeeping CPUs can be assigned to multi vCPUs, I don't think it's a common configuration;
But this can happen.

> And wouldn't exposing HLT/MWAIT to a vCPU that's on a housekeeping CPU be a bogus
> configuration?

Agree, it's a bogus configuration and not suppose to like this, but this can happen;

It seems we can't cover all the abnormal cases in a single line. So I think just checking for
most right configurations is needed.

> >  }
> >  
> >  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
> >  {
> >  	return kvm_x86_ops.set_hv_timer
> > -	       && !(kvm_mwait_in_guest(vcpu->kvm) ||
> > +	       && !(kvm_hlt_in_guest(vcpu->kvm) ||  
> 
> This is incorrect, the HLT vs. MWAIT isn't purely a posting interrupts thing.  The
> VMX preemption timer counts down in C0, C1, and C2, but not deeper sleep states.
> HLT is always C1, thus it's safe to use the VMX preemption timer even if the guest
> can execute HLT without exiting.
> The timer isn't compatible with MWAIT because it stops counting in C3 (or lower),
> i.e. the guest can cause the timer to stop counting.

Thanks for your pointer, now i know this.

> 
> >  		    kvm_can_post_timer_interrupt(vcpu));
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
> > --   
> 
> Splicing in Wanpeng's version to try and merge the two threads:
> 
> On Tue, Nov 23, 2021 at 10:00 PM Wanpeng Li <kernellwp@gmail.com> wrote:
> > ---
> >  arch/x86/kvm/lapic.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 759952dd1222..8257566d44c7 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -113,14 +113,13 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
> >
> >  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
> >  {
> > -       return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> > +       return pi_inject_timer && kvm_mwait_in_guest(vcpu->kvm) && kvm_vcpu_apicv_active(vcpu);  
> 
> As Aili's changelog pointed out, MWAIT may not be advertised to the guest. 
> 
> So I think we want this?  With a non-functional, opinionated refactoring of
> kvm_can_use_hv_timer() because I'm terrible at reading !(a || b).
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 40270d7bc597..c77cb386d03d 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -113,14 +113,25 @@ static inline u32 kvm_x2apic_id(struct kvm_lapic *apic)
> 
>  static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu)
>  {
> -       return pi_inject_timer && kvm_vcpu_apicv_active(vcpu);
> +       return pi_inject_timer && kvm_vcpu_apicv_active(vcpu) &&
> +              (kvm_mwait_in_guest(vcpu) || kvm_hlt_in_guest(vcpu));
>  }

I think only kvm_hlt_in_guest() check is enough here, as for current code, if kvm_mwait_in_guest() is true,
kvm_hlt_in_guest must be ture, if kvm_mwait_in_guest() is false, kvm_hlt_in_guest() could also
be true.

>  bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu)
>  {
> -       return kvm_x86_ops.set_hv_timer
> -              && !(kvm_mwait_in_guest(vcpu->kvm) ||
> -                   kvm_can_post_timer_interrupt(vcpu));
> +       /*
> +        * Don't use the hypervisor timer, a.k.a. VMX Preemption Timer, if the
> +        * guest can execute MWAIT without exiting as the timer will stop
> +        * counting if the core enters C3 or lower.  HLT in the guest is ok as
> +        * HLT is effectively C1 and the timer counts in C0, C1, and C2.
> +        *
> +        * Don't use the hypervisor timer if KVM can post a timer interrupt to
> +        * the guest since posted the timer avoids taking an extra a VM-Exit
> +        * when the timer expires.
> +        */
> +       return kvm_x86_ops.set_hv_timer &&
> +              !kvm_mwait_in_guest(vcpu->kvm) &&
> +              !kvm_can_post_timer_interrupt(vcpu));
>  }
>  EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
> 

I think this modification covers most used configurations and it's right.
Thanks!

--Aili Yao