Received: by 10.223.164.202 with SMTP id h10csp352064wrb; Wed, 29 Nov 2017 23:53:46 -0800 (PST) X-Google-Smtp-Source: AGs4zMa9qG0OoJLDaZ+qACjsChKOIOg3T0qNEa9CWwgUmf5pHkbgpIsABhYulv1Jy7yRHfpauV6s X-Received: by 10.99.184.25 with SMTP id p25mr1550550pge.337.1512028426079; Wed, 29 Nov 2017 23:53:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512028426; cv=none; d=google.com; s=arc-20160816; b=P7jlD/BogPUcdEr/9Ck/CHolwxFpDH1pyeiiC3I7fyawNbW7A/AONG9e4mc6Y66Uc8 BGTaaoACH9PUkiMeeV18OhThn7QJxcg9e4PYPqiNqG6n6cALN98/VzGKxEUVXLEd2xgT FRZ/oh/moG0NBBcdpI6sInmRdsGOZyN1UV3zGq18UFD1NA22WN5YsjCUB0LZozYADtNN HXbr/xD5TsgeustZe4/RbT0kvt+ziAaKyvwayp/pTBas5/jVY4CQ2vfDesV6TVDSw2Et AwD/sOQEOojuxzOAwBqOzEVI17SKhbPgg2nOuvPrf6B92mXyock7Z+wvLG+u5klKSmiH Y5hQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:references:in-reply-to:mime-version :dkim-signature:arc-authentication-results; bh=+JxmcARXpJdxqytQvmyzF22QOwzxHMDeANfEySZBk90=; b=XTBGkzucSEJTSoYbZ8RfRMvJfHX7P6Z0sF85n9RdivFFQ1z51/mnzPfaOspo0sFYwo p5vBtX+gO7ZUrhFruUym8pOqjezxD4D/6vj0Dsffl+3p4E1Yf32uBIh798FuJyGg1i+m QaSXu8LO6h97fzxRH1znvjzX4y6dnW+YkPC/fDqDA+VMeaGV8jT2RrBBMcfoQ6WQOt4L sjP+FwYpSVLCqETXUfvvyzsFxnt3R3RhsVueXYCKjFWx2TyrU+WR527NgdLP9XWjZyNR /ucWsmTeJCvAaOWHfHziDOqjYOdvhbrTJm2AaM13iH8BOc4b9tih7bPYdR23oDGzq89M yAFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ek/kBnea; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id be11si2659210plb.81.2017.11.29.23.53.32; Wed, 29 Nov 2017 23:53:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ek/kBnea; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751890AbdK3HxH (ORCPT + 99 others); Thu, 30 Nov 2017 02:53:07 -0500 Received: from mail-ot0-f194.google.com ([74.125.82.194]:37944 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751254AbdK3HxC (ORCPT ); Thu, 30 Nov 2017 02:53:02 -0500 Received: by mail-ot0-f194.google.com with SMTP id p3so5383159oti.5; Wed, 29 Nov 2017 23:53:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=+JxmcARXpJdxqytQvmyzF22QOwzxHMDeANfEySZBk90=; b=Ek/kBneaOy/O1FwyUk/okmiInrB+IQ9vPWCBbzv0tTZ2qo6cE7D6yFQ0L0Xg+pHagK 1CNvOn3WPJpuYG+WK6O1zQydOjAnMPZluAk4j7D1yR7FKE4IUVaqbercX/1waPfe5+i7 Ov7gt1AHuEqcufKSIM3sJlGSNYy8qSpUulU3tB7rg1q6yfizOJNN1IGnjmNyGHCn+RFl cLW/DVgxJ+eb/Qvv6/dyDX3EC9FCiZHEK8UyeG2p+zhESd6jRqfQPYuiliJi8hLpB/u/ WPA9x2yIxZaGobwWpZsMh8WgTtuioAdHv3gIzM70NOWoovYXf2XmcphOTdgo8HGpDzKg RZOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=+JxmcARXpJdxqytQvmyzF22QOwzxHMDeANfEySZBk90=; b=DVyfAq8tHwQ1DfX0EPiuLUcDeRj8/8+rgx1SCud/KcnKQ2U6PGSREuFaUXu+Q34H88 KTrHq3xBqFdS0MXOChX+4NqahhJeaRfER73K4mRIVJedDoNEkzc6eOEruCk1g3JC6Ca3 wacnLj1yavWyFZMKOPyDi8GnsPi7SPGCpr4WhJFI0UDDMjfEEogz+O172e1ahQ+qhQU2 /9yHqAQrIp5S7EG6uNktnOVtADW7aPhZKjY5VExOPomU7qb+5rqWG5OQ+CzzsPuZKa26 beYOofYCEThRHDrfloVktOzuxKWFXrSPrDnbEjcH6ougnatEv3uoiVPTAMGf4gHvRQ/S o4Vg== X-Gm-Message-State: AJaThX6ySsBdy3wWafTg+Bpl8XnF8zYf4KadkC1WkObTBNaL9kFs26fv AV2Xy4q0ZT8wWuWG6MqJ7IHzoCVVIL+q9Tqsrvk= X-Received: by 10.157.44.236 with SMTP id e41mr4175590otd.22.1512028382195; Wed, 29 Nov 2017 23:53:02 -0800 (PST) MIME-Version: 1.0 Received: by 10.74.209.8 with HTTP; Wed, 29 Nov 2017 23:53:01 -0800 (PST) In-Reply-To: <1512021674-9880-3-git-send-email-wanpeng.li@hotmail.com> References: <1512021674-9880-1-git-send-email-wanpeng.li@hotmail.com> <1512021674-9880-3-git-send-email-wanpeng.li@hotmail.com> From: Wanpeng Li Date: Thu, 30 Nov 2017 15:53:01 +0800 Message-ID: Subject: Re: [PATCH v7 2/4] KVM: X86: Add Paravirt TLB Shootdown To: "linux-kernel@vger.kernel.org" , kvm Cc: Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Peter Zijlstra , Wanpeng Li Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2017-11-30 14:01 GMT+08:00 Wanpeng Li : > From: Wanpeng Li > > Remote flushing api's does a busy wait which is fine in bare-metal > scenario. But with-in the guest, the vcpus might have been pre-empted > or blocked. In this scenario, the initator vcpu would end up > busy-waiting for a long amount of time. > > This patch set implements para-virt flush tlbs making sure that it > does not wait for vcpus that are sleeping. And all the sleeping vcpus > flush the tlb on guest enter. > > The best result is achieved when we're overcommiting the host by running > multiple vCPUs on each pCPU. In this case PV tlb flush avoids touching > vCPUs which are not scheduled and avoid the wait on the main CPU. > > Testing on a Xeon Gold 6142 2.6GHz 2 sockets, 32 cores, 64 threads, > so 64 pCPUs, and each VM is 64 vCPUs. > > ebizzy -M > vanilla optimized boost > 1VM 46799 48670 4% > 2VM 23962 42691 78% > 3VM 16152 37539 132% > > Cc: Paolo Bonzini > Cc: Radim Kr=C4=8Dm=C3=A1=C5=99 > Cc: Peter Zijlstra > Signed-off-by: Wanpeng Li > --- > Documentation/virtual/kvm/cpuid.txt | 4 +++ > arch/x86/include/uapi/asm/kvm_para.h | 2 ++ > arch/x86/kernel/kvm.c | 47 ++++++++++++++++++++++++++++++= ++++++ > 3 files changed, 53 insertions(+) > > diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/= kvm/cpuid.txt > index 3c65feb..dcab6dc 100644 > --- a/Documentation/virtual/kvm/cpuid.txt > +++ b/Documentation/virtual/kvm/cpuid.txt > @@ -54,6 +54,10 @@ KVM_FEATURE_PV_UNHALT || 7 || guest c= hecks this feature bit > || || before enabling paravirtu= alized > || || spinlock support. > ------------------------------------------------------------------------= ------ > +KVM_FEATURE_PV_TLB_FLUSH || 9 || guest checks this feature= bit > + || || before enabling paravirtu= alized > + || || tlb flush. > +------------------------------------------------------------------------= ------ > KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no gues= t-side > || || per-cpu warps are expecte= d in > || || kvmclock. > diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi= /asm/kvm_para.h > index 763b692..8fbcc16 100644 > --- a/arch/x86/include/uapi/asm/kvm_para.h > +++ b/arch/x86/include/uapi/asm/kvm_para.h > @@ -25,6 +25,7 @@ > #define KVM_FEATURE_STEAL_TIME 5 > #define KVM_FEATURE_PV_EOI 6 > #define KVM_FEATURE_PV_UNHALT 7 > +#define KVM_FEATURE_PV_TLB_FLUSH 9 > > /* The last 8 bits are used to indicate how to interpret the flags field > * in pvclock structure. If no bits are set, all flags are ignored. > @@ -53,6 +54,7 @@ struct kvm_steal_time { > > #define KVM_VCPU_NOT_PREEMPTED (0 << 0) > #define KVM_VCPU_PREEMPTED (1 << 0) > +#define KVM_VCPU_SHOULD_FLUSH (1 << 1) > > #define KVM_CLOCK_PAIRING_WALLCLOCK 0 > struct kvm_clock_pairing { > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 6610b92..64fb9a4 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -498,6 +498,34 @@ static void __init kvm_apf_trap_init(void) > update_intr_gate(X86_TRAP_PF, async_page_fault); > } > > +static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask); > + > +static void kvm_flush_tlb_others(const struct cpumask *cpumask, > + const struct flush_tlb_info *info) > +{ > + u8 state; > + int cpu; > + struct kvm_steal_time *src; > + struct cpumask *flushmask =3D this_cpu_cpumask_var_ptr(__pv_tlb_m= ask); > + > + cpumask_copy(flushmask, cpumask); > + /* > + * We have to call flush only on online vCPUs. And > + * queue flush_on_enter for pre-empted vCPUs > + */ > + for_each_cpu(cpu, flushmask) { > + src =3D &per_cpu(steal_time, cpu); > + state =3D READ_ONCE(src->preempted); > + if ((state & KVM_VCPU_PREEMPTED)) { > + if (try_cmpxchg(&src->preempted, &state, > + state | KVM_VCPU_SHOULD_FLUSH)) > + __cpumask_clear_cpu(cpu, flushmask); > + } > + } > + > + native_flush_tlb_others(flushmask, info); > +} > + > static void __init kvm_guest_init(void) > { > int i; > @@ -517,6 +545,9 @@ static void __init kvm_guest_init(void) > pv_time_ops.steal_clock =3D kvm_steal_clock; > } > > + if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH)) > + pv_mmu_ops.flush_tlb_others =3D kvm_flush_tlb_others; > + > if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) > apic_set_eoi_write(kvm_guest_apic_eoi_write); > > @@ -598,6 +629,22 @@ static __init int activate_jump_labels(void) > } > arch_initcall(activate_jump_labels); > > +static __init int kvm_setup_pv_tlb_flush(void) > +{ > + int cpu; > + > + if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH)) { > + for_each_possible_cpu(cpu) { > + zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mas= k, cpu), > + GFP_KERNEL, cpu_to_node(cpu)); > + } > + pr_info("KVM setup remote TLB flush\n"); Please change it to "KVM setup pv remote TLB flush\n" if it is the last version before applying. :) Regards, Wanpeng Li > + } > + > + return 0; > +} > +arch_initcall(kvm_setup_pv_tlb_flush); > + > #ifdef CONFIG_PARAVIRT_SPINLOCKS > > /* Kick a cpu by its apicid. Used to wake up a halted vcpu */ > -- > 2.7.4 > From 1585469806015715303@xxx Thu Nov 30 06:03:53 +0000 2017 X-GM-THRID: 1585469692856919567 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread