Received: by 10.223.164.202 with SMTP id h10csp272258wrb; Wed, 29 Nov 2017 22:02:05 -0800 (PST) X-Google-Smtp-Source: AGs4zMZ3+oPmHODbvL0RLxMYki8uFjkaF2kSlJbVyW212kOZ1p3dWOfHKFQN9I4OB34UoUyYFNzt X-Received: by 10.84.211.136 with SMTP id c8mr1394322pli.301.1512021725319; Wed, 29 Nov 2017 22:02:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512021725; cv=none; d=google.com; s=arc-20160816; b=yyDMKNQgauGPJILxGCcYRcUWs34bh+ZUH2tavuis7P93TCNPfNcMUxkF3H5dDjfgVt XUhonb4BTM7G35L/oQ1X1XEsUlLy4Z8W/Supx/3SA79Ggt9z2THFqaDzKZqFFDorbYhl Z7MQYdQYBQovX9q45hNpxdCIB8kL6Yh8wsiSSfFwRhENjhuR2pg9ph0agHT8OWvNd2pf VWm10SCx2qOtTFPPysLZic8ZVQUg/PzX7+lPbhbwLNWF4GO/mm74R2PypdyU9S4iUbxU 7AkjUtqX4SdVhkg4NrtHubYWdj0dlQK58/g9+RmmH/7/LEdyxzx/wiTi5KEudOimLXki 394Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=JGGasjcEjXhB7UFhNxsUkkdC2f1RE8CNaMwjPXgZzPo=; b=HVV8hg/j6IcLyKGui39qhSFG8h4MzVs310yAI8GTWE2AHCnC9bLlyq7RhsWp17V+wi SzYMNRtzQKY5dmSROyMn6TbGHp3gMoY7EuIE3eFgTNP93A1F0SF7CnEdkX7ryrvaK+2Z 9pB8rjsd5lJn2lL30eMacGzg/Nm8d/8UmuEuj/y2WGucJTVXKZa23tZfTOMfYmKgfHPh cMgGGtNyqiUcGka5l4lsSGwmZfqkVlhvuQXW86uXLbwBHVlsonwb+hlMssyuRdCNOKYB RTr1X4hhTVp9E2/47hnRxt9zo1K0ltPl73MQEBMLZhgPCdnz2WB2KZh3La8qwAMuP/GY 2PSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ewHEv35e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w10si2470110pgr.259.2017.11.29.22.01.51; Wed, 29 Nov 2017 22:02:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ewHEv35e; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752398AbdK3GB1 (ORCPT + 99 others); Thu, 30 Nov 2017 01:01:27 -0500 Received: from mail-pf0-f196.google.com ([209.85.192.196]:39404 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751904AbdK3GBX (ORCPT ); Thu, 30 Nov 2017 01:01:23 -0500 Received: by mail-pf0-f196.google.com with SMTP id l24so2685261pfj.6; Wed, 29 Nov 2017 22:01:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JGGasjcEjXhB7UFhNxsUkkdC2f1RE8CNaMwjPXgZzPo=; b=ewHEv35eIDB9IMzJUc//6BKtNX04UchjPDjNbFtZ1jRGEU3TolUqjUzGSYMoD78SwH fRqdeskJ97LO63Mo/92pOsXJH5QsIXNscDhqxV4eGfIA8ZbqmXisMhrUXSXBNMlaueKh V2UgeBf2QU5QCPxWrj80PXEzT/C7IIRz5IWk10SDQsz62GNMPtItg0xh77eCACyXNfWq 8htv5tIPOuVYhOatlNTuoEarAenK91wDpungVIRFt6aD8j9HpCcSFKTjmOrjmjG60QeA 6AQOwB05IGTWPlWKz5GyZoUywowbSuT8KCh6Vn+dd8V6C72eaCkGomZ4tY6NvZQMWTsp 2now== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JGGasjcEjXhB7UFhNxsUkkdC2f1RE8CNaMwjPXgZzPo=; b=d8ZQLBcg8kDgrMyseJXe/d1ZT+c5bQRqzED5az8jgh4fLvPgW2N1tcODP5+r/O4BZf hZfCavXg0EwyVNwv9br01ZbXqxRN5bAbzklOq5sHfHByrqRTqsXWjh/LT3+AcrcnWd5g pVRpI0qRhq2u+9MaluMVi9SHgq6Kng4yC/HSOBqFRlyMR8j4m4MgeUzWyWPCfV27vjtw /jxpCZm5zpPbbdKZaih89T8EAFN8FdhzU77q8VgIvUN/2AtJ68SVZN7Dqa+R4oAIJPSQ uswIwgI+oGHNpwsNgH90j1UY3NDAQxy3illrA/E6y4IYsrnkx4I83hmwTKDT6fVz4cWC l+hQ== X-Gm-Message-State: AJaThX4gLZr1QDwzth8E5znoNC85MEMpTDfBaEjtXNSjrDcQn6GcLDeQ OprUve87ZVpQYgzmpVhG60Ic6Q== X-Received: by 10.99.112.89 with SMTP id a25mr1374552pgn.2.1512021682561; Wed, 29 Nov 2017 22:01:22 -0800 (PST) Received: from localhost ([203.205.141.123]) by smtp.gmail.com with ESMTPSA id j62sm5186970pgc.35.2017.11.29.22.01.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 29 Nov 2017 22:01:22 -0800 (PST) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Peter Zijlstra , Wanpeng Li Subject: [PATCH v7 2/4] KVM: X86: Add Paravirt TLB Shootdown Date: Wed, 29 Nov 2017 22:01:12 -0800 Message-Id: <1512021674-9880-3-git-send-email-wanpeng.li@hotmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1512021674-9880-1-git-send-email-wanpeng.li@hotmail.com> References: <1512021674-9880-1-git-send-email-wanpeng.li@hotmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Wanpeng Li Remote flushing api's does a busy wait which is fine in bare-metal scenario. But with-in the guest, the vcpus might have been pre-empted or blocked. In this scenario, the initator vcpu would end up busy-waiting for a long amount of time. This patch set implements para-virt flush tlbs making sure that it does not wait for vcpus that are sleeping. And all the sleeping vcpus flush the tlb on guest enter. The best result is achieved when we're overcommiting the host by running multiple vCPUs on each pCPU. In this case PV tlb flush avoids touching vCPUs which are not scheduled and avoid the wait on the main CPU. Testing on a Xeon Gold 6142 2.6GHz 2 sockets, 32 cores, 64 threads, so 64 pCPUs, and each VM is 64 vCPUs. ebizzy -M vanilla optimized boost 1VM 46799 48670 4% 2VM 23962 42691 78% 3VM 16152 37539 132% Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Peter Zijlstra Signed-off-by: Wanpeng Li --- Documentation/virtual/kvm/cpuid.txt | 4 +++ arch/x86/include/uapi/asm/kvm_para.h | 2 ++ arch/x86/kernel/kvm.c | 47 ++++++++++++++++++++++++++++++++++++ 3 files changed, 53 insertions(+) diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt index 3c65feb..dcab6dc 100644 --- a/Documentation/virtual/kvm/cpuid.txt +++ b/Documentation/virtual/kvm/cpuid.txt @@ -54,6 +54,10 @@ KVM_FEATURE_PV_UNHALT || 7 || guest checks this feature bit || || before enabling paravirtualized || || spinlock support. ------------------------------------------------------------------------------ +KVM_FEATURE_PV_TLB_FLUSH || 9 || guest checks this feature bit + || || before enabling paravirtualized + || || tlb flush. +------------------------------------------------------------------------------ KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side || || per-cpu warps are expected in || || kvmclock. diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 763b692..8fbcc16 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -25,6 +25,7 @@ #define KVM_FEATURE_STEAL_TIME 5 #define KVM_FEATURE_PV_EOI 6 #define KVM_FEATURE_PV_UNHALT 7 +#define KVM_FEATURE_PV_TLB_FLUSH 9 /* The last 8 bits are used to indicate how to interpret the flags field * in pvclock structure. If no bits are set, all flags are ignored. @@ -53,6 +54,7 @@ struct kvm_steal_time { #define KVM_VCPU_NOT_PREEMPTED (0 << 0) #define KVM_VCPU_PREEMPTED (1 << 0) +#define KVM_VCPU_SHOULD_FLUSH (1 << 1) #define KVM_CLOCK_PAIRING_WALLCLOCK 0 struct kvm_clock_pairing { diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 6610b92..64fb9a4 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -498,6 +498,34 @@ static void __init kvm_apf_trap_init(void) update_intr_gate(X86_TRAP_PF, async_page_fault); } +static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask); + +static void kvm_flush_tlb_others(const struct cpumask *cpumask, + const struct flush_tlb_info *info) +{ + u8 state; + int cpu; + struct kvm_steal_time *src; + struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_tlb_mask); + + cpumask_copy(flushmask, cpumask); + /* + * We have to call flush only on online vCPUs. And + * queue flush_on_enter for pre-empted vCPUs + */ + for_each_cpu(cpu, flushmask) { + src = &per_cpu(steal_time, cpu); + state = READ_ONCE(src->preempted); + if ((state & KVM_VCPU_PREEMPTED)) { + if (try_cmpxchg(&src->preempted, &state, + state | KVM_VCPU_SHOULD_FLUSH)) + __cpumask_clear_cpu(cpu, flushmask); + } + } + + native_flush_tlb_others(flushmask, info); +} + static void __init kvm_guest_init(void) { int i; @@ -517,6 +545,9 @@ static void __init kvm_guest_init(void) pv_time_ops.steal_clock = kvm_steal_clock; } + if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH)) + pv_mmu_ops.flush_tlb_others = kvm_flush_tlb_others; + if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) apic_set_eoi_write(kvm_guest_apic_eoi_write); @@ -598,6 +629,22 @@ static __init int activate_jump_labels(void) } arch_initcall(activate_jump_labels); +static __init int kvm_setup_pv_tlb_flush(void) +{ + int cpu; + + if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH)) { + for_each_possible_cpu(cpu) { + zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mask, cpu), + GFP_KERNEL, cpu_to_node(cpu)); + } + pr_info("KVM setup remote TLB flush\n"); + } + + return 0; +} +arch_initcall(kvm_setup_pv_tlb_flush); + #ifdef CONFIG_PARAVIRT_SPINLOCKS /* Kick a cpu by its apicid. Used to wake up a halted vcpu */ -- 2.7.4 From 1585950650494388325@xxx Tue Dec 05 13:26:42 +0000 2017 X-GM-THRID: 1585949845089879659 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread