Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp950182ybi; Wed, 3 Jul 2019 07:06:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqwgO+1f7n0sLpFA8D2qmINNBGzXW2kEnRTJefoRZHpYZVqbLyHwJF2tuaLF+ZHBx1+fvheT X-Received: by 2002:a17:90a:21d0:: with SMTP id q74mr13251137pjc.12.1562162771389; Wed, 03 Jul 2019 07:06:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562162771; cv=none; d=google.com; s=arc-20160816; b=w25AVgQqwRR5SLv9MRqrSu/fZIpS7dO90uLSeuL/mdax+aqoMod/aWZV2Cs6Uk2YVb Oye87hP9xm1Wez32xBKMXP/H+rQR+AIoGN9okjkBuyXbg0US4OX8jBy5HBd5AlHr8Dsq vzkDQ4TOyALI40knBe/5OZgys9G4vMWiVtQ9HvCKOJqxk9f4CLrbq8Sr4TW3pzhy6VZK h6BgceSfVS6JICtyCvJAS9IQDqFTNUCP+RzA5V65K6pSmuwjZ3DPt15OqXdmOTEnrtJ7 FDQrRpBgssQT8OnkdpvsL21I3CUEXrneTa7x6fk7d64kS0nYYyf0xindmjrRQFBNqDi0 VEsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=C/AYUDiLnpUG2k8owCxnRDHbmtvyCjE0G0VxQN3A7ZQ=; b=YMRMk0IC6iLtqz9uNi0CX3oNLIxrz8ahXo1olsZgQ8dtGmQbjtov/fr46q6c69JZF5 nO1Jw0iveM1b+eQcFKg23O2TlTKVeneBbobolMGALa5ZJMgmhtvOdwYNxIdXQjuhozz4 s+pOfXGcMfVuN4AhF5W/hqGIrvjCqTsfrPMpy5pl4VXZvtM8S/6f9SYZUyO9M5R9/pdu aESb0rYjjOJzjfAvpnYu3aW11hRwsqnV3q+MoYhqs7ZhduT18x23SUENJLVru9JVLvza lfBDb5ukCjlicKtMur7zhuXra5aPrPuaNVESbVZJrxNmfPq/B8NimNPIeDhIWxAuQGB0 fRag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d37si2391650plb.351.2019.07.03.07.05.35; Wed, 03 Jul 2019 07:06:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726924AbfGCOEY (ORCPT + 99 others); Wed, 3 Jul 2019 10:04:24 -0400 Received: from mx2.suse.de ([195.135.220.15]:60202 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725847AbfGCOEY (ORCPT ); Wed, 3 Jul 2019 10:04:24 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B1869AE14; Wed, 3 Jul 2019 14:04:22 +0000 (UTC) Subject: Re: [PATCH v2 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently To: Nadav Amit , Andy Lutomirski , Dave Hansen Cc: Borislav Petkov , Peter Zijlstra , Sasha Levin , x86@kernel.org, Thomas Gleixner , virtualization@lists.linux-foundation.org, xen-devel@lists.xenproject.org, Haiyang Zhang , "K. Y. Srinivasan" , Stephen Hemminger , Boris Ostrovsky , Ingo Molnar , Paolo Bonzini , kvm@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org References: <20190702235151.4377-1-namit@vmware.com> <20190702235151.4377-5-namit@vmware.com> From: Juergen Gross Message-ID: Date: Wed, 3 Jul 2019 16:04:21 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <20190702235151.4377-5-namit@vmware.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: de-DE Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03.07.19 01:51, Nadav Amit wrote: > To improve TLB shootdown performance, flush the remote and local TLBs > concurrently. Introduce flush_tlb_multi() that does so. Introduce > paravirtual versions of flush_tlb_multi() for KVM, Xen and hyper-v (Xen > and hyper-v are only compile-tested). > > While the updated smp infrastructure is capable of running a function on > a single local core, it is not optimized for this case. The multiple > function calls and the indirect branch introduce some overhead, and > might make local TLB flushes slower than they were before the recent > changes. > > Before calling the SMP infrastructure, check if only a local TLB flush > is needed to restore the lost performance in this common case. This > requires to check mm_cpumask() one more time, but unless this mask is > updated very frequently, this should impact performance negatively. > > Cc: "K. Y. Srinivasan" > Cc: Haiyang Zhang > Cc: Stephen Hemminger > Cc: Sasha Levin > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: Borislav Petkov > Cc: x86@kernel.org > Cc: Juergen Gross > Cc: Paolo Bonzini > Cc: Dave Hansen > Cc: Andy Lutomirski > Cc: Peter Zijlstra > Cc: Boris Ostrovsky > Cc: linux-hyperv@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: virtualization@lists.linux-foundation.org > Cc: kvm@vger.kernel.org > Cc: xen-devel@lists.xenproject.org > Signed-off-by: Nadav Amit > --- > arch/x86/hyperv/mmu.c | 13 +++--- > arch/x86/include/asm/paravirt.h | 6 +-- > arch/x86/include/asm/paravirt_types.h | 4 +- > arch/x86/include/asm/tlbflush.h | 9 ++-- > arch/x86/include/asm/trace/hyperv.h | 2 +- > arch/x86/kernel/kvm.c | 11 +++-- > arch/x86/kernel/paravirt.c | 2 +- > arch/x86/mm/tlb.c | 65 ++++++++++++++++++++------- > arch/x86/xen/mmu_pv.c | 20 ++++++--- > include/trace/events/xen.h | 2 +- > 10 files changed, 91 insertions(+), 43 deletions(-) ... > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > index beb44e22afdf..19e481e6e904 100644 > --- a/arch/x86/xen/mmu_pv.c > +++ b/arch/x86/xen/mmu_pv.c > @@ -1355,8 +1355,8 @@ static void xen_flush_tlb_one_user(unsigned long addr) > preempt_enable(); > } > > -static void xen_flush_tlb_others(const struct cpumask *cpus, > - const struct flush_tlb_info *info) > +static void xen_flush_tlb_multi(const struct cpumask *cpus, > + const struct flush_tlb_info *info) > { > struct { > struct mmuext_op op; > @@ -1366,7 +1366,7 @@ static void xen_flush_tlb_others(const struct cpumask *cpus, > const size_t mc_entry_size = sizeof(args->op) + > sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus()); > > - trace_xen_mmu_flush_tlb_others(cpus, info->mm, info->start, info->end); > + trace_xen_mmu_flush_tlb_multi(cpus, info->mm, info->start, info->end); > > if (cpumask_empty(cpus)) > return; /* nothing to do */ > @@ -1375,9 +1375,17 @@ static void xen_flush_tlb_others(const struct cpumask *cpus, > args = mcs.args; > args->op.arg2.vcpumask = to_cpumask(args->mask); > > - /* Remove us, and any offline CPUS. */ > + /* Flush locally if needed and remove us */ > + if (cpumask_test_cpu(smp_processor_id(), to_cpumask(args->mask))) { > + local_irq_disable(); > + flush_tlb_func_local(info); I think this isn't the correct function for PV guests. In fact it should be much easier: just don't clear the own cpu from the mask, that's all what's needed. The hypervisor is just fine having the current cpu in the mask and it will do the right thing. Juergen