Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3025034imm; Sun, 29 Jul 2018 08:38:30 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfCaxoOTGc4mAdEIznoYPo24QPToqeEwWr5E+jrRr+M6pUyBHLc1ThKLlsV8NHZhVgC7sso X-Received: by 2002:a17:902:6b:: with SMTP id 98-v6mr1458832pla.68.1532878710719; Sun, 29 Jul 2018 08:38:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532878710; cv=none; d=google.com; s=arc-20160816; b=zoXyJt3wDb8gkd7daiC94lxmFoE9CqaNZv80ypYwyRWi8Kp4yjyltcehtKOtH5Rkme CVY+H2Jfvxj+r7rX7PvDpaNKeYRxn9NYHlF0pWGEdVidmDcEP6yAvbbsQlLgniPCjrjp Nnw7g9V0Ss/qDtdbr0N0eGoDCNjpQ99FM55VCWMLCR1GTyC8t3cBDbUHDw6YA10TSgkE zeTIIzQsylKl+f2DMdBXtakM5rzXjBllYiLlwNLS61IZB92AnhfDAIHWuKNbcq2c4VYI ADPVSVw5ZL+Vk2iwg/JJBXPjoyItTSIJQ5To2Vm52WZDr/5U2csbQ3KX4ub+FQKbmtP/ xAYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature:arc-authentication-results; bh=waLyDkpFPcRjKzcj/SAe/Kg5Z3icvDYbE8FtGowW0w0=; b=mU7kcsoFhGI63cS7yjjjMVQiskV/5VHPhEjBS8e1xlibwFnachuLjnehzCuf0/QPaT tjthVQDJNE+kod/q/S917zls2YwKpdiFD0847NTNUqn/nCCCBwlPBmHha2nrwbL0Ismg r55CpUvwcBsKbpr8XDfZldcmeqqVKaV/8jjCJdOpU3WsAhyQyFRyQLclntm6W+OIEB8w f0m5Gq6wUOEUdrEa7X+8bl2AqoTDe/zGFip6bVL87OTclY1+CEjhrs8He3q1NyasPTqH UR7lrfmqDg9dDy2IH4o6NQ6jGVKc45nzwhYE/kCzzGaWTtW7Dt0Pzuty+jn0gBJS77A+ 01tA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b="Vp6/OFWN"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w31-v6si8135661pla.133.2018.07.29.08.38.15; Sun, 29 Jul 2018 08:38:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b="Vp6/OFWN"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726464AbeG2RHl (ORCPT + 99 others); Sun, 29 Jul 2018 13:07:41 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:37840 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726337AbeG2RHk (ORCPT ); Sun, 29 Jul 2018 13:07:40 -0400 Received: by mail-pg1-f196.google.com with SMTP id n7-v6so5856232pgq.4 for ; Sun, 29 Jul 2018 08:36:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=waLyDkpFPcRjKzcj/SAe/Kg5Z3icvDYbE8FtGowW0w0=; b=Vp6/OFWNFEsZ9y1vLsf2rqcNWfrvghsd9hS3IRO0nR84TJuShRDz/fUDmUMTeIyLy/ uzQH4HcsGW9rB7SL59GlqheRE/JFP+eH9D2WXJ2cpKSA+9GJVsoPQesWEvfPGbkbEa6k uQx3ohwLJgMP2Vrkyj8fXNNhxu4OrTdYfxq51MVwo35imz8QYWGxbx4gLN0DvpxdqHf4 cl1DWO8qUpIH7NUYvix982+nIT26NmUS5Y89zh2LK0z4RGoOCgrrkmnb5N/5khaC5u9R LMFD23Cj2u+zZBhWrI64woe8NpeSF5OHYGlWAHBPCuhFXSd8dJi+D22l18+Fk6TjDYeK Isbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=waLyDkpFPcRjKzcj/SAe/Kg5Z3icvDYbE8FtGowW0w0=; b=EoUKuUy6FqKdu+TtbqJMrIuMxCkA7la36alyi/rafLVKS7jdJA4eI5K90LJiqSJizP 4BoC6TQrpDP5ZUSLi/9yqSiOhXO5IIMkcygdf73dvEzbThJYBnFpyQedR7DWADA7DN/F jn0hCi4Z30CROiFl8zjipvjfn93m5B6O38j6t9XMfq+GOus+mZNxFSI5bf2tB4263xX/ SvYe2ub0gg4H0SFDv6k9zLOooJYIwvlVhLyFl9HA6EPye9IzQZN98YC38lK7t5S4P0QL tCdDk86qudocobt0uUAIDCgZJbs/JH8POgeGkRhsxTvtV8Nh1Ic03RmkSRfG7Ix3oFYK lbLw== X-Gm-Message-State: AOUpUlFtd0QNWsvefArhyFxytE4KHisAIqPph3OAM6VNbXGgloAA5nHQ GiONYPeNu7rwUggo5HRMZg4SKw== X-Received: by 2002:a63:1315:: with SMTP id i21-v6mr13187401pgl.147.1532878610371; Sun, 29 Jul 2018 08:36:50 -0700 (PDT) Received: from ?IPv6:2600:1010:b066:ba1f:d10d:1ec8:df2a:b975? ([2600:1010:b066:ba1f:d10d:1ec8:df2a:b975]) by smtp.gmail.com with ESMTPSA id q5-v6sm10455551pgv.61.2018.07.29.08.36.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 29 Jul 2018 08:36:48 -0700 (PDT) Content-Type: multipart/alternative; boundary=Apple-Mail-C4427A33-DB38-4940-BEE1-840D82F799EA Mime-Version: 1.0 (1.0) Subject: Re: [PATCH 03/10] smp,cpumask: introduce on_each_cpu_cond_mask From: Andy Lutomirski X-Mailer: iPhone Mail (15G77) In-Reply-To: <1532865634.28585.2.camel@surriel.com> Date: Sun, 29 Jul 2018 08:36:47 -0700 Cc: Andy Lutomirski , LKML , kernel-team , Peter Zijlstra , X86 ML , Vitaly Kuznetsov , Ingo Molnar , Mike Galbraith , Dave Hansen , will.daecon@arm.com, Catalin Marinas , Benjamin Herrenschmidt Content-Transfer-Encoding: 7bit Message-Id: References: <20180728215357.3249-1-riel@surriel.com> <20180728215357.3249-4-riel@surriel.com> <1532865634.28585.2.camel@surriel.com> To: Rik van Riel Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --Apple-Mail-C4427A33-DB38-4940-BEE1-840D82F799EA Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable >> On Jul 29, 2018, at 5:00 AM, Rik van Riel wrote: >>=20 >> On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote: >> On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel >> wrote: >>> Introduce a variant of on_each_cpu_cond that iterates only over the >>> CPUs in a cpumask, in order to avoid making callbacks for every >>> single >>> CPU in the system when we only need to test a subset. >>=20 >> Nice. >>=20 >> Although, if you want to be really fancy, you could optimize this (or >> add a variant) that does the callback on the local CPU in parallel >> with the remote ones. That would give a small boost to TLB flushes. >=20 > The test_func callbacks are not run remotely, but on > the local CPU, before deciding who to send callbacks > to. >=20 > The actual IPIs are sent in parallel, if the cpumask > allocation succeeds (it always should in many kernel > configurations, and almost always in the rest). What I meant is that on_each_cpu_mask does: smp_call_function_many(mask, func, info, wait); if (cpumask_test_cpu(cpu, mask)) { unsigned long flags; local_irq_save(flags); func(info); local_irq_restore(flags); } So it IPIs all the remote CPUs in parallel, then waits, then does the local w= ork. In principle, the local flush could be done after triggering the IPIs b= ut before they all finish. > --=20 > All Rights Reversed. --Apple-Mail-C4427A33-DB38-4940-BEE1-840D82F799EA Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable


O= n Jul 29, 2018, at 5:00 AM, Rik van Riel <riel@surriel.com> wrote:

=
On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote:<= br>
On Sat, Jul 28, 2018 at 2:53 PM, Rik van R= iel <riel@surriel.com><= br>
wrote:
Introduce a vari= ant of on_each_cpu_cond that iterates only over the
<= /blockquote>
CPUs i= n a cpumask, in order to avoid making callbacks for every
= single
CPU in the system when we only need to test a subse= t.

Nice.

Although, if you want to be really fancy, you could optimize= this (or
add a varia= nt) that does the callback on the local CPU in parallel
with the remote ones.  That would gi= ve a small boost to TLB flushes.

The test_func callbacks are not run remotely, but on
the l= ocal CPU, before deciding who to send callbacks
to.
The actual IPIs are sent in parallel, if the cpumas= k
allocation succeeds (it always should in many kernel
configurations, and almost always in the rest).
<= /span>

What I meant is that on_eac= h_cpu_mask does:

smp_call_function_many(mask, func, i=
nfo, wait);

	if (cpumask_test_cpu(cp=
u, mask)) {
  &=
nbsp;unsigned long flags;
   local_irq_save(flags);
		func(<=
/span>info);
   local_irq_restore=
(flags);<=
/code>

	}

So it IPIs all the remote CPUs i=
n parallel, then waits, then does the local work.  In principle, the lo=
cal flush could be done after triggering the IPIs but before they all finish=
.


-= -
All Rights Reversed.
= --Apple-Mail-C4427A33-DB38-4940-BEE1-840D82F799EA--