Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752609AbdG1Rba (ORCPT ); Fri, 28 Jul 2017 13:31:30 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:38851 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752115AbdG1Rb2 (ORCPT ); Fri, 28 Jul 2017 13:31:28 -0400 Date: Fri, 28 Jul 2017 10:31:23 -0700 From: "Paul E. McKenney" To: Andrew Hunter Cc: Avi Kivity , Maged Michael , Geoffrey Romer , lkml , Mathieu Desnoyers Subject: Re: Udpated sys_membarrier() speedup patch, FYI Reply-To: paulmck@linux.vnet.ibm.com References: <20170727181250.GA20183@linux.vnet.ibm.com> <5c8c6946-ce3a-6183-76a2-027823a9948a@scylladb.com> <20170727194322.GL3730@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17072817-0056-0000-0000-000003AE2B31 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007442; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00894208; UDB=6.00447120; IPR=6.00674368; BA=6.00005496; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016427; XFM=3.00000015; UTC=2017-07-28 17:31:25 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17072817-0057-0000-0000-000007E44C10 Message-Id: <20170728173123.GH3730@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-28_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1707280281 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1897 Lines: 51 On Fri, Jul 28, 2017 at 10:15:49AM -0700, Andrew Hunter wrote: > On Thu, Jul 27, 2017 at 12:43 PM, Paul E. McKenney > wrote: > > On Thu, Jul 27, 2017 at 10:20:14PM +0300, Avi Kivity wrote: > >> IPIing only running threads of my process would be perfect. In fact > >> I might even be able to make use of "membarrier these threads > >> please" to reduce IPIs, when I change the topology from fully > >> connected to something more sparse, on larger machines. > > We do this as well--sometimes we only need RSEQ fences against > specific CPU(s), and thus pass a subset. Sounds like a good future enhancement, probably requiring a new syscall to accommodate the cpumask. > > +static void membarrier_private_expedited_ipi_each(void) > > +{ > > + int cpu; > > + > > + for_each_online_cpu(cpu) { > > + struct task_struct *p; > > + > > + rcu_read_lock(); > > + p = task_rcu_dereference(&cpu_rq(cpu)->curr); > > + if (p && p->mm == current->mm) > > + smp_call_function_single(cpu, ipi_mb, NULL, 1); > > + rcu_read_unlock(); > > + } > > +} > > + > > We have the (simpler imho) > > const struct cpumask *mask = mm_cpumask(mm); > /* possibly AND it with a user requested mask */ > smp_call_function_many(mask, ipi_func, ....); > > which I think will be faster on some archs (that support broadcast) > and have fewer problems with out of sync values (though we do have to > check in our IPI function that we haven't context switched out. > > Am I missing why this won't work? My impression is that some architectures don't provide the needed ordering in this case, and also that some architectures support ASIDs and would thus IPI CPUs that weren't actually running threads in the process at the current time. Mathieu, anything I am missing? Thanx, Paul