Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752545AbdG1SOy (ORCPT ); Fri, 28 Jul 2017 14:14:54 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52770 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751882AbdG1SOw (ORCPT ); Fri, 28 Jul 2017 14:14:52 -0400 Date: Fri, 28 Jul 2017 11:14:47 -0700 From: "Paul E. McKenney" To: Andrew Hunter Cc: Avi Kivity , Maged Michael , Geoffrey Romer , lkml Subject: Re: Udpated sys_membarrier() speedup patch, FYI Reply-To: paulmck@linux.vnet.ibm.com References: <20170727181250.GA20183@linux.vnet.ibm.com> <20170727190637.GK3730@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17072818-0024-0000-0000-000002B72774 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007442; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00894222; UDB=6.00447129; IPR=6.00674383; BA=6.00005496; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016428; XFM=3.00000015; UTC=2017-07-28 18:14:49 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17072818-0025-0000-0000-000044E8DC40 Message-Id: <20170728181447.GI3730@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-28_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1706020000 definitions=main-1707280291 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2314 Lines: 43 On Fri, Jul 28, 2017 at 10:37:25AM -0700, Andrew Hunter wrote: > On Thu, Jul 27, 2017 at 12:06 PM, Paul E. McKenney > wrote: > > IPIin only those CPUs running threads in the same process as the > > thread invoking membarrier() would be very nice! There is some LKML > > discussion on this topic, which is currently circling around making this > > determination reliable on all CPU families. ARM and x86 are thought > > to be OK, PowerPC is thought to require a smallish patch, MIPS is > > a big question mark, and so on. > > I'm not sure what you mean by the determination or how this is arch specific? It looks like Peter and Mathieu are well on the way to solving this, see his latest patch. > > But I am surprised when you say that the downgrade would not work, at > > least if you are not running with nohz_full CPUs. The rcu_sched_qs() > > function simply sets a per-CPU quiescent-state flag. The needed strong > > ordering is instead supplied by the combination of the code starting > > the grace period, reporting the setting of the quiescent-state flag > > to core RCU, and the code completing the grace period. Each non-idle > > CPU will execute full memory barriers either in RCU_SOFTIRQ context, > > on entry to idle, on exit from idle, or within the grace-period kthread. > > In particular, a CPU running the same usermode thread for the entire > > grace period will execute the needed memory barriers in RCU_SOFTIRQ > > context shortly after taking a scheduling-clock interrupt. > > Recall that I need more than just a memory barrier--also to interrupt > RSEQ critical sections in progress on those CPUs. I know this isn't > general purpose, I'm just saying a trivial downgrade wouldn't work for > me. :) It would probably be sufficient to set NOTIFY_RESUME on all > cpus running my code (which is what my IPI function does anyway...) OK, yes, one major goal of the slowboat sys_membarrier is to -avoid- IPIing other CPUs, and if you need the CPUs to be IPIed, then a non-expedited grace period isn't going to do it for you. And yes, once sys_membarrier() settles a bit, hopefully early next week, it would be good to work out some way for RSEQ to share the sys_membarrier() code. Maybe RSEQ adds a bit to the flags argument or some such? Thanx, Paul