Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753603AbdG2JYE (ORCPT ); Sat, 29 Jul 2017 05:24:04 -0400 Received: from merlin.infradead.org ([205.233.59.134]:50216 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753403AbdG2JYD (ORCPT ); Sat, 29 Jul 2017 05:24:03 -0400 Date: Sat, 29 Jul 2017 11:23:33 +0200 From: Peter Zijlstra To: Nicholas Piggin Cc: Mathieu Desnoyers , "Paul E. McKenney" , linux-kernel , Boqun Feng , Andrew Hunter , maged michael , gromer , Avi Kivity , Michael Ellerman , Benjamin Herrenschmidt , Palmer Dabbelt Subject: Re: [RFC PATCH v2] membarrier: expedited private command Message-ID: <20170729092333.GB6524@worktop.programming.kicks-ass.net> References: <20170727211314.32666-1-mathieu.desnoyers@efficios.com> <20170728085532.ylhuz2irwmgpmejv@hirez.programming.kicks-ass.net> <20170728115702.5vgnvwhmbbmyrxbf@hirez.programming.kicks-ass.net> <2118431661.29566.1501256295573.JavaMail.zimbra@efficios.com> <20170728164642.jolhwyqs3swhzmrb@hirez.programming.kicks-ass.net> <856243469.29609.1501261613685.JavaMail.zimbra@efficios.com> <20170729115840.7dff4ea5@roar.ozlabs.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170729115840.7dff4ea5@roar.ozlabs.ibm.com> User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1326 Lines: 28 On Sat, Jul 29, 2017 at 11:58:40AM +1000, Nicholas Piggin wrote: > I haven't had time to read the thread and understand exactly why you need > this extra barrier, I'll do it next week. Thanks for cc'ing us on it. Bottom of here: https://lkml.kernel.org/r/20170727135610.jwjfvyuacqzj5e4u@hirez.programming.kicks-ass.net is probably the fastest way towards understanding the need for a barrier after rq->curr assignment. Any barrier after that assignment is good for us, but so far it looks like PPC doesn't (and PPC only afaict) provide any smp_mb() after that point. > A smp_mb is pretty expensive on powerpc CPUs. Removing the sync from > switch_to increased thread switch performance by 2-3%. Putting it in > switch_mm may be a little less painful, but still we have to weigh it > against the benefit of this new functionality. Would that be a net win > for the average end-user? Seems unlikely. > > But we also don't want to lose sys_membarrier completely. Would it be too > painful to make MEMBARRIER_CMD_PRIVATE_EXPEDITED return error, or make it > fall back to a slower case if we decide not to implement it? One ugly thing we've thought of is tagging each mm that has used sys_membarrier() and only issue the smp_mb() for those. That way only those tasks that actually rely on the syscall get to pay the price.