Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751599AbdG0Sgk (ORCPT ); Thu, 27 Jul 2017 14:36:40 -0400 Received: from mail-it0-f45.google.com ([209.85.214.45]:38832 "EHLO mail-it0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751441AbdG0Sgj (ORCPT ); Thu, 27 Jul 2017 14:36:39 -0400 MIME-Version: 1.0 In-Reply-To: <20170727181250.GA20183@linux.vnet.ibm.com> References: <20170727181250.GA20183@linux.vnet.ibm.com> From: Andrew Hunter Date: Thu, 27 Jul 2017 11:36:38 -0700 Message-ID: Subject: Re: Udpated sys_membarrier() speedup patch, FYI To: "Paul E. McKenney" Cc: avi@scylladb.com, Maged Michael , Geoffrey Romer , lkml Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1431 Lines: 25 On Thu, Jul 27, 2017 at 11:12 AM, Paul E. McKenney wrote: > Hello! > But my main question is whether the throttling shown below is acceptable > for your use cases, namely only one expedited sys_membarrier() permitted > per scheduling-clock period (1 millisecond on many platforms), with any > excess being silently converted to non-expedited form. Google doesn't use sys_membarrier (that I know of...), but we do use RSEQ fences, which implements membarrier + a little extra to interrupt RSEQ critical sections (via IPI--smp_call_function_many.) One important optimization here is that we only throw IPIs to cpus running the same mm as current (or a subset if requested by userspace), as this is sufficient for the API guarantees we provide. I suspect a similar optimization would largely mitigate DOS concerns, no? I don't know if there are use cases not covered. To answer your question: throttling these (or our equivalents) would be fine in terms of userspace throughput. We haven't noticed performance problems requiring such an intervention, however. Furthermore: I wince a bit at the silent downgrade; I'd almost prefer -EAGAIN or -EBUSY. In particular, again for RSEQ fence, the downgrade simply wouldn't work; rcu_sched_qs() gets called at many points that aren't sufficiently quiescent for RSEQ (in particular, when userspace code is running!) This is solvable, but worth thinking about.