Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp1758469imj; Sun, 17 Feb 2019 13:53:10 -0800 (PST) X-Google-Smtp-Source: AHgI3IZgPmUh1JXBPvq5uuRXKH7ntWzXRA90ODbFGBbYFCcgsbuY8ezH38VYS6wbpuFVUsGPWOg4 X-Received: by 2002:a63:534c:: with SMTP id t12mr15736835pgl.205.1550440389936; Sun, 17 Feb 2019 13:53:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550440389; cv=none; d=google.com; s=arc-20160816; b=JIPgPNiBDlL9ZAMWmf3UTtD+mQF9GC/4foThsgA0L4ypUL0dZcAf9VxFgwZPxkvjlC 3ovBPpEHLSZZMZdqDa9sgyK7XDUx3sg7UyUjy0cjgBNnfc43TDUo8t/tnTq5Q6taBa0/ 5P78ZqnXSICPGJVFGu4NIiRJIp4LB0Z5R65HaTSDEhf8Rn7ztloWfukPPos/kryMW0KY hEhp2YNP4UZAujctii5ixCD0wzcMygOE3juf8aP3SPkwAsu8lJGXetdOvdsdM5uorTIG zeF4yO54qUMmSEEJYvW42FCZAWdFf/ZZaV+v1/nJI19i5f/z8Smz+ndKkIX5RM9JLa/T ruAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=iuNkv2wrUk4hYSpM+1u6vdPG8MUSgAYh0fMFzfQTQiY=; b=G04HjdlgYlsq2c3sv8/g0CHeFKziCVoLzYZU6Zk98EElEMJIQqdfey5rOuoaPjowl+ F+wtqOUamyBsdj0vdgTxfEheZe0ZNgE6q98x3ZQzRidYOrPzwB4xGWEBEll6I2rmRgwO yl/K4oRolwXAlkhquZAng+D9SnkyxZ8lDv/FLc2aYyBsRc+PWC058NpodEupdbcZfE2H YYhBVUmiJA283JHHBgrodyQX6JY+1vsE4E7vJ/uEMM7tkpYysaPnUsu6RTDRBHEteQCI YgPoELV4VdG/8TOE/evtKOEB1mYCd3T1V92SdJTV0ybajoCNLnvJeUcRamAobRaDfYja xTpg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c4si567488pfn.83.2019.02.17.13.52.53; Sun, 17 Feb 2019 13:53:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726461AbfBQVwi (ORCPT + 99 others); Sun, 17 Feb 2019 16:52:38 -0500 Received: from 216-12-86-13.cv.mvl.ntelos.net ([216.12.86.13]:59900 "EHLO brightrain.aerifal.cx" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726124AbfBQVwi (ORCPT ); Sun, 17 Feb 2019 16:52:38 -0500 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1gvUMV-0006Ok-00; Sun, 17 Feb 2019 21:52:35 +0000 Date: Sun, 17 Feb 2019 16:52:35 -0500 From: Rich Felker To: Mathieu Desnoyers Cc: linux-kernel , "Paul E. McKenney" , Peter Zijlstra , Ingo Molnar , Alexander Viro Subject: Re: Regression in SYS_membarrier expedited Message-ID: <20190217215235.GH23599@brightrain.aerifal.cx> References: <20190217184800.GA16118@brightrain.aerifal.cx> <53623603.9626.1550439285362.JavaMail.zimbra@efficios.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53623603.9626.1550439285362.JavaMail.zimbra@efficios.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 17, 2019 at 04:34:45PM -0500, Mathieu Desnoyers wrote: > ----- On Feb 17, 2019, at 1:48 PM, Rich Felker dalias@libc.org wrote: > > > commit a961e40917fb14614d368d8bc9782ca4d6a8cd11 made it so that the > > MEMBARRIER_CMD_PRIVATE_EXPEDITED command cannot be used without first > > registering intent to use it. However, registration is an expensive > > operation since commit 3ccfebedd8cf54e291c809c838d8ad5cc00f5688, which > > added synchronize_sched() to it; this means it's no longer possible to > > lazily register intent at first use, and it's unreasonably expensive > > to preemptively register intent for possibly extremely-short-lived > > processes that will never use it. (My usage case is in libc (musl), > > where I can't know if the process will be short- or long-lived; > > unnecessary and potentially expensive syscalls can't be made > > preemptively, only lazily at first use.) > > > > Can we restore the functionality of MEMBARRIER_CMD_PRIVATE_EXPEDITED > > to work even without registration? The motivation of requiring > > registration seems to be: > > > > "Registering at this time removes the need to interrupt each and > > every thread in that process at the first expedited > > sys_membarrier() system call." > > > > but interrupting every thread in the process is exactly what I expect, > > and is not a problem. What does seem like a big problem is waiting for > > synchronize_sched() to synchronize with an unboundedly large number of > > cores (vs only a few threads in the process), especially in the > > presence of full_nohz, where it seems like latency would be at least a > > few ms and possibly unbounded. > > > > Short of a working SYS_membarrier that doesn't require expensive > > pre-registration, I'm stuck just implementing it in userspace with > > signals... > > Hi Rich, > > Let me try to understand the scenario first. > > musl libc support for using membarrier private expedited > would require to first register membarrier private expedited for > the process at musl library init (typically after exec). At that stage, the > process is still single-threaded, right ? So there is no reason > to issue a synchronize_sched() (or now synchronize_rcu() in newer > kernels): > > membarrier_register_private_expedited() > > if (!(atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)) { > /* > * Ensure all future scheduler executions will observe the > * new thread flag state for this process. > */ > synchronize_rcu(); > } > > So considering that pre-registration carefully done before the process > becomes multi-threaded just costs a system call (and not a synchronize_sched()), > does it make the pre-registration approach more acceptable ? It does get rid of the extreme cost, but I don't think it would be well-received by users who don't like random unnecessary syscalls at init time (each adding a few us of startup time cost). If it's so cheap, why isn't it just the default at kernel-side process creation? Why is there any requirement of registration to begin with? Reading the code, it looks like all it does is set a flag, and all this flag is used for is erroring-out if it's not set. Rich