Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp1768577imj; Sun, 17 Feb 2019 14:09:02 -0800 (PST) X-Google-Smtp-Source: AHgI3IZqsXAutRVzDwpbLky8cYHxt214NXL6skar4BobjOfmYYx7s4ujebMTXvNtFe7PNa3UsMQQ X-Received: by 2002:a63:ce41:: with SMTP id r1mr1139463pgi.119.1550441342679; Sun, 17 Feb 2019 14:09:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550441342; cv=none; d=google.com; s=arc-20160816; b=XR4rv9H7xEg8VgwixePzk3llpcMX0RrSh02Roh3fQSSgfo3mhjn6K6cS2ocCxzNygt 9wAVgHK3rD5BWdZATKxNTMGrOuVuqMj/Bd7mH3xzrp/uJJgn0qPvXXgy82uVYuibptNT CeuwY7ayGBkjx8AqGC8UVv2RKL11UCWVAIk8oNLm1JSUVSrHGTcjSk+gqNWUEQfSWvg5 uW7VCAYXR4Cq/i864NXH9YN8IulCM70Zyhfvu/mUA2UCJOwzs5u3yn8e8VhwwmmuvtFP VmXVPeBiSVggC4Gm/zMKWHCiv+G6onOF/3UIa3Dzj1ubLSef5vsxc7dkmBYUqT/IZKiN Juxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=b5iE2SbgA8C4Ti3tAveo0foeSU5yoa4LcZlDY7fk1UQ=; b=Z9hxuaIrrhUmK8DKTQdAjSJ7/opSNckg/h0G0Hbrv9p645/GyMF6wkGzhnl9w9IV0k X2jzCyDtDQL231Px3erB+vonvMLXKfPQ3ulIVKy4KBtkY4XdqT9gi9lyBZIuAKgisAWd QBR6uLSCyYrgXjO4vtlexxsMIVT7PwK+oAeibUBCdnB/qgvx34GoJRkRfKZWS5OQS4p0 ftPDROcawiloUvy0eYyZflOjlULfR64qjzMgXGZ1RCRe12Qey+3OaIV36njhFxMI6y1t BIkNxbdv7MT0ocJai00dmqVnKfCTI+oUvxpWmp8ZBdewgPXay9reSCXVbtpZnHE+PYGj ladQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c6si300335plr.166.2019.02.17.14.08.45; Sun, 17 Feb 2019 14:09:02 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726921AbfBQWII (ORCPT + 99 others); Sun, 17 Feb 2019 17:08:08 -0500 Received: from 216-12-86-13.cv.mvl.ntelos.net ([216.12.86.13]:59912 "EHLO brightrain.aerifal.cx" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726124AbfBQWIH (ORCPT ); Sun, 17 Feb 2019 17:08:07 -0500 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1gvUbV-0006RA-00; Sun, 17 Feb 2019 22:08:05 +0000 Date: Sun, 17 Feb 2019 17:08:05 -0500 From: Rich Felker To: Mathieu Desnoyers Cc: linux-kernel , "Paul E. McKenney" , Peter Zijlstra , Ingo Molnar , Alexander Viro Subject: Re: Regression in SYS_membarrier expedited Message-ID: <20190217220805.GI23599@brightrain.aerifal.cx> References: <20190217184800.GA16118@brightrain.aerifal.cx> <53623603.9626.1550439285362.JavaMail.zimbra@efficios.com> <20190217215235.GH23599@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190217215235.GH23599@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 17, 2019 at 04:52:35PM -0500, Rich Felker wrote: > On Sun, Feb 17, 2019 at 04:34:45PM -0500, Mathieu Desnoyers wrote: > > ----- On Feb 17, 2019, at 1:48 PM, Rich Felker dalias@libc.org wrote: > > > > > commit a961e40917fb14614d368d8bc9782ca4d6a8cd11 made it so that the > > > MEMBARRIER_CMD_PRIVATE_EXPEDITED command cannot be used without first > > > registering intent to use it. However, registration is an expensive > > > operation since commit 3ccfebedd8cf54e291c809c838d8ad5cc00f5688, which > > > added synchronize_sched() to it; this means it's no longer possible to > > > lazily register intent at first use, and it's unreasonably expensive > > > to preemptively register intent for possibly extremely-short-lived > > > processes that will never use it. (My usage case is in libc (musl), > > > where I can't know if the process will be short- or long-lived; > > > unnecessary and potentially expensive syscalls can't be made > > > preemptively, only lazily at first use.) > > > > > > Can we restore the functionality of MEMBARRIER_CMD_PRIVATE_EXPEDITED > > > to work even without registration? The motivation of requiring > > > registration seems to be: > > > > > > "Registering at this time removes the need to interrupt each and > > > every thread in that process at the first expedited > > > sys_membarrier() system call." > > > > > > but interrupting every thread in the process is exactly what I expect, > > > and is not a problem. What does seem like a big problem is waiting for > > > synchronize_sched() to synchronize with an unboundedly large number of > > > cores (vs only a few threads in the process), especially in the > > > presence of full_nohz, where it seems like latency would be at least a > > > few ms and possibly unbounded. > > > > > > Short of a working SYS_membarrier that doesn't require expensive > > > pre-registration, I'm stuck just implementing it in userspace with > > > signals... > > > > Hi Rich, > > > > Let me try to understand the scenario first. > > > > musl libc support for using membarrier private expedited > > would require to first register membarrier private expedited for > > the process at musl library init (typically after exec). At that stage, the > > process is still single-threaded, right ? So there is no reason > > to issue a synchronize_sched() (or now synchronize_rcu() in newer > > kernels): > > > > membarrier_register_private_expedited() > > > > if (!(atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1)) { > > /* > > * Ensure all future scheduler executions will observe the > > * new thread flag state for this process. > > */ > > synchronize_rcu(); > > } > > > > So considering that pre-registration carefully done before the process > > becomes multi-threaded just costs a system call (and not a synchronize_sched()), > > does it make the pre-registration approach more acceptable ? > > It does get rid of the extreme cost, but I don't think it would be > well-received by users who don't like random unnecessary syscalls at > init time (each adding a few us of startup time cost). If it's so > cheap, why isn't it just the default at kernel-side process creation? > Why is there any requirement of registration to begin with? Reading > the code, it looks like all it does is set a flag, and all this flag > is used for is erroring-out if it's not set. On further thought, pre-registration could be done at first pthread_create rather than process entry, which would probably be acceptable. But the question remains why it's needed at all, and neither of these approaches is available to code that doesn't have the privilege of being part of libc. For example, library code that might be loaded via dlopen can't safely use SYS_membarrier without introducing unbounded latency before the first use. Rich