Received: by 10.192.165.148 with SMTP id m20csp810881imm; Wed, 2 May 2018 09:08:27 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqmR0TioOxPEWhoVlBwq49X6e3ZeeWfr45sd97wH3f+FZJycDbSPepwFZdhhE1Lko0JwH5t X-Received: by 2002:a17:902:1007:: with SMTP id b7-v6mr20668668pla.205.1525277307897; Wed, 02 May 2018 09:08:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525277307; cv=none; d=google.com; s=arc-20160816; b=SXHDyizNH6tERPswDua3FbAvp8U4t8P9qfpq1hVV/jyVwgS80zbe7IBoZLyCTlv4d3 FUN/u3Wv1P3AwY1lMUIqXCt3ezpqkxrp/IhEm8S/HxEOISPjlZzeeD5svPeKnGkDRQW7 e1v/DjKBD5xS4wCDPAfAopSW+bS/m+6VaouvrpbQLLJ4Z3L0rabA7NcWcEQkNnarcfWZ cI1M7U5Mm/TgUPtxCaSoy0wUm65hFRHSWiD9lStmfTT/Va/xVniBMFiVW2o0V6L9XuUp HXtjbHMiZqoHyVeNZryFDGv6BjGoAYywWOjsJoNjsV0ExYsVaJ65jPrW+EtXcMvZ68Wz ayvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=1ldJmSMZf0LwjsOyOmZGZxKNSJPDKUPCR/SH4yy8yhM=; b=x2I3Sc3GJnp9U0wv9HxfWKiTUdN4vziF5eef7ntXZ8mO/j8xT7E+Bajb2bWc9gMRJ/ xXaXf5s7Q1f1kURowWD3MlIbGGRGIdSMEoFNWuz/RMnlQhRpWqo55nIR0QR/Dsv3XKub jJ+YfP/MYZKyEmOh1CBn8cFHMfSkFztXeicdIY2PG/Jg+cZf/zL4HcWebS/MglOaA+MF RrnhlXWivawYZJJwKi4HdnIIHbBHsifWcUZYxH6gX5JP2FRZXAlAkZQ4/omBb/2ZhdJu kI8Ke22eSRQLVVkw3Zf0zRBY5vwlG8Cdr4WM25N74BuQc29Ys4Jd/4beQek2jbYoeOKv l5kQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=bsOzUuS0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m1-v6si9843101pgm.413.2018.05.02.09.08.13; Wed, 02 May 2018 09:08:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=bsOzUuS0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751820AbeEBQIC (ORCPT + 99 others); Wed, 2 May 2018 12:08:02 -0400 Received: from mail-io0-f174.google.com ([209.85.223.174]:44910 "EHLO mail-io0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751718AbeEBQIA (ORCPT ); Wed, 2 May 2018 12:08:00 -0400 Received: by mail-io0-f174.google.com with SMTP id d11-v6so18082656iof.11 for ; Wed, 02 May 2018 09:08:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1ldJmSMZf0LwjsOyOmZGZxKNSJPDKUPCR/SH4yy8yhM=; b=bsOzUuS0RJqw1Rqa7Q9hnumtoY1478eG4MpHb7HpOxiiLGjKN7vO3CUf/CRaPOlbWj 0LEQ85IsSYCCAY0q9DMTa3//S7lxzWRJvTrgm0QQ6qJWuR+HJHRlBMmlKLgSEsfMB3VF fcBYY7MsR3HadEgTM6ckpzWL4k+VFMt7ZmnqLNWeKV7H+Pp3OT9FpzT2aHkzWepyGmBp HIS2UQbXCoECyjxy/UwM4BQwKma5gbXHn6sGhzj1uVmEhIg2ixV8ZAX/+ySpwXPzKJ7y khZJUXVWvQi2StDc37bTAX+SYBT6zKcE4jbN+UfPn2+jAyOuFYPoxXn3EKTXKicZSVpj L3JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1ldJmSMZf0LwjsOyOmZGZxKNSJPDKUPCR/SH4yy8yhM=; b=PY6pGt3aGBU8pn1odi8djn0kwsnuGj7cKBrTRFv3DKbQaePTOX8vms0WYRbTfy/a+2 PAx9VfMkXNPGprYhcY6hy5g01FwOTbKadUSBfX/bNUwi67Gkntq6QvtSSDEvqW/mXEQx Hwc83XCd/v48I8UyZbv2PlPjszYl3312qZ6rIZkNZnAp7TGDWo8rmm3kXp+6pqccqQxg 9IToC0P6qEp4CbuDJGun34uyvQGwnVXi6Lkhd2IyeGnox62lNSN7FLCwDgUX7M3K/Nr7 X4qweewHTGpZPABHtr0vjmaxAajowgYyV/RwUlQx+LSb717awlHyjrTHiaRnwIVmhcYu VHYw== X-Gm-Message-State: ALQs6tAfIHcNn2N/ootG2TgJrCCpdXEe6IHRqXZRIEJeduU1FGM2JC8v FLpM1BNJoMdmBHuoQmLDmtlHHBGVWnna9+ZvUK795Q== X-Received: by 2002:a6b:3846:: with SMTP id f67-v6mr21311617ioa.117.1525277279133; Wed, 02 May 2018 09:07:59 -0700 (PDT) MIME-Version: 1.0 References: <20180430224433.17407-1-mathieu.desnoyers@efficios.com> <660904075.9201.1525276988842.JavaMail.zimbra@efficios.com> In-Reply-To: <660904075.9201.1525276988842.JavaMail.zimbra@efficios.com> From: Daniel Colascione Date: Wed, 02 May 2018 16:07:48 +0000 Message-ID: Subject: Re: [RFC PATCH for 4.18 00/14] Restartable Sequences To: Mathieu Desnoyers Cc: Peter Zijlstra , Paul McKenney , boqun.feng@gmail.com, luto@amacapital.net, davejwatson@fb.com, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Paul Turner , Andrew Morton , linux@arm.linux.org.uk, tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, Andrew Hunter , andi@firstfloor.org, cl@linux.com, bmaurer@fb.com, rostedt@goodmis.org, josh@joshtriplett.org, torvalds@linux-foundation.org, catalin.marinas@arm.com, will.deacon@arm.com, Michael Kerrisk-manpages , Joel Fernandes Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 2, 2018 at 9:03 AM Mathieu Desnoyers < mathieu.desnoyers@efficios.com> wrote: > ----- On May 1, 2018, at 11:53 PM, Daniel Colascione dancol@google.com wrote: > [...] > > > > I think a small enhancement to rseq would let us build a perfect userspace > > mutex, one that spins on lock-acquire only when the lock owner is running > > and that sleeps otherwise, freeing userspace from both specifying ad-hoc > > spin counts and from trying to detect situations in which spinning is > > generally pointless. > > > > It'd work like this: in the per-thread rseq data structure, we'd include a > > description of a futex operation for the kernel would perform (in the > > context of the preempted thread) upon preemption, immediately before > > schedule(). If the futex operation itself sleeps, that's no problem: we > > will have still accomplished our goal of running some other thread instead > > of the preempted thread. > Hi Daniel, > I agree that the problem you are aiming to solve is important. Let's see > what prevents the proposed rseq implementation from doing what you envision. > The main issue here is touching userspace immediately before schedule(). > At that specific point, it's not possible to take a page fault. In the proposed > rseq implementation, we get away with it by raising a task struct flag, and using > it in a return to userspace notifier (where we can actually take a fault), where > we touch the userspace TLS area. > If we can find a way to solve this limitation, then the rest of your design > makes sense to me. Thanks for taking a look! Why couldn't we take a page fault just before schedule? The reason we can't take a page fault in atomic context is that doing so might call schedule. Here, we're about to call schedule _anyway_, so what harm does it do to call something that might call schedule? If we schedule via that call, we can skip the manual schedule we were going to perform.