Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp826830pxb; Tue, 14 Sep 2021 09:31:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzajOn1rgExeRrelDkYuLeII9dnBjRTbekWRSSCp3YK2nu5pMLhZdwQhb8TNQ2JQ7j30Prv X-Received: by 2002:a2e:a4d1:: with SMTP id p17mr15988293ljm.82.1631637085531; Tue, 14 Sep 2021 09:31:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631637085; cv=none; d=google.com; s=arc-20160816; b=C6YB5fgifWRdR+bwk6JKUYm3QRiLI6kWinhDfkxSJwtkptRa/5YZLsCIvaWk6Rqpn9 EdLjR9T2PmFmTaTcjikazrWHyYELqdglG4JnFG82az9vxCmy4Nqac/mB5jtzSXTDNYEr RYs4QkQEntjyHjXQQXCgcPM2yK19TUQ+D64Mqz71XsHBqFGoQ//hhlanh0TiwQqiPpeO bAkb47WN5CS0br9Qkew5XvTbPGyLnUdCPUbYw6ONhnUbT1fsSLwjU65yfKS2HkyNevCi piW3xFb8dyKGJTuwEgMR1LtH4UlvhkNcQZY4xDAaqUkRsH51X1TKc4Jwep72z0Zdz6J+ 7EjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=zLdbCb/avNxTuz1aH11GNXLXKb4lBfj2ifLnEtCfqTs=; b=v9MZ22Tpae6Dqd9R8/Svfwe9c1AojCPmZfcelkZcv5Uj6raGrcawLPPPcwnU99Y3N/ GX14pniG50LpqAhbU1slYRNTbjysj9cE3jRsb8gx84ZUTjRtBZBXaptuUKJnDzRrU+fL gMQI8i23mYnpGAHVTw8pHsVcVX/dqWHj9xicVXvt71j1Fj721bnMAi3tROfSvqX0O+Ez mZnUL4N7ga1GvbZfZCUsYNcYfVFMgwz2XjdAOgssqS44HyENRziAfQowgdAKzyp1s9Sc /ZUSjrqlNAulDeXL9hny+gpuji+3dmzHLn2IpulFzrdFKwTkrU7FRtYNGKE1MC5/WZFm tleQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bcP5hplM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k19si2980357lfe.297.2021.09.14.09.30.56; Tue, 14 Sep 2021 09:31:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bcP5hplM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229663AbhINQac (ORCPT + 99 others); Tue, 14 Sep 2021 12:30:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229580AbhINQab (ORCPT ); Tue, 14 Sep 2021 12:30:31 -0400 Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0201C061762 for ; Tue, 14 Sep 2021 09:29:13 -0700 (PDT) Received: by mail-lj1-x22f.google.com with SMTP id r3so25048209ljc.4 for ; Tue, 14 Sep 2021 09:29:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zLdbCb/avNxTuz1aH11GNXLXKb4lBfj2ifLnEtCfqTs=; b=bcP5hplMqRw6+YWuCJHddtl22eao0d/Nb8Y25EY8LwGfcQLmLceu7BPOjnvuZ8PCSx Q7WOLIFvZpgNc4sLHlxAbsTfcnYTgIlcKFTKcQ/cNOz8rn47UNUHeNSenSvJ8yB67U9l LsqGVmXAkXCtDOApuyc3Vf/81utxiBSQZE4XZAI7dna17UTwtPVLbyizEF+GTMps4VX5 959oUaxml1mRcqKb2HqRNYJQ4WEAbY5H4lIa03xzEd/MPKZjzs9rrZutfn7ZtSFaWmDQ RliDyWNAw7c7mfKnLB++R2NJ5051EXT1HuaaSHQH4U/cEU5VzxYMCmpi07sWJSlEeNOY sqYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zLdbCb/avNxTuz1aH11GNXLXKb4lBfj2ifLnEtCfqTs=; b=ytCAtTFjXlp7NizOh0Z2VwUy1eiiVeaQAaaxkE44QbnN8s5hyo1S2s8na2RXtfJjCC TdcNbGhbcnNrjJhzEWuRjRYfd/Lmx6yJSr6qhJBGsaf/ZRnhSpWE2cMxS1F42v/f4KKK g/2FlwlFSXGITC9wFgjFPOr6QTP9HWLbWyocNGYUlq6v0r/ZlNkqZv0BAszlK3GVvsJi DGyvSBaDKiyWkaockRc8KIu6kjf3Bf+WjSC17jT8BE6wTHeRx/mIsxRitx95RYSOGYbs lkdVr8AyOLxSP2/rg1C4pfGOqBo0pp74kzI4o5PQIFbtROl8HT3piO/HrJFdoXEzg7Wd uy1Q== X-Gm-Message-State: AOAM5318K01brRpk93hejHS/nsPWtIiHypMPYQhNTB1Xoqn7ic364iUU EZHGCd9SyANzzIa2l7axMqR6Ud9c3XhUr0KxFtrPMg== X-Received: by 2002:a2e:900c:: with SMTP id h12mr16668315ljg.263.1631636952121; Tue, 14 Sep 2021 09:29:12 -0700 (PDT) MIME-Version: 1.0 References: <20210908184905.163787-1-posk@google.com> <20210908184905.163787-3-posk@google.com> In-Reply-To: From: Peter Oskolkov Date: Tue, 14 Sep 2021 09:29:00 -0700 Message-ID: Subject: Re: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers To: Peter Zijlstra Cc: Jann Horn , Peter Oskolkov , Ingo Molnar , Thomas Gleixner , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Paul Turner , Ben Segall , Andrei Vagin , Thierry Delisle Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 14, 2021 at 1:09 AM Peter Zijlstra wrote: > > On Thu, Sep 09, 2021 at 12:06:58PM -0700, Peter Oskolkov wrote: > > On Wed, Sep 8, 2021 at 4:39 PM Jann Horn wrote: [...] > > Durr.. so yeah this is a bit of a chicken and egg problem here. We need > a userspace page to notify we're blocked, but at the same time, > accessing said page can get us blocked. > > And then worse, as Jann said, we cannot do this in the appropriate spot > because we could be blocking on mmap_sem, so we must not require > mmap_sem to make progress etc.. :/ > > Now, in reality actually taking a fault for these pages is extremely > unlikely, but if we do, there's really no option but to block and wait > for it without notification. Tought luck there. In the version of the patchset that I'm preparing to send I've decided to punt on the issue and just ask the userspace to deal with locking the memory as it sees fit: mlock() is available and as far as I can tell RLIMIT_MLOCK is decently sized by default (6MB on Ubuntu, so locked memory can contain more than 100k of structs umcg_task if nothing else uses it); and if it is not enough for some special case, it can be adjusted at a higher level in the userspace. If we get a pagefault when we access struct umcg_task in the kernel, we just kill the task. Does the approach seem reasonable for the initial version of the patchset? > > So what we can do, is use get_user_page() on the appropriate pages > (alignment ensure the whole umcg struct must be in a single page etc..) > the moment a umcg task enters the kernel. For this we need some > SYSCALL_WORK_ENTER flag. > > So normally a task would have ->umcg_page and ->umcg_server_page be > NULL, the above SYSCALL_WORK_SYSCALL_UMCG flag would get_user_page() the > self and server pages. If get_user_page() blocks, these fields would > still be NULL and sched_submit_work() would not do anything, c'est la > vie. > > Once we have the pages, any actual blocking hitting sched_submit_work() > can do the updates without further blocking. It can then also put_page() > and clear the ->umcg_{,server_}page pointers, because the task_work that > will set RUNNABLE *can* suffer mmap_sem (again, unlikely, again tough > luck if it does). > > The reason for put'ing the pages on blocking, is that this guarantees > the pages are only pinned for a short amount of time, and 'never' by a > blocked task. IOW, it's a proper transient pin and doesn't require extra > special care or accounting. I'd prefer to defer this smart/transient pinning of pages until later if mlock() will solve the issue at the moment. > Also, can you *please* convert that RST crud to a text file, it's > absolutely unreadable gunk. Those documentation files should be readable > as plain text first and foremost. That whole rendering to html crap is > nonsense. Using a browser to read a test file is insane. Will do. Maybe we can have both an RST and a TXT version of the document? I think most files in /Documentation are RST... Thanks, Peter