Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp793134pxb; Thu, 9 Sep 2021 12:09:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwISk0IMt4MeeWJmYSFZ6iGxdZFTMf+xCqV32FaoUH6ekT70zWXMHP5ZIVm5OpyrB0hTP1n X-Received: by 2002:a17:906:4dc1:: with SMTP id f1mr5251632ejw.288.1631214561384; Thu, 09 Sep 2021 12:09:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631214561; cv=none; d=google.com; s=arc-20160816; b=vk5i/HeySaCN4TOyNZuj7qZlPfdc9tetPVJNmWcN5yzW14NNqvRxS/kEuIwn7qBEIP jaNNBH+8AU8/Vh5j9Sv+AZeJU+0P3lZOrLxEFvB9q2c4jQdHPvGMcjE9dgB3muIqFNcC I2M//5CWxFs3suaN8zn+4qj5b7Q6TDPRDIsQ9n7Rvyr5nOjL98CnKpWhPLlZFximRaAs /VOz+rXxApmObwHESaizBsL3wn8TV3QWDVir1q1b4gHj/BWjeGD9s1IV2xc81/imkaKN aoz5cVGSsu3cBwp5U/1T7RBQ8MpYYH5lIs93qQFO7pZrdZDT26ik/+mq6hlVwzem5M6m Optg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=tSxoVhZKPV5vIsKI5h5zhtRaljW5GeLiCgX94saAGpI=; b=LJg2gWoDFK2XdJUjMKKUCiALkDN27kvgyoQ/JUgwhGDoIokfQEibL2F7VAOseeu1dc ywfOTadDxHqLfl9UaIZB20kddqOf2EM8mW9htyGAzq5HvPaWD7AbUFmO24oJW7HxoVda r/uzO2lI9/IX4uEHmUEMoC00ejQ7hwYNOat+kynYZpFH1VfpHc8dJqNmXGFFJ29X/rWH TfUrO+7smvM8fOVRjiEX62VtLv2MRTL7IKfrNG3oBBFJxCUbAyw7o3kDrIaz0TBaTDrb JeQdUK/EfV5y74zrGDNIZbgOzuilCVRFtngZ4utOj8JurbGxHs7o5qKrvwexiK2onMXO bzEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EEh5iHf2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b3si2591676ejp.504.2021.09.09.12.08.51; Thu, 09 Sep 2021 12:09:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EEh5iHf2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231422AbhIITIX (ORCPT + 99 others); Thu, 9 Sep 2021 15:08:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236112AbhIITIW (ORCPT ); Thu, 9 Sep 2021 15:08:22 -0400 Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A59FC061575 for ; Thu, 9 Sep 2021 12:07:13 -0700 (PDT) Received: by mail-lf1-x131.google.com with SMTP id a4so5680267lfg.8 for ; Thu, 09 Sep 2021 12:07:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tSxoVhZKPV5vIsKI5h5zhtRaljW5GeLiCgX94saAGpI=; b=EEh5iHf2AyVU6pWc66FkqwfxhZN7OOFswNYtUyq59yF6TfwdRjY4MK7LyilbyCD5Kl U646gNYO+qY671TJrJ6dHx/ihOFsPxkznJOjZjg0pOotVadZKBMI8dVwRv9RmpTRceat amLbx3XKy+m2mJxxB/0iM1lrah+eABVQnmzMypJNatGwdutNXOGKQ3GF0PrgF6XCz9tq t1LrrAHj/Xjq8NTDoHuP4CoeAjy3BME1ounPNfNPCNLQaYp6EMWdvtvJexFvqfQfn3U0 SjaIjrCCSoLoXao5PTqu04cq15nIgXUJ4ZA+QqVjpr0sw1iMUgjyzVGBFZ2fJR1z+gwn vW7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tSxoVhZKPV5vIsKI5h5zhtRaljW5GeLiCgX94saAGpI=; b=xFagfUKxIIhKcu+9+JaKwXDXHTybakW94seikb5ztpjugvD7Ykv0y1uYFJMJMJDTWn hnZ0YrHwH+HX9VWRi9T2vJtQc39w8Mx81DF6LKBbISwqnXObfc69jWii6tnyK/WaHb0T gos7nWYRPwGfDMf8KMx5vS/btwJHgs52X5mDVNJVFxGoytQqq5tSDWtfbYK8Dko8dAO1 iCEUXBUDoGxjwqEV2DdtQxH9ht64xyJKlzKf4G89KSs3JSY2Vuv5dlFLNxB2HzqYiJae 0uY80gefa4O+RBLpISf4XZTrpw9sI22NkWyJ7fU++8taZ8BM93WTg1t4ayZcu3IQ2dwj 8h8Q== X-Gm-Message-State: AOAM532n+tUhiJEVlw979DT83rixQz9tCs2MK6UtQtR5ovu/c1Rb3eCv xnE2oOU71A16ySSxy64tzihppgBU2+tY1u7S0dBW+A== X-Received: by 2002:ac2:4e98:: with SMTP id o24mr988992lfr.295.1631214431050; Thu, 09 Sep 2021 12:07:11 -0700 (PDT) MIME-Version: 1.0 References: <20210908184905.163787-1-posk@google.com> <20210908184905.163787-3-posk@google.com> In-Reply-To: From: Peter Oskolkov Date: Thu, 9 Sep 2021 12:06:58 -0700 Message-ID: Subject: Re: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers To: Jann Horn Cc: Peter Oskolkov , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Paul Turner , Ben Segall , Andrei Vagin , Thierry Delisle Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 8, 2021 at 4:39 PM Jann Horn wrote: Thanks a lot for the reviews, Jann! I understand how to address most of your comments. However, one issue I'm not sure what to do about: [...] > If this function is not allowed to sleep, as the comment says... [...] > ... then I'm pretty sure you can't call fix_pagefault() here, which > acquires the mmap semaphore (which may involve sleeping) and then goes > through the pagefault handling path (which can also sleep for various > reasons, like allocating memory for pagetables, loading pages from > disk / NFS / FUSE, and so on). : So a PF_UMCG_WORKER would be added to sched_submit_work()'s PF_*_WORKER path to capture these tasks blocking. The umcg_sleeping() hook added there would: put_user(BLOCKED, umcg_task->umcg_status); ... Which is basically what I am doing here: in sched_submit_work() I need to read/write to userspace; and we cannot sleep in sched_submit_work(), I believe. If you are right that it is impossible to deal with pagefaults from within non-sleepable contexts, I see two options: Option 1: as you suggest, pin pages holding struct umcg_task in sys_umcg_ctl; or Option 2: add more umcg-related kernel state to task_struct so that reading/writing to userspace is not necessary in sched_submit_work(). The first option sounds much better from the code simplicity point of view, but I'm not sure if it is a viable approach, i.e. I'm afraid we'll get a hard NACK here, as a non-privileged process will be able to force the kernel to pin a page per task/thread. We may get around it by first pinning a limited number of pages, then having the userspace allocate structs umcg_task on those pages, so that a pinned page would cover more than a single task/thread. And have a sysctl that limits the number of pinned pages per MM. Peter Z., could you, please, comment here? Do you think pinning pages to hold structs umcg_task is acceptable? [...]