Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B8D5C61DA4 for ; Thu, 23 Feb 2023 00:55:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232898AbjBWAzW (ORCPT ); Wed, 22 Feb 2023 19:55:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232484AbjBWAzU (ORCPT ); Wed, 22 Feb 2023 19:55:20 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FC3537717 for ; Wed, 22 Feb 2023 16:55:18 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-536c6ce8d74so92610357b3.9 for ; Wed, 22 Feb 2023 16:55:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date:from:to :cc:subject:date:message-id:reply-to; bh=3azLqXR2vSKV5L2PaJh+3T6s1vBztgHEFOD2woEyPYw=; b=ll/kgw4EhN1QBjkzhFZB0Ail1ORJy+AyUBj3Zon2zA6YZR6bvv6MRVw4xL88tq8Az8 UNfJYuIlxPRqkMZxma4EfhQVLDz+WB3n7N3GgnS6/N71B9VGL9L1wJxicWIzPEsnhNbY jwqQtYqGUwQCJ7O/Ie7nc1bEP4GmGEJpB8DCFusadZrYSFRLnQZARIT1I0plRhGyCTgj klqga5wY1X/qhy0ohB1y16mctfilFHsYR3l8CX6UQd0Ok2HUyZRCt5JjEmEyFz3N4ySp AUxoMU/xS8Z5QEE7WS4WTYYb6USgkUVSpxiH8hOE9if2xvRinxF+p0VrjEGBhDPd5C7e 03hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:in-reply-to:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3azLqXR2vSKV5L2PaJh+3T6s1vBztgHEFOD2woEyPYw=; b=iDNbsnQlmEUnn6LDWwS5ziI9BjoN9XQMZSMLT0GLdaOnnvyncC8ohmIMxCCL/1aNhx JkbUyI7gHQAV6S7Rq368YBtPz1gE4JCBxBGiucBy0ZR3bu0hrVlMOsnStt1U0opz5/+q iKaVN4r4e9CZzwJL1jOaYCkYgSMDX8rPY2vEgcSKp1XCdg8qjdOSn7Lo5IP9WBxnfcwz RfE5gSzerpky299f6DZnCqVk30xQ5nxijBGhY9ihS46xm3bXExDXtjuNxVb3Bn96KQem mlG6VBxnRz8Il7HDDEdGT6ylA+TzSJ6K5H8/2bcBWrXWg98/ZwuU3I7mFrIaXpy5cTS1 sbZQ== X-Gm-Message-State: AO0yUKUNgnN2+h5NZo04hUcSizaFHfQGWODV4din/wpLTgkmdiK5m2n/ bXhQX4CUi7wLiovdNFoFYuSyxFG5s1ktC8Qc8Q== X-Google-Smtp-Source: AK7set/bRW8kyMySNfR/6zy0z0kR68DNc8Wiwl2R9ueq8c35MSB8zjbIqdtnOK6awXZtxDJCM5SBJnFOM9ODRuSA8Q== X-Received: from ackerleytng-cloudtop.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1f5f]) (user=ackerleytng job=sendgmr) by 2002:a05:6902:1028:b0:a27:3ecd:6 with SMTP id x8-20020a056902102800b00a273ecd0006mr761629ybt.1.1677113717462; Wed, 22 Feb 2023 16:55:17 -0800 (PST) Date: Thu, 23 Feb 2023 00:55:16 +0000 In-Reply-To: <20230216100150.yv2ehwrdcfzbdhcq@box.shutemov.name> (kirill@shutemov.name) Mime-Version: 1.0 Message-ID: Subject: Re: [RFC PATCH 1/2] mm: restrictedmem: Allow userspace to specify mount_path for memfd_restricted From: Ackerley Tng To: "Kirill A. Shutemov" Cc: kvm@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, qemu-devel@nongnu.org, chao.p.peng@linux.intel.com, aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, arnd@arndb.de, bfields@fieldses.org, bp@alien8.de, corbet@lwn.net, dave.hansen@intel.com, david@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, hpa@zytor.com, hughd@google.com, jlayton@kernel.org, jmattson@google.com, joro@8bytes.org, jun.nakajima@intel.com, kirill.shutemov@linux.intel.com, linmiaohe@huawei.com, luto@kernel.org, mail@maciej.szmigiero.name, mhocko@suse.com, michael.roth@amd.com, mingo@redhat.com, naoya.horiguchi@nec.com, pbonzini@redhat.com, qperret@google.com, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, tabba@google.com, tglx@linutronix.de, vannapurve@google.com, vbabka@suse.cz, vkuznets@redhat.com, wanpengli@tencent.com, wei.w.wang@intel.com, x86@kernel.org, yu.c.zhang@linux.intel.com Content-Type: text/plain; charset="UTF-8"; format=flowed; delsp=yes Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "Kirill A. Shutemov" writes: > On Thu, Feb 16, 2023 at 12:41:16AM +0000, Ackerley Tng wrote: >> By default, the backing shmem file for a restrictedmem fd is created >> on shmem's kernel space mount. >> With this patch, an optional tmpfs mount can be specified, which will >> be used as the mountpoint for backing the shmem file associated with a >> restrictedmem fd. >> This change is modeled after how sys_open() can create an unnamed >> temporary file in a given directory with O_TMPFILE. >> This will help restrictedmem fds inherit the properties of the >> provided tmpfs mounts, for example, hugepage allocation hints, NUMA >> binding hints, etc. >> Signed-off-by: Ackerley Tng >> --- >> include/linux/syscalls.h | 2 +- >> include/uapi/linux/restrictedmem.h | 8 ++++ >> mm/restrictedmem.c | 63 +++++++++++++++++++++++++++--- >> 3 files changed, 66 insertions(+), 7 deletions(-) >> create mode 100644 include/uapi/linux/restrictedmem.h >> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h >> index f9e9e0c820c5..4b8efe9a8680 100644 >> --- a/include/linux/syscalls.h >> +++ b/include/linux/syscalls.h >> @@ -1056,7 +1056,7 @@ asmlinkage long sys_memfd_secret(unsigned int >> flags); >> asmlinkage long sys_set_mempolicy_home_node(unsigned long start, >> unsigned long len, >> unsigned long home_node, >> unsigned long flags); >> -asmlinkage long sys_memfd_restricted(unsigned int flags); >> +asmlinkage long sys_memfd_restricted(unsigned int flags, const char >> __user *mount_path); >> /* >> * Architecture-specific system calls > I'm not sure what the right practice now: do we provide string that > contains mount path or fd that represents the filesystem (returned from > fsmount(2) or open_tree(2)). > fd seems more flexible: it allows to specify unbind mounts. I tried out the suggestion of passing fds to memfd_restricted() instead of strings. One benefit I see of using fds is interface uniformity: it feels more aligned with other syscalls like fsopen(), fsconfig(), and fsmount() in terms of using and passing around fds. Other than being able to use a mount without a path attached to the mount, are there any other benefits of using fds over using the path string? Should I post the patches that allows specifying a mount using fds? Should I post them as a separate RFC, or as a new revision to this RFC?