Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp2029367rwb; Fri, 19 Aug 2022 13:55:09 -0700 (PDT) X-Google-Smtp-Source: AA6agR6ONTSXwbA4fsU/085dcy9ywHRfTKOsVu0q7tTlGegKLqhqWUwFWR6Dxl0nfDrN2hhhklpG X-Received: by 2002:a05:6402:530c:b0:43b:c6bf:a496 with SMTP id eo12-20020a056402530c00b0043bc6bfa496mr7217241edb.282.1660942509489; Fri, 19 Aug 2022 13:55:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660942509; cv=none; d=google.com; s=arc-20160816; b=r8v1pexpNczMR/WbUUb7fVjiAa6k3ALzoDKFu57t2q60ZIT5Tt1gftpuRARHe2Esp5 YjA/BKzKfSdIIUSACga+wo6fMi9IDiIVS9T8rUo8UgPOFZLEzSeOu3J0AXcQkzyH4CRP Jphqgv/xAGxZMuXdJl4+tXgSAOecCJt8rfgoDS3AAtTRVO6VEblFCll0Of/unoipf5z+ TLqLkWe1T2tsJEPCqQZaMeQagZ24flGRro6OqBlPqmp12ijH6rM4wBRzFJBGvndMQE0T 6nGNExodPaMPOjJGLBmA9SXbVK57PrYtX0ts89cqp/5SHgwocy1PXEnBtomA+wiN8H0s rtYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:mime-version:message-id:date :dkim-signature; bh=eTc1s2RdgTuEwcKjsTePj9b05rEPFXKmnvQrewsDJGY=; b=CcGDTn6qWRbgZExvf8sG3SoAgyl3AQHNxvazwbtjO1HGfAypBO2Qx7VIQ+7udsvjQI 7UPUKRYP4JSgyZ3I+zTFN2VjRnGhXYCgTphFCTRk7nOyJG0hB9xlrOctyf4q+s7PFeqJ t959inXUz1Netuoamks63fei/FKGGKB84qo5o7i0LoXNUpvr06APeb4JmstY3LN8SXK9 HDy9PYPDUjmXF2d33w/BLjwcIqLdWM8Js8oyEmgetkNjJSpW+X3nhhjzVTwUh0v/uezb PAk41NnK0soaMtaIxsGtg8fxeWYr9csS0vqEUnXbJIMkQ8aZ5MdqZpnQPk3C3aEng/TW Ss3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Nh1KlEtq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w3-20020a056402070300b0043da4dd1b2dsi3350938edx.575.2022.08.19.13.54.43; Fri, 19 Aug 2022 13:55:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Nh1KlEtq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352022AbiHSUwu (ORCPT + 99 others); Fri, 19 Aug 2022 16:52:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352009AbiHSUwa (ORCPT ); Fri, 19 Aug 2022 16:52:30 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A087109582 for ; Fri, 19 Aug 2022 13:52:11 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-3363b1dffa0so94467457b3.23 for ; Fri, 19 Aug 2022 13:52:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:from:to:cc; bh=eTc1s2RdgTuEwcKjsTePj9b05rEPFXKmnvQrewsDJGY=; b=Nh1KlEtqEMn+hz2bIHnZ1kTqSWg4bFjXv6426Y5hVUaoK7tELNU6W6NXVhT/c1Na8B fuTHgP3mOi8dFIyL/t/7MR439z7H1866ZEiSu3evBzrFy/m2ATYaGY43afxovuVkWSYa eUMrsrRTeeysD/TdHp8rDI01P26G/e5fl/KQ1dmFVglltS2WT+x/wxMcnMAZO6mOE6mS U/WqdM3oqVEoKKbZgJm/7OvM3mjKkmGyip4ewKKTi7I9PF6xYTGUYcHfEJNikdH4WT4a MpcZ4QApDWJwg3WW3fPg2tVSwWrwdqItoUXqWKdaFCWofAq4S++j3fXkowc6Bqhr0eJL PyKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc; bh=eTc1s2RdgTuEwcKjsTePj9b05rEPFXKmnvQrewsDJGY=; b=0UJroea2ZWefIx7u2P1ZetFpqHRLfjBprPnn+KXIA3FrHehGYVYn+L+W9lENKKKuEE QylSmZ658HA6BGpygtlRB6Yo2cyaEg20udYXgGyDabJmR7gxd08Nrte3VwTgiftHepQw FmqcoFmGHtUUqAzPSQVaCGn7NlEFklhuOxZsupiNfRJe2LIfinqIFExhzgbSbtQUFOkN e0Tef5ya6HwBfF9Gz0iykB3OHvqQEP9vdHy5WMJTWqH3tzwd86yjMiBBGkf7sI/kElTi VoD9t/2MrvixogBVo1Vk3vJ3ohv2n/PVY5N3OexnQec8g2pJztx4wrJk/9+VNEth+9ka qyaQ== X-Gm-Message-State: ACgBeo3feV/rJqUWrBlk7ABm//mYrD+7ypZDKt1TnnXxXOGOPFQXxGsu xuzjlDXWQxgSSTEnHuRiM+CmsukGpDtJRSHLT+RA X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:baf:4c5:18b:2c4b]) (user=axelrasmussen job=sendgmr) by 2002:a5b:2c8:0:b0:671:7cc8:219c with SMTP id h8-20020a5b02c8000000b006717cc8219cmr9358273ybp.325.1660942330377; Fri, 19 Aug 2022 13:52:10 -0700 (PDT) Date: Fri, 19 Aug 2022 13:51:56 -0700 Message-Id: <20220819205201.658693-1-axelrasmussen@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.37.1.595.g718a3a8f04-goog Subject: [PATCH v7 0/5] userfaultfd: add /dev/userfaultfd for fine grained access control From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This series is based on torvalds/master. The series is split up like so: - Patch 1 is a simple fixup which we should take in any case (even by itself). - Patches 2-5 add the feature, configurable selftest support, and docs. Why not ...? ============ - Why not /proc/[pid]/userfaultfd? Two main points (additional discussion [1]): - /proc/[pid]/* files are all owned by the user/group of the process, and they don't really support chmod/chown. So, without extending procfs it doesn't solve the problem this series is trying to solve. - The main argument *for* this was to support creating UFFDs for remote processes. But, that use case clearly calls for CAP_SYS_PTRACE, so to support this we could just use the UFFD syscall as-is. - Why not use a syscall? Access to syscalls is generally controlled by capabilities. We don't have a capability which is used for userfaultfd access without also granting more / other permissions as well, and adding a new capability was rejected [2]. - It's possible a LSM could be used to control access instead, but I have some concerns. I don't think this approach would be as easy to use, particularly if we were to try to solve this with something heavyweight like SELinux. Maybe we could pursue adding a new LSM specifically for this user case, but it may be too narrow of a case to justify that. Changelog ========= v6->v7: - Handle misc_register() failure properly by propagating the error instead if just WARN_ON-ing. [Greg] - Remove no-op open function from file_operations, since its caller detects the lack of an open implementation and proceeds normally anyway. [Greg] v5->v6: - Modified selftest to exit with KSFT_SKIP *only* when features are unsupported, exiting with 1 in other error cases. [Mike] - Improved wording in two spots in the documentation. [Mike] - Picked up some Acked-by's. v4->v5: - Call userfaultfd_syscall_allowed() directly in the syscall, so we don't have to plumb a flag into new_userfaultfd(). [Nadav] - Refactored run_vmtests.sh to loop over UFFD test mods. [Nadav] - Reworded cover letter. - Picked up some Acked-by's. v3->v4: - Picked up an Acked-by on 5/5. - Updated cover letter to cover "why not ...". - Refactored userfaultfd_allowed() into userfaultfd_syscall_allowed(). [Peter] - Removed obsolete comment from a previous version. [Peter] - Refactored userfaultfd_open() in selftest. [Peter] - Reworded admin-guide documentation. [Mike, Peter] - Squashed 2 commits adding /dev/userfaultfd to selftest and making selftest configurable. [Peter] - Added "syscall" test modifier (the default behavior) to selftest. [Peter] v2->v3: - Rebased onto linux-next/akpm-base, in order to be based on top of the run_vmtests.sh refactor which was merged previously. - Picked up some Reviewed-by's. - Fixed ioctl definition (_IO instead of _IOWR), and stopped using compat_ptr_ioctl since it is unneeded for ioctls which don't take a pointer. - Removed the "handle_kernel_faults" bool, simplifying the code. The result is logically equivalent, but simpler. - Fixed userfaultfd selftest so it returns KSFT_SKIP appropriately. - Reworded documentation per Shuah's feedback on v2. - Improved example usage for userfaultfd selftest. v1->v2: - Add documentation update. - Test *both* userfaultfd(2) and /dev/userfaultfd via the selftest. [1]: https://patchwork.kernel.org/project/linux-mm/cover/20220719195628.3415852-1-axelrasmussen@google.com/ [2]: https://lore.kernel.org/lkml/686276b9-4530-2045-6bd8-170e5943abe4@schaufler-ca.com/T/ Axel Rasmussen (5): selftests: vm: add hugetlb_shared userfaultfd test to run_vmtests.sh userfaultfd: add /dev/userfaultfd for fine grained access control userfaultfd: selftests: modify selftest to use /dev/userfaultfd userfaultfd: update documentation to describe /dev/userfaultfd selftests: vm: add /dev/userfaultfd test cases to run_vmtests.sh Documentation/admin-guide/mm/userfaultfd.rst | 41 ++++++++++- Documentation/admin-guide/sysctl/vm.rst | 3 + fs/userfaultfd.c | 71 +++++++++++++----- include/uapi/linux/userfaultfd.h | 4 ++ tools/testing/selftests/vm/run_vmtests.sh | 15 ++-- tools/testing/selftests/vm/userfaultfd.c | 76 +++++++++++++++++--- 6 files changed, 176 insertions(+), 34 deletions(-) -- 2.37.1.595.g718a3a8f04-goog