Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp902248iog; Mon, 13 Jun 2022 15:48:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy9xv1yy5JGqT+Wzb8KCJ0vZD6AaZgjHUJQuZcGieyTP79J18VCKh4mIh/JGVWwvYOQOhcU X-Received: by 2002:a17:902:ecc5:b0:164:1a5d:576f with SMTP id a5-20020a170902ecc500b001641a5d576fmr1341899plh.19.1655160491590; Mon, 13 Jun 2022 15:48:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655160491; cv=none; d=google.com; s=arc-20160816; b=OPoJZ2iiU6KuYWKTIO1+lGBVhmR1hNBW7tLE1GaR/l9Ac7cx9OIsHS1JoE/44YgINV Qz4wqdQslgEpKUBbFnSrgaYkA4oXhXv+xS/CXHWfySzKDccAbJJkg6+VHsx2eTti/bqA c6+8o9Wga6p0gATFW+lC/6oJP13v7wOZCiJAR3p6rnENyn24Mu2eRO93201f9NJHpQn/ BjjKSoMr9ViR3rDc7w+atEw2WzlhnCuZkiK4FQOmM53cY7sQAKWSutKBgUfVFU0MwlJW kttIy+gQva2epdd/IapG5fd59hZjN+w3a58f1J43pHF2A74kUHKTWgOD5OjrslejVun1 uuFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=cSfQVDPdSZV7BsUZMX0DMKU36oE62bl8JUEG399LTGQ=; b=kcCssWOuUxMlxDvc1sx4r9jndxLXI+dE0VHVeNRuGEU6RhusmsOxKSaUEAq4Q/Wzzt Rk7l3PYT/CkmzIgDKSidGN3+7FR96H1F7B9CVbk6sMpMorT4kuDpIWpfu8lwfL5pHB/I NigER76QN0/ASGIexBtReY1mMKQiFD+A8AVWnYmkTj5C/1QvvF5yGYitGJc5pQLzwr8e nbO3lRALHA3HgIyI6bQyA1EI4a75GxC7EftG9r+e0y5MjjzzzzGMHU2pV8Pxq5zOjLni /fJ8ZOJScH5RIf783oNAp1nz8IJDkqjdKhiFo3Yia9dIz4lDvaSropaAd2sBGOGRwaWf 05dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=iOe2c0LZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l9-20020a170902f68900b00163a83d160asi10635977plg.168.2022.06.13.15.47.58; Mon, 13 Jun 2022 15:48:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=iOe2c0LZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229808AbiFMWil (ORCPT + 99 others); Mon, 13 Jun 2022 18:38:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230240AbiFMWik (ORCPT ); Mon, 13 Jun 2022 18:38:40 -0400 Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA4E1617B for ; Mon, 13 Jun 2022 15:38:38 -0700 (PDT) Received: by mail-io1-xd2c.google.com with SMTP id y79so7671901iof.2 for ; Mon, 13 Jun 2022 15:38:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cSfQVDPdSZV7BsUZMX0DMKU36oE62bl8JUEG399LTGQ=; b=iOe2c0LZoVW+GY27vdPQ164YNh5IJrimb87nSfQol6K9V2Wjvi7hEXWRmNiYR1bwUQ 5TpXCahY9UIL2dY5I5t2MiTzSJeqPHWw1Bto/qfAjYn4kJxplc0U01mR//HUbdxvQDGH LC1M4Ru3Us/w/yWJb/J6AsoYk4lQ2ihX8s0Q5Vwi6gl6Ecnqp3Ll7kSNctOBXPo0GBq9 0QlakwT6SIWPHk+3iwnGbYC8THbaSirHeaaduFJiwRI7mJgxtID2V6jIeJmNOgmln4Kn nr1bVIcZPbAxgugwhMKLjJyi++xVgjKyQpMomzK+P5O1PiDVc69khAKw2yR2WyI0wcqy tq7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cSfQVDPdSZV7BsUZMX0DMKU36oE62bl8JUEG399LTGQ=; b=L/1raY/lwaS2NECJ8dPRv1YUThUHblTcjcXpH1rxXm0TUOT4j+2zJexdyJHCyFrKms aq6NJ2vfSelxIiwRCs14gS/eiwlgFuBPELxWM3FUngOlkBSthmRi+tuwCOFZs/zjQIwr wm4Wp9NN0IJAkaLnafU/idLTyn2KT53AF6ck9yNPZsAOPtwW3edwidIQ5/keKTAKSQRl f2glA3kPfjt0AwHYSeMfOZGpyMDm9aKhXMcxozREfg2JoeIGJxOPYhAWh+2/ftWahaFe 76iQm6bqk12DeDyzFO1v3G8UrUlGOOObGgEUQJDjWD3eQ9RBig7QNTtGVjK2KXlE7mLk mCUw== X-Gm-Message-State: AOAM532xJBRD/C9nq3tIHI4HzXaE5n4HCQUumzZGWwryVnQUV29vzVtQ ep4umnxbWo8ePKlaOVQFZFHm0jLkGGiz2r2vrpl5jg== X-Received: by 2002:a02:2305:0:b0:331:a026:b650 with SMTP id u5-20020a022305000000b00331a026b650mr1141093jau.314.1655159918049; Mon, 13 Jun 2022 15:38:38 -0700 (PDT) MIME-Version: 1.0 References: <20220601210951.3916598-1-axelrasmussen@google.com> <20220601210951.3916598-3-axelrasmussen@google.com> <20220613145540.1c9f7750092911bae1332b92@linux-foundation.org> In-Reply-To: From: Axel Rasmussen Date: Mon, 13 Jun 2022 15:38:02 -0700 Message-ID: Subject: Re: [PATCH v3 2/6] userfaultfd: add /dev/userfaultfd for fine grained access control To: Peter Xu Cc: Andrew Morton , Alexander Viro , Charan Teja Reddy , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, LKML , Linux MM , Linuxkselftest Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 13, 2022 at 3:29 PM Peter Xu wrote: > > On Mon, Jun 13, 2022 at 02:55:40PM -0700, Andrew Morton wrote: > > On Wed, 1 Jun 2022 14:09:47 -0700 Axel Rasmussen wrote: > > > > > To achieve this, add a /dev/userfaultfd misc device. This device > > > provides an alternative to the userfaultfd(2) syscall for the creation > > > of new userfaultfds. The idea is, any userfaultfds created this way will > > > be able to handle kernel faults, without the caller having any special > > > capabilities. Access to this mechanism is instead restricted using e.g. > > > standard filesystem permissions. > > > > The use of a /dev node isn't pretty. Why can't this be done by > > tweaking sys_userfaultfd() or by adding a sys_userfaultfd2()? I think for any approach involving syscalls, we need to be able to control access to who can call a syscall. Maybe there's another way I'm not aware of, but I think today the only mechanism to do this is capabilities. I proposed adding a CAP_USERFAULTFD for this purpose, but that approach was rejected [1]. So, I'm not sure of another way besides using a device node. One thing that could potentially make this cleaner is, as one LWN commenter pointed out, we could have open() on /dev/userfaultfd just return a new userfaultfd directly, instead of this multi-step process of open /dev/userfaultfd, NEW ioctl, then you get a userfaultfd. When I wrote this originally it wasn't clear to me how to get that to happen - open() doesn't directly return the result of our custom open function pointer, as far as I can tell - but it could be investigated. [1]: https://lore.kernel.org/lkml/686276b9-4530-2045-6bd8-170e5943abe4@schaufler-ca.com/T/ > > > > Peter, will you be completing review of this patchset? > > Sorry to not have reviewed it proactively.. > > I think it's because I never had a good picture/understanding of what > should be the best security model for uffd, meanwhile I am (it seems) just > seeing more and more ways to "provide a safer uffd" by different people > using different ways.. and I never had time (and probably capability too..) > to figure out the correct approach if not to accept all options provided. Agreed, what we have right now is a bit of a mess of different approaches. I think the reason for this is, there is no "perfect" way to control access to features like this, so what we now have is several different approaches with different tradeoffs. From my perspective, the existing controls were simpler to implement, but are not ideal because they require us to grant access to UFFD *plus more stuff too*. The approach I've proposed is the most granular, so it doesn't require adding any extra permissions. But, I agree the interface is sort of overcomplicated. :/ But, from my perspective, security in shared Cloud computing environments where UFFD is used for live migration is critical, so I prefer this tradeoff - I'll put up with a slightly messier interface, if the gain is a very minimal set of privileges. > > I think I'll just assume the whole thing is acked already from you > generally, then I'll read at least the implementation before the end of > tomorrow. > > Thanks, > > -- > Peter Xu >