Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp754988ybk; Wed, 20 May 2020 11:07:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxHschpMyZ97vvpYwiAkpgVPxlhsQ6dp6EK/OHgGtzKOp9sVUwWsiSNG0IefCngIvecE2Az X-Received: by 2002:a17:906:6457:: with SMTP id l23mr272174ejn.188.1589998061732; Wed, 20 May 2020 11:07:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1589998061; cv=none; d=google.com; s=arc-20160816; b=juCLgZqJS8EKioACSJt64gMe1B+0ikJfesQTkT+0CXvIH6e/PoOW5hbvEi7XUHbgeT GR5MBorXFz9ZUzoSkQ1o9pCdeF3ZAjEWdjgfl73XIu/Zrq9iBZuWULf+jYwRNBHpsHWn 0m3C4+y8yDGMwQCcJ4wUeKnpiXdRXwOurAR/0nbkOK0undFMnL3ghqTPgakbVZCuVEif wIDggTlO2elZGSE6cdIVFqAtnRyMsOJP0H89mPZbQrfYyHNgJURgCXsdUU8HFZdpGMPi pAGPD6g2DLMz3Mk+xPZLgpSd2tN9tZ6AI3vxrQ9/0QWm/qyG+aKQIt4I1FmlxMDbQVZa zojQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=4gedS5z2xUBNPz+Ub7NmjuaMewCM/6KJ5Nb+BZM0cYA=; b=F515zmEayMUj8vSxsTp6aIlr05HUA78hgI47Hmsw4JP7AZ1STGHCNhrVc6vkETv/dQ Dipf02fZx+rDkNI7G90cOGJJROpCEmqaHbxcTwn6TFX3bmo3R8p4sjNsGYNBrABXzh43 n5mam0z4vp8eqTkgjQdpiOVtpecv3P18R04nFQUYjpUs9TYsrjM4lRXmDQyKR6/KAXAH +ZlR/ifvUskSOJUsgRou1nHNyG1CA/fDR8Jf9WzrdSD9JuvX48wlZdFGoFgOmDGuuZsQ PqtRZ9n0fi6FnZfTV4bsZ9R9osxikw2Z6Ye6OhaP5VZtNXYibhPIByu28IcO+7WwVa6r EbfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=i2gEr4aC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e22si2119297eje.203.2020.05.20.11.07.18; Wed, 20 May 2020 11:07:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=i2gEr4aC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726812AbgETSDo (ORCPT + 99 others); Wed, 20 May 2020 14:03:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726810AbgETSDm (ORCPT ); Wed, 20 May 2020 14:03:42 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7F94C061A0F for ; Wed, 20 May 2020 11:03:42 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id f15so1664879plr.3 for ; Wed, 20 May 2020 11:03:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4gedS5z2xUBNPz+Ub7NmjuaMewCM/6KJ5Nb+BZM0cYA=; b=i2gEr4aClpisfvaW3sWRRYnHudnywtDQIglybb1KsEhteuY0ZA+C5QXF9Vn2+gfm4u aSgjm29EaaoEB1GVZMwTIr4V930XxzwhlTKSlKueYKSueV5uUB5mb74Pj/mS9rSioi3z wlv2GINuwEzIenvfv+tdIb2MObGfqp9IcxGH0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4gedS5z2xUBNPz+Ub7NmjuaMewCM/6KJ5Nb+BZM0cYA=; b=mp4EACrsY99KRR+s4i+58M5vxNpSgR+ml0poE4JY3Ie7XfuvUp7soHRywI+1vSKRV9 yPk9TeD4kvUIDy+AWKJCEs/PEdrX6eVH9+tPsTh1ft31poJm0Wkz0+dtFwVLi3EAs1+X VKtcZpYS601X+DDwKOGcPovYjq3I18BrauNI0qCciemhCN2WldqazrqieC2k5ufkeq6W V9ejMSZrguJbEnzI1hndkyxm9jh2dgC/Ko7RqS47hHASiquWkKevmLNDHFj1J3JkXWdS 1DAo8MWiw8zgAmq5ObudNbn8ZmIwQ7exVW5wYD6uIkko/b6R6YQaHQwq8kp4WLvrasCR eBtQ== X-Gm-Message-State: AOAM533am+QHxGPHlG9VDpTdqZXkxTJBkTi8rQdCSfV5dCXoyoszhjBg xuPM5N8jBn6Gjr5kxyQ2Uass/A== X-Received: by 2002:a17:902:b907:: with SMTP id bf7mr5718198plb.136.1589997821983; Wed, 20 May 2020 11:03:41 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id h7sm2389595pgg.17.2020.05.20.11.03.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 May 2020 11:03:40 -0700 (PDT) Date: Wed, 20 May 2020 11:03:39 -0700 From: Kees Cook To: Andrea Arcangeli Cc: "Michael S. Tsirkin" , Daniel Colascione , Jonathan Corbet , Alexander Viro , Luis Chamberlain , Iurii Zaikin , Mauro Carvalho Chehab , Andrew Morton , Andy Shevchenko , Vlastimil Babka , Mel Gorman , Sebastian Andrzej Siewior , Peter Xu , Mike Rapoport , Jerome Glisse , Shaohua Li , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, timmurray@google.com, minchan@google.com, sspatil@google.com, lokeshgidra@google.com Subject: Re: [PATCH 2/2] Add a new sysctl knob: unprivileged_userfaultfd_user_mode_only Message-ID: <202005200921.2BD5A0ADD@keescook> References: <20200423002632.224776-1-dancol@google.com> <20200423002632.224776-3-dancol@google.com> <20200508125054-mutt-send-email-mst@kernel.org> <20200508125314-mutt-send-email-mst@kernel.org> <20200520045938.GC26186@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200520045938.GC26186@redhat.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 20, 2020 at 12:59:38AM -0400, Andrea Arcangeli wrote: > Hello everyone, > > On Fri, May 08, 2020 at 12:54:03PM -0400, Michael S. Tsirkin wrote: > > On Fri, May 08, 2020 at 12:52:34PM -0400, Michael S. Tsirkin wrote: > > > On Wed, Apr 22, 2020 at 05:26:32PM -0700, Daniel Colascione wrote: > > > > This sysctl can be set to either zero or one. When zero (the default) > > > > the system lets all users call userfaultfd with or without > > > > UFFD_USER_MODE_ONLY, modulo other access controls. When > > > > unprivileged_userfaultfd_user_mode_only is set to one, users without > > > > CAP_SYS_PTRACE must pass UFFD_USER_MODE_ONLY to userfaultfd or the API > > > > will fail with EPERM. This facility allows administrators to reduce > > > > the likelihood that an attacker with access to userfaultfd can delay > > > > faulting kernel code to widen timing windows for other exploits. > > > > > > > > Signed-off-by: Daniel Colascione > > > > > > The approach taken looks like a hard-coded security policy. > > > For example, it won't be possible to set the sysctl knob > > > in question on any sytem running kvm. So this is > > > no good for any general purpose system. Not all systems run unprivileged KVM. :) > > > What's wrong with using a security policy for this instead? > > > > In fact I see the original thread already mentions selinux, > > so it's just a question of making this controllable by > > selinux. > > I agree it'd be preferable if it was not hardcoded, but then this > patchset is also much simpler than the previous controlling it through > selinux.. > > I was thinking, an alternative policy that could control it without > hard-coding it, is a seccomp-bpf filter, then you can drop 2/2 as > well, not just 1/6-4/6. Err, did I miss a separate 6-patch series? I can't find anything on lore. > > If you keep only 1/2, can't seccomp-bpf enforce userfaultfd to be > always called with flags==0x1 without requiring extra modifications in > the kernel? Please no. This is way too much overhead for something that a system owner wants to enforce globally. A sysctl is the correct option here, IMO. If it needs to be a per-userns sysctl, that would be fine too. > Can't you get the feature party with the CAP_SYS_PTRACE capability > too, if you don't wrap those tasks with the ptrace capability under > that seccomp filter? > > As far as I can tell, it's unprecedented to create a flag for a > syscall API, with the only purpose of implementing a seccomp-bpf > filter verifying such flag is set, but then if you want to control it > with LSM it's even more complex than doing it with seccomp-bpf, and it > requires more kernel code too. We could always add 2/2 later, such > possibility won't disappear, in fact we could also add 1/6-4/6 later > too if that is not enough. > > If we could begin by merging only 1/2 from this new series and be done > with the kernel changes, because we offload the rest of the work to > the kernel eBPF JIT, I think it'd be ideal. I'd agree that patch 1 should land, as it appears to be required for any further policy considerations. I'm still a big fan of a sysctl since this is the kind of thing I would absolutely turn on globally for all my systems. -- Kees Cook