Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp2236389ybh; Fri, 24 Jul 2020 07:49:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxutJCJf9Syes+f0JGK+YuRgqfbmeFyKJufXg6g2jFsT58yC0TCb6XE5X0+00QEiJGo8k/B X-Received: by 2002:a05:6402:202e:: with SMTP id ay14mr9257413edb.233.1595602148855; Fri, 24 Jul 2020 07:49:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595602148; cv=none; d=google.com; s=arc-20160816; b=OPw0hOhXZwF72OziQY/NqLVuqNNUXchSnJ5wRYpiIWPNoEsI15BOHCkToKND+jW9AU ozZoh4py6T1G20kAKBJyC46ZOZTsoSyZf/zvtm1rkbV4fXoMQcbScL3e2V5/a9T/JHGe zDAjZ+9Dfa45wqZYFtZGwy3oWWpykCs/CJU5apZYbT8zlWI7GD0Q7TiF8JZE6OrwyiVr hv3TPoLTUDbJa5kBHttMWYXt2UV+SY16PSCTVg2B7/2P9x5jrEWr74LgCJohPNF6+Ga/ 6w+xNEC8Nyj2FpYIDzbcd6fzDOAeGMg7KzUsOa0Qz3JMBBURmP8vLwNs8/oZB1roocvo I+EQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ooxbKiMWNxM2Qkg9OCv+AGbzLM8cpI7Yz8Qh84quFec=; b=ZxdZ7pDjOEW4n2KBGeIrUsKbGsWnWmRPryrrIDG7/wLvKz04mYlY6PmcroS/1cNREl m1j4MtyBxF4kIYbxhShHD3UdEU16LuSIdKGRg9ySANu1DWPnwZcOcklw4wOp3XmwjM0+ CaHKJOJhHGkRgkVvxbu1gUBt8P8Is4jDGgRJph1XJXTGVKjugqqye+3NUnCCoWdo/ytU APPS4jEQk/7vTybOd/XWdtY5tNIUanwXLY9M9Z7tz6Ls5jmTJFvJG6AlsyPtxIKMtCOq xY21muTlZUvPXvNngGM96MoPMJD2Dl2m+IsL1HrIYMi/ASWeRpMTvofHQmbRy2anSe9/ 5u3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ejwR60RW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d23si719917ejz.145.2020.07.24.07.48.45; Fri, 24 Jul 2020 07:49:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ejwR60RW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726758AbgGXOqP (ORCPT + 99 others); Fri, 24 Jul 2020 10:46:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726742AbgGXOqO (ORCPT ); Fri, 24 Jul 2020 10:46:14 -0400 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC0A1C0619E5 for ; Fri, 24 Jul 2020 07:46:14 -0700 (PDT) Received: by mail-il1-x143.google.com with SMTP id x9so7420485ila.3 for ; Fri, 24 Jul 2020 07:46:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ooxbKiMWNxM2Qkg9OCv+AGbzLM8cpI7Yz8Qh84quFec=; b=ejwR60RWkPZ9/VVKHthh3FpYSzqmZIanKKHiIZIzNQfGxkc34Rg8Z9LYBQPYpo1teZ SZz9q9XA9dY263aHUQ2ps+DaSxtV0Dz1nnNw4Nko5bTb0QAWMHSxWoRW56nnr0xT2d8f N0LyUn2C3G9JYflsDNQE/0EhPghMtFdYKsOqAVk/yJyN54LNZSzsM9V88gijgsLn1KZK saK9TyhP/8yogG+tYgBngjt1QmJhVi/HN8Jg2czuum+IvvJfuKZfmilrJ8v2XGSPhDNG eeMHriv7AnjefyokfOeATQlWcY2nOw95zZn4vGsOfE9Ya6LxXfX0Gps6lYzq0x/F+S1y oEtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ooxbKiMWNxM2Qkg9OCv+AGbzLM8cpI7Yz8Qh84quFec=; b=GnoMZU/noNpJsEj+DSIfRIL17LU2xwJbwlRP3ONwBzXVEIPHeElDd5+pRA3USfGnJc efIMwQmIQiTXiaLYgNlAaTv/W40Opj/kA5Ehmm7vkbdC/CWf+ThqYA3aJnRb7kKicTYo zohcXn2nknS2Pyf7READM8qPxiJRTdXgtc/3F51l5k/S+tbM7yaTtaE82+p6ZFw6xhM5 jdEE5Yv14KH8SEu7+/Q2hGycj1P/M4Jq3xSHweprkcxFZoMAEjLOEITVN9Bnu7/3xaj/ UXkLksyC0RryJJUI0fTyvjcV+l41wi+O03HV0GIhjmh20hFevyyu+d8m3u2Mebz2SpEN SI4w== X-Gm-Message-State: AOAM533MYMkhYYBPy5dqa+QY7p1sqxAABAM7kB/ia0xNpFuVMxfPeryF ng7+pCOjWGxFzXdfYZeZvBHeX9aB6VB77J1di+RPXQ== X-Received: by 2002:a92:dc90:: with SMTP id c16mr10596811iln.202.1595601973678; Fri, 24 Jul 2020 07:46:13 -0700 (PDT) MIME-Version: 1.0 References: <20200423002632.224776-1-dancol@google.com> <20200423002632.224776-2-dancol@google.com> <20200724100153-mutt-send-email-mst@kernel.org> In-Reply-To: <20200724100153-mutt-send-email-mst@kernel.org> From: Lokesh Gidra Date: Fri, 24 Jul 2020 07:46:02 -0700 Message-ID: Subject: Re: [PATCH 1/2] Add UFFD_USER_MODE_ONLY To: "Michael S. Tsirkin" Cc: Jonathan Corbet , Alexander Viro , Luis Chamberlain , Kees Cook , Iurii Zaikin , Mauro Carvalho Chehab , Andrew Morton , Andy Shevchenko , Vlastimil Babka , Mel Gorman , Sebastian Andrzej Siewior , Peter Xu , Andrea Arcangeli , Mike Rapoport , Jerome Glisse , Shaohua Li , linux-doc@vger.kernel.org, linux-kernel , Linux FS Devel , Tim Murray , Minchan Kim , Sandeep Patil , Daniel Colascione , Jeffrey Vander Stoep , Nick Kralevich , kernel@android.com, Kalesh Singh Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 24, 2020 at 7:28 AM Michael S. Tsirkin wrote: > > On Wed, Apr 22, 2020 at 05:26:31PM -0700, Daniel Colascione wrote: > > userfaultfd handles page faults from both user and kernel code. Add a > > new UFFD_USER_MODE_ONLY flag for userfaultfd(2) that makes the > > resulting userfaultfd object refuse to handle faults from kernel mode, > > treating these faults as if SIGBUS were always raised, causing the > > kernel code to fail with EFAULT. > > > > A future patch adds a knob allowing administrators to give some > > processes the ability to create userfaultfd file objects only if they > > pass UFFD_USER_MODE_ONLY, reducing the likelihood that these processes > > will exploit userfaultfd's ability to delay kernel page faults to open > > timing windows for future exploits. > > > > Signed-off-by: Daniel Colascione > > Something to add here is that there is separate work on selinux to > support limiting specific userspace programs to only this type of > userfaultfd. > > I also think Kees' comment about documenting what is the threat being solved > including some links to external sources still applies. > > Finally, a question: > > Is there any way at all to increase security without breaking > the assumption that copy_from_user is the same as userspace read? > > > As an example of a drastical approach that might solve some issues, how > about allocating some special memory and setting some VMA flag, then > limiting copy from/to user to just this subset of virtual addresses? > We can then do things like pin these pages in RAM, forbid > madvise/userfaultfd for these addresses, etc. > > Affected userspace then needs to use a kind of a bounce buffer for any > calls into kernel. This needs much more support from userspace and adds > much more overhead, but on the flip side, affects more ways userspace > can slow down the kernel. > > Was this discussed in the past? Links would be appreciated. > Adding Nick and Jeff to the discussion. > > > --- > > fs/userfaultfd.c | 7 ++++++- > > include/uapi/linux/userfaultfd.h | 9 +++++++++ > > 2 files changed, 15 insertions(+), 1 deletion(-) > > > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > > index e39fdec8a0b0..21378abe8f7b 100644 > > --- a/fs/userfaultfd.c > > +++ b/fs/userfaultfd.c > > @@ -418,6 +418,9 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) > > > > if (ctx->features & UFFD_FEATURE_SIGBUS) > > goto out; > > + if ((vmf->flags & FAULT_FLAG_USER) == 0 && > > + ctx->flags & UFFD_USER_MODE_ONLY) > > + goto out; > > > > /* > > * If it's already released don't get it. This avoids to loop > > @@ -2003,6 +2006,7 @@ static void init_once_userfaultfd_ctx(void *mem) > > > > SYSCALL_DEFINE1(userfaultfd, int, flags) > > { > > + static const int uffd_flags = UFFD_USER_MODE_ONLY; > > struct userfaultfd_ctx *ctx; > > int fd; > > > > @@ -2012,10 +2016,11 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) > > BUG_ON(!current->mm); > > > > /* Check the UFFD_* constants for consistency. */ > > + BUILD_BUG_ON(uffd_flags & UFFD_SHARED_FCNTL_FLAGS); > > BUILD_BUG_ON(UFFD_CLOEXEC != O_CLOEXEC); > > BUILD_BUG_ON(UFFD_NONBLOCK != O_NONBLOCK); > > > > - if (flags & ~UFFD_SHARED_FCNTL_FLAGS) > > + if (flags & ~(UFFD_SHARED_FCNTL_FLAGS | uffd_flags)) > > return -EINVAL; > > > > ctx = kmem_cache_alloc(userfaultfd_ctx_cachep, GFP_KERNEL); > > diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h > > index e7e98bde221f..5f2d88212f7c 100644 > > --- a/include/uapi/linux/userfaultfd.h > > +++ b/include/uapi/linux/userfaultfd.h > > @@ -257,4 +257,13 @@ struct uffdio_writeprotect { > > __u64 mode; > > }; > > > > +/* > > + * Flags for the userfaultfd(2) system call itself. > > + */ > > + > > +/* > > + * Create a userfaultfd that can handle page faults only in user mode. > > + */ > > +#define UFFD_USER_MODE_ONLY 1 > > + > > #endif /* _LINUX_USERFAULTFD_H */ > > -- > > 2.26.2.303.gf8c07b1a785-goog > > >