Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp982463img; Mon, 18 Mar 2019 20:09:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqy9DMlDoZtJesbAbgxmOskLwCA9ynXFUbmZaAdQwZWEBr+7L0mSsxEbeTZj23Sf9f6tcFMj X-Received: by 2002:a65:510c:: with SMTP id f12mr21210250pgq.40.1552964951986; Mon, 18 Mar 2019 20:09:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552964951; cv=none; d=google.com; s=arc-20160816; b=i0mmTViN35ajcvjdefY4UgjslG3iEIqQYmzCn6PG9tcS67ZwEgVezWpgA5f88XXs2U drFdNz0aCrChWPtpLyJWXCox33WoTWDgTO+9lgVyXM1QOU70ks6ksaQPbk+GcsPYZMIi 1ztxeGWS0hUG2YxklrssDx77chatgW/Je0rA/fcGDYPMmZNIlRsV3ACa2XSsmLCfxAQ3 39kLqyEYzonGJoGXo0Qignf+09Dg8F1uuEKcojNjMGLad6396xXgX6lbK5B8IRe9UoXL RIj6o7930vzZjurvyB8NxMAuz8840zG0LOG5B3VIXj21iiTY9gpEhY/lMA6sE+eYFHyp Jzqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=h4r6W8lhBQGAfpEhniV5yNxj5IJM6e+DGMzsP5DmZi8=; b=SouU8u4dgkhv4ehqIAxVzZopGqHiCpHoyV1ZxDtOHr39zyiQp2GbbD8wPrPIpbgmC/ eUzlXrcJVYSGJU9eUVcQrJyt3s14He6fhWaa08JX5G2qV15qZ+q0G4Ys+0faLdYs1Tzo /fCwyLlmVyM2DHMtOoTqMYTk2Rnt0yyJh62DF7ViDo59inyGyu8E0fh4u/6EXdud494r yaB1nyoBjKsBtmINy3kov4zsGcK8wJQBcDO5KCQkPr859P9Fey03SUV9pwhdb9sKz7mR HTfRKAQwIurscg5wo6cIk+9+cbEhjeSqjPXJNAp4Tbq6/rvmZb6rgwzMCg08xDu+hvPw 8r0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u137si6980773pgc.467.2019.03.18.20.08.57; Mon, 18 Mar 2019 20:09:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727538AbfCSDHp (ORCPT + 99 others); Mon, 18 Mar 2019 23:07:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49832 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726639AbfCSDHo (ORCPT ); Mon, 18 Mar 2019 23:07:44 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B6EC93082200; Tue, 19 Mar 2019 03:07:43 +0000 (UTC) Received: from xz-x1.nay.redhat.com (dhcp-14-116.nay.redhat.com [10.66.14.116]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7FCC25D75C; Tue, 19 Mar 2019 03:07:36 +0000 (UTC) From: Peter Xu To: linux-kernel@vger.kernel.org Cc: Paolo Bonzini , Hugh Dickins , Luis Chamberlain , Maxime Coquelin , Maya Gokhale , Jerome Glisse , Pavel Emelyanov , Johannes Weiner , peterx@redhat.com, Martin Cracauer , Denis Plotnikov , linux-mm@kvack.org, Marty McFadden , Mike Kravetz , Andrea Arcangeli , Mike Rapoport , Kees Cook , Mel Gorman , "Kirill A . Shutemov" , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, "Dr . David Alan Gilbert" , Andrew Morton Subject: [PATCH v2 1/1] userfaultfd/sysctl: add vm.unprivileged_userfaultfd Date: Tue, 19 Mar 2019 11:07:22 +0800 Message-Id: <20190319030722.12441-2-peterx@redhat.com> In-Reply-To: <20190319030722.12441-1-peterx@redhat.com> References: <20190319030722.12441-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Tue, 19 Mar 2019 03:07:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add a global sysctl knob "vm.unprivileged_userfaultfd" to control whether userfaultfd is allowed by unprivileged users. When this is set to zero, only privileged users (root user, or users with the CAP_SYS_PTRACE capability) will be able to use the userfaultfd syscalls. Suggested-by: Andrea Arcangeli Suggested-by: Mike Rapoport Signed-off-by: Peter Xu --- Documentation/sysctl/vm.txt | 12 ++++++++++++ fs/userfaultfd.c | 5 +++++ include/linux/userfaultfd_k.h | 2 ++ kernel/sysctl.c | 12 ++++++++++++ 4 files changed, 31 insertions(+) diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 187ce4f599a2..f146712f67bb 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -61,6 +61,7 @@ Currently, these files are in /proc/sys/vm: - stat_refresh - numa_stat - swappiness +- unprivileged_userfaultfd - user_reserve_kbytes - vfs_cache_pressure - watermark_boost_factor @@ -818,6 +819,17 @@ The default value is 60. ============================================================== +unprivileged_userfaultfd + +This flag controls whether unprivileged users can use the userfaultfd +syscalls. Set this to 1 to allow unprivileged users to use the +userfaultfd syscalls, or set this to 0 to restrict userfaultfd to only +privileged users (with SYS_CAP_PTRACE capability). + +The default value is 1. + +============================================================== + - user_reserve_kbytes When overcommit_memory is set to 2, "never overcommit" mode, reserve diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 89800fc7dc9d..7e856a25cc2f 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -30,6 +30,8 @@ #include #include +int sysctl_unprivileged_userfaultfd __read_mostly = 1; + static struct kmem_cache *userfaultfd_ctx_cachep __read_mostly; enum userfaultfd_state { @@ -1921,6 +1923,9 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) struct userfaultfd_ctx *ctx; int fd; + if (!sysctl_unprivileged_userfaultfd && !capable(CAP_SYS_PTRACE)) + return -EPERM; + BUG_ON(!current->mm); /* Check the UFFD_* constants for consistency. */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 37c9eba75c98..ac9d71e24b81 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -28,6 +28,8 @@ #define UFFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) #define UFFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS) +extern int sysctl_unprivileged_userfaultfd; + extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason); extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start, diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 7578e21a711b..9b8ff1881df9 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -66,6 +66,7 @@ #include #include #include +#include #include #include @@ -1704,6 +1705,17 @@ static struct ctl_table vm_table[] = { .extra1 = (void *)&mmap_rnd_compat_bits_min, .extra2 = (void *)&mmap_rnd_compat_bits_max, }, +#endif +#ifdef CONFIG_USERFAULTFD + { + .procname = "unprivileged_userfaultfd", + .data = &sysctl_unprivileged_userfaultfd, + .maxlen = sizeof(sysctl_unprivileged_userfaultfd), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &zero, + .extra2 = &one, + }, #endif { } }; -- 2.17.1