Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1981205imu; Sat, 5 Jan 2019 11:21:27 -0800 (PST) X-Google-Smtp-Source: ALg8bN7U7XxbahXb0TkhowgfB7i++S9wYhZweRPsPVsGn4ORumMt3jjevL7+Nm3NiTLlIEN4XnqW X-Received: by 2002:a65:6148:: with SMTP id o8mr5680788pgv.451.1546716087655; Sat, 05 Jan 2019 11:21:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546716087; cv=none; d=google.com; s=arc-20160816; b=aw/js/h831ASQud3nuCpuNRWqQT0V1PtzSxRdCpZh1LJYH33Qf0aktZ5uycHseJsYV HldThBRdovCy7J21Sb0CG/z+Q/GZT7T/Z1OoJnp1yyQLv6CQ7K7XUv6dmR6UqwGQkQOh Vmh/tGNSiXJ7RyvFANkloG7xw+rgtfVgyGLQ3xI/gcgkQRnEudtUvOIyFkpFm8Wf1bif etv7Ev+aO9Ov4PXN0zFqj7OPQEag4mVS2fr/M0zX9/y3+VwcYsKpNMi07plRZeFImZ4R 4hJx4BwijUQF8ApxzeveHRhuAPTItM/erYtYref3SzljQi9fwpnVRxB18q56QzZVp6ki +dcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=qWGtuBWSdmQ6VlktNztIhl6UJ2ESrT0DmvKEp5oDmMo=; b=zENbe+R6maPEEMvC9s7cqHmQoKizTEF3R+lundg80+PIn/OTe4uftOELnqAX8ypDON /m4IRBVohcQ2Tz5ZH1dO20dO/6tLF95ehFyXMUaMxuiYYWSAeL383ZUK7Z/jMFv/T3i0 5uHOnCQbndNr99Ovy7nD7IPzERu/qp2gkFzHsoy6Y5HUPKZ40r1URIDfq3ckDYQ7Mnix FQ9T4S0kv2vVkTaL5kpWXZLy1xtvHo5JOLdZyACLG6R24wYyVOKwSlPW2wteJh3D9/6G Jrluhk9EVydv+WDipj30UaD6ApOAoa+6RuXqznmJ6Js1VVd1c2Y9xY7+KN1nnb4owzsP PDdA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k38si5448958pgi.235.2019.01.05.11.20.24; Sat, 05 Jan 2019 11:21:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726289AbfAETPC (ORCPT + 99 others); Sat, 5 Jan 2019 14:15:02 -0500 Received: from mx2.suse.de ([195.135.220.15]:32994 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726252AbfAETPB (ORCPT ); Sat, 5 Jan 2019 14:15:01 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 8560EACE8; Sat, 5 Jan 2019 19:14:59 +0000 (UTC) Subject: Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged To: Jiri Kosina , Linus Torvalds , Andrew Morton , Greg KH , Peter Zijlstra , Michal Hocko Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org References: From: Vlastimil Babka Message-ID: Date: Sat, 5 Jan 2019 20:14:57 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5.1.2019 18:27, Jiri Kosina wrote: > From: Jiri Kosina > > There are possibilities [1] how mincore() could be used as a converyor of > a sidechannel information about pagecache metadata. > > Provide vm.mincore_privileged sysctl, which makes it possible to mincore() > start returning -EPERM in case it's invoked by a process lacking > CAP_SYS_ADMIN. Haven't checked the details yet, but wouldn't it be safe if anonymous private mincore() kept working, and restrictions were applied only to page cache? > The default behavior stays "mincore() can be used by anybody" in order to > be conservative with respect to userspace behavior. What if we lied instead of returned -EPERM, to not break userspace so obviously? I guess false positive would be the safer lie? > [1] https://www.theregister.co.uk/2019/01/05/boffins_beat_page_cache/ > > Signed-off-by: Jiri Kosina > --- > Documentation/sysctl/vm.txt | 9 +++++++++ > kernel/sysctl.c | 8 ++++++++ > mm/mincore.c | 5 +++++ > 3 files changed, 22 insertions(+) > > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt > index 187ce4f599a2..afb8635e925e 100644 > --- a/Documentation/sysctl/vm.txt > +++ b/Documentation/sysctl/vm.txt > @@ -41,6 +41,7 @@ Currently, these files are in /proc/sys/vm: > - min_free_kbytes > - min_slab_ratio > - min_unmapped_ratio > +- mincore_privileged > - mmap_min_addr > - mmap_rnd_bits > - mmap_rnd_compat_bits > @@ -485,6 +486,14 @@ files and similar are considered. > The default is 1 percent. > > ============================================================== > +mincore_privileged: > + > +mincore() could be potentially used to mount a side-channel attack against > +pagecache metadata. This sysctl provides system administrators means to > +make it available only to processess that own CAP_SYS_ADMIN capability. > + > +The default is 0, which means mincore() can be used without restrictions. > +============================================================== > > mmap_min_addr > > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index 1825f712e73b..f03cb07c8dd4 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -114,6 +114,7 @@ extern unsigned int sysctl_nr_open_min, sysctl_nr_open_max; > #ifndef CONFIG_MMU > extern int sysctl_nr_trim_pages; > #endif > +extern int sysctl_mincore_privileged; > > /* Constants used for minimum and maximum */ > #ifdef CONFIG_LOCKUP_DETECTOR > @@ -1684,6 +1685,13 @@ static struct ctl_table vm_table[] = { > .extra2 = (void *)&mmap_rnd_compat_bits_max, > }, > #endif > + { > + .procname = "mincore_privileged", > + .data = &sysctl_mincore_privileged, > + .maxlen = sizeof(sysctl_mincore_privileged), > + .mode = 0644, > + .proc_handler = proc_dointvec, > + }, > { } > }; > > diff --git a/mm/mincore.c b/mm/mincore.c > index 218099b5ed31..77d4928cdfaa 100644 > --- a/mm/mincore.c > +++ b/mm/mincore.c > @@ -21,6 +21,8 @@ > #include > #include > > +int sysctl_mincore_privileged; > + > static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr, > unsigned long end, struct mm_walk *walk) > { > @@ -228,6 +230,9 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, > unsigned long pages; > unsigned char *tmp; > > + if (sysctl_mincore_privileged && !capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > /* Check the start address: needs to be page-aligned.. */ > if (start & ~PAGE_MASK) > return -EINVAL; >