Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4884481ybl; Wed, 22 Jan 2020 06:27:36 -0800 (PST) X-Google-Smtp-Source: APXvYqxRrh1c1qZN3bd9SCb9J+lW14ZwhVFDaOAoW85UaBGIre8LZNEa/8ly3nC4i0UBzt/n+vHn X-Received: by 2002:a9d:6196:: with SMTP id g22mr7770569otk.204.1579703256176; Wed, 22 Jan 2020 06:27:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579703256; cv=none; d=google.com; s=arc-20160816; b=F+C3m6wv7ZMhtTFZp8302AWi/x546rhIMtZfroaP/5GTqdh5w0gfHLpjoLsW3LPGcN dDhR52GkJS4rfDw77GfnSTZiLs2Ro8qtod0enL/koqihrByKbuLiRL7emBDMB8/z+WLi E0fnQz3+ChFM0Ys6UYcy1X3Kr6ld5Nl5T9VxbAkL3sCP3QTgca2aeaXMyC/WwTpL3ane RZCj6drrWsAq/FxSU9lDQMmoo6YfudSCVaeM+f9OV2S74lrSDAFbzJe+AbKhQYJMAqWC 6SIqV1ozdoIepoYj9LwtkOQDUwWOYnVlc0hnx/58rqSxLq0NrtlZRm09zRWL/GyDnkgX vwqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject; bh=XgnQMVD90hcaWKSpJSph10b3s0MKn1eQW3a0NVJaWLU=; b=iSH5A245OFsT43Cou0Wpz+ZOBkbhkQHD1QSIkrT/NYt785flsm9ZNEbjL8dwfihqvx mbZbDR5PUqfC4ccKVXMcLEnDwAEbXeaOfGQBM0xqXUOaAyr5AvDVlEEHWT6AKxaGPS/Y r5/ITLoFUYdkc/B+DfxWFlxPOwdk7MdtYE/7ErN1kca/souQPr2gyLhXGFdGBLprvx7d 0OIwhQmyfiP07jCGZQ5uj+65MTtohBJ0zoLX7IHt9gKqy82ZDfMHGRqNgTgftiZtU986 MC2/bAZZRJuBE/Lfoc/SH+9GIv1NQruKi2iL5Gcd64odNGaQZUoOibIyX8vohskWmD8u yDug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w19si23432372otj.209.2020.01.22.06.27.23; Wed, 22 Jan 2020 06:27:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729008AbgAVO0P (ORCPT + 99 others); Wed, 22 Jan 2020 09:26:15 -0500 Received: from mga07.intel.com ([134.134.136.100]:7295 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728205AbgAVO0O (ORCPT ); Wed, 22 Jan 2020 09:26:14 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Jan 2020 06:25:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,350,1574150400"; d="scan'208";a="259481099" Received: from linux.intel.com ([10.54.29.200]) by fmsmga002.fm.intel.com with ESMTP; 22 Jan 2020 06:25:20 -0800 Received: from [10.252.5.6] (unknown [10.252.5.6]) by linux.intel.com (Postfix) with ESMTP id C8A54580100; Wed, 22 Jan 2020 06:25:10 -0800 (PST) Subject: Re: [PATCH v5 01/10] capabilities: introduce CAP_PERFMON to kernel and user space To: Stephen Smalley , Alexei Starovoitov Cc: Peter Zijlstra , Arnaldo Carvalho de Melo , Ingo Molnar , "jani.nikula@linux.intel.com" , "joonas.lahtinen@linux.intel.com" , "rodrigo.vivi@intel.com" , "benh@kernel.crashing.org" , Paul Mackerras , Michael Ellerman , "james.bottomley@hansenpartnership.com" , Serge Hallyn , James Morris , Will Deacon , Mark Rutland , Robert Richter , Alexei Starovoitov , Jiri Olsa , Andi Kleen , Stephane Eranian , Igor Lubashev , Alexander Shishkin , Namhyung Kim , Song Liu , Lionel Landwerlin , Thomas Gleixner , linux-kernel , "linux-security-module@vger.kernel.org" , "selinux@vger.kernel.org" , "intel-gfx@lists.freedesktop.org" , "linux-parisc@vger.kernel.org" , "linuxppc-dev@lists.ozlabs.org" , linux-arm-kernel , "linux-perf-users@vger.kernel.org" , oprofile-list@lists.sf.net, Andy Lutomirski References: <0548c832-7f4b-dc4c-8883-3f2b6d351a08@linux.intel.com> <9b77124b-675d-5ac7-3741-edec575bd425@linux.intel.com> <64cab472-806e-38c4-fb26-0ffbee485367@tycho.nsa.gov> <05297eff-8e14-ccdf-55a4-870c64516de8@linux.intel.com> <537bdb28-c9e4-f44f-d665-25250065a6bb@linux.intel.com> <63d9700f-231d-7973-5307-3e56a48c54cb@linux.intel.com> From: Alexey Budankov Organization: Intel Corp. Message-ID: Date: Wed, 22 Jan 2020 17:25:09 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 22.01.2020 17:07, Stephen Smalley wrote: > On 1/22/20 5:45 AM, Alexey Budankov wrote: >> >> On 21.01.2020 21:27, Alexey Budankov wrote: >>> >>> On 21.01.2020 20:55, Alexei Starovoitov wrote: >>>> On Tue, Jan 21, 2020 at 9:31 AM Alexey Budankov >>>> wrote: >>>>> >>>>> >>>>> On 21.01.2020 17:43, Stephen Smalley wrote: >>>>>> On 1/20/20 6:23 AM, Alexey Budankov wrote: >>>>>>> >>>>>>> Introduce CAP_PERFMON capability designed to secure system performance >>>>>>> monitoring and observability operations so that CAP_PERFMON would assist >>>>>>> CAP_SYS_ADMIN capability in its governing role for perf_events, i915_perf >>>>>>> and other performance monitoring and observability subsystems. >>>>>>> >>>>>>> CAP_PERFMON intends to harden system security and integrity during system >>>>>>> performance monitoring and observability operations by decreasing attack >>>>>>> surface that is available to a CAP_SYS_ADMIN privileged process [1]. >>>>>>> Providing access to system performance monitoring and observability >>>>>>> operations under CAP_PERFMON capability singly, without the rest of >>>>>>> CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials and >>>>>>> makes operation more secure. >>>>>>> >>>>>>> CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to >>>>>>> system performance monitoring and observability operations and balance >>>>>>> amount of CAP_SYS_ADMIN credentials following the recommendations in the >>>>>>> capabilities man page [1] for CAP_SYS_ADMIN: "Note: this capability is >>>>>>> overloaded; see Notes to kernel developers, below." >>>>>>> >>>>>>> Although the software running under CAP_PERFMON can not ensure avoidance >>>>>>> of related hardware issues, the software can still mitigate these issues >>>>>>> following the official embargoed hardware issues mitigation procedure [2]. >>>>>>> The bugs in the software itself could be fixed following the standard >>>>>>> kernel development process [3] to maintain and harden security of system >>>>>>> performance monitoring and observability operations. >>>>>>> >>>>>>> [1] http://man7.org/linux/man-pages/man7/capabilities.7.html >>>>>>> [2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html >>>>>>> [3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html >>>>>>> >>>>>>> Signed-off-by: Alexey Budankov >>>>>>> --- >>>>>>>    include/linux/capability.h          | 12 ++++++++++++ >>>>>>>    include/uapi/linux/capability.h     |  8 +++++++- >>>>>>>    security/selinux/include/classmap.h |  4 ++-- >>>>>>>    3 files changed, 21 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/include/linux/capability.h b/include/linux/capability.h >>>>>>> index ecce0f43c73a..8784969d91e1 100644 >>>>>>> --- a/include/linux/capability.h >>>>>>> +++ b/include/linux/capability.h >>>>>>> @@ -251,6 +251,18 @@ extern bool privileged_wrt_inode_uidgid(struct user_namespace *ns, const struct >>>>>>>    extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap); >>>>>>>    extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap); >>>>>>>    extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns); >>>>>>> +static inline bool perfmon_capable(void) >>>>>>> +{ >>>>>>> +    struct user_namespace *ns = &init_user_ns; >>>>>>> + >>>>>>> +    if (ns_capable_noaudit(ns, CAP_PERFMON)) >>>>>>> +        return ns_capable(ns, CAP_PERFMON); >>>>>>> + >>>>>>> +    if (ns_capable_noaudit(ns, CAP_SYS_ADMIN)) >>>>>>> +        return ns_capable(ns, CAP_SYS_ADMIN); >>>>>>> + >>>>>>> +    return false; >>>>>>> +} >>>>>> >>>>>> Why _noaudit()?  Normally only used when a permission failure is non-fatal to the operation.  Otherwise, we want the audit message. >> >> So far so good, I suggest using the simplest version for v6: >> >> static inline bool perfmon_capable(void) >> { >>     return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN); >> } >> >> It keeps the implementation simple and readable. The implementation is more >> performant in the sense of calling the API - one capable() call for CAP_PERFMON >> privileged process. >> >> Yes, it bloats audit log for CAP_SYS_ADMIN privileged and unprivileged processes, >> but this bloating also advertises and leverages using more secure CAP_PERFMON >> based approach to use perf_event_open system call. > > I can live with that.  We just need to document that when you see both a CAP_PERFMON and a CAP_SYS_ADMIN audit message for a process, try only allowing CAP_PERFMON first and see if that resolves the issue.  We have a similar issue with CAP_DAC_READ_SEARCH versus CAP_DAC_OVERRIDE. perf security [1] document can be updated, at least, to align and document this audit logging specifics. ~Alexey [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html