Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp6152696imm; Wed, 27 Jun 2018 03:10:05 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd7cvpXprYVQqxtjQoMA3Jvvi4igNR5yumZhk/ABeHWKj5QOHIyrELUrqtoOD5MLkgkmIFW X-Received: by 2002:a62:6941:: with SMTP id e62-v6mr5279243pfc.56.1530094205224; Wed, 27 Jun 2018 03:10:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530094205; cv=none; d=google.com; s=arc-20160816; b=WAISV6ew/vBNE5RqJnlWfCu40tAs9axE0YMTEAax58G5Ax+UGMCwbX+KHF5TYfN8xC M+Z+Q6po3SpTekHsW8gsVxClUjMfytiIphnsf4PiDcenqK/DlCungEFK4oVynDzynQqm 1dai0k5vDeXX9RrHo/OVNFif/Gdmx0OzPkmsSpsPR8ROfeS0Bve8ilMeyQQRA8hZsTNy a6oos7vVMwf74TYaoFDTC9HRQCa2lUxfj3vizjNHqgH+SHVKCwZV/AdAxSHcqqp9gK8H E/cS3RaU+5rx/TPcVKNXW8Vkje7MDlHBt1u6qLEtpr6MxCzRwMtQVjuMJsKyHyisJt/j pkDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=I34CbNlNTG1+Gk5M9JPAFi+PoeyjEb7vc3iIlsMpeiw=; b=wCvfYMMj66ab+QpJ34cJTOq/NkKZge5GCjLdwIfugogiav9S8m2Qsg5YNQQ/8d7H32 WxVN1vnzdZgk8dMgGXIE99hrX+O+6C0EgvuUibwkAEA7Iv+rz/BPd+LOyj7e7h9r0KPw SPX8z/fuj0+dYq85E+vTeD4VhTVeyv5jG+2zi1d0qplrRnDdiCa3+ZvPsqybdXZgyW8t PB9qwN9rR4445169o03vPCvERHT9BF1mWuqt7nRPXhPGxfZ+M7fFdGRR2TOfX/dPU+8F 6Lk/Yp1REDMKueosjFPLQQ2X8v+uUYVY/X3An70BNFppbhsMdI3CGvkjAcecwLDs1luX U16Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s65-v6si3779087pfe.290.2018.06.27.03.09.48; Wed, 27 Jun 2018 03:10:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934133AbeF0JrT (ORCPT + 99 others); Wed, 27 Jun 2018 05:47:19 -0400 Received: from mga06.intel.com ([134.134.136.31]:27374 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932135AbeF0JrQ (ORCPT ); Wed, 27 Jun 2018 05:47:16 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Jun 2018 02:47:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,278,1526367600"; d="scan'208";a="62571561" Received: from linux.intel.com ([10.54.29.200]) by orsmga003.jf.intel.com with ESMTP; 27 Jun 2018 02:47:15 -0700 Received: from [10.125.252.135] (abudanko-mobl.ccr.corp.intel.com [10.125.252.135]) by linux.intel.com (Postfix) with ESMTP id AE44B58011E; Wed, 27 Jun 2018 02:47:12 -0700 (PDT) Subject: Re: [RFC 3/4] perf: Allow per PMU access control To: Tvrtko Ursulin , linux-kernel@vger.kernel.org Cc: Tvrtko Ursulin , Thomas Gleixner , Peter Zijlstra , Ingo Molnar , "H. Peter Anvin" , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Madhavan Srinivasan , Andi Kleen , x86@kernel.org References: <20180626153642.5587-1-tvrtko.ursulin@linux.intel.com> <20180626153642.5587-4-tvrtko.ursulin@linux.intel.com> <59478e35-4ac9-7be5-172a-c8301d9e11fd@ursulin.net> From: Alexey Budankov Message-ID: <2d389cf9-e284-0557-159a-e70a758d357b@linux.intel.com> Date: Wed, 27 Jun 2018 12:47:10 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <59478e35-4ac9-7be5-172a-c8301d9e11fd@ursulin.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27.06.2018 12:15, Tvrtko Ursulin wrote: > > On 26/06/18 18:25, Alexey Budankov wrote: >> Hi, >> >> On 26.06.2018 18:36, Tvrtko Ursulin wrote: >>> From: Tvrtko Ursulin >>> >>> For situations where sysadmins might want to allow different level of >>> access control for different PMUs, we start creating per-PMU >>> perf_event_paranoid controls in sysfs. >>> >>> These work in equivalent fashion as the existing perf_event_paranoid >>> sysctl, which now becomes the parent control for each PMU. >>> >>> On PMU registration the global/parent value will be inherited by each PMU, >>> as it will be propagated to all registered PMUs when the sysctl is >>> updated. >>> >>> At any later point individual PMU access controls, located in >>> /device//perf_event_paranoid, can be adjusted to achieve >>> fine grained access control. >>> >>> Signed-off-by: Tvrtko Ursulin >>> Cc: Thomas Gleixner >>> Cc: Peter Zijlstra >>> Cc: Ingo Molnar >>> Cc: "H. Peter Anvin" >>> Cc: Arnaldo Carvalho de Melo >>> Cc: Alexander Shishkin >>> Cc: Jiri Olsa >>> Cc: Namhyung Kim >>> Cc: Madhavan Srinivasan >>> Cc: Andi Kleen >>> Cc: Alexey Budankov >>> Cc: linux-kernel@vger.kernel.org >>> Cc: x86@kernel.org >>> --- >>>   include/linux/perf_event.h | 12 ++++++-- >>>   kernel/events/core.c       | 59 ++++++++++++++++++++++++++++++++++++++ >>>   kernel/sysctl.c            |  4 ++- >>>   3 files changed, 71 insertions(+), 4 deletions(-) >>> >>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h >>> index d7938d88c028..22e91cc2d9e1 100644 >>> --- a/include/linux/perf_event.h >>> +++ b/include/linux/perf_event.h >>> @@ -271,6 +271,9 @@ struct pmu { >>>       /* number of address filters this PMU can do */ >>>       unsigned int            nr_addr_filters; >>>   +    /* per PMU access control */ >>> +    int                perf_event_paranoid; >> >> It looks like it needs to be declared as atomic and atomic_read/atomic_write >> operations need to be explicitly used below in the patch as far this >> variable may be manipulated by different threads at the same time >> without explicit locking. > > It is just a write of an integer from either sysfs access or sysctl. As such I don't think going atomic would change anything. There is no RMW or increment or anything on it. > > Unless there are architectures where int stores are not atomic? But then the existing sysctl would have the same issue. So I suspect we can indeed rely on int store being atomic. Yep, aligned word read/write is atomic on Intel and there is no runtime issue currently, but the implementation itself is multithread and implicitly relies on that atomicity so my suggestion is just explicitly code that assumption :). Also, as you mentioned, that makes the arch independent part of code more portable. > > Regards, > > Tvrtko > >> >>> + >>>       /* >>>        * Fully disable/enable this PMU, can be used to protect from the PMI >>>        * as well as for lazy/batch writing of the MSRs. >>> @@ -1168,6 +1171,9 @@ extern int sysctl_perf_cpu_time_max_percent; >>>     extern void perf_sample_event_took(u64 sample_len_ns); >>>   +extern int perf_proc_paranoid_handler(struct ctl_table *table, int write, >>> +        void __user *buffer, size_t *lenp, >>> +        loff_t *ppos); >>>   extern int perf_proc_update_handler(struct ctl_table *table, int write, >>>           void __user *buffer, size_t *lenp, >>>           loff_t *ppos); >>> @@ -1180,17 +1186,17 @@ int perf_event_max_stack_handler(struct ctl_table *table, int write, >>>     static inline bool perf_paranoid_tracepoint_raw(const struct pmu *pmu) >>>   { >>> -    return sysctl_perf_event_paranoid > -1; >>> +    return pmu->perf_event_paranoid > -1; >>>   } >>>     static inline bool perf_paranoid_cpu(const struct pmu *pmu) >>>   { >>> -    return sysctl_perf_event_paranoid > 0; >>> +    return pmu->perf_event_paranoid > 0; >>>   } >>>     static inline bool perf_paranoid_kernel(const struct pmu *pmu) >>>   { >>> -    return sysctl_perf_event_paranoid > 1; >>> +    return pmu->perf_event_paranoid > 1; >>>   } >>>     extern void perf_event_init(void); >>> diff --git a/kernel/events/core.c b/kernel/events/core.c >>> index 370c89e81722..da36317dc8dc 100644 >>> --- a/kernel/events/core.c >>> +++ b/kernel/events/core.c >>> @@ -432,6 +432,24 @@ static void update_perf_cpu_limits(void) >>>     static bool perf_rotate_context(struct perf_cpu_context *cpuctx); >>>   +int perf_proc_paranoid_handler(struct ctl_table *table, int write, >>> +        void __user *buffer, size_t *lenp, >>> +        loff_t *ppos) >>> +{ >>> +    int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); >>> +    struct pmu *pmu; >>> + >>> +    if (ret || !write) >>> +        return ret; >>> + >>> +    mutex_lock(&pmus_lock); >>> +    list_for_each_entry(pmu, &pmus, entry) >>> +        pmu->perf_event_paranoid = sysctl_perf_event_paranoid; >>> +    mutex_unlock(&pmus_lock); >>> + >>> +    return 0; >>> +} >>> + >>>   int perf_proc_update_handler(struct ctl_table *table, int write, >>>           void __user *buffer, size_t *lenp, >>>           loff_t *ppos) >>> @@ -9425,6 +9443,41 @@ static void free_pmu_context(struct pmu *pmu) >>>       mutex_unlock(&pmus_lock); >>>   } >>>   +/* >>> + * Fine-grained access control: >>> + */ >>> +static ssize_t >>> +perf_event_paranoid_show(struct device *dev, >>> +             struct device_attribute *attr, >>> +             char *page) >>> +{ >>> +    struct pmu *pmu = dev_get_drvdata(dev); >>> + >>> +    return snprintf(page, PAGE_SIZE - 1, "%d\n", pmu->perf_event_paranoid); >>> +} >>> + >>> +static ssize_t >>> +perf_event_paranoid_store(struct device *dev, >>> +              struct device_attribute *attr, >>> +              const char *buf, size_t count) >>> +{ >>> +    struct pmu *pmu = dev_get_drvdata(dev); >>> +    int ret, val; >>> + >>> +    ret = kstrtoint(buf, 0, &val); >>> +    if (ret) >>> +        return ret; >>> + >>> +    if (val < -1 || val > 2) >>> +        return -EINVAL; >>> + >>> +    pmu->perf_event_paranoid = val; >>> + >>> +    return count; >>> +} >>> + >>> +static DEVICE_ATTR_RW(perf_event_paranoid); >>> + >>>   /* >>>    * Let userspace know that this PMU supports address range filtering: >>>    */ >>> @@ -9539,6 +9592,11 @@ static int pmu_dev_alloc(struct pmu *pmu) >>>       if (ret) >>>           goto free_dev; >>>   +    /* Add fine-grained access control attribute. */ >>> +    ret = device_create_file(pmu->dev, &dev_attr_perf_event_paranoid); >>> +    if (ret) >>> +        goto del_dev; >>> + >>>       /* For PMUs with address filters, throw in an extra attribute: */ >>>       if (pmu->nr_addr_filters) >>>           ret = device_create_file(pmu->dev, &dev_attr_nr_addr_filters); >>> @@ -9570,6 +9628,7 @@ int perf_pmu_register(struct pmu *pmu, const char *name, int type) >>>       if (!pmu->pmu_disable_count) >>>           goto unlock; >>>   +    pmu->perf_event_paranoid = sysctl_perf_event_paranoid; >>>       pmu->type = -1; >>>       if (!name) >>>           goto skip_type; >>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c >>> index 2d9837c0aff4..7f6fccb64a30 100644 >>> --- a/kernel/sysctl.c >>> +++ b/kernel/sysctl.c >>> @@ -1142,7 +1142,9 @@ static struct ctl_table kern_table[] = { >>>           .data        = &sysctl_perf_event_paranoid, >>>           .maxlen        = sizeof(sysctl_perf_event_paranoid), >>>           .mode        = 0644, >>> -        .proc_handler    = proc_dointvec, >>> +        .proc_handler   = perf_proc_paranoid_handler, >>> +        .extra1         = &neg_one, >>> +        .extra2         = &two, >>>       }, >>>       { >>>           .procname    = "perf_event_mlock_kb", >>> >> >