Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5132838ybl; Tue, 14 Jan 2020 04:07:48 -0800 (PST) X-Google-Smtp-Source: APXvYqzS1d3czmkjlH8KXpg8vtaBAJebictg//Cgy4MuYUiO6e/PXOipwImoCrYeumYopOGb08Y+ X-Received: by 2002:a9d:75c5:: with SMTP id c5mr16776241otl.172.1579003668739; Tue, 14 Jan 2020 04:07:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579003668; cv=none; d=google.com; s=arc-20160816; b=JZvz+cD3BG0qJKoNZLpKJvwDCdrmA07JgNjRMYXlQB4teeAQVoEQor8inzVkLk1+fD 9/XORXAsSzWLg1J1ImLnmPZmpgHYVU2DuyPHU5WdSKi84mZ3TRZf3npEx7H3bZ+irt3e wjGHXEkIO0GNlxrCthHu5CZTGlbAnwpgLfrPmHOZ+RjUvTAjkwAPgjljHs6BvcVvSZ2q s4QFOUHAM5uHokU1EM5KoRhb0jIeuxWDrO2+/JALmNNwl02Dt2WLocD3R3TLG7fpI1i0 zJ7Oi27TWMSFSo066lR6s/LkYHK5B6mpvdTU4naqlxKyUuxB5KE7yHOzcegpC69LbWGh pO6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=8xSY+wo6CeR9nlWL+yMVPlBkVm9BpcZz+KxIm4WInuc=; b=h0AWgZGAGnflBHr4SuFrKs+QrSZIKlBXT8AzkDdWW4AAwmKBIWXB5wvXjjo3EQwaHG wuKEwQVFXaGPAUgRd0E53KJaFeiz78N1Y/qfZ4nYyAvxNIo+ZL5K39iKyC+wSRCH1Hlh gh6ClDtKDLpsBCQoH2CiguXfiTbjizc2I34HqLMPSrLzQcrtXcPaw31PCroNE+ryGoxC 8s1MFLcMYOAUrA08FSJRcjNQeJ/of6w3P01tY2hL6ZoT9IhmU+FGDfQjNLld4D2OkuKo 7brWatkac8sWDDN0cpwypwq7gbDp+9MS+oifLPAuRDquqgpQu0cljvnBI0Rj4hkXVpih Fi+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qKMRyYqn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p20si11358250otr.319.2020.01.14.04.07.30; Tue, 14 Jan 2020 04:07:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=qKMRyYqn; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729360AbgANMFF (ORCPT + 99 others); Tue, 14 Jan 2020 07:05:05 -0500 Received: from mail.kernel.org ([198.145.29.99]:47244 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725956AbgANMFF (ORCPT ); Tue, 14 Jan 2020 07:05:05 -0500 Received: from devnote2 (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8D59F2467A; Tue, 14 Jan 2020 12:04:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579003503; bh=zy01w1f2KOWBgVwZ9IgRyq6zJNbdRHGb5iIvsu0kB+U=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=qKMRyYqnCxilxstSlRIO22OfQQZ8Rzr7gF1y0v2MqMJsPCIOHpGKM1cs1xZiTy6mB UFDCP6HYzBYt/SIEphcB2LTNCFDzyeZiVCZH2ftM8/HnqENr9WdDC9aYd74mWuO+Cr 2dgPTZ6aEgvQlFknXCdbkrMsC8sIVnKNxZTK0umU= Date: Tue, 14 Jan 2020 21:04:56 +0900 From: Masami Hiramatsu To: Alexei Starovoitov Cc: Alexey Budankov , Arnaldo Carvalho de Melo , Song Liu , Peter Zijlstra , Ingo Molnar , "jani.nikula@linux.intel.com" , "joonas.lahtinen@linux.intel.com" , "rodrigo.vivi@intel.com" , Alexei Starovoitov , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , "james.bottomley@hansenpartnership.com" , Serge Hallyn , James Morris , Will Deacon , Mark Rutland , Casey Schaufler , Robert Richter , Jiri Olsa , Andi Kleen , Stephane Eranian , Igor Lubashev , Alexander Shishkin , Namhyung Kim , linux-kernel Subject: Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process Message-Id: <20200114210456.c9e098d18ccb77cdf6b6c633@kernel.org> In-Reply-To: References: <20200108160713.GI2844@hirez.programming.kicks-ass.net> <20200110140234.GO2844@hirez.programming.kicks-ass.net> <20200111005213.6dfd98fb36ace098004bde0e@kernel.org> <20200110164531.GA2598@kernel.org> <20200111084735.0ff01c758bfbfd0ae2e1f24e@kernel.org> <2B79131A-3F76-47F5-AAB4-08BCA820473F@fb.com> <5e191833.1c69fb81.8bc25.a88c@mx.google.com> <158a4033-f8d6-8af7-77b0-20e62ec913b0@linux.intel.com> <20200114122506.3cf442dc189a649d4736f86e@kernel.org> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 13 Jan 2020 21:17:49 -0800 Alexei Starovoitov wrote: > On Mon, Jan 13, 2020 at 7:25 PM Masami Hiramatsu wrote: > > > > On Sat, 11 Jan 2020 12:57:18 +0300 > > Alexey Budankov wrote: > > > > > > > > On 11.01.2020 3:35, arnaldo.melo@gmail.com wrote: > > > > > > Message-ID: > > > > > > > > On January 10, 2020 9:23:27 PM GMT-03:00, Song Liu wrote: > > > >> > > > >> > > > >>> On Jan 10, 2020, at 3:47 PM, Masami Hiramatsu > > > >> wrote: > > > >>> > > > >>> On Fri, 10 Jan 2020 13:45:31 -0300 > > > >>> Arnaldo Carvalho de Melo wrote: > > > >>> > > > >>>> Em Sat, Jan 11, 2020 at 12:52:13AM +0900, Masami Hiramatsu escreveu: > > > >>>>> On Fri, 10 Jan 2020 15:02:34 +0100 Peter Zijlstra > > > >> wrote: > > > >>>>>> Again, this only allows attaching to previously created kprobes, > > > >> it does > > > >>>>>> not allow creating kprobes, right? > > > >>>> > > > >>>>>> That is; I don't think CAP_SYS_PERFMON should be allowed to create > > > >>>>>> kprobes. > > > >>>> > > > >>>>>> As might be clear; I don't actually know what the user-ABI is for > > > >>>>>> creating kprobes. > > > >>>> > > > >>>>> There are 2 ABIs nowadays, ftrace and ebpf. perf-probe uses ftrace > > > >> interface to > > > >>>>> define new kprobe events, and those events are treated as > > > >> completely same as > > > >>>>> tracepoint events. On the other hand, ebpf tries to define new > > > >> probe event > > > >>>>> via perf_event interface. Above one is that interface. IOW, it > > > >> creates new kprobe. > > > >>>> > > > >>>> Masami, any plans to make 'perf probe' use the perf_event_open() > > > >>>> interface for creating kprobes/uprobes? > > > >>> > > > >>> Would you mean perf probe to switch to perf_event_open()? > > > >>> No, perf probe is for setting up the ftrace probe events. I think we > > > >> can add an > > > >>> option to use perf_event_open(). But current kprobe creation from > > > >> perf_event_open() > > > >>> is separated from ftrace by design. > > > >> > > > >> I guess we can extend event parser to understand kprobe directly. > > > >> Instead of > > > >> > > > >> perf probe kernel_func > > > >> perf stat/record -e probe:kernel_func ... > > > >> > > > >> We can just do > > > >> > > > >> perf stat/record -e kprobe:kernel_func ... > > > > > > > > > > > > You took the words from my mouth, exactly, that is a perfect use case, an alternative to the 'perf probe' one of making a disabled event that then gets activated via record/stat/trace, in many cases it's better, removes the explicit probe setup case. > > > > > > Arnaldo, Masami, Song, > > > > > > What do you think about making this also open to CAP_SYS_PERFMON privileged processes? > > > Could you please also review and comment on patch 5/9 for bpf_trace.c? > > > > As we talked at RFC series of CAP_SYS_TRACING last year, I just expected > > to open it for enabling/disabling kprobes, not for creation. > > > > If we can accept user who has no admin priviledge but the CAP_SYS_PERFMON, > > to shoot their foot by their own risk, I'm OK to allow it. (Even though, > > it should check the max number of probes to be created by something like > > ulimit) > > I think nowadays we have fixed all such kernel crash problems on x86, > > but not sure for other archs, especially on the devices I can not reach. > > I need more help to stabilize it. > > I don't see how enable/disable is any safer than creation. > If there are kernel bugs in kprobes the kernel will crash anyway. Why? admin can test the probes before using it via bpf. My point was only admin can make a dicision to allow (or delegate) the priviledge to a user, and if it is OK, I don't mind it. (Maybe it is better to give a knob to allow this CAP only for admin.) > I think such partial CAP_SYS_PERFMON would be very confusing to the users. > CAP_* is about delegation of root privileges to non-root. > Delegating some of it is ok, but disallowing creation makes it useless > for bpf tracing, so we would need to add another CAP later. > Hence I suggest to do it right away instead of breaking > sys_perf_even_open() access into two CAPs. I understand that the single strong CAP will useful anyway (even if it is CAP_SYS_ADMIN). I just concern that causes any issue and when someone wants to mitigate it, it is sad if there is only way to disable all tracing facilities. What about providing a sysctl to control the power of the CAP? maybe it is also good from the viewpoint of system security. Thank you, -- Masami Hiramatsu