Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp399537imm; Thu, 14 Jun 2018 23:03:52 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLiGGPnaaGJl2Kq9SiklKTlglW/D93pF7nUYjkE92rBCN39Lo4gFPElRtwsvxiEr9maJOsL X-Received: by 2002:a17:902:2d24:: with SMTP id o33-v6mr425238plb.14.1529042632838; Thu, 14 Jun 2018 23:03:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529042632; cv=none; d=google.com; s=arc-20160816; b=Rj4pkYIMs+xrjmm/mLTZu3GtTx9PgNDK0QLVix6SRGEGkjqykeRof94y7QUTQgROm2 WyQQ6DQngoY3oZUxsOGzGica/VOP9aRAVRZEz5g5mlpYhY9jxP84Z/eWr4458YZhA3f6 tiZ9ijQBNFlteso5PvQCz9vURE8kiQl5LKrnYI3G5XF0sLcUBTbo7pIlpSIjKLOmiKzq Ylxh9CLNjhYyM2mz/Vxz7NqbmRZ4nfRFO3bH4oAmsXNU7iy2GfTlh9StGNri1KyCbPTd f2ys93KBouK8kXVgt3ehAqa46DcCWsMVdZtWaC26vXp4ei2DvStCnA2mNQtBdf3SKbku sQYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=oykycXql9hS3J2tKYmAyP4wcJsVb+iJQ4H6LRbXGV34=; b=QVy9bikNC1ELXoIWFM7r4RlBow2fBuc2NwMd7qqNG9AhmOLzqQ6yRUZ43pDrgfv2nW e2TOOAPXuv0KUO3e79NEZkyQgcNDL3z7pSIwpmF4D2ufKXoWVxHcVENFDH4Jl5N/PgSU nHW5eUkN3vbe5an8ahcby7ep8VM5T0nvvCvzb2aUwRUednrd1a+YoM9jqIAMT1IULDsw n+l3vdMW1KvjgFy476O8pvknBSkGk5bpAJeZREmWPFKPub7GcCWy1qsiOG166aUtzgrm Ebihx4JEQck027QapQOHqRUB1ZfKc4Lf1gpM/7zTeY/R8D0TelTPZdHcLbPsHpk+KrdF ifGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=X4Bpmf1O; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d32-v6si3690222pla.329.2018.06.14.23.03.37; Thu, 14 Jun 2018 23:03:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=X4Bpmf1O; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755656AbeFOGDI (ORCPT + 99 others); Fri, 15 Jun 2018 02:03:08 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:39451 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755607AbeFOGDG (ORCPT ); Fri, 15 Jun 2018 02:03:06 -0400 Received: by mail-wr0-f194.google.com with SMTP id w7-v6so8675771wrn.6 for ; Thu, 14 Jun 2018 23:03:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oykycXql9hS3J2tKYmAyP4wcJsVb+iJQ4H6LRbXGV34=; b=X4Bpmf1O88+yomKC+lrJijX6hPqwVIOb+OYOChSs9Y7m1OTWpyTaWPBFpUUl3+onNv n21QnYxmX2w9PlZz08NMDGfSuRot5oOMFSeSPGFpw0NZacHtHjEv054Z/HawAlF64rl2 bFWcAqb9gkPvPbrMXOrg5EWwcvRcZL/LptLJSfgMKyu0eEHa1fFjKVycT7BRJDpX7lTG LEhL+3eOLGPhjg2t9AkooyHi5FaozHtoJu4GGV44Txlrrr2bn3Qtbh50tFIEKQtPssHz sjgyz89f3w2nJ6kj+hAwQiatQZjcLlhiBZOq1f5C3UD38Vues6MCw+HMjJSACChgWqJA 2VqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oykycXql9hS3J2tKYmAyP4wcJsVb+iJQ4H6LRbXGV34=; b=quXk8RGaBYMTb8GO2o0PNp3M8pQw66q1NctqhOTo+n++/pw1oZoLu+BqWr2yJbPDhz BBXXknOiAnX9bkSUIDDZzMzix3LuXVkEStKTpbUW277qPVTDW1h5sCQA892tEqOh6ZRj LKkp0l0bB6ylsqDzLkVydotJkies8mgJLuHHy8xlb8upE3Q5co/NRJPpVmNs737atKQE RHpbjhIMF2pkR6xWREChcrr/lDvChKcLbqPyFKjFh7elKsxoSmrnAJQujq/hDWR06n1j 5JeBndehpnLPhpNx/z4eWTxVgSenm8Bj2oNmwkd8oClGMBj/wnZvX+y8xN1U8GjYk2dF A9UQ== X-Gm-Message-State: APt69E1u1m4FT053lTqrKAx1DKozs9NPmZs/rVDyv7Z08k0FI3SZgjcg njKbNZE21oMUj6I/fGOSmweK1KBhzcDynG5u4h17dw== X-Received: by 2002:adf:a0ee:: with SMTP id n43-v6mr315226wrn.23.1529042585375; Thu, 14 Jun 2018 23:03:05 -0700 (PDT) MIME-Version: 1.0 References: <1529057003-2212-1-git-send-email-yao.jin@linux.intel.com> <1529057003-2212-2-git-send-email-yao.jin@linux.intel.com> In-Reply-To: <1529057003-2212-2-git-send-email-yao.jin@linux.intel.com> From: Stephane Eranian Date: Thu, 14 Jun 2018 23:02:53 -0700 Message-ID: Subject: Re: [PATCH v1 1/2] perf/core: Use sysctl to turn on/off dropping leaked kernel samples To: yao.jin@linux.intel.com Cc: Arnaldo Carvalho de Melo , Jiri Olsa , Peter Zijlstra , Ingo Molnar , Alexander Shishkin , me@kylehuey.com, LKML , Vince Weaver , Will Deacon , Namhyung Kim , Andi Kleen , "Liang, Kan" , "Jin, Yao" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 14, 2018 at 7:10 PM Jin Yao wrote: > > When doing sampling, for example: > > perf record -e cycles:u ... > > On workloads that do a lot of kernel entry/exits we see kernel > samples, even though :u is specified. This is due to skid existing. > > This might be a security issue because it can leak kernel addresses even > though kernel sampling support is disabled. > > One patch "perf/core: Drop kernel samples even though :u is specified" > was posted in last year but it was reverted because it introduced a > regression issue that broke the rr-project, which used sampling > events to receive a signal on overflow. These signals were critical > to the correct operation of rr. > > See '6a8a75f32357 ("Revert "perf/core: Drop kernel samples even > though :u is specified"")' for detail. > > Now the idea is to use sysctl to control the dropping of leaked > kernel samples. > > /sys/devices/cpu/perf_allow_sample_leakage: > > 0 - default, drop the leaked kernel samples. > 1 - don't drop the leaked kernel samples. > > For rr it can write 1 to /sys/devices/cpu/perf_allow_sample_leakage. > > For example, > > root@skl:/tmp# cat /sys/devices/cpu/perf_allow_sample_leakage > 0 > root@skl:/tmp# perf record -e cycles:u ./div > root@skl:/tmp# perf report --stdio > > ........ ....... ............. ................ > > 47.01% div div [.] main > 20.74% div libc-2.23.so [.] __random_r > 15.59% div libc-2.23.so [.] __random > 8.68% div div [.] compute_flag > 4.48% div libc-2.23.so [.] rand > 3.50% div div [.] rand@plt > 0.00% div ld-2.23.so [.] do_lookup_x > 0.00% div ld-2.23.so [.] memcmp > 0.00% div ld-2.23.so [.] _dl_start > 0.00% div ld-2.23.so [.] _start > > There is no kernel symbol reported. > > root@skl:/tmp# echo 1 > /sys/devices/cpu/perf_allow_sample_leakage > root@skl:/tmp# cat /sys/devices/cpu/perf_allow_sample_leakage > 1 > root@skl:/tmp# perf record -e cycles:u ./div > root@skl:/tmp# perf report --stdio > > ........ ....... ................ ............. > > 47.53% div div [.] main > 20.62% div libc-2.23.so [.] __random_r > 15.32% div libc-2.23.so [.] __random > 8.66% div div [.] compute_flag > 4.53% div libc-2.23.so [.] rand > 3.34% div div [.] rand@plt > 0.00% div [kernel.vmlinux] [k] apic_timer_interrupt > 0.00% div libc-2.23.so [.] intel_check_word > 0.00% div ld-2.23.so [.] brk > 0.00% div [kernel.vmlinux] [k] page_fault > 0.00% div ld-2.23.so [.] _start > > We can see the kernel symbols apic_timer_interrupt and page_fault. > > Signed-off-by: Jin Yao > --- > kernel/events/core.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 58 insertions(+) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 80cca2b..7867541 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7721,6 +7721,28 @@ int perf_event_account_interrupt(struct perf_event *event) > return __perf_event_account_interrupt(event, 1); > } > > +static int perf_allow_sample_leakage __read_mostly; > + > +static bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs) > +{ > + int allow_leakage = READ_ONCE(perf_allow_sample_leakage); > + > + if (allow_leakage) > + return true; > + > + /* > + * Due to interrupt latency (AKA "skid"), we may enter the > + * kernel before taking an overflow, even if the PMU is only > + * counting user events. > + * To avoid leaking information to userspace, we must always > + * reject kernel samples when exclude_kernel is set. > + */ > + if (event->attr.exclude_kernel && !user_mode(regs)) > + return false; > + And how does that filter PEBS or LBR records? > + return true; > +} > + > /* > * Generic event overflow handling, sampling. > */ > @@ -7742,6 +7764,12 @@ static int __perf_event_overflow(struct perf_event *event, > ret = __perf_event_account_interrupt(event, throttle); > > /* > + * For security, drop the skid kernel samples if necessary. > + */ > + if (!sample_is_allowed(event, regs)) > + return ret; > + > + /* > * XXX event_limit might not quite work as expected on inherited > * events > */ > @@ -9500,9 +9528,39 @@ perf_event_mux_interval_ms_store(struct device *dev, > } > static DEVICE_ATTR_RW(perf_event_mux_interval_ms); > > +static ssize_t > +perf_allow_sample_leakage_show(struct device *dev, > + struct device_attribute *attr, char *page) > +{ > + int allow_leakage = READ_ONCE(perf_allow_sample_leakage); > + > + return snprintf(page, PAGE_SIZE-1, "%d\n", allow_leakage); > +} > + > +static ssize_t > +perf_allow_sample_leakage_store(struct device *dev, > + struct device_attribute *attr, > + const char *buf, size_t count) > +{ > + int allow_leakage, ret; > + > + ret = kstrtoint(buf, 0, &allow_leakage); > + if (ret) > + return ret; > + > + if (allow_leakage != 0 && allow_leakage != 1) > + return -EINVAL; > + > + WRITE_ONCE(perf_allow_sample_leakage, allow_leakage); > + > + return count; > +} > +static DEVICE_ATTR_RW(perf_allow_sample_leakage); > + > static struct attribute *pmu_dev_attrs[] = { > &dev_attr_type.attr, > &dev_attr_perf_event_mux_interval_ms.attr, > + &dev_attr_perf_allow_sample_leakage.attr, > NULL, > }; > ATTRIBUTE_GROUPS(pmu_dev); > -- > 2.7.4 >