Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp1961049pxa; Thu, 6 Aug 2020 22:26:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy/O+8yx1CNak3BWnC7Hs73X+K4IQSJ9JRHeLGoEHgU+/UVr0Sur3Rjypew6BoHiaoHhcqg X-Received: by 2002:a17:906:98c1:: with SMTP id zd1mr8048112ejb.410.1596777987254; Thu, 06 Aug 2020 22:26:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596777987; cv=none; d=google.com; s=arc-20160816; b=Mk2Khnahvl4GaMKkntdT8NCi4a60TSJ/Za4PeTtTMlHiV/lqkzQmfI1okSaX1ChQtu jaflucgz5mltEHzOVqp0eA4RFMUQyHpiDMEaw0zvwbZJ5KwJ6y0DXd5aS7AOlEY2zuYN c+WWTC7CEUQVppWA883j33N5DZiFgJ/syYTeYAPXIBxeG4eXwJWqANPZG7+H9sh1JZuH H72muvbi7kW+gkaBye9jilZndCLJiHlt/0rbSWAUJTcIdZfh5p1O2+t34vqLULd/8xqW RLd3dXByqt1SM/Ayyuer1obN2iHd2nBz5oRVE4NeTsbd5igmK8QRQ0TMzKXnhBFsJdhG A2lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=ZZGVNrTNeriD1ywnKrnEsZOB7rx5I5LomPn30xgXqsw=; b=svAfwwJJmnnWJouuARIngX10LQRsXLiTDxYktSPReEJWwGq3X6F6tRqgq0NnyWQhx5 HdCZpXSJi4Af3BV7leNnv6AIVKRKTckJMcEh25dE3hvtJ1obMxL9xnmfez3xtwUahW1b /qVLZV/+m5ydgK0dJBX+LkHXB2Wo9nqYiW1zyUr7Wj/QQa4XPfqlH3W1xSDCedpEwWQb NYxyvgPGegQB320Y83782F9uM+vkXylP9IdCD3wA8572yrfp6gcnCthfFOk4Vou/VQdV mNkCj/Ly0KlXRKZQ3refheKiawOL1UNdPu+qjX6KXWrUaLpaGfpvUebWPq7i3xOThKkX tjyA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id od24si4816787ejb.630.2020.08.06.22.26.04; Thu, 06 Aug 2020 22:26:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726150AbgHGFXO (ORCPT + 99 others); Fri, 7 Aug 2020 01:23:14 -0400 Received: from mga18.intel.com ([134.134.136.126]:62442 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726126AbgHGFXN (ORCPT ); Fri, 7 Aug 2020 01:23:13 -0400 IronPort-SDR: +M9k1FLPEn48nFZr6Mk36CAHjbkuAFUA3+irpNgmwb+YrTjWXQAO2GbvkaniEPjHxQPmgHZpU5 crG8VpSabGYw== X-IronPort-AV: E=McAfee;i="6000,8403,9705"; a="140593998" X-IronPort-AV: E=Sophos;i="5.75,444,1589266800"; d="scan'208";a="140593998" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Aug 2020 22:23:13 -0700 IronPort-SDR: srx7j9xD8L3kwHnBSk3uQkkVaHdxhwMp8fpz/GUdYeYhNEai4rBYz8SxZyVvvfVLc1/HNMUjSU drcvr9wNMvTA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,444,1589266800"; d="scan'208";a="323658062" Received: from yjin15-mobl1.ccr.corp.intel.com (HELO [10.238.5.239]) ([10.238.5.239]) by orsmga008.jf.intel.com with ESMTP; 06 Aug 2020 22:23:09 -0700 Subject: Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples To: peterz@infradead.org Cc: mingo@redhat.com, oleg@redhat.com, acme@kernel.org, jolsa@kernel.org, Linux-kernel@vger.kernel.org, ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com, alexander.shishkin@linux.intel.com, mark.rutland@arm.com References: <20200731025617.16243-1-yao.jin@linux.intel.com> <20200731025617.16243-2-yao.jin@linux.intel.com> <20200804114900.GI2657@hirez.programming.kicks-ass.net> <4c958d61-11a7-9f3e-9e7d-d733270144a1@linux.intel.com> <20200805124454.GP2657@hirez.programming.kicks-ass.net> <797aa4de-c618-f340-ad7b-cef38c96b035@linux.intel.com> <20200806091827.GY2674@hirez.programming.kicks-ass.net> From: "Jin, Yao" Message-ID: Date: Fri, 7 Aug 2020 13:23:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <20200806091827.GY2674@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On 8/6/2020 5:18 PM, peterz@infradead.org wrote: > On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote: > >>> +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs) >>> +{ >>> + struct pt_regs *sample_regs = regs; >>> + >>> + /* user only */ >>> + if (!event->attr.exclude_kernel || !event->attr.exclude_hv || >>> + !event->attr.exclude_host || !event->attr.exclude_guest) >>> + return sample_regs; >>> + >> >> Is this condition correct? >> >> Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It >> will go "return sample_regs" path. > > I'm not sure, I'm terminally confused on virt stuff. > > Suppose we have nested virt: > > L0-hv > | > G0/L1-hv > | > G1 > > And we're running in G0, then: > > - 'exclude_hv' would exclude L0 events > - 'exclude_host' would ... exclude L1-hv events? I think the exclude_host is generally set by guest (/arch/x86/kvm/pmu.c, pmc_reprogram_counter). If G0 is a host, if we set exclude_host in G0, I think we will not be able to count the events on G0. The appropriate usage is, G1 sets the exclude_host, then the events on G0 will not be collected by guest G1. That's my understanding for the usage of exclude_host. > - 'exclude_guest' would ... exclude G1 events? > Similarly, the appropriate usage is, the host (G0) sets the exclude_guest, then the events on G1 will not be collected by host G0. If G1 sets exclude_guest, since no guest is under G1, that's ineffective. > Then the next question is, if G0 is a host, does the L1-hv run in > G0 userspace or G0 kernel space? > I'm not very sure. Maybe some in kernel, some in userspace(qemu)? Maybe some KVM experts can help to answer this question. > I was assuming G0 userspace would not include anything L1 (kvm is a > kernel module after all), but what do I know. > I have tested following conditions in native environment (not in KVM guests), the result is not expected. /* user only */ if (!event->attr.exclude_kernel || !event->attr.exclude_hv || !event->attr.exclude_host || !event->attr.exclude_guest) return sample_regs; perf record -e cycles:u ./div perf report --stdio # Overhead Command Shared Object Symbol # ........ ....... ................ ....................... # 49.51% div libc-2.27.so [.] __random_r 33.93% div libc-2.27.so [.] __random 8.13% div libc-2.27.so [.] rand 4.29% div div [.] main 4.14% div div [.] rand@plt 0.00% div [unknown] [k] 0xffffffffbd600cb0 0.00% div [unknown] [k] 0xffffffffbd600df0 0.00% div ld-2.27.so [.] _dl_relocate_object 0.00% div ld-2.27.so [.] _dl_start 0.00% div ld-2.27.so [.] _start 0xffffffffbd600cb0 and 0xffffffffbd600df0 are leaked kernel addresses. From debug, I can see: [ 6272.320258] jinyao: sanitize_sample_regs: event->attr.exclude_kernel = 1, event->attr.exclude_hv = 1, event->attr.exclude_host = 0, event->attr.exclude_guest = 0 So it goes "return sample_regs;" path. >>> @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open, >>> if (err) >>> return err; >>> - if (!attr.exclude_kernel) { >>> + if (!attr.exclude_kernel || !attr.exclude_callchain_kernel || >>> + !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) { >>> err = perf_allow_kernel(&attr); >>> if (err) >>> return err; >>> >> >> I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel". >> >> But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest". > > Well, I'm very sure G0 userspace should never see L0 or G1 state, so > exclude_hv and exclude_guest had better be true. > >> On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right? > > Same as above, is G0 host state G0 userspace? > >> So even exclude_kernel = 1 but exclude_host = 0, we will still go >> perf_allow_kernel path. Please correct me if my understanding is wrong. > > Yes, because with those permission checks in place it means you have > permission to see kernel bits. > At the syscall entry, I also added some printk. Aug 7 03:37:40 kbl-ppc kernel: [ 854.688045] syscall: attr.exclude_kernel = 1, attr.exclude_callchain_kernel = 0, attr.exclude_hv = 0, attr.exclude_host = 0, attr.exclude_guest = 0 For my test case ("perf record -e cycles:u ./div"), the perf_allow_kernel() is also executed. Thanks Jin Yao