Received: by 10.223.176.5 with SMTP id f5csp4393745wra; Tue, 30 Jan 2018 06:44:17 -0800 (PST) X-Google-Smtp-Source: AH8x227aDzMmwgiNDXLO4aHfgkQtEz5WnW0mjK/Q4fmAcwbAqbsnWHv7j5TLCeS9kRWzOFRVjo8L X-Received: by 2002:a17:902:6908:: with SMTP id j8-v6mr25208733plk.211.1517323457711; Tue, 30 Jan 2018 06:44:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517323457; cv=none; d=google.com; s=arc-20160816; b=afpdRh2r+lJ6kGkTmWqHUX6qftMTY4U9iR8Sf9KBy5sOLYIpxqnhdrkHQsuEOhGpcj tVAjN7gc+i3Pa0NxVuLA4xw+EboJ6MHjrZeAzUYAnjgQCrhE1P/XFmDB5b0hELPdQQvi tTMJXIBDzVqyBVrX27JV5l2dPgtESk+GejndCphGDD+YWR78hPtlz6rssPifuNfp4r6y VgbOuEspWcyxC43YVTZPkz3voFtqNikXTa7qmt9/Xqw/1JKQwEbGSwAXXX8i3qcTPLeC FN/979/O7vHUdkeElHtzIUFx46DQs+GTqbn1pqtx/Ch9zqZ+XyB6C2POFNzIYs8jw7iD TT8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=de1sJZrCXbcpxWjoT9jtQ01Yq75zCOQffK+nvEUBCj8=; b=gcmb6l3CVbZSYO+0U7xWYfE4/ITb8/3dZjE7z3pYABU8oZFH+3KgYGGlI1GlgBTfso mMcZiiBz9W5+ENc1NtxEfjvlt3jg5JLwpHDSEQ6A37WMd9m36liZB63TbFPP7uwN6Qre 24vwsguC1nYENqpnukiDpLExbhOTfBUJklyiDAVgp5lX/UgQwORollsJLYl3RewHXeUY y0fvwPa2X/8lF6SZhirE4dx223JqYelXIrN9vT8wKq6DRzn+IezVpElDsv6FXiOay2XK u2f81XVVpJ+h9h3Lx7vvVC6yvxwEH6TFy9p9RBvF/++YkC7578QuBDqGcyhZU0byuyNH /YkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si12036029ple.769.2018.01.30.06.44.03; Tue, 30 Jan 2018 06:44:17 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752857AbeA3Olo (ORCPT + 99 others); Tue, 30 Jan 2018 09:41:44 -0500 Received: from mga18.intel.com ([134.134.136.126]:30827 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751433AbeA3Olm (ORCPT ); Tue, 30 Jan 2018 09:41:42 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Jan 2018 06:41:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,435,1511856000"; d="scan'208";a="23403487" Received: from linux.intel.com ([10.54.29.200]) by orsmga003.jf.intel.com with ESMTP; 30 Jan 2018 06:41:42 -0800 Received: from [10.254.64.174] (kliang2-mobl1.ccr.corp.intel.com [10.254.64.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 5BF6558033D; Tue, 30 Jan 2018 06:41:41 -0800 (PST) Subject: Re: [PATCH V3 0/5] bugs fix for large PEBS mmap read and rdpmc read To: Stephane Eranian Cc: Peter Zijlstra , Ingo Molnar , LKML , Arnaldo Carvalho de Melo , Thomas Gleixner , Jiri Olsa , Andi Kleen References: <1517243373-355481-1-git-send-email-kan.liang@linux.intel.com> From: "Liang, Kan" Message-ID: <99cea6a4-c0f1-b263-ed85-ef2464505ba3@linux.intel.com> Date: Tue, 30 Jan 2018 09:41:40 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/30/2018 4:16 AM, Stephane Eranian wrote: > Hi, > > On Mon, Jan 29, 2018 at 8:29 AM, wrote: >> >> From: Kan Liang >> >> ------ >> >> Changes since V2: >> - Refined the changelog >> - Introduced specific read function for large PEBS. >> The previous generic PEBS read function is confusing. >> Disabled PMU in pmu::read() path for large PEBS. >> Handled the corner case when reload_times == 0. >> - Modified the parameter of intel_pmu_save_and_restart_reload() >> Discarded local64_cmpxchg >> - Added fixes tag >> - Added WARN to handle reload_times == 0 || reload_val == 0 >> >> Changes since V1: >> - Check PERF_X86_EVENT_AUTO_RELOAD before call >> intel_pmu_save_and_restore() > > It is not yet clear to me why PERF_SAMPLE_PERIOD is not allowed > with large PEBS. Large PEBS requires fixed period. So the kernel could > make up the period from the event and store it in the sampling buffer. The PERF_SAMPLE_PERIOD will be implicitly set in freq mode. By now, large PEBS doesn't support freq mode yet. > > I tried using large PEBS recently, and despite trying different option > combination of perf record, I was not able to get it to work. > > $ perf record -c 1 -e cpu/event=0xd1,umask=0x10,period=19936/pp > --no-timestamp --no-period -a -C 0 > > But I was able to make this work with a much older kernel. Is there any error message? > > Another annoyance I ran into is with perf record requiring -c period > in order not to set > PERF_SAMPLE_PERIOD in the event. > > If I do: > perf record -c 1 -e cpu/event=0xd1,umask=0x10,period=19936/pp > --no-timestamp --no-period -a -C 0 > > I get > > perf_event_attr: > type 4 > size 112 > config 0x10d1 > { sample_period, sample_freq } 199936 > sample_type IP|TID|CPU > > But if I do: > perf record -e cpu/event=0xd1,umask=0x10,period=19936/pp > --no-timestamp --no-period -a -C 0 > > I get > > perf_event_attr: > type 4 > size 112 > config 0x10d1 > { sample_period, sample_freq } 199936 > sample_type IP|TID|CPU|PERIOD > > Perf should check if all events have a period=, then it should not > pass PERF_SAMPLE_PERIOD, even > more so when only one event is defined. As my understanding, the "period=" is per-event term. It should not change the global setting. I think it's better still use "--no-period" to control the PERIOD bit. > > Also it does not seem to honor --no-period. > Right, perf tool doesn't clear the PERIOD for --no-period. if (opts->period) perf_evsel__set_sample_bit(evsel, PERIOD); I will submit a patch to fix it. > All of this makes it hard to use large PEBS in general. > > So I think your series should also address this part. The issue looks like perf tool problem. I will take a look at it, but probably address it in a separate patch set. Thanks, Kan > >> - Introduce a special purpose intel_pmu_save_and_restart() >> just for AUTO_RELOAD. >> - New patch to disable userspace RDPMC usage if large PEBS is enabled. >> >> ------ >> >> There is a bug when mmap read event->count with large PEBS enabled. >> Here is an example. >> #./read_count >> 0x71f0 >> 0x122c0 >> 0x1000000001c54 >> 0x100000001257d >> 0x200000000bdc5 >> >> The bug is caused by two issues. >> - In x86_perf_event_update, the calculation of event->count does not >> take the auto-reload values into account. >> - In x86_pmu_read, it doesn't count the undrained values in large PEBS >> buffers. >> >> The first issue was introduced with the auto-reload mechanism enabled >> since commit 851559e35fd5 ("perf/x86/intel: Use the PEBS auto reload >> mechanism when possible") >> >> Patch 1 fixed the issue in x86_perf_event_update. >> >> The second issue was introduced since commit b8241d20699e >> ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS >> interrupt threshold)") >> >> Patch 2-4 fixed the issue in x86_pmu_read. >> >> Besides the two issues, the userspace RDPMC usage is broken for large >> PEBS as well. >> The RDPMC issue was also introduced since commit b8241d20699e >> ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS >> interrupt threshold)") >> >> Patch 5 fixed the RDPMC issue. >> >> The source code of read_count is as below. >> >> struct cpu { >> int fd; >> struct perf_event_mmap_page *buf; >> }; >> >> int perf_open(struct cpu *ctx, int cpu) >> { >> struct perf_event_attr attr = { >> .type = PERF_TYPE_HARDWARE, >> .size = sizeof(struct perf_event_attr), >> .sample_period = 100000, >> .config = 0, >> .sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_TID | >> PERF_SAMPLE_TIME | PERF_SAMPLE_CPU, >> .precise_ip = 3, >> .mmap = 1, >> .comm = 1, >> .task = 1, >> .mmap2 = 1, >> .sample_id_all = 1, >> .comm_exec = 1, >> }; >> ctx->buf = NULL; >> ctx->fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0); >> if (ctx->fd < 0) { >> perror("perf_event_open"); >> return -1; >> } >> return 0; >> } >> >> void perf_close(struct cpu *ctx) >> { >> close(ctx->fd); >> if (ctx->buf) >> munmap(ctx->buf, pagesize); >> } >> >> int main(int ac, char **av) >> { >> struct cpu ctx; >> u64 count; >> >> perf_open(&ctx, 0); >> >> while (1) { >> sleep(5); >> >> if (read(ctx.fd, &count, 8) != 8) { >> perror("counter read"); >> break; >> } >> printf("0x%llx\n", count); >> >> } >> perf_close(&ctx); >> } >> >> Kan Liang (5): >> perf/x86/intel: fix event update for auto-reload >> perf/x86: introduce read function for x86_pmu >> perf/x86/intel/ds: introduce read function for large pebs >> perf/x86/intel: introduce read function for intel_pmu >> perf/x86: fix: disable userspace RDPMC usage for large PEBS >> >> arch/x86/events/core.c | 5 ++- >> arch/x86/events/intel/core.c | 9 +++++ >> arch/x86/events/intel/ds.c | 85 ++++++++++++++++++++++++++++++++++++++++++-- >> arch/x86/events/perf_event.h | 3 ++ >> 4 files changed, 99 insertions(+), 3 deletions(-) >> >> -- >> 2.7.4 >>