Received: by 10.223.176.5 with SMTP id f5csp4060213wra; Tue, 30 Jan 2018 01:17:34 -0800 (PST) X-Google-Smtp-Source: AH8x2249JKVncPj0hPlGcL3DurgvBLQ6livSJa1u0DDUJDQR+kiwNuvxlO61SbfiB8NEodGOrQAm X-Received: by 10.98.65.193 with SMTP id g62mr29510207pfd.60.1517303854622; Tue, 30 Jan 2018 01:17:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517303854; cv=none; d=google.com; s=arc-20160816; b=JlloyPkZwy7vyu5c12PV4bpWObOeNoWAJ6q94BkVzR2BK3+Y0ZdEAXuMXURtIq6jhx jXX4QC1vDsOmd9Va9dn773yKHay7H/mA/9kGpOpu+2XAwjkYYs7LsI7G124sJxKD3q1d Mx/QgaqrcYf8MJErkgheO+SqUz6qKnbkTfZanz1UPto8Ei9XK3j1158ud5edCkaTWjV4 l+gvn7p+HP59Of8LUt5T8/Rm8CxjsKFsB3mnnIp4DymdZ4aDkd7VTJEkgRY8tzOMJesD yxKpuDPyAaHp5WBl5Y6mVS2jmKkz/QTZx8zZfqzMqjKX2ktjpyVjv094WdyIth3TGJJB RZJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=gct//Z0ESYcIeyWQn02L1n5HfnPt2sPoYA4X68OveDQ=; b=gC8TqcnA9BdcPFbiYefoJU9KJk0xa+nSXbfYWS0HYcufvkfF80uNl/RR69LY0nUKkT KGydXGJvz5amEy1um7eMPZFJqyJ47uzA4jVzISFJfzHp8U6e1tE+HwBuznrtR6IGP25Y IrbgFBAxKUXpxr9INo9h85ptTGebqDFsgmZoBTvo0mKHoshu/KfdWG35vNHThO+WNRzf 6rXBMa9s2gtZevbwxN4TvLXoNpTEDNNxeHsgdo3g7v28HH/2v7/V1Ts+yTG6oV3KE4LI rC0X8vflpDK3w7Nag18yjhAvoPSlqTx243PS4hEPqvuOJAndgFbhi3zzbAt9LfHSbwXA pcpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=SzVPZeoB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o7-v6si10240119pll.114.2018.01.30.01.17.20; Tue, 30 Jan 2018 01:17:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=SzVPZeoB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751556AbeA3JQo (ORCPT + 99 others); Tue, 30 Jan 2018 04:16:44 -0500 Received: from mail-it0-f65.google.com ([209.85.214.65]:38318 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750959AbeA3JQk (ORCPT ); Tue, 30 Jan 2018 04:16:40 -0500 Received: by mail-it0-f65.google.com with SMTP id w14so12598264itc.3 for ; Tue, 30 Jan 2018 01:16:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=gct//Z0ESYcIeyWQn02L1n5HfnPt2sPoYA4X68OveDQ=; b=SzVPZeoBbyLTrUBCMJ/J248MpPwJrFK9U+5H3lWxAA44rHnA/M000Rct/c0PlrW8Du CY0JVyI8aBnZgGyyA5LGAl5yVKNbpkjlefLxVvOWTDRuoxozoPi6EQKSjlGp/hnkgQur lOyNE3BmNYWftjEyakTqvl+/im6dBYytQlqZAmI3QKQ4NDCmo/JKMpdn3vy0wFMwC3e1 zDry6i4ggCqBkv868GrfhuqC8QpNNSfgF6dj8ELZW0IdXvV8lqfeh77bZcQzpJCLi0Nm Gk0RT47BXNOfYtTHsaZjVJAWUF5paI6pZXeqsz30Bb7kfyFA3DN66BSe7MzIP+sxhsM6 ieUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=gct//Z0ESYcIeyWQn02L1n5HfnPt2sPoYA4X68OveDQ=; b=ggrtK8Dk+6r33fFkY7ChOUMJn7VdsLYryhfgL2+wIvENxcmSbAaUALVeO8csSEdIXS VW2ixHqLLMhUH+7gDq6ycE+0Z3bJ7Y3AIMnnes/8D0NGooUONoDReMH4HxMUqrPO2z3g tbEY1UQWwtKI4Adbp+bVQUeXwH4vRwxuimENKYofs8ho3fnPhzlQyqWBBn03QdSE1lDM y9War+n6hMx2kJhKnMw2AF9DVqWbnSjuNBce2FILVelf7FSbx0IXldChwd6Utkqgk+qT nK6jGGIwriT5lvZKb+oJzOy9/9Cql3YCxw8iHUmE+AJwhZ1o9+GYcISDhLn0jE7YMgwU hcFw== X-Gm-Message-State: AKwxytcbZwBnswIc7aGyCutTuFTKK2OPyEqXpl0zs6eyMob8KVA+8vaJ Gza3czRyZy1MpJdlTjASTEi8yWURuF+5UuUVf66zXw== X-Received: by 10.36.73.102 with SMTP id z99mr32683539ita.72.1517303799942; Tue, 30 Jan 2018 01:16:39 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.170.39 with HTTP; Tue, 30 Jan 2018 01:16:39 -0800 (PST) In-Reply-To: <1517243373-355481-1-git-send-email-kan.liang@linux.intel.com> References: <1517243373-355481-1-git-send-email-kan.liang@linux.intel.com> From: Stephane Eranian Date: Tue, 30 Jan 2018 01:16:39 -0800 Message-ID: Subject: Re: [PATCH V3 0/5] bugs fix for large PEBS mmap read and rdpmc read To: kan.liang@linux.intel.com Cc: Peter Zijlstra , Ingo Molnar , LKML , Arnaldo Carvalho de Melo , Thomas Gleixner , Jiri Olsa , Andi Kleen Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Mon, Jan 29, 2018 at 8:29 AM, wrote: > > From: Kan Liang > > ------ > > Changes since V2: > - Refined the changelog > - Introduced specific read function for large PEBS. > The previous generic PEBS read function is confusing. > Disabled PMU in pmu::read() path for large PEBS. > Handled the corner case when reload_times == 0. > - Modified the parameter of intel_pmu_save_and_restart_reload() > Discarded local64_cmpxchg > - Added fixes tag > - Added WARN to handle reload_times == 0 || reload_val == 0 > > Changes since V1: > - Check PERF_X86_EVENT_AUTO_RELOAD before call > intel_pmu_save_and_restore() It is not yet clear to me why PERF_SAMPLE_PERIOD is not allowed with large PEBS. Large PEBS requires fixed period. So the kernel could make up the period from the event and store it in the sampling buffer. I tried using large PEBS recently, and despite trying different option combination of perf record, I was not able to get it to work. $ perf record -c 1 -e cpu/event=0xd1,umask=0x10,period=19936/pp --no-timestamp --no-period -a -C 0 But I was able to make this work with a much older kernel. Another annoyance I ran into is with perf record requiring -c period in order not to set PERF_SAMPLE_PERIOD in the event. If I do: perf record -c 1 -e cpu/event=0xd1,umask=0x10,period=19936/pp --no-timestamp --no-period -a -C 0 I get perf_event_attr: type 4 size 112 config 0x10d1 { sample_period, sample_freq } 199936 sample_type IP|TID|CPU But if I do: perf record -e cpu/event=0xd1,umask=0x10,period=19936/pp --no-timestamp --no-period -a -C 0 I get perf_event_attr: type 4 size 112 config 0x10d1 { sample_period, sample_freq } 199936 sample_type IP|TID|CPU|PERIOD Perf should check if all events have a period=, then it should not pass PERF_SAMPLE_PERIOD, even more so when only one event is defined. Also it does not seem to honor --no-period. All of this makes it hard to use large PEBS in general. So I think your series should also address this part. > - Introduce a special purpose intel_pmu_save_and_restart() > just for AUTO_RELOAD. > - New patch to disable userspace RDPMC usage if large PEBS is enabled. > > ------ > > There is a bug when mmap read event->count with large PEBS enabled. > Here is an example. > #./read_count > 0x71f0 > 0x122c0 > 0x1000000001c54 > 0x100000001257d > 0x200000000bdc5 > > The bug is caused by two issues. > - In x86_perf_event_update, the calculation of event->count does not > take the auto-reload values into account. > - In x86_pmu_read, it doesn't count the undrained values in large PEBS > buffers. > > The first issue was introduced with the auto-reload mechanism enabled > since commit 851559e35fd5 ("perf/x86/intel: Use the PEBS auto reload > mechanism when possible") > > Patch 1 fixed the issue in x86_perf_event_update. > > The second issue was introduced since commit b8241d20699e > ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS > interrupt threshold)") > > Patch 2-4 fixed the issue in x86_pmu_read. > > Besides the two issues, the userspace RDPMC usage is broken for large > PEBS as well. > The RDPMC issue was also introduced since commit b8241d20699e > ("perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS > interrupt threshold)") > > Patch 5 fixed the RDPMC issue. > > The source code of read_count is as below. > > struct cpu { > int fd; > struct perf_event_mmap_page *buf; > }; > > int perf_open(struct cpu *ctx, int cpu) > { > struct perf_event_attr attr = { > .type = PERF_TYPE_HARDWARE, > .size = sizeof(struct perf_event_attr), > .sample_period = 100000, > .config = 0, > .sample_type = PERF_SAMPLE_IP | PERF_SAMPLE_TID | > PERF_SAMPLE_TIME | PERF_SAMPLE_CPU, > .precise_ip = 3, > .mmap = 1, > .comm = 1, > .task = 1, > .mmap2 = 1, > .sample_id_all = 1, > .comm_exec = 1, > }; > ctx->buf = NULL; > ctx->fd = syscall(__NR_perf_event_open, &attr, -1, cpu, -1, 0); > if (ctx->fd < 0) { > perror("perf_event_open"); > return -1; > } > return 0; > } > > void perf_close(struct cpu *ctx) > { > close(ctx->fd); > if (ctx->buf) > munmap(ctx->buf, pagesize); > } > > int main(int ac, char **av) > { > struct cpu ctx; > u64 count; > > perf_open(&ctx, 0); > > while (1) { > sleep(5); > > if (read(ctx.fd, &count, 8) != 8) { > perror("counter read"); > break; > } > printf("0x%llx\n", count); > > } > perf_close(&ctx); > } > > Kan Liang (5): > perf/x86/intel: fix event update for auto-reload > perf/x86: introduce read function for x86_pmu > perf/x86/intel/ds: introduce read function for large pebs > perf/x86/intel: introduce read function for intel_pmu > perf/x86: fix: disable userspace RDPMC usage for large PEBS > > arch/x86/events/core.c | 5 ++- > arch/x86/events/intel/core.c | 9 +++++ > arch/x86/events/intel/ds.c | 85 ++++++++++++++++++++++++++++++++++++++++++-- > arch/x86/events/perf_event.h | 3 ++ > 4 files changed, 99 insertions(+), 3 deletions(-) > > -- > 2.7.4 >