Received: by 10.223.176.5 with SMTP id f5csp3083405wra; Mon, 29 Jan 2018 08:31:22 -0800 (PST) X-Google-Smtp-Source: AH8x225V9YUAhdfOpRiRkupvSNreGRj1shXfx9qU3DvsfjgdbE+gqI2vN2Unta0Hq6uaqeJvohMX X-Received: by 10.98.232.14 with SMTP id c14mr27847662pfi.215.1517243482584; Mon, 29 Jan 2018 08:31:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517243482; cv=none; d=google.com; s=arc-20160816; b=yUJt4cBPs1iM2dmRlvjcXYNnoaLJrnM7mpUOhJhk96iMd/Ml6MQePtrpEEFCMDfOJE HzVYrKJf8Iam8A7Q+EQaOjCwQbDngzIGT6b1A/3di9KrM7wa57AB4/A0kHogmbpeH/XI pFRWBakHM1FxY+NAhQ4QxLTgm3TbheIJKhXcAWSXPaxpQizJqGp/yFLPhOSn3MnFChGk 5j7TrHGdDjt8N+ixmjAJwF2akfNTqU1fRMA1Z3MLpFerZ+6e3ziyuJJLzP8aJI7E+dtE xIkHaCQZTJSi8hikhfwqGU4piq3SLyD/4tMMZbbiSBmjD5pCA82sAKJ5TkFqy0wp4FFS p1Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=4/p7VjiISGbgZeGRNKsjn8x1G+0WSF2cy09NKIQghI8=; b=Q622emsRjdiYFq3ukv+D+gyp3KSiissTeCKRc48IRQIZSpkQJjY8zSLU1AOtT6RGqs DoRqw0fl49jnd1JAm0XHkRbCbEjSm4mC510VLDqE3dB0AhG6H/jLP7S3R9VS+KhmKBRe xyaa+9PIbByV+0gKkh/kpOfOReaBBN7h40NWM3XeK3h9TorUOS8/MvOiuzfcQp9Yj5FK aw2FV4icbGX4JXn5i/5RjfeQPjpuazFpXNoM7H7Xd5wwGjoKlC4O14bUpdbGs/t08Vmo lPFihVVX07nPbTIj9XK8cFWptWN7AXQ8b4euO1uYcFeAM8stJ6GYZh7KQt+7FgZpaEll x9KA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w23si12078725pfk.337.2018.01.29.08.31.07; Mon, 29 Jan 2018 08:31:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751457AbeA2Qam (ORCPT + 99 others); Mon, 29 Jan 2018 11:30:42 -0500 Received: from mga09.intel.com ([134.134.136.24]:11472 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751219AbeA2Qak (ORCPT ); Mon, 29 Jan 2018 11:30:40 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Jan 2018 08:30:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,431,1511856000"; d="scan'208";a="25629342" Received: from otc-lr-04.jf.intel.com ([10.54.39.128]) by fmsmga004.fm.intel.com with ESMTP; 29 Jan 2018 08:30:38 -0800 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org Cc: acme@kernel.org, tglx@linutronix.de, jolsa@redhat.com, eranian@google.com, ak@linux.intel.com, Kan Liang Subject: [PATCH V3 1/5] perf/x86/intel: fix event update for auto-reload Date: Mon, 29 Jan 2018 08:29:29 -0800 Message-Id: <1517243373-355481-2-git-send-email-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1517243373-355481-1-git-send-email-kan.liang@linux.intel.com> References: <1517243373-355481-1-git-send-email-kan.liang@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Kan Liang There is a bug when mmap read event->count with large PEBS enabled. Here is an example. #./read_count 0x71f0 0x122c0 0x1000000001c54 0x100000001257d 0x200000000bdc5 In fixed period mode, the auto-reload mechanism could be enabled for PEBS events. But the calculation of event->count does not take the auto-reload values into account. Anyone who read the event->count will get wrong result, e.g x86_pmu_read. The calculation of hwc->period_left is wrong either, which will impact the calculation of the period for the first record in PEBS multiple records. The issue was introduced with the auto-reload mechanism enabled since commit 851559e35fd5 ("perf/x86/intel: Use the PEBS auto reload mechanism when possible") Introduce intel_pmu_save_and_restart_reload to calculate the event->count only for auto-reload. There is a small gap between 'PEBS hardware is armed' and 'the NMI is handled'. Because of the gap, the first record also needs to be specially handled. The formula to calculate the increments of event->count is as below. The increments = the period for first record + (reload_times - 1) * reload_val + the gap - The 'the period for first record' is the period left from last PMI, which can be got from the previous event value. - For the second and later records, the period is exactly the reload value. Just need to simply add (reload_times - 1) * reload_val - Because of the auto-reload, the start point of counting is alwyas (-reload_val). So the calculation of 'the gap' needs to be corrected by adding reload_val. The period_left needs to do the same adjustment as well. There is nothing need to do in x86_perf_event_set_period(). Because it is fixed period. The period_left is already adjusted. Fixes: 851559e35fd5 ("perf/x86/intel: Use the PEBS auto reload mechanism when possible") Signed-off-by: Kan Liang --- arch/x86/events/intel/ds.c | 69 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 67 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 8156e47..6533426 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1303,17 +1303,82 @@ get_next_pebs_record_by_bit(void *base, void *top, int bit) return NULL; } +/* + * Specific intel_pmu_save_and_restart() for auto-reload. + * It only be called from drain_pebs(). + */ +static int intel_pmu_save_and_restart_reload(struct perf_event *event, + int reload_times) +{ + struct hw_perf_event *hwc = &event->hw; + int shift = 64 - x86_pmu.cntval_bits; + u64 reload_val = hwc->sample_period; + u64 prev_raw_count, new_raw_count; + u64 delta; + + WARN_ON((reload_times == 0) || (reload_val == 0)); + + /* + * drain_pebs() only happens when the PMU is disabled. + * It doesnot need to specially handle the previous event value. + * The hwc->prev_count will be updated in x86_perf_event_set_period(). + */ + prev_raw_count = local64_read(&hwc->prev_count); + rdpmcl(hwc->event_base_rdpmc, new_raw_count); + + /* + * Now we have the new raw value and have updated the prev + * timestamp already. We can now calculate the elapsed delta + * (event-)time and add that to the generic event. + * + * Careful, not all hw sign-extends above the physical width + * of the count. + * + * There is a small gap between 'PEBS hardware is armed' and 'the NMI + * is handled'. Because of the gap, the first record also needs to be + * specially handled. + * The formula to calculate the increments of event->count is as below. + * The increments = the period for first record + + * (reload_times - 1) * reload_val + + * the gap + * 'The period for first record' can be got from -prev_raw_count. + * + * 'The gap' = new_raw_count + reload_val. Because the start point of + * counting is always -reload_val for auto-reload. + * + * The period_left needs to do the same adjustment by adding + * reload_val. + */ + delta = (reload_val << shift) + (new_raw_count << shift) - + (prev_raw_count << shift); + delta >>= shift; + + local64_add(reload_val * (reload_times - 1), &event->count); + local64_add(delta, &event->count); + local64_sub(delta, &hwc->period_left); + + return x86_perf_event_set_period(event); +} + static void __intel_pmu_pebs_event(struct perf_event *event, struct pt_regs *iregs, void *base, void *top, int bit, int count) { + struct hw_perf_event *hwc = &event->hw; struct perf_sample_data data; struct pt_regs regs; void *at = get_next_pebs_record_by_bit(base, top, bit); - if (!intel_pmu_save_and_restart(event) && - !(event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD)) + if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) { + /* + * Now, auto-reload is only enabled in fixed period mode. + * The reload value is always hwc->sample_period. + * May need to change it, if auto-reload is enabled in + * freq mode later. + */ + intel_pmu_save_and_restart_reload(event, count); + } else if (!intel_pmu_save_and_restart(event)) return; while (count > 1) { -- 2.7.4