Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp108624ybh; Tue, 17 Mar 2020 19:10:27 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtes+XSnCZBRI9wycRWa7S/cTu1CLVKVlzxRqSJWm7PyZ0TXoMqEOdHQCTDJf7u5UOLbjNn X-Received: by 2002:a9d:850:: with SMTP id 74mr1869935oty.279.1584497427357; Tue, 17 Mar 2020 19:10:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584497427; cv=none; d=google.com; s=arc-20160816; b=zBURjH19UW2D8HrKsvvDArrQUJma+XjPXQzoI1xrhy65He3eKkwDAhkVh7oKErXKyU v7yBd7Fw0PfYb4YBUr97ZuUNpvtbhw5t7MKEV61E4VCyBgpcXvRhS8WzNZi1823ZXZuS +4S7T+FHz06WHshTJmRQIAtMjqfS+7o8Yv2xcX17J/Mv7WMteoiN+LSkaHB7MwRaZbYr TDWnpWyf4hRi7hzC/D/a+N7IlqRHHSck8JS0KYBkI91RcqJ6OJQoopfYPKnZPE44rf2S YWCwlWZ7aPu3jvguGPWX8b2EBID57OU14+PqIQBa6/eYSHmeUcVS1ENYFgxS3Oc/2GFV 0J4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ZwkKxZxz8ZvxgUWRtQYqaELKYeJh+ujm7OIHIKvAQ24=; b=ArQtLumy6jq5tKyHbmMLhtXp37aj6CMrS2yCe1bVLis82F+2SjJWiap8ArFIhzQ4OR ETXTxUsXd10W7eAKvRxkC33OU6UzXvUVSvSZ/cXZP+uCtCPLzCsvQFrvXCUmHKNA9Vj5 OuupLTmFIe8za+yMbSNEZk55RYfJhZI2XPN1HRKGXh96v5YpDaCrHzNrg/yo/c9Z+B3e XunVCgLcgl+jWjUi+qBcL7BZ7HHPW2fG/GwailLffkcnaLVRCsnd55nEplAe/ku/UYwd LQ7uZgLKfECsZ5I31B6fnj0y3wcyFTtuAK0+xobN8b9JoGHQvW9gNXfNFRUf5CZEtMc8 y1Uw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=m3+JC7u+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a4si2690010otf.90.2020.03.17.19.10.14; Tue, 17 Mar 2020 19:10:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=m3+JC7u+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727285AbgCRCJu (ORCPT + 99 others); Tue, 17 Mar 2020 22:09:50 -0400 Received: from mail-ua1-f66.google.com ([209.85.222.66]:40774 "EHLO mail-ua1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727197AbgCRCJu (ORCPT ); Tue, 17 Mar 2020 22:09:50 -0400 Received: by mail-ua1-f66.google.com with SMTP id t20so8869120uao.7 for ; Tue, 17 Mar 2020 19:09:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZwkKxZxz8ZvxgUWRtQYqaELKYeJh+ujm7OIHIKvAQ24=; b=m3+JC7u+yGHBnU2SLiHuAd5FssLpnXd7UNNlz4eMMZl1UkqnuIQuL5O40R1w9mJ54S v4zUhBVMr4czXz1CHURsJPXct+NL8ccSkth/07C/5IAYiN5hhE9r4jOSPZf1RkJGPq9j zg7F9h63MbrZcedrw5JQVJW0jrjGvTA+wiVczCLT1aHmXOafA4JH/Q6NRkzXtzjIv2qX 1uujsd+ajQEJz7UaN07mdRSr/PVHk5kWa6z64rSQpA8XiPAOrittTeEIyhD0OLVJR+rB bpv34H7kFvIgwAnvsKO2gv+MXvL8NFshzDBsgVUYfEKTYhUciyceBVyBwNt/XWDQUo1T Qj9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZwkKxZxz8ZvxgUWRtQYqaELKYeJh+ujm7OIHIKvAQ24=; b=CCW4FMHcyKBuXZ0uOOkjseBtxZ+rpC8QlW40tDUPbIbPiB+iqFjN/Mhxw2Zuywwu1W 9XlbFRNpJJZdhpV4s+TrVpYB5Qz1tms+wIzeVaNVVI4B/NRwBJ5ve7YrT2S0ZPvJZSF7 ZS37a5WzNq+ginsrEoO03EpSC69MvLT20piR+lrVWEArM2f27z8wa6g3zwZ+p1OAHBrW szTvmoTNY7IMSm+xZZM+Uae9T7+duL+s3vg+1BBmTjpnOhgjghh/KhfC/w4qMnRqmvZD EemFOw7Y9SueETcbnMwE/JOAO9FrDhcJD4BWo+ZEzknvd0Fyxz00hnwfVU27YLV9J/OY Ye4A== X-Gm-Message-State: ANhLgQ3H0Evvl/JKoabd9oxiKRbeAexSLpQzmjWrF0k6vu0X4h0yF47V Zgk+y0jMDyRR5YF7X2gFi4XbaaeYA4v8dJNva/K52Q== X-Received: by 2002:ab0:911:: with SMTP id w17mr1424994uag.60.1584497388970; Tue, 17 Mar 2020 19:09:48 -0700 (PDT) MIME-Version: 1.0 References: <20200313231024.17601-1-kim.phillips@amd.com> In-Reply-To: <20200313231024.17601-1-kim.phillips@amd.com> From: Stephane Eranian Date: Tue, 17 Mar 2020 19:09:37 -0700 Message-ID: Subject: Re: [PATCH 1/3 v2] perf/amd/uncore: Prepare L3 thread mask code for Family 19h support To: Kim Phillips Cc: Peter Zijlstra , Ingo Molnar , Ingo Molnar , Thomas Gleixner , Borislav Petkov , Alexander Shishkin , Arnaldo Carvalho de Melo , "H. Peter Anvin" , Jiri Olsa , Mark Rutland , Michael Petlan , Namhyung Kim , LKML , x86 Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 13, 2020 at 4:10 PM Kim Phillips wrote: > > In order to better accommodate the upcoming Family 19h support, > given the 80-char line limit, move the existing code into a new > l3_thread_slice_mask function. > > Signed-off-by: Kim Phillips > Cc: Alexander Shishkin > Cc: Arnaldo Carvalho de Melo > Cc: Borislav Petkov > Cc: "H. Peter Anvin" > Cc: Ingo Molnar > Cc: Ingo Molnar > Cc: Jiri Olsa > Cc: Mark Rutland > Cc: Michael Petlan > Cc: Namhyung Kim > Cc: Peter Zijlstra > Cc: Stephane Eranian > Cc: Thomas Gleixner > Cc: linux-kernel@vger.kernel.org > Cc: x86@kernel.org > --- > v2: split into two parts, this one being the mechanical > move based on Boris' review comments: > > https://lkml.org/lkml/2020/3/12/581 > > arch/x86/events/amd/uncore.c | 25 ++++++++++++++++--------- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c > index a6ea07f2aa84..dcdac001431e 100644 > --- a/arch/x86/events/amd/uncore.c > +++ b/arch/x86/events/amd/uncore.c > @@ -180,6 +180,20 @@ static void amd_uncore_del(struct perf_event *event, int flags) > hwc->idx = -1; > } > > +/* > + * Convert logical cpu number to L3 PMC Config ThreadMask format > + */ > +static u64 l3_thread_slice_mask(int cpu) > +{ > + int thread = 2 * (cpu_data(cpu).cpu_core_id % 4); > + > + if (smp_num_siblings > 1) > + thread += cpu_data(cpu).apicid & 1; > + > + return (1ULL << (AMD64_L3_THREAD_SHIFT + thread) & > + AMD64_L3_THREAD_MASK) | AMD64_L3_SLICE_MASK; > +} > + > static int amd_uncore_event_init(struct perf_event *event) > { > struct amd_uncore *uncore; > @@ -209,15 +223,8 @@ static int amd_uncore_event_init(struct perf_event *event) > * SliceMask and ThreadMask need to be set for certain L3 events in > * Family 17h. For other events, the two fields do not affect the count. > */ > - if (l3_mask && is_llc_event(event)) { > - int thread = 2 * (cpu_data(event->cpu).cpu_core_id % 4); > - > - if (smp_num_siblings > 1) > - thread += cpu_data(event->cpu).apicid & 1; > - > - hwc->config |= (1ULL << (AMD64_L3_THREAD_SHIFT + thread) & > - AMD64_L3_THREAD_MASK) | AMD64_L3_SLICE_MASK; > - } > + if (l3_mask && is_llc_event(event)) > + hwc->config |= l3_thread_slice_mask(event->cpu); > By looking at this code, I realized that even on Zen2 this is wrong. It does not work well. You are basically saying that the L3 event is tied to the CPU the event is programmed to. But this does not work with the cpumask programmed for the amd_l3 PMU. This mask shows, as it should, one CPU/CCX. So that means that when I do: $ perf stat -a amd_l3/event=llc_event/ This only collects on the CPUs listed in the cpumask: 0,4,8,12 .... That means that L3 events generated by the other CPUs on the CCX are not monitored. I can easily see the problem by pinning a memory bound program to CPU64, for instance. I think the thread mask should be exposed to the user. If not specified, then set the mask to cover all CPUs of the CCX. That way you can pick and choose what you want. And with one event/CCX you can monitor for all CPUs. I can send a patch that does that. With what you have now, you have to force the list of CPUs with -C to work around the cpumask. And forcing the cpumask to 0-255 does not make sense because not all L3 events necessarily need the L3 mask, so you don't want to program them on all CPUs especially with 8 cpus/CCX and only 6 counters. On Fri, Mar 13, 2020 at 4:10 PM Kim Phillips wrote: > > In order to better accommodate the upcoming Family 19h support, > given the 80-char line limit, move the existing code into a new > l3_thread_slice_mask function. > > Signed-off-by: Kim Phillips > Cc: Alexander Shishkin > Cc: Arnaldo Carvalho de Melo > Cc: Borislav Petkov > Cc: "H. Peter Anvin" > Cc: Ingo Molnar > Cc: Ingo Molnar > Cc: Jiri Olsa > Cc: Mark Rutland > Cc: Michael Petlan > Cc: Namhyung Kim > Cc: Peter Zijlstra > Cc: Stephane Eranian > Cc: Thomas Gleixner > Cc: linux-kernel@vger.kernel.org > Cc: x86@kernel.org > --- > v2: split into two parts, this one being the mechanical > move based on Boris' review comments: > > https://lkml.org/lkml/2020/3/12/581 > > arch/x86/events/amd/uncore.c | 25 ++++++++++++++++--------- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c > index a6ea07f2aa84..dcdac001431e 100644 > --- a/arch/x86/events/amd/uncore.c > +++ b/arch/x86/events/amd/uncore.c > @@ -180,6 +180,20 @@ static void amd_uncore_del(struct perf_event *event, int flags) > hwc->idx = -1; > } > > +/* > + * Convert logical cpu number to L3 PMC Config ThreadMask format > + */ > +static u64 l3_thread_slice_mask(int cpu) > +{ > + int thread = 2 * (cpu_data(cpu).cpu_core_id % 4); > + > + if (smp_num_siblings > 1) > + thread += cpu_data(cpu).apicid & 1; > + > + return (1ULL << (AMD64_L3_THREAD_SHIFT + thread) & > + AMD64_L3_THREAD_MASK) | AMD64_L3_SLICE_MASK; > +} > + > static int amd_uncore_event_init(struct perf_event *event) > { > struct amd_uncore *uncore; > @@ -209,15 +223,8 @@ static int amd_uncore_event_init(struct perf_event *event) > * SliceMask and ThreadMask need to be set for certain L3 events in > * Family 17h. For other events, the two fields do not affect the count. > */ > - if (l3_mask && is_llc_event(event)) { > - int thread = 2 * (cpu_data(event->cpu).cpu_core_id % 4); > - > - if (smp_num_siblings > 1) > - thread += cpu_data(event->cpu).apicid & 1; > - > - hwc->config |= (1ULL << (AMD64_L3_THREAD_SHIFT + thread) & > - AMD64_L3_THREAD_MASK) | AMD64_L3_SLICE_MASK; > - } > + if (l3_mask && is_llc_event(event)) > + hwc->config |= l3_thread_slice_mask(event->cpu); > > uncore = event_to_amd_uncore(event); > if (!uncore) > -- > 2.25.1 >