Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp505994lql; Mon, 11 Mar 2024 08:58:44 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXom5wXOWc4W5yNBBYlJdT5YCbjnHsFl9PX+0k7Tf/8y+WLItK/WLssRLSA+qeW4mXqhlv5lqsSyu6scPZ4TLgw1Ft5ooNH67ogPHIcLA== X-Google-Smtp-Source: AGHT+IHPV5uEAJR9AhzY+JFM9o9RjTwkVcaAz4o0X5+rlPYXUajdCPqkd2hcqb/HP+1tqxASEknb X-Received: by 2002:a17:906:6a09:b0:a46:5df:ad98 with SMTP id qw9-20020a1709066a0900b00a4605dfad98mr7545035ejc.21.1710172724200; Mon, 11 Mar 2024 08:58:44 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710172724; cv=pass; d=google.com; s=arc-20160816; b=s7JQ2ukTMMbnTTbxADOZ+TwWU+RFy8zMJ6ID8rkU7RnLYsjxSBiYJk1tTFHCU81bC6 xFlyP0L9dXMOmHFKNiUcaqMj3JGEGyoHuSo0+S/LF420kKlhZjLcV0tSfcbjeBS2YICR m1eZLO+sG8kdHABptSHHhipLkzhtCVU+e2/iu+BOil2FdlXW72bTtEbrZjAQUIkBFWlr NBzI9euEmhC7e0A7P4SMHvLZolmVKe+0QRBsPu4la7ABAF+EghnqXcxe9BotTdlpZBF4 Te666rSF/7LcU59ypFQnNpvPKY+FpXAKu6PSLLwc5sivZxnzQTqHuyfT4GopmRpG+cTb YTPQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=WunYlrrgH8Pqhah2AuK1LnOVLc3QngywqLDoA1RDFFA=; fh=laA/nffNGD85uHGG54iEtZc84M3j34/c+k6UEgqA300=; b=UxexlrxsGat8Z98vvxkmubcBZLqYDfUZXxnK/YNGy0NA14Ym5/TYDARgJO69As17sO 6rA/ggml/XDrpFx3Qs65AjH/5zRzJLYPCQX+elP6gBDERZTczggphvMWlm5JsADWtP8a VsWUdklKFEaqA/KJWUcyviiqPKz9NQcMNo7GS1hTraR1H/0oL7fHIoetCJjoNrggzugq 3Ut3hWbZT2V/owZyBEqFsiXj/xsxpp2dMYJNlMc0m/maPKDb59KURCJHQHxrbiZif+DP 4VGWhlMoYVGdnx0AZucKO7W2yRawvV7CL2prc4yYG36y+msIaoJdowqbUPjMM1iQB3eA VkOA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-99160-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99160-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id r8-20020a170906548800b00a453c6d51d1si2571874ejo.362.2024.03.11.08.58.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Mar 2024 08:58:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-99160-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-99160-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99160-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id BA78D1F21EF1 for ; Mon, 11 Mar 2024 15:58:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A226C481A0; Mon, 11 Mar 2024 15:58:26 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B639944C9B; Mon, 11 Mar 2024 15:58:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710172706; cv=none; b=OUV03hwlRrlq0pV7USG3yEWpqNyt8gP3ZNZjMw5QvDJZKNrKL5nFn7WcXvxQYR6XJXO3u18ITDRZAdr0qzb9qCfjYjFrRNE2uFN8DMUpXCPJ9921LXt0csYrPWXWx8pnDtyX1tgehBWgNidXjGSzidG/8Rf0cmhSnjxiwKgd//w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710172706; c=relaxed/simple; bh=OrzdstX60YGyWu/xA75nxftBoTukeg0HRk4dLur9uJA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=SSeGlDYJXeXvoR5zMkjw4Eq4FjNLemF5x5mVwzq2EtXlp02XC7tF8bmr0QjHCGrt7XzNWldI3/TcUswDdbK2J9aGk+SgcQfvylZ05n9YzqneZYpOLdkj9m8BvCMLMldrYcGBnn0mrYZ6wVeK5wWOpdG9fV1P8Zr9FLbTJyq8Xow= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9FD91FEC; Mon, 11 Mar 2024 08:58:59 -0700 (PDT) Received: from [192.168.1.100] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 948703F762; Mon, 11 Mar 2024 08:58:20 -0700 (PDT) Message-ID: <3461a0dc-b163-31ed-a96e-f503ade74260@arm.com> Date: Mon, 11 Mar 2024 15:58:18 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v2 1/4] perf: Support PERF_SAMPLE_READ with inherit_stat Content-Language: en-US To: Ben Gainey , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com References: <20240208131050.2406183-1-ben.gainey@arm.com> <20240208131050.2406183-2-ben.gainey@arm.com> From: James Clark In-Reply-To: <20240208131050.2406183-2-ben.gainey@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 08/02/2024 13:10, Ben Gainey wrote: > This change allows events to use PERF_SAMPLE READ with inherit > so long as both inherit_stat and PERF_SAMPLE_TID are set. I saw there was a discussion on V1 about adding a new flag because this is an ABI break. Personally I'm not sure if it is required given that it wasn't allowed before, so there wouldn't be any users to experience it. I suppose there _could_ be a new user if they stumbled across this now, but it's not like they would see that as a regression because it wasn't allowed before anyway. Maybe it's cleaner to just use the existing flags rather than adding a new one. Perf even guarded the condition in userspace as far as I can see, so nobody using Perf would hit this either. > > In this configuration, and event will be inherited into any and -> any/an? > child processes / threads, allowing convenient profiling of a > multiprocess or multithreaded application, whilst allowing > profiling tools to collect per-thread samples, in particular > of groups of counters. > > Signed-off-by: Ben Gainey > --- > include/linux/perf_event.h | 1 + > kernel/events/core.c | 53 ++++++++++++++++++++++++++------------ > 2 files changed, 37 insertions(+), 17 deletions(-) > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index d2a15c0c6f8a..7d405dff6694 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -932,6 +932,7 @@ struct perf_event_context { > > int nr_task_data; > int nr_stat; > + int nr_stat_read; > int nr_freq; > int rotate_disable; > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index f0f0f71213a1..dac7093b3608 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -1795,8 +1795,11 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx) > ctx->nr_events++; > if (event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT) > ctx->nr_user++; > - if (event->attr.inherit_stat) > + if (event->attr.inherit_stat) { > ctx->nr_stat++; > + if (event->attr.inherit && (event->attr.sample_type & PERF_SAMPLE_READ)) > + ctx->nr_stat_read++; > + } > > if (event->state > PERF_EVENT_STATE_OFF) > perf_cgroup_event_enable(event, ctx); > @@ -2019,8 +2022,11 @@ list_del_event(struct perf_event *event, struct perf_event_context *ctx) > ctx->nr_events--; > if (event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT) > ctx->nr_user--; > - if (event->attr.inherit_stat) > + if (event->attr.inherit_stat) { > ctx->nr_stat--; > + if (event->attr.inherit && (event->attr.sample_type & PERF_SAMPLE_READ)) This condition is repeated a few times, maybe we could add a macro like "has_sample_read_thread()" or something or whatever we're calling it exactly. It might have prevented the mistake in copying the condition below in perf_event_count()... > + ctx->nr_stat_read--; > + } > > list_del_rcu(&event->event_entry); > > @@ -3529,11 +3535,17 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next) > perf_ctx_disable(ctx, false); > > /* PMIs are disabled; ctx->nr_pending is stable. */ > - if (local_read(&ctx->nr_pending) || > + if (ctx->nr_stat_read || > + next_ctx->nr_stat_read || > + local_read(&ctx->nr_pending) || > local_read(&next_ctx->nr_pending)) { > /* > * Must not swap out ctx when there's pending > * events that rely on the ctx->task relation. > + * > + * Likewise, when a context contains inherit+inherit_stat+SAMPLE_READ > + * events they should be switched out using the slow path > + * so that they are treated as if they were distinct contexts. > */ > raw_spin_unlock(&next_ctx->lock); > rcu_read_unlock(); > @@ -3545,6 +3557,7 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next) > > perf_ctx_sched_task_cb(ctx, false); > perf_event_swap_task_ctx_data(ctx, next_ctx); > + perf_event_sync_stat(ctx, next_ctx); > > perf_ctx_enable(ctx, false); > > @@ -3559,8 +3572,6 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next) > RCU_INIT_POINTER(next->perf_event_ctxp, ctx); > > do_switch = 0; > - > - perf_event_sync_stat(ctx, next_ctx); The commit message gives a very high level summary of the user visible changes, but it doesn't say what changes have been made to the code to accomplish it. I wasn't sure what this move of perf_even_sync_stat() is for, because it's actually skipped over when nr_stat_read != 0, which is in this new case? > } > raw_spin_unlock(&next_ctx->lock); > raw_spin_unlock(&ctx->lock); > @@ -4533,8 +4544,13 @@ static void __perf_event_read(void *info) > raw_spin_unlock(&ctx->lock); > } > > -static inline u64 perf_event_count(struct perf_event *event) > +static inline u64 perf_event_count(struct perf_event *event, bool from_sample) There are only two trues, and 7 existing falses, so you could just add a new function like perf_event_count_sample(). But I suppose that's a style thing. > { > + if (from_sample && event->attr.inherit && > + event->attr.inherit && > + (event->attr.sample_type & PERF_SAMPLE_TID)) { There's something suspicious about this condition with the event->attr.inherit twice. Should it be && inherit_stat? I don't know if there's any test that could have caught this, if it's affecting the existing inherit but not inherit_stat case? And it's also probably not very helpful at this stage, but if you run checkpatch.pl on your patches it will point out some of the style issues.