Received: by 2002:ab2:1c04:0:b0:1f7:53ba:1ebe with SMTP id f4csp140779lqg; Fri, 26 Apr 2024 10:56:01 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXqnf+w7qQPBMzftCeGa6dcPpr2C7110Hw9n2GElDdTtaV7mwUA70IgPsoE67PvJ3iU5Fn33ujTwg0KXMRCDAM92AycYN3Zx8iuaCMPiw== X-Google-Smtp-Source: AGHT+IFPdH5uKpBTfkHk7UssT/1r9ZPkbcIcB5N1dfUZndPbowYL7Z0O1vJuFDaxrfA71utCzH1Q X-Received: by 2002:a17:90a:e646:b0:2a2:4192:dfc1 with SMTP id ep6-20020a17090ae64600b002a24192dfc1mr3591964pjb.14.1714154161020; Fri, 26 Apr 2024 10:56:01 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714154161; cv=pass; d=google.com; s=arc-20160816; b=AeytqzpFkBVHKgqzDnXxRgwzaRikyy1UHbH1TxMQNW8HujkM9aXpioJ2K/XejjWTm4 9KOhK1kYfh/o4dS4wsvg7RRFli2yRUdvBc0iFUM57/09ulyxlIoBSaJpMvlFZk+NVXmq g8tTbdOMA9hWwfErlwXHkiS8fG1yzkeKpDfyscUq4Pn2fdePql3kG36IJe29tmuXVhtj rT45GKONc4gZ6idQhIfdXYbWR006g8jscF697eeIdWLLwZ+dpofyZ/bOpORugeSihdCz RJiDnF9/4SmkQfcqPb6nMQHutx8/Lhltm78P7lPjl66zy59dakqVIDKxFSNOVXgXUUSi Qokg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=Zr5wzEgzTTtHq+5wKv0E8ifYCternf09ZS6twOyb8p8=; fh=vzWup3kfCUdvbFNfQsBxCi1pun0IhOQKA1YBwfnitt4=; b=r0sktiN/xH0UuhGJDeXx7lcteqhzWuhAgQJ+lwgj89kDJzsFro2nQiqoq8v9JNfMWh 9MBKsT+JlbfNQjpW05Krqu1VwD+tqrxH5friwoXYCLTfOpWh4ue5RokGHFEqBn8fm6kb qTJQHrkZVx4pRORZWPf3+p26QynqLKw/4TH7zgSqlh9Gnqj6hl+XqAb20b4kyW0VKIT1 ODidVHx8RCPbSWBR3Js91RNnYgS5uH/24dcCArMefuSzMA1DSsYuAxoUQQ/+PXNPTMp4 8I10izotqEwHjcOJybAWP9cuMXXjh/RCithW80a+vGN2gUrQUDGfkIc0VBUOSklLZpkD H2CQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rT13YVuA; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-160524-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-160524-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id h13-20020a17090aa88d00b002abd9c77040si11909496pjq.60.2024.04.26.10.56.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Apr 2024 10:56:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-160524-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rT13YVuA; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-160524-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-160524-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 4B3ED28418C for ; Fri, 26 Apr 2024 17:56:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E3659168B07; Fri, 26 Apr 2024 17:55:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rT13YVuA" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEC3D762EF; Fri, 26 Apr 2024 17:55:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714154153; cv=none; b=YKjHXvHdb+Nu7pfoSYFfcYbuEZJU1cs0GXqBhnnTvZsljiHVBY8pLgrlK3dqO2Ij9RaaHJ0VbJrQ/55QTMCP2ykB9+IvoKAiCCSZfK/LNxohgvISYADAs2MssSJ6ruXrkqiXuK+bRvJnU5OTH1Njm/Afo9U90QaDDqu8yTziRsE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714154153; c=relaxed/simple; bh=Pf7pPVZks5xzzNAo26Zws/OeKXgG3gHu4RXtYM46N9A=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=djlRHkf7jdjnU4D7eqBVDccazZtovrDCPWpLjhEcdRYgn5rID6GBXSTnp6fEt0WgQJ34cA132S9CxOUGQn3K3d66lbQ1tRyQuD6xdnAwn9kDGGrV7wcDz7kMkR3KOzvl1ab5dSiNGxSV2xB5YP768PvnOqfxAfpDKxPlCG41mfU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rT13YVuA; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58CB3C113CD; Fri, 26 Apr 2024 17:55:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1714154153; bh=Pf7pPVZks5xzzNAo26Zws/OeKXgG3gHu4RXtYM46N9A=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=rT13YVuAd+8ccqqsaeo+9P3Qc8BDiML7tpLr7wfW9bntsqaw2rdB5xsWWrKvikMqC eCX9N+G/07ubtvRVIuXLJSU6HUErL29t4evS2P95KiUoQCphl48fRKuDay4W1SbrzS dsGRN5cG9qgMtMU1ZeMXQWku+C1odaR5jBkY3MwgA4SidP/+SC3qcpHonJ5T8KlNz8 g+Dk+bNhaGz1zzvMf6vVnx10Fp6nzxqw4ripQMV3W5kI96eCtY+L+L0XZ9UkTEkvBP le+YcR/GiZNvid2rWbhZ+UPc1EgXQgzcg3qiPzRX9UD0I323u0bmEWZSYxa3UVHP3D SLxi2Pq71NWHA== Received: by mail-oo1-f52.google.com with SMTP id 006d021491bc7-5acf5723325so642551eaf.0; Fri, 26 Apr 2024 10:55:53 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCXXU8P0ed1Wn7DZAerrjOi+txVW7I26NOOCqHDu5tPaUm8JbrS8vIZx2ugcu7eVIHzWO1YEjj4orDLrAu5fCShfJw46pgPQF1ak34RoQbhnkzTteeI0DyEB9aslAptbafRQ/DI+m+4= X-Gm-Message-State: AOJu0YzEgtZ52YSmL+lBKSSRIpLCLuJuz4WIcLA0LifQNf8lmRQqcNiC X9NFjzISEBqSd9gbWZuDS43UwEZYeRUD5eDXw6UTmjJd53eSX616Q5OPSno9tbIru6SUwexe/Xu RawnsHJwJUX75tK4PshZ46XqNcmA= X-Received: by 2002:a05:6871:6ac:b0:239:9a:d3a4 with SMTP id l44-20020a05687106ac00b00239009ad3a4mr3680726oao.0.1714154152504; Fri, 26 Apr 2024 10:55:52 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240422162040.1502626-1-rui.zhang@intel.com> <20240422162040.1502626-2-rui.zhang@intel.com> In-Reply-To: <20240422162040.1502626-2-rui.zhang@intel.com> From: "Rafael J. Wysocki" Date: Fri, 26 Apr 2024 19:55:40 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH V3 1/2] powercap: intel_rapl: Introduce APIs for PMU support To: Zhang Rui Cc: rafael.j.wysocki@intel.com, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, srinivas.pandruvada@intel.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Apr 22, 2024 at 6:21=E2=80=AFPM Zhang Rui wro= te: > > Introduce two new APIs rapl_package_add_pmu()/rapl_package_remove_pmu(). > > RAPL driver can invoke these APIs to expose its supported energy > counters via perf PMU. The new RAPL PMU is fully compatible with current > MSR RAPL PMU, including using the same PMU name and events > name/id/unit/scale, etc. > > For example, use below command > perf stat -e power/energy-pkg/ -e power/energy-ram/ FOO > to get the energy consumption if power/energy-pkg/ and power/energy-ram/ > events are available in the "perf list" output. > > This does not introduce any conflict because TPMI RAPL is the only user > of these APIs currently, and it never co-exists with MSR RAPL. > > Note that RAPL Packages can be probed/removed dynamically, and the > events supported by each TPMI RAPL device can be different. Thus the > RAPL PMU support is done on demand, which means > 1. PMU is registered only if it is needed by a RAPL Package. PMU events > for unsupported counters are not exposed. > 2. PMU is unregistered and registered when a new RAPL Package is probed > and supports new counters that are not supported by current PMU. > For example, on a dual-package system using TPMI RAPL, it is possible > that Package 1 behaves as TPMI domain root and supports Psys domain. > In this case, register PMU without Psys event when probing Package 0, > and re-register the PMU with Psys event when probing Package 1. > 3. PMU is unregistered when all registered RAPL Packages don't need PMU. > > Signed-off-by: Zhang Rui > --- > drivers/powercap/intel_rapl_common.c | 578 +++++++++++++++++++++++++++ > include/linux/intel_rapl.h | 32 ++ > 2 files changed, 610 insertions(+) > > diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/inte= l_rapl_common.c > index c4302caeb631..1fa45ed8ba0b 100644 > --- a/drivers/powercap/intel_rapl_common.c > +++ b/drivers/powercap/intel_rapl_common.c > @@ -15,6 +15,8 @@ > #include > #include > #include > +#include > +#include > #include > #include > #include > @@ -1507,6 +1509,582 @@ static int rapl_detect_domains(struct rapl_packag= e *rp) > return 0; > } > > +#ifdef CONFIG_PERF_EVENTS > + > +/* > + * Support for RAPL PMU > + * > + * Register a PMU if any of the registered RAPL Packages have the requir= ement > + * of exposing its energy counters via Perf PMU. > + * > + * PMU Name: > + * power > + * > + * Events: > + * Name Event id RAPL Domain > + * energy_cores 0x01 RAPL_DOMAIN_PP0 > + * energy_pkg 0x02 RAPL_DOMAIN_PACKAGE > + * energy_ram 0x03 RAPL_DOMAIN_DRAM > + * energy_gpu 0x04 RAPL_DOMAIN_PP1 > + * energy_psys 0x05 RAPL_DOMAIN_PLATFORM > + * > + * Unit: > + * Joules > + * > + * Scale: > + * 2.3283064365386962890625e-10 > + * The same RAPL domain in different RAPL Packages may have differen= t > + * energy units. Use 2.3283064365386962890625e-10 (2^-32) Joules as > + * the fixed unit for all energy counters, and covert each hardware > + * counter increase to N times of PMU event counter increases. > + * > + * This is fully compatible with the current MSR RAPL PMU. This means th= at > + * userspace programs like turbostat can use the same code to handle RAP= L Perf > + * PMU, no matter what RAPL Interface driver (MSR/TPMI, etc) is running > + * underlying on the platform. > + * > + * Note that RAPL Packages can be probed/removed dynamically, and the ev= ents > + * supported by each TPMI RAPL device can be different. Thus the RAPL PM= U > + * support is done on demand, which means > + * 1. PMU is registered only if it is needed by a RAPL Package. PMU even= ts for > + * unsupported counters are not exposed. > + * 2. PMU is unregistered and registered when a new RAPL Package is prob= ed and > + * supports new counters that are not supported by current PMU. > + * 3. PMU is unregistered when all registered RAPL Packages don't need P= MU. > + */ > + > +struct rapl_pmu { > + struct pmu pmu; /* Perf PMU structure */ > + u64 timer_ms; /* Maximum expiration time to avo= id counter overflow */ > + unsigned long domain_map; /* Events supported by current re= gistered PMU */ > + bool registered; /* Whether the PMU has been regis= tered or not */ > +}; > + > +static struct rapl_pmu rapl_pmu; > + > +/* PMU helpers */ > + > +static int get_pmu_cpu(struct rapl_package *rp) > +{ > + int cpu; > + > + if (!rp->has_pmu) > + return nr_cpu_ids; > + > + /* Only TPMI RAPL is supported for now */ > + if (rp->priv->type !=3D RAPL_IF_TPMI) > + return nr_cpu_ids; > + > + /* TPMI RAPL uses any CPU in the package for PMU */ > + for_each_online_cpu(cpu) > + if (topology_physical_package_id(cpu) =3D=3D rp->id) > + return cpu; > + > + return nr_cpu_ids; > +} > + > +static bool is_rp_pmu_cpu(struct rapl_package *rp, int cpu) > +{ > + if (!rp->has_pmu) > + return false; > + > + /* Only TPMI RAPL is supported for now */ > + if (rp->priv->type !=3D RAPL_IF_TPMI) > + return nr_cpu_ids; As per the comment, this should be false, shouldn't it? > + > + /* TPMI RAPL uses any CPU in the package for PMU */ > + return topology_physical_package_id(cpu) =3D=3D rp->id; > +} > + > +static struct rapl_package_pmu_data *event_to_pmu_data(struct perf_event= *event) > +{ > + struct rapl_package *rp =3D event->pmu_private; > + > + return &rp->pmu_data; > +} > + > +/* PMU event callbacks */ > + > +static u64 event_read_counter(struct perf_event *event) > +{ > + struct rapl_package *rp =3D event->pmu_private; > + u64 val; > + int ret; > + > + /* Return 0 for unsupported events */ > + if (event->hw.idx < 0) > + return 0; > + > + ret =3D rapl_read_data_raw(&rp->domains[event->hw.idx], ENERGY_CO= UNTER, false, &val); > + > + /* Return 0 for failed read */ > + if (ret) > + return 0; > + > + return val; > +} > + > +static void __rapl_pmu_event_start(struct perf_event *event) > +{ > + struct rapl_package_pmu_data *data =3D event_to_pmu_data(event); > + > + if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED))) > + return; > + > + event->hw.state =3D 0; > + > + list_add_tail(&event->active_entry, &data->active_list); > + > + local64_set(&event->hw.prev_count, event_read_counter(event)); > + if (++data->n_active =3D=3D 1) > + hrtimer_start(&data->hrtimer, data->timer_interval, > + HRTIMER_MODE_REL_PINNED); > +} > + > +static void rapl_pmu_event_start(struct perf_event *event, int mode) > +{ > + struct rapl_package_pmu_data *data =3D event_to_pmu_data(event); > + unsigned long flags; > + > + raw_spin_lock_irqsave(&data->lock, flags); > + __rapl_pmu_event_start(event); > + raw_spin_unlock_irqrestore(&data->lock, flags); > +} > + > +static u64 rapl_event_update(struct perf_event *event) > +{ > + struct hw_perf_event *hwc =3D &event->hw; > + struct rapl_package_pmu_data *data =3D event_to_pmu_data(event); > + u64 prev_raw_count, new_raw_count; > + s64 delta, sdelta; > + s64 tmp; > + > + do { > + prev_raw_count =3D local64_read(&hwc->prev_count); > + new_raw_count =3D event_read_counter(event); > + tmp =3D local64_cmpxchg(&hwc->prev_count, prev_raw_count,= new_raw_count); > + } while (tmp !=3D prev_raw_count); I think that it is only safe to call this function for draining an event going away, because otherwise the above may turn into an endless loop, and the function is called under a spinlock. I would add a comment (above the loop) explaining that this is about draining, so the counter is expected to stop incrementing shortly. The rest of the patch LGTM. Thanks!