2024-01-31 11:45:08

by Zhang, Rui

[permalink] [raw]
Subject: [PATCH 0/6] powercap: intel_rapl: Fixes and new platform enabling

Patch 1/6 fixes a real bug when MMIO RAPL driver is probed on platforms
that are not listed in current CPU model list. IMO, it should be
considered as stable material.

Patch 2/6 fixes a potential racing issue, but I have not reproduced it
yet.

Patch 3/6 ~ 4/6 fix a problem that TPMI RAPL driver probes disabled
System (Psys) RAPL Domains.

Patch 5/6 and 6/6 are two simple CPU model updates to support MSR RAPL
on Arrowlake and Lunarlake platforms.

thanks,
rui

----------------------------------------------------------------
Sumeet Pawnikar (1):
powercap: intel_rapl: add support for Arrow Lake

Zhang Rui (5):
powercap: intel_rapl: Fix a NULL pointer reference bug
powercap: intel_rapl: Fix locking for TPMI RAPL
powercap: intel_rapl_tpmi: Fix a register bug
powercap: intel_rapl_tpmi: Fix System Domain probing
powercap: intel_rapl: Add support for LNL-M paltform

drivers/powercap/intel_rapl_common.c | 36 ++++++++++++++++++++--
drivers/powercap/intel_rapl_msr.c | 8 ++---
drivers/powercap/intel_rapl_tpmi.c | 15 +++++++++
.../intel/int340x_thermal/processor_thermal_rapl.c | 8 ++---
include/linux/intel_rapl.h | 6 ++++
5 files changed, 62 insertions(+), 11 deletions(-)



2024-01-31 11:45:42

by Zhang, Rui

[permalink] [raw]
Subject: [PATCH 3/6] powercap: intel_rapl_tpmi: Fix a register bug

Add the missing Domain Info register. This also fixes the bogus
definition of the Interrupt register.

Neither of these two registers was used previously.

Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
Signed-off-by: Zhang Rui <[email protected]>
---
drivers/powercap/intel_rapl_tpmi.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/powercap/intel_rapl_tpmi.c b/drivers/powercap/intel_rapl_tpmi.c
index 891c90fefd8b..f1c734ac3c34 100644
--- a/drivers/powercap/intel_rapl_tpmi.c
+++ b/drivers/powercap/intel_rapl_tpmi.c
@@ -40,6 +40,7 @@ enum tpmi_rapl_register {
TPMI_RAPL_REG_ENERGY_STATUS,
TPMI_RAPL_REG_PERF_STATUS,
TPMI_RAPL_REG_POWER_INFO,
+ TPMI_RAPL_REG_DOMAIN_INFO,
TPMI_RAPL_REG_INTERRUPT,
TPMI_RAPL_REG_MAX = 15,
};
--
2.34.1


2024-01-31 11:45:50

by Zhang, Rui

[permalink] [raw]
Subject: [PATCH 4/6] powercap: intel_rapl_tpmi: Fix System Domain probing

Only domain root packages can enumerate System (Psys) domain.
Whether a package is domain root or not is described in the Bit 0 of the
Domain Info register.

Add support for Domain Info register and fix the System domain probing
accordingly.

Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
Signed-off-by: Zhang Rui <[email protected]>
---
drivers/powercap/intel_rapl_tpmi.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/powercap/intel_rapl_tpmi.c b/drivers/powercap/intel_rapl_tpmi.c
index f1c734ac3c34..f6b7f085977c 100644
--- a/drivers/powercap/intel_rapl_tpmi.c
+++ b/drivers/powercap/intel_rapl_tpmi.c
@@ -131,6 +131,12 @@ static void trp_release(struct tpmi_rapl_package *trp)
mutex_unlock(&tpmi_rapl_lock);
}

+/*
+ * Bit 0 of TPMI_RAPL_REG_DOMAIN_INFO indicates if the current package is a domain
+ * root or not. Only domain root packages can enumerate System (Psys) Domain.
+ */
+#define TPMI_RAPL_DOMAIN_ROOT BIT(0)
+
static int parse_one_domain(struct tpmi_rapl_package *trp, u32 offset)
{
u8 tpmi_domain_version;
@@ -140,6 +146,7 @@ static int parse_one_domain(struct tpmi_rapl_package *trp, u32 offset)
enum rapl_domain_reg_id reg_id;
int tpmi_domain_size, tpmi_domain_flags;
u64 tpmi_domain_header = readq(trp->base + offset);
+ u64 tpmi_domain_info;

/* Domain Parent bits are ignored for now */
tpmi_domain_version = tpmi_domain_header & 0xff;
@@ -170,6 +177,13 @@ static int parse_one_domain(struct tpmi_rapl_package *trp, u32 offset)
domain_type = RAPL_DOMAIN_PACKAGE;
break;
case TPMI_RAPL_DOMAIN_SYSTEM:
+ if (!(tpmi_domain_flags & BIT(TPMI_RAPL_REG_DOMAIN_INFO))) {
+ pr_warn(FW_BUG "System domain must support Domain Info register\n");
+ return -ENODEV;
+ }
+ tpmi_domain_info = readq(trp->base + offset + TPMI_RAPL_REG_DOMAIN_INFO);
+ if (!(tpmi_domain_info & TPMI_RAPL_DOMAIN_ROOT))
+ return 0;
domain_type = RAPL_DOMAIN_PLATFORM;
break;
case TPMI_RAPL_DOMAIN_MEMORY:
--
2.34.1


2024-01-31 11:46:15

by Zhang, Rui

[permalink] [raw]
Subject: [PATCH 6/6] powercap: intel_rapl: add support for Arrow Lake

From: Sumeet Pawnikar <[email protected]>

Add support for LNL-M platform.

Signed-off-by: Sumeet Pawnikar <[email protected]>
---
drivers/powercap/intel_rapl_common.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index a3094cb9f296..aa627a6b12a4 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -1263,6 +1263,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &rapl_defaults_spr_server),
X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, &rapl_defaults_spr_server),
X86_MATCH_INTEL_FAM6_MODEL(LUNARLAKE_M, &rapl_defaults_core),
+ X86_MATCH_INTEL_FAM6_MODEL(ARROWLAKE, &rapl_defaults_core),
X86_MATCH_INTEL_FAM6_MODEL(LAKEFIELD, &rapl_defaults_core),

X86_MATCH_INTEL_FAM6_MODEL(ATOM_SILVERMONT, &rapl_defaults_byt),
--
2.34.1


2024-01-31 12:08:21

by Zhang, Rui

[permalink] [raw]
Subject: [PATCH 2/6] powercap: intel_rapl: Fix locking for TPMI RAPL

RAPL framework uses CPU hotplug lock to protect rapl_packages list and
rp->lead_cpu to guarantee that
1. the RAPL package device is not unprobed and freed
2. the cached rp->lead_cpu is always valid
for operations like powercap sysfs accesses.

Current RAPL APIs assume they are called from CPU hotplug callbacks
which hold the CPU hotplug lock. But TPMI RAPL driver invokes the APIs
in the driver' .probe() function without CPU hotplug lock held.

Fix the problem by providing both locked and lockless version RAPL APIs.

Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
Signed-off-by: Zhang Rui <[email protected]>
---
drivers/powercap/intel_rapl_common.c | 29 +++++++++++++++++--
drivers/powercap/intel_rapl_msr.c | 8 ++---
.../int340x_thermal/processor_thermal_rapl.c | 8 ++---
include/linux/intel_rapl.h | 6 ++++
4 files changed, 40 insertions(+), 11 deletions(-)

diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 1a739afd47d9..9d3e102f1a76 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -5,6 +5,7 @@
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/cleanup.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/list.h>
@@ -1504,7 +1505,7 @@ static int rapl_detect_domains(struct rapl_package *rp)
}

/* called from CPU hotplug notifier, hotplug lock held */
-void rapl_remove_package(struct rapl_package *rp)
+void rapl_remove_package_cpuslocked(struct rapl_package *rp)
{
struct rapl_domain *rd, *rd_package = NULL;

@@ -1533,10 +1534,18 @@ void rapl_remove_package(struct rapl_package *rp)
list_del(&rp->plist);
kfree(rp);
}
+EXPORT_SYMBOL_GPL(rapl_remove_package_cpuslocked);
+
+void rapl_remove_package(struct rapl_package *rp)
+{
+ guard(cpus_read_lock)();
+ rapl_remove_package_cpuslocked(rp);
+}
EXPORT_SYMBOL_GPL(rapl_remove_package);

/* caller to ensure CPU hotplug lock is held */
-struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu)
+struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
+ bool id_is_cpu)
{
struct rapl_package *rp;
int uid;
@@ -1554,10 +1563,17 @@ struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv,

return NULL;
}
+EXPORT_SYMBOL_GPL(rapl_find_package_domain_cpuslocked);
+
+struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu)
+{
+ guard(cpus_read_lock)();
+ return rapl_find_package_domain_cpuslocked(id, priv, id_is_cpu);
+}
EXPORT_SYMBOL_GPL(rapl_find_package_domain);

/* called from CPU hotplug notifier, hotplug lock held */
-struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu)
+struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *priv, bool id_is_cpu)
{
struct rapl_package *rp;
int ret;
@@ -1603,6 +1619,13 @@ struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id
kfree(rp);
return ERR_PTR(ret);
}
+EXPORT_SYMBOL_GPL(rapl_add_package_cpuslocked);
+
+struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu)
+{
+ guard(cpus_read_lock)();
+ return rapl_add_package_cpuslocked(id, priv, id_is_cpu);
+}
EXPORT_SYMBOL_GPL(rapl_add_package);

static void power_limit_state_save(void)
diff --git a/drivers/powercap/intel_rapl_msr.c b/drivers/powercap/intel_rapl_msr.c
index 250bd41a588c..b4b6930cacb0 100644
--- a/drivers/powercap/intel_rapl_msr.c
+++ b/drivers/powercap/intel_rapl_msr.c
@@ -73,9 +73,9 @@ static int rapl_cpu_online(unsigned int cpu)
{
struct rapl_package *rp;

- rp = rapl_find_package_domain(cpu, rapl_msr_priv, true);
+ rp = rapl_find_package_domain_cpuslocked(cpu, rapl_msr_priv, true);
if (!rp) {
- rp = rapl_add_package(cpu, rapl_msr_priv, true);
+ rp = rapl_add_package_cpuslocked(cpu, rapl_msr_priv, true);
if (IS_ERR(rp))
return PTR_ERR(rp);
}
@@ -88,14 +88,14 @@ static int rapl_cpu_down_prep(unsigned int cpu)
struct rapl_package *rp;
int lead_cpu;

- rp = rapl_find_package_domain(cpu, rapl_msr_priv, true);
+ rp = rapl_find_package_domain_cpuslocked(cpu, rapl_msr_priv, true);
if (!rp)
return 0;

cpumask_clear_cpu(cpu, &rp->cpumask);
lead_cpu = cpumask_first(&rp->cpumask);
if (lead_cpu >= nr_cpu_ids)
- rapl_remove_package(rp);
+ rapl_remove_package_cpuslocked(rp);
else if (rp->lead_cpu == cpu)
rp->lead_cpu = lead_cpu;
return 0;
diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.c
index 2f00fc3bf274..e964a9375722 100644
--- a/drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.c
+++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.c
@@ -27,9 +27,9 @@ static int rapl_mmio_cpu_online(unsigned int cpu)
if (topology_physical_package_id(cpu))
return 0;

- rp = rapl_find_package_domain(cpu, &rapl_mmio_priv, true);
+ rp = rapl_find_package_domain_cpuslocked(cpu, &rapl_mmio_priv, true);
if (!rp) {
- rp = rapl_add_package(cpu, &rapl_mmio_priv, true);
+ rp = rapl_add_package_cpuslocked(cpu, &rapl_mmio_priv, true);
if (IS_ERR(rp))
return PTR_ERR(rp);
}
@@ -42,14 +42,14 @@ static int rapl_mmio_cpu_down_prep(unsigned int cpu)
struct rapl_package *rp;
int lead_cpu;

- rp = rapl_find_package_domain(cpu, &rapl_mmio_priv, true);
+ rp = rapl_find_package_domain_cpuslocked(cpu, &rapl_mmio_priv, true);
if (!rp)
return 0;

cpumask_clear_cpu(cpu, &rp->cpumask);
lead_cpu = cpumask_first(&rp->cpumask);
if (lead_cpu >= nr_cpu_ids)
- rapl_remove_package(rp);
+ rapl_remove_package_cpuslocked(rp);
else if (rp->lead_cpu == cpu)
rp->lead_cpu = lead_cpu;
return 0;
diff --git a/include/linux/intel_rapl.h b/include/linux/intel_rapl.h
index 33f21bd85dbf..f3196f82fd8a 100644
--- a/include/linux/intel_rapl.h
+++ b/include/linux/intel_rapl.h
@@ -178,6 +178,12 @@ struct rapl_package {
struct rapl_if_priv *priv;
};

+struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
+ bool id_is_cpu);
+struct rapl_package *rapl_add_package_cpuslocked(int id, struct rapl_if_priv *priv,
+ bool id_is_cpu);
+void rapl_remove_package_cpuslocked(struct rapl_package *rp);
+
struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu);
struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu);
void rapl_remove_package(struct rapl_package *rp);
--
2.34.1


2024-02-13 19:00:32

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/6] powercap: intel_rapl: Fixes and new platform enabling

On Wed, Jan 31, 2024 at 12:37 PM Zhang Rui <[email protected]> wrote:
>
> Patch 1/6 fixes a real bug when MMIO RAPL driver is probed on platforms
> that are not listed in current CPU model list. IMO, it should be
> considered as stable material.
>
> Patch 2/6 fixes a potential racing issue, but I have not reproduced it
> yet.
>
> Patch 3/6 ~ 4/6 fix a problem that TPMI RAPL driver probes disabled
> System (Psys) RAPL Domains.
>
> Patch 5/6 and 6/6 are two simple CPU model updates to support MSR RAPL
> on Arrowlake and Lunarlake platforms.
>
> thanks,
> rui
>
> ----------------------------------------------------------------
> Sumeet Pawnikar (1):
> powercap: intel_rapl: add support for Arrow Lake
>
> Zhang Rui (5):
> powercap: intel_rapl: Fix a NULL pointer reference bug
> powercap: intel_rapl: Fix locking for TPMI RAPL
> powercap: intel_rapl_tpmi: Fix a register bug
> powercap: intel_rapl_tpmi: Fix System Domain probing
> powercap: intel_rapl: Add support for LNL-M paltform
>
> drivers/powercap/intel_rapl_common.c | 36 ++++++++++++++++++++--
> drivers/powercap/intel_rapl_msr.c | 8 ++---
> drivers/powercap/intel_rapl_tpmi.c | 15 +++++++++
> .../intel/int340x_thermal/processor_thermal_rapl.c | 8 ++---
> include/linux/intel_rapl.h | 6 ++++
> 5 files changed, 62 insertions(+), 11 deletions(-)

All applied as 6.9 material with some minor changes in the subject and
changelogs of the last two patches.

Thanks!