2023-03-08 07:07:03

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 0/8] platform/x86: ISST: Use TPMI interface

This series implements TPMI as Intel Speed Select Technology (Intel SST)
HW interface. TPMI has several advantages for Intel SST. This replaces
legacy mailbox and MMIO with architectural interface over TPMI.

This improves performance for HPC type applications. One single IOCTL command
replaces 10s of IOCTLs for mailboxes. This allowed to offer many more
performance levels and SST configurations.

This series depends on previously posted series:
- platform/x86/intel: Intel TPMI enumeration driver

Change History
v2
- Rebased on top of review-hans branch of platform-drivers-x86
- Removed patches which are already present in this branch from the last review
So number of patches are reduced from 12 to 8.
- Rework patch for MSR 0x54 support
- Use suggestion from Hans for suspend/resume callbacks
- Add Reviewed-by and Test-by tags


Srinivas Pandruvada (8):
platform/x86: ISST: Add support for MSR 0x54
platform/x86: ISST: Enumerate TPMI SST and create framework
platform/x86: ISST: Parse SST MMIO and update instance
platform/x86: ISST: Add SST-CP support via TPMI
platform/x86: ISST: Add SST-PP support via TPMI
platform/x86: ISST: Add SST-BF support via TPMI
platform/x86: ISST: Add SST-TF support via TPMI
platform/x86: ISST: Add suspend/resume callbacks

.../x86/intel/speed_select_if/Kconfig | 4 +
.../x86/intel/speed_select_if/Makefile | 2 +
.../intel/speed_select_if/isst_if_common.c | 28 +
.../x86/intel/speed_select_if/isst_tpmi.c | 72 +
.../intel/speed_select_if/isst_tpmi_core.c | 1438 +++++++++++++++++
.../intel/speed_select_if/isst_tpmi_core.h | 18 +
include/uapi/linux/isst_if.h | 303 ++++
7 files changed, 1865 insertions(+)
create mode 100644 drivers/platform/x86/intel/speed_select_if/isst_tpmi.c
create mode 100644 drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
create mode 100644 drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.h

--
2.34.1



2023-03-08 07:07:07

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 1/8] platform/x86: ISST: Add support for MSR 0x54

To map Linux CPU numbering scheme to hardware CPU numbering scheme
MSR 0x53 is getting used. But for new generation of CPUs, this MSR
is not valid. Since this is model specific MSR, this is possible.

A new MSR 0x54 is defined for this purpose. User space can use the
API version to distinguish format from MSR 0x53.

Intel speed select utility is updated to use the new format based
on the API version.

Signed-off-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2
- Don't do any format conversion, let user space do this

.../intel/speed_select_if/isst_if_common.c | 28 +++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
index 19e671500f00..e0572a29212e 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c
@@ -19,9 +19,13 @@
#include <linux/uaccess.h>
#include <uapi/linux/isst_if.h>

+#include <asm/cpu_device_id.h>
+#include <asm/intel-family.h>
+
#include "isst_if_common.h"

#define MSR_THREAD_ID_INFO 0x53
+#define MSR_PM_LOGICAL_ID 0x54
#define MSR_CPU_BUS_NUMBER 0x128

static struct isst_if_cmd_cb punit_callbacks[ISST_IF_DEV_MAX];
@@ -31,6 +35,7 @@ static int punit_msr_white_list[] = {
MSR_CONFIG_TDP_CONTROL,
MSR_TURBO_RATIO_LIMIT1,
MSR_TURBO_RATIO_LIMIT2,
+ MSR_PM_LOGICAL_ID,
};

struct isst_valid_cmd_ranges {
@@ -73,6 +78,8 @@ struct isst_cmd {
u32 param;
};

+static bool isst_hpm_support;
+
static DECLARE_HASHTABLE(isst_hash, 8);
static DEFINE_MUTEX(isst_hash_lock);

@@ -411,11 +418,20 @@ static int isst_if_cpu_online(unsigned int cpu)
isst_cpu_info[cpu].pci_dev[1] = _isst_if_get_pci_dev(cpu, 1, 30, 1);
}

+ if (isst_hpm_support) {
+
+ ret = rdmsrl_safe(MSR_PM_LOGICAL_ID, &data);
+ if (!ret)
+ goto set_punit_id;
+ }
+
ret = rdmsrl_safe(MSR_THREAD_ID_INFO, &data);
if (ret) {
isst_cpu_info[cpu].punit_cpu_id = -1;
return ret;
}
+
+set_punit_id:
isst_cpu_info[cpu].punit_cpu_id = data;

isst_restore_msr_local(cpu);
@@ -704,6 +720,12 @@ static struct miscdevice isst_if_char_driver = {
.fops = &isst_if_char_driver_ops,
};

+static const struct x86_cpu_id hpm_cpu_ids[] = {
+ X86_MATCH_INTEL_FAM6_MODEL(GRANITERAPIDS_X, NULL),
+ X86_MATCH_INTEL_FAM6_MODEL(SIERRAFOREST_X, NULL),
+ {}
+};
+
static int isst_misc_reg(void)
{
mutex_lock(&punit_misc_dev_reg_lock);
@@ -711,6 +733,12 @@ static int isst_misc_reg(void)
goto unlock_exit;

if (!misc_usage_count) {
+ const struct x86_cpu_id *id;
+
+ id = x86_match_cpu(hpm_cpu_ids);
+ if (id)
+ isst_hpm_support = true;
+
misc_device_ret = isst_if_cpu_info_init();
if (misc_device_ret)
goto unlock_exit;
--
2.34.1


2023-03-08 07:07:10

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 4/8] platform/x86: ISST: Add SST-CP support via TPMI

Intel Speed Select Technology Core Power (SST-CP) is an interface that
allows users to define per core priority. This defines a mechanism to
distribute power among cores when there is a power constrained
scenario. This defines a class of service (CLOS) configuration.

Three new IOCTLs are added:
ISST_IF_CORE_POWER_STATE : Enable/Disable SST-CP
ISST_IF_CLOS_PARAM : Configure CLOS parameters
ISST_IF_CLOS_ASSOC : Associate CPUs to a CLOS

To associate CPUs to CLOS, either Linux CPU numbering or PUNIT numbering
scheme can be used, using parameter punit_cpu_map (1: for PUNIT numbering
0 for Linux CPU number).

There is no change to IOCTL to get PUNIT CPU number for a CPU.

Introduce get_instance() function, which is used by majority of IOCTLs
processing to convert a socket and power domain to
tpmi_per_power_domain_info * instance. This instance has all the MMIO
offsets stored to read a particular field.

Once an instance is identified, read or write from correct MMIO
offset for a given field as defined in the specification.

For details on SST CP operations using intel-speed-selet utility,
refer to:
Documentation/admin-guide/pm/intel-speed-select.rst
under the kernel documentation

Signed-off-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2:
No change

.../intel/speed_select_if/isst_tpmi_core.c | 264 ++++++++++++++++++
include/uapi/linux/isst_if.h | 79 ++++++
2 files changed, 343 insertions(+)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
index 3453708c2dd0..bc1c1f26fbf9 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
@@ -31,6 +31,18 @@
/* Supported SST hardware version by this driver */
#define ISST_HEADER_VERSION 1

+/*
+ * Used to indicate if value read from MMIO needs to get multiplied
+ * to get to a standard unit or not.
+ */
+#define SST_MUL_FACTOR_NONE 1
+
+/* Define 100 as a scaling factor frequency ratio to frequency conversion */
+#define SST_MUL_FACTOR_FREQ 100
+
+/* All SST regs are 64 bit size */
+#define SST_REG_SIZE 8
+
/**
* struct sst_header - SST main header
* @interface_version: Version number for this interface
@@ -359,6 +371,249 @@ static int sst_main(struct auxiliary_device *auxdev, struct tpmi_per_power_domai
return 0;
}

+/*
+ * Map a package and power_domain id to SST information structure unique for a power_domain.
+ * The caller should call under isst_tpmi_dev_lock.
+ */
+static struct tpmi_per_power_domain_info *get_instance(int pkg_id, int power_domain_id)
+{
+ struct tpmi_per_power_domain_info *power_domain_info;
+ struct tpmi_sst_struct *sst_inst;
+
+ if (pkg_id < 0 || pkg_id > isst_common.max_index ||
+ pkg_id >= topology_max_packages())
+ return NULL;
+
+ sst_inst = isst_common.sst_inst[pkg_id];
+ if (!sst_inst)
+ return NULL;
+
+ if (power_domain_id < 0 || power_domain_id >= sst_inst->number_of_power_domains)
+ return NULL;
+
+ power_domain_info = &sst_inst->power_domain_info[power_domain_id];
+
+ if (power_domain_info && !power_domain_info->sst_base)
+ return NULL;
+
+ return power_domain_info;
+}
+
+static bool disable_dynamic_sst_features(void)
+{
+ u64 value;
+
+ rdmsrl(MSR_PM_ENABLE, value);
+ return !(value & 0x1);
+}
+
+#define _read_cp_info(name_str, name, offset, start, width, mult_factor)\
+{\
+ u64 val, mask;\
+ \
+ val = readq(power_domain_info->sst_base + power_domain_info->sst_header.cp_offset +\
+ (offset));\
+ mask = GENMASK_ULL((start + width - 1), start);\
+ val &= mask; \
+ val >>= start;\
+ name = (val * mult_factor);\
+}
+
+#define _write_cp_info(name_str, name, offset, start, width, div_factor)\
+{\
+ u64 val, mask;\
+ \
+ val = readq(power_domain_info->sst_base +\
+ power_domain_info->sst_header.cp_offset + (offset));\
+ mask = GENMASK_ULL((start + width - 1), start);\
+ val &= ~mask;\
+ val |= (name / div_factor) << start;\
+ writeq(val, power_domain_info->sst_base + power_domain_info->sst_header.cp_offset +\
+ (offset));\
+}
+
+#define SST_CP_CONTROL_OFFSET 8
+#define SST_CP_STATUS_OFFSET 16
+
+#define SST_CP_ENABLE_START 0
+#define SST_CP_ENABLE_WIDTH 1
+
+#define SST_CP_PRIORITY_TYPE_START 1
+#define SST_CP_PRIORITY_TYPE_WIDTH 1
+
+static long isst_if_core_power_state(void __user *argp)
+{
+ struct tpmi_per_power_domain_info *power_domain_info;
+ struct isst_core_power core_power;
+
+ if (disable_dynamic_sst_features())
+ return -EFAULT;
+
+ if (copy_from_user(&core_power, argp, sizeof(core_power)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(core_power.socket_id, core_power.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ if (core_power.get_set) {
+ _write_cp_info("cp_enable", core_power.enable, SST_CP_CONTROL_OFFSET,
+ SST_CP_ENABLE_START, SST_CP_ENABLE_WIDTH, SST_MUL_FACTOR_NONE)
+ _write_cp_info("cp_prio_type", core_power.priority_type, SST_CP_CONTROL_OFFSET,
+ SST_CP_PRIORITY_TYPE_START, SST_CP_PRIORITY_TYPE_WIDTH,
+ SST_MUL_FACTOR_NONE)
+ } else {
+ /* get */
+ _read_cp_info("cp_enable", core_power.enable, SST_CP_STATUS_OFFSET,
+ SST_CP_ENABLE_START, SST_CP_ENABLE_WIDTH, SST_MUL_FACTOR_NONE)
+ _read_cp_info("cp_prio_type", core_power.priority_type, SST_CP_STATUS_OFFSET,
+ SST_CP_PRIORITY_TYPE_START, SST_CP_PRIORITY_TYPE_WIDTH,
+ SST_MUL_FACTOR_NONE)
+ core_power.supported = !!(power_domain_info->sst_header.cap_mask & BIT(0));
+ if (copy_to_user(argp, &core_power, sizeof(core_power)))
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+#define SST_CLOS_CONFIG_0_OFFSET 24
+
+#define SST_CLOS_CONFIG_PRIO_START 4
+#define SST_CLOS_CONFIG_PRIO_WIDTH 4
+
+#define SST_CLOS_CONFIG_MIN_START 8
+#define SST_CLOS_CONFIG_MIN_WIDTH 8
+
+#define SST_CLOS_CONFIG_MAX_START 16
+#define SST_CLOS_CONFIG_MAX_WIDTH 8
+
+static long isst_if_clos_param(void __user *argp)
+{
+ struct tpmi_per_power_domain_info *power_domain_info;
+ struct isst_clos_param clos_param;
+
+ if (copy_from_user(&clos_param, argp, sizeof(clos_param)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(clos_param.socket_id, clos_param.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ if (clos_param.get_set) {
+ _write_cp_info("clos.min_freq", clos_param.min_freq_mhz,
+ (SST_CLOS_CONFIG_0_OFFSET + clos_param.clos * SST_REG_SIZE),
+ SST_CLOS_CONFIG_MIN_START, SST_CLOS_CONFIG_MIN_WIDTH,
+ SST_MUL_FACTOR_FREQ);
+ _write_cp_info("clos.max_freq", clos_param.max_freq_mhz,
+ (SST_CLOS_CONFIG_0_OFFSET + clos_param.clos * SST_REG_SIZE),
+ SST_CLOS_CONFIG_MAX_START, SST_CLOS_CONFIG_MAX_WIDTH,
+ SST_MUL_FACTOR_FREQ);
+ _write_cp_info("clos.prio", clos_param.prop_prio,
+ (SST_CLOS_CONFIG_0_OFFSET + clos_param.clos * SST_REG_SIZE),
+ SST_CLOS_CONFIG_PRIO_START, SST_CLOS_CONFIG_PRIO_WIDTH,
+ SST_MUL_FACTOR_NONE);
+ } else {
+ /* get */
+ _read_cp_info("clos.min_freq", clos_param.min_freq_mhz,
+ (SST_CLOS_CONFIG_0_OFFSET + clos_param.clos * SST_REG_SIZE),
+ SST_CLOS_CONFIG_MIN_START, SST_CLOS_CONFIG_MIN_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ _read_cp_info("clos.max_freq", clos_param.max_freq_mhz,
+ (SST_CLOS_CONFIG_0_OFFSET + clos_param.clos * SST_REG_SIZE),
+ SST_CLOS_CONFIG_MAX_START, SST_CLOS_CONFIG_MAX_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ _read_cp_info("clos.prio", clos_param.prop_prio,
+ (SST_CLOS_CONFIG_0_OFFSET + clos_param.clos * SST_REG_SIZE),
+ SST_CLOS_CONFIG_PRIO_START, SST_CLOS_CONFIG_PRIO_WIDTH,
+ SST_MUL_FACTOR_NONE)
+
+ if (copy_to_user(argp, &clos_param, sizeof(clos_param)))
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+#define SST_CLOS_ASSOC_0_OFFSET 56
+#define SST_CLOS_ASSOC_CPUS_PER_REG 16
+#define SST_CLOS_ASSOC_BITS_PER_CPU 4
+
+static long isst_if_clos_assoc(void __user *argp)
+{
+ struct isst_if_clos_assoc_cmds assoc_cmds;
+ unsigned char __user *ptr;
+ int i;
+
+ /* Each multi command has u16 command count as the first field */
+ if (copy_from_user(&assoc_cmds, argp, sizeof(assoc_cmds)))
+ return -EFAULT;
+
+ if (!assoc_cmds.cmd_count || assoc_cmds.cmd_count > ISST_IF_CMD_LIMIT)
+ return -EINVAL;
+
+ ptr = argp + offsetof(struct isst_if_clos_assoc_cmds, assoc_info);
+ for (i = 0; i < assoc_cmds.cmd_count; ++i) {
+ struct tpmi_per_power_domain_info *power_domain_info;
+ struct isst_if_clos_assoc clos_assoc;
+ int punit_id, punit_cpu_no, pkg_id;
+ struct tpmi_sst_struct *sst_inst;
+ int offset, shift, cpu;
+ u64 val, mask, clos;
+
+ if (copy_from_user(&clos_assoc, ptr, sizeof(clos_assoc)))
+ return -EFAULT;
+
+ if (clos_assoc.socket_id > topology_max_packages())
+ return -EINVAL;
+
+ cpu = clos_assoc.logical_cpu;
+ clos = clos_assoc.clos;
+
+ if (assoc_cmds.punit_cpu_map)
+ punit_cpu_no = cpu;
+ else
+ return -EOPNOTSUPP;
+
+ if (punit_cpu_no < 0)
+ return -EINVAL;
+
+ punit_id = clos_assoc.power_domain_id;
+ pkg_id = clos_assoc.socket_id;
+
+ sst_inst = isst_common.sst_inst[pkg_id];
+
+ if (clos_assoc.power_domain_id > sst_inst->number_of_power_domains)
+ return -EINVAL;
+
+ power_domain_info = &sst_inst->power_domain_info[punit_id];
+
+ offset = SST_CLOS_ASSOC_0_OFFSET +
+ (punit_cpu_no / SST_CLOS_ASSOC_CPUS_PER_REG) * SST_REG_SIZE;
+ shift = punit_cpu_no % SST_CLOS_ASSOC_CPUS_PER_REG;
+ shift *= SST_CLOS_ASSOC_BITS_PER_CPU;
+
+ val = readq(power_domain_info->sst_base +
+ power_domain_info->sst_header.cp_offset + offset);
+ if (assoc_cmds.get_set) {
+ mask = GENMASK_ULL((shift + SST_CLOS_ASSOC_BITS_PER_CPU - 1), shift);
+ val &= ~mask;
+ val |= (clos << shift);
+ writeq(val, power_domain_info->sst_base +
+ power_domain_info->sst_header.cp_offset + offset);
+ } else {
+ val >>= shift;
+ clos_assoc.clos = val & GENMASK(SST_CLOS_ASSOC_BITS_PER_CPU - 1, 0);
+ if (copy_to_user(ptr, &clos_assoc, sizeof(clos_assoc)))
+ return -EFAULT;
+ }
+
+ ptr += sizeof(clos_assoc);
+ }
+
+ return 0;
+}
+
static int isst_if_get_tpmi_instance_count(void __user *argp)
{
struct isst_tpmi_instance_count tpmi_inst;
@@ -400,6 +655,15 @@ static long isst_if_def_ioctl(struct file *file, unsigned int cmd,
case ISST_IF_COUNT_TPMI_INSTANCES:
ret = isst_if_get_tpmi_instance_count(argp);
break;
+ case ISST_IF_CORE_POWER_STATE:
+ ret = isst_if_core_power_state(argp);
+ break;
+ case ISST_IF_CLOS_PARAM:
+ ret = isst_if_clos_param(argp);
+ break;
+ case ISST_IF_CLOS_ASSOC:
+ ret = isst_if_clos_assoc(argp);
+ break;
default:
break;
}
diff --git a/include/uapi/linux/isst_if.h b/include/uapi/linux/isst_if.h
index bf32d959f6e8..32687d8023ef 100644
--- a/include/uapi/linux/isst_if.h
+++ b/include/uapi/linux/isst_if.h
@@ -163,6 +163,82 @@ struct isst_if_msr_cmds {
struct isst_if_msr_cmd msr_cmd[1];
};

+/**
+ * struct isst_core_power - Structure to get/set core_power feature
+ * @get_set: 0: Get, 1: Set
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @enable: Feature enable status
+ * @priority_type: Priority type for the feature (ordered/proportional)
+ *
+ * Structure to get/set core_power feature state using IOCTL
+ * ISST_IF_CORE_POWER_STATE.
+ */
+struct isst_core_power {
+ __u8 get_set;
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u8 enable;
+ __u8 supported;
+ __u8 priority_type;
+};
+
+/**
+ * struct isst_clos_param - Structure to get/set clos praram
+ * @get_set: 0: Get, 1: Set
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * clos: Clos ID for the parameters
+ * min_freq_mhz: Minimum frequency in MHz
+ * max_freq_mhz: Maximum frequency in MHz
+ * prop_prio: Proportional priority from 0-15
+ *
+ * Structure to get/set per clos property using IOCTL
+ * ISST_IF_CLOS_PARAM.
+ */
+struct isst_clos_param {
+ __u8 get_set;
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u8 clos;
+ __u16 min_freq_mhz;
+ __u16 max_freq_mhz;
+ __u8 prop_prio;
+};
+
+/**
+ * struct isst_if_clos_assoc - Structure to assign clos to a CPU
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @logical_cpu: CPU number
+ * @clos: Clos ID to assign to the logical CPU
+ *
+ * Structure to get/set core_power feature.
+ */
+struct isst_if_clos_assoc {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u16 logical_cpu;
+ __u16 clos;
+};
+
+/**
+ * struct isst_if_clos_assoc_cmds - Structure to assign clos to CPUs
+ * @cmd_count: Number of cmds (cpus) in this request
+ * @get_set: Request is for get or set
+ * @punit_cpu_map: Set to 1 if the CPU number is punit numbering not
+ * Linux CPU number
+ *
+ * Structure used to get/set associate CPUs to clos using IOCTL
+ * ISST_IF_CLOS_ASSOC.
+ */
+struct isst_if_clos_assoc_cmds {
+ __u16 cmd_count;
+ __u16 get_set;
+ __u16 punit_cpu_map;
+ struct isst_if_clos_assoc assoc_info[1];
+};
+
/**
* struct isst_tpmi_instance_count - Get number of TPMI instances per socket
* @socket_id: Socket/package id
@@ -186,5 +262,8 @@ struct isst_tpmi_instance_count {
#define ISST_IF_MSR_COMMAND _IOWR(ISST_IF_MAGIC, 4, struct isst_if_msr_cmds *)

#define ISST_IF_COUNT_TPMI_INSTANCES _IOR(ISST_IF_MAGIC, 5, struct isst_tpmi_instance_count *)
+#define ISST_IF_CORE_POWER_STATE _IOWR(ISST_IF_MAGIC, 6, struct isst_core_power *)
+#define ISST_IF_CLOS_PARAM _IOWR(ISST_IF_MAGIC, 7, struct isst_clos_param *)
+#define ISST_IF_CLOS_ASSOC _IOWR(ISST_IF_MAGIC, 8, struct isst_if_clos_assoc_cmds *)

#endif
--
2.34.1


2023-03-08 07:07:15

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 3/8] platform/x86: ISST: Parse SST MMIO and update instance

SST registers are presented to OS in multi-layer structures starting
with a SST header showing version information freezing current
definition.

For details on SST terminology refer to
Documentation/admin-guide/pm/intel-speed-select.rst
under the kernel documentation

SST TPMI details are published in the following document:
https://github.com/intel/tpmi_power_management/blob/main/SST_TPMI_public_disclosure_FINAL.docx

SST MMIO structure layout follows:
SST-HEADER
SST-CP Header
SST-CP CONTROL
SST-CP STATUS
SST-CP CONFIG0
SST-CP CONFIG1
...
...
SST-PP Header
SST-PP OFFSET_0
SST-PP OFFSET_1
SST_PP_0_INFO
SST_PP_1_INFO
SST_PP_2_INFO
SST_PP_3_INFO
SST-PP CONTROL
SST-PP STATUS

Each register bank contains information to get to next lower level
information. This information is parsed and stored in the struct
tpmi_per_power_domain_info for each domain. This information is
used to process each SST requests.

Signed-off-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2:
No change

.../intel/speed_select_if/isst_tpmi_core.c | 291 +++++++++++++++++-
1 file changed, 287 insertions(+), 4 deletions(-)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
index 6b37016c0417..3453708c2dd0 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
@@ -20,6 +20,7 @@
#include <linux/auxiliary_bus.h>
#include <linux/intel_tpmi.h>
#include <linux/fs.h>
+#include <linux/io.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <uapi/linux/isst_if.h>
@@ -27,10 +28,192 @@
#include "isst_tpmi_core.h"
#include "isst_if_common.h"

+/* Supported SST hardware version by this driver */
+#define ISST_HEADER_VERSION 1
+
+/**
+ * struct sst_header - SST main header
+ * @interface_version: Version number for this interface
+ * @cap_mask: Bitmask of the supported sub features. 1=the sub feature is enabled.
+ * 0=disabled.
+ * Bit[8]= SST_CP enable (1), disable (0)
+ * bit[9]= SST_PP enable (1), disable (0)
+ * other bits are reserved for future use
+ * @cp_offset: Qword (8 bytes) offset to the SST_CP register bank
+ * @pp_offset: Qword (8 bytes) offset to the SST_PP register bank
+ * @reserved: Reserved for future use
+ *
+ * This register allows SW to discover SST capability and the offsets to SST-CP
+ * and SST-PP register banks.
+ */
+struct sst_header {
+ u8 interface_version;
+ u8 cap_mask;
+ u8 cp_offset;
+ u8 pp_offset;
+ u32 reserved;
+} __packed;
+
+/**
+ * struct cp_header - SST-CP (core-power) header
+ * @feature_id: 0=SST-CP, 1=SST-PP, 2=SST-BF, 3=SST-TF
+ * @feature_rev: Interface Version number for this SST feature
+ * @ratio_unit: Frequency ratio unit. 00: 100MHz. All others are reserved
+ * @reserved: Reserved for future use
+ *
+ * This structure is used store SST-CP header. This is packed to the same
+ * format as defined in the specifications.
+ */
+struct cp_header {
+ u64 feature_id :4;
+ u64 feature_rev :8;
+ u64 ratio_unit :2;
+ u64 reserved :50;
+} __packed;
+
+/**
+ * struct pp_header - SST-PP (Perf profile) header
+ * @feature_id: 0=SST-CP, 1=SST-PP, 2=SST-BF, 3=SST-TF
+ * @feature_rev: Interface Version number for this SST feature
+ * @level_en_mask: SST-PP level enable/disable fuse mask
+ * @allowed_level_mask: Allowed level mask used for dynamic config level switching
+ * @reserved0: Reserved for future use
+ * @ratio_unit: Frequency ratio unit. 00: 100MHz. All others are reserved
+ * @block_size: Size of PP block in Qword unit (8 bytes)
+ * @dynamic_switch: If set (1), dynamic switching of SST PP is supported
+ * @memory_ratio_unit: Memory Controller frequency ratio unit. 00: 100MHz, others reserved
+ * @reserved1: Reserved for future use
+ *
+ * This structure is used store SST-PP header. This is packed to the same
+ * format as defined in the specifications.
+ */
+struct pp_header {
+ u64 feature_id :4;
+ u64 feature_rev :8;
+ u64 level_en_mask :8;
+ u64 allowed_level_mask :8;
+ u64 reserved0 :4;
+ u64 ratio_unit :2;
+ u64 block_size :8;
+ u64 dynamic_switch :1;
+ u64 memory_ratio_unit :2;
+ u64 reserved1 :19;
+} __packed;
+
+/**
+ * struct feature_offset - Offsets to SST-PP features
+ * @pp_offset: Qword offset within PP level for the SST_PP register bank
+ * @bf_offset: Qword offset within PP level for the SST_BF register bank
+ * @tf_offset: Qword offset within PP level for the SST_TF register bank
+ * @reserved: Reserved for future use
+ *
+ * This structure is used store offsets for SST features in the register bank.
+ * This is packed to the same format as defined in the specifications.
+ */
+struct feature_offset {
+ u64 pp_offset :8;
+ u64 bf_offset :8;
+ u64 tf_offset :8;
+ u64 reserved :40;
+} __packed;
+
+/**
+ * struct levels_offset - Offsets to each SST PP level
+ * @sst_pp_level0_offset: Qword offset to the register block of PP level 0
+ * @sst_pp_level1_offset: Qword offset to the register block of PP level 1
+ * @sst_pp_level2_offset: Qword offset to the register block of PP level 2
+ * @sst_pp_level3_offset: Qword offset to the register block of PP level 3
+ * @sst_pp_level4_offset: Qword offset to the register block of PP level 4
+ * @reserved: Reserved for future use
+ *
+ * This structure is used store offsets of SST PP levels in the register bank.
+ * This is packed to the same format as defined in the specifications.
+ */
+struct levels_offset {
+ u64 sst_pp_level0_offset :8;
+ u64 sst_pp_level1_offset :8;
+ u64 sst_pp_level2_offset :8;
+ u64 sst_pp_level3_offset :8;
+ u64 sst_pp_level4_offset :8;
+ u64 reserved :24;
+} __packed;
+
+/**
+ * struct pp_control_offset - Offsets for SST PP controls
+ * @perf_level: A SST-PP level that SW intends to switch to
+ * @perf_level_lock: SST-PP level select lock. 0 - unlocked. 1 - locked till next reset
+ * @resvd0: Reserved for future use
+ * @current_state: Bit mask to control the enable(1)/disable(0) state of each feature
+ * of the current PP level, bit 0 = BF, bit 1 = TF, bit 2-7 = reserved
+ * @reserved: Reserved for future use
+ *
+ * This structure is used store offsets of SST PP controls in the register bank.
+ * This is packed to the same format as defined in the specifications.
+ */
+struct pp_control_offset {
+ u64 perf_level :3;
+ u64 perf_level_lock :1;
+ u64 resvd0 :4;
+ u64 current_state :8;
+ u64 reserved :48;
+} __packed;
+
+/**
+ * struct pp_status_offset - Offsets for SST PP status fields
+ * @sst_pp_level: Returns the current SST-PP level
+ * @sst_pp_lock: Returns the lock bit setting of perf_level_lock in pp_control_offset
+ * @error_type: Returns last error of SST-PP level change request. 0: no error,
+ * 1: level change not allowed, others: reserved
+ * @feature_state: Bit mask to indicate the enable(1)/disable(0) state of each feature of the
+ * current PP level. bit 0 = BF, bit 1 = TF, bit 2-7 reserved
+ * @reserved0: Reserved for future use
+ * @feature_error_type: Returns last error of the specific feature. Three error_type bits per
+ * feature. i.e. ERROR_TYPE[2:0] for BF, ERROR_TYPE[5:3] for TF, etc.
+ * 0x0: no error, 0x1: The specific feature is not supported by the hardware.
+ * 0x2-0x6: Reserved. 0x7: feature state change is not allowed.
+ * @reserved1: Reserved for future use
+ *
+ * This structure is used store offsets of SST PP status in the register bank.
+ * This is packed to the same format as defined in the specifications.
+ */
+struct pp_status_offset {
+ u64 sst_pp_level :3;
+ u64 sst_pp_lock :1;
+ u64 error_type :4;
+ u64 feature_state :8;
+ u64 reserved0 :16;
+ u64 feature_error_type : 24;
+ u64 reserved1 :8;
+} __packed;
+
+/**
+ * struct perf_level - Used to store perf level and mmio offset
+ * @mmio_offset: mmio offset for a perf level
+ * @level: perf level for this offset
+ *
+ * This structure is used store final mmio offset of each perf level from the
+ * SST base mmio offset.
+ */
+struct perf_level {
+ int mmio_offset;
+ int level;
+};
+
/**
* struct tpmi_per_power_domain_info - Store per power_domain SST info
* @package_id: Package id for this power_domain
* @power_domain_id: Power domain id, Each entry from the SST-TPMI instance is a power_domain.
+ * @max_level: Max possible PP level possible for this power_domain
+ * @ratio_unit: Ratio unit for converting to MHz
+ * @avx_levels: Number of AVX levels
+ * @pp_block_size: Block size from PP header
+ * @sst_header: Store SST header for this power_domain
+ * @cp_header: Store SST-CP header for this power_domain
+ * @pp_header: Store SST-PP header for this power_domain
+ * @perf_levels: Pointer to each perf level to map level to mmio offset
+ * @feature_offsets: Store feature offsets for each PP-level
+ * @control_offset: Store the control offset for each PP-level
+ * @status_offset: Store the status offset for each PP-level
* @sst_base: Mapped SST base IO memory
* @auxdev: Auxiliary device instance enumerated this instance
*
@@ -41,6 +224,17 @@
struct tpmi_per_power_domain_info {
int package_id;
int power_domain_id;
+ int max_level;
+ int ratio_unit;
+ int avx_levels;
+ int pp_block_size;
+ struct sst_header sst_header;
+ struct cp_header cp_header;
+ struct pp_header pp_header;
+ struct perf_level *perf_levels;
+ struct feature_offset feature_offsets;
+ struct pp_control_offset control_offset;
+ struct pp_status_offset status_offset;
void __iomem *sst_base;
struct auxiliary_device *auxdev;
};
@@ -85,6 +279,86 @@ static int isst_core_usage_count;
/* Stores complete SST information for every package and power_domain */
static struct tpmi_sst_common_struct isst_common;

+#define SST_MAX_AVX_LEVELS 3
+
+#define SST_PP_OFFSET_0 8
+#define SST_PP_OFFSET_1 16
+#define SST_PP_OFFSET_SIZE 8
+
+static int sst_add_perf_profiles(struct auxiliary_device *auxdev,
+ struct tpmi_per_power_domain_info *pd_info,
+ int levels)
+{
+ u64 perf_level_offsets;
+ int i;
+
+ pd_info->perf_levels = devm_kcalloc(&auxdev->dev, levels,
+ sizeof(struct perf_level),
+ GFP_KERNEL);
+ if (!pd_info->perf_levels)
+ return 0;
+
+ pd_info->ratio_unit = pd_info->pp_header.ratio_unit;
+ pd_info->avx_levels = SST_MAX_AVX_LEVELS;
+ pd_info->pp_block_size = pd_info->pp_header.block_size;
+
+ /* Read PP Offset 0: Get feature offset with PP level */
+ *((u64 *)&pd_info->feature_offsets) = readq(pd_info->sst_base +
+ pd_info->sst_header.pp_offset +
+ SST_PP_OFFSET_0);
+
+ perf_level_offsets = readq(pd_info->sst_base + pd_info->sst_header.pp_offset +
+ SST_PP_OFFSET_1);
+
+ for (i = 0; i < levels; ++i) {
+ u64 offset;
+
+ offset = perf_level_offsets & (0xff << (i * SST_PP_OFFSET_SIZE));
+ offset >>= (i * 8);
+ offset &= 0xff;
+ offset *= 8; /* Convert to byte from QWORD offset */
+ pd_info->perf_levels[i].mmio_offset = pd_info->sst_header.pp_offset + offset;
+ }
+
+ return 0;
+}
+
+static int sst_main(struct auxiliary_device *auxdev, struct tpmi_per_power_domain_info *pd_info)
+{
+ int i, mask, levels;
+
+ *((u64 *)&pd_info->sst_header) = readq(pd_info->sst_base);
+ pd_info->sst_header.cp_offset *= 8;
+ pd_info->sst_header.pp_offset *= 8;
+
+ if (pd_info->sst_header.interface_version != ISST_HEADER_VERSION) {
+ dev_err(&auxdev->dev, "SST: Unsupported version:%x\n",
+ pd_info->sst_header.interface_version);
+ return -ENODEV;
+ }
+
+ /* Read SST CP Header */
+ *((u64 *)&pd_info->cp_header) = readq(pd_info->sst_base + pd_info->sst_header.cp_offset);
+
+ /* Read PP header */
+ *((u64 *)&pd_info->pp_header) = readq(pd_info->sst_base + pd_info->sst_header.pp_offset);
+
+ /* Force level_en_mask level 0 */
+ pd_info->pp_header.level_en_mask |= 0x01;
+
+ mask = 0x01;
+ levels = 0;
+ for (i = 0; i < 8; ++i) {
+ if (pd_info->pp_header.level_en_mask & mask)
+ levels = i;
+ mask <<= 1;
+ }
+ pd_info->max_level = levels;
+ sst_add_perf_profiles(auxdev, pd_info, levels + 1);
+
+ return 0;
+}
+
static int isst_if_get_tpmi_instance_count(void __user *argp)
{
struct isst_tpmi_instance_count tpmi_inst;
@@ -102,10 +376,10 @@ static int isst_if_get_tpmi_instance_count(void __user *argp)
sst_inst = isst_common.sst_inst[tpmi_inst.socket_id];
tpmi_inst.valid_mask = 0;
for (i = 0; i < sst_inst->number_of_power_domains; ++i) {
- struct tpmi_per_power_domain_info *power_domain_info;
+ struct tpmi_per_power_domain_info *pd_info;

- power_domain_info = &sst_inst->power_domain_info[i];
- if (power_domain_info->sst_base)
+ pd_info = &sst_inst->power_domain_info[i];
+ if (pd_info->sst_base)
tpmi_inst.valid_mask |= BIT(i);
}

@@ -134,11 +408,13 @@ static long isst_if_def_ioctl(struct file *file, unsigned int cmd,
return ret;
}

+#define TPMI_SST_AUTO_SUSPEND_DELAY_MS 2000
+
int tpmi_sst_dev_add(struct auxiliary_device *auxdev)
{
struct intel_tpmi_plat_info *plat_info;
struct tpmi_sst_struct *tpmi_sst;
- int i, pkg = 0, inst = 0;
+ int i, ret, pkg = 0, inst = 0;
int num_resources;

plat_info = tpmi_get_platform_data(auxdev);
@@ -189,6 +465,13 @@ int tpmi_sst_dev_add(struct auxiliary_device *auxdev)
if (IS_ERR(tpmi_sst->power_domain_info[i].sst_base))
return PTR_ERR(tpmi_sst->power_domain_info[i].sst_base);

+ ret = sst_main(auxdev, &tpmi_sst->power_domain_info[i]);
+ if (ret) {
+ devm_iounmap(&auxdev->dev, tpmi_sst->power_domain_info[i].sst_base);
+ tpmi_sst->power_domain_info[i].sst_base = NULL;
+ continue;
+ }
+
++inst;
}

--
2.34.1


2023-03-08 07:07:19

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 6/8] platform/x86: ISST: Add SST-BF support via TPMI

The Intel Speed Select Technology - Base Frequency (SST-BF) feature lets
the user control base frequency. If some critical workload threads demand
constant high guaranteed performance, then this feature can be used to
execute the thread at higher base frequency on specific sets of CPUs
(high priority CPUs) at the cost of lower base frequency (low priority
CPUs) on other CPUs.

Two new IOCTLs are added:
ISST_IF_GET_BASE_FREQ_INFO : Get frequency information for high and
low priority CPUs
ISST_IF_GET_BASE_FREQ_CPU_MASK : CPUs capable of higher frequency

Once an instance is identified, read or write from correct MMIO
offset for a given field as defined in the specification.

For details on SST-BF operations using intel-speed-selet utility,
refer to:
Documentation/admin-guide/pm/intel-speed-select.rst
under the kernel documentation

Signed-off-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2:
No change

.../intel/speed_select_if/isst_tpmi_core.c | 87 +++++++++++++++++++
1 file changed, 87 insertions(+)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
index c9b1321dfd1b..0566c8d30181 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
@@ -1014,6 +1014,87 @@ static int isst_if_get_perf_level_mask(void __user *argp)
return 0;
}

+#define SST_BF_INFO_0_OFFSET 0
+#define SST_BF_INFO_1_OFFSET 8
+
+#define SST_BF_P1_HIGH_START 13
+#define SST_BF_P1_HIGH_WIDTH 8
+
+#define SST_BF_P1_LOW_START 21
+#define SST_BF_P1_LOW_WIDTH 8
+
+#define SST_BF_T_PROHOT_START 38
+#define SST_BF_T_PROHOT_WIDTH 8
+
+#define SST_BF_TDP_START 46
+#define SST_BF_TDP_WIDTH 15
+
+static int isst_if_get_base_freq_info(void __user *argp)
+{
+ static struct isst_base_freq_info base_freq;
+ struct tpmi_per_power_domain_info *power_domain_info;
+
+ if (copy_from_user(&base_freq, argp, sizeof(base_freq)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(base_freq.socket_id, base_freq.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ if (base_freq.level > power_domain_info->max_level)
+ return -EINVAL;
+
+ _read_bf_level_info("p1_high", base_freq.high_base_freq_mhz, base_freq.level,
+ SST_BF_INFO_0_OFFSET, SST_BF_P1_HIGH_START, SST_BF_P1_HIGH_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ _read_bf_level_info("p1_low", base_freq.low_base_freq_mhz, base_freq.level,
+ SST_BF_INFO_0_OFFSET, SST_BF_P1_LOW_START, SST_BF_P1_LOW_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ _read_bf_level_info("BF-TJ", base_freq.tjunction_max_c, base_freq.level,
+ SST_BF_INFO_0_OFFSET, SST_BF_T_PROHOT_START, SST_BF_T_PROHOT_WIDTH,
+ SST_MUL_FACTOR_NONE)
+ _read_bf_level_info("BF-tdp", base_freq.thermal_design_power_w, base_freq.level,
+ SST_BF_INFO_0_OFFSET, SST_BF_TDP_START, SST_BF_TDP_WIDTH,
+ SST_MUL_FACTOR_NONE)
+ base_freq.thermal_design_power_w /= 8; /*unit = 1/8th watt*/
+
+ if (copy_to_user(argp, &base_freq, sizeof(base_freq)))
+ return -EFAULT;
+
+ return 0;
+}
+
+#define P1_HI_CORE_MASK_START 0
+#define P1_HI_CORE_MASK_WIDTH 64
+
+static int isst_if_get_base_freq_mask(void __user *argp)
+{
+ static struct isst_perf_level_cpu_mask cpumask;
+ struct tpmi_per_power_domain_info *power_domain_info;
+ u64 mask;
+
+ if (copy_from_user(&cpumask, argp, sizeof(cpumask)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(cpumask.socket_id, cpumask.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ _read_bf_level_info("BF-cpumask", mask, cpumask.level, SST_BF_INFO_1_OFFSET,
+ P1_HI_CORE_MASK_START, P1_HI_CORE_MASK_WIDTH,
+ SST_MUL_FACTOR_NONE)
+
+ cpumask.mask = mask;
+
+ if (!cpumask.punit_cpu_map)
+ return -EOPNOTSUPP;
+
+ if (copy_to_user(argp, &cpumask, sizeof(cpumask)))
+ return -EFAULT;
+
+ return 0;
+}
+
static int isst_if_get_tpmi_instance_count(void __user *argp)
{
struct isst_tpmi_instance_count tpmi_inst;
@@ -1079,6 +1160,12 @@ static long isst_if_def_ioctl(struct file *file, unsigned int cmd,
case ISST_IF_GET_PERF_LEVEL_CPU_MASK:
ret = isst_if_get_perf_level_mask(argp);
break;
+ case ISST_IF_GET_BASE_FREQ_INFO:
+ ret = isst_if_get_base_freq_info(argp);
+ break;
+ case ISST_IF_GET_BASE_FREQ_CPU_MASK:
+ ret = isst_if_get_base_freq_mask(argp);
+ break;
default:
break;
}
--
2.34.1


2023-03-08 07:07:23

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 5/8] platform/x86: ISST: Add SST-PP support via TPMI

This Intel Speed Select Technology - Performance Profile (SST-PP) feature
introduces a mechanism that allows multiple optimized performance profiles
per system. Each profile defines a set of CPUs that need to be online and
rest offline to sustain a guaranteed base frequency.

Five new IOCTLs are added:
ISST_IF_PERF_LEVELS : Get number of performance levels
ISST_IF_PERF_SET_LEVEL : Set to a new performance level
ISST_IF_PERF_SET_FEATURE : Activate SST-BF/SST-TF for a performance level
ISST_IF_GET_PERF_LEVEL_INFO : Get parameters for a performance level
ISST_IF_GET_PERF_LEVEL_CPU_MASK : Get CPU mask for a performance level

Once an instance is identified, read or write from correct MMIO
offset for a given field as defined in the specification.

For details on SST PP operations using intel-speed-selet utility,
refer to:
Documentation/admin-guide/pm/intel-speed-select.rst
under the kernel documentation

Signed-off-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2:
No change

.../intel/speed_select_if/isst_tpmi_core.c | 417 +++++++++++++++++-
include/uapi/linux/isst_if.h | 180 ++++++++
2 files changed, 596 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
index bc1c1f26fbf9..c9b1321dfd1b 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
@@ -18,6 +18,7 @@
*/

#include <linux/auxiliary_bus.h>
+#include <linux/delay.h>
#include <linux/intel_tpmi.h>
#include <linux/fs.h>
#include <linux/io.h>
@@ -325,7 +326,7 @@ static int sst_add_perf_profiles(struct auxiliary_device *auxdev,
for (i = 0; i < levels; ++i) {
u64 offset;

- offset = perf_level_offsets & (0xff << (i * SST_PP_OFFSET_SIZE));
+ offset = perf_level_offsets & (0xffULL << (i * SST_PP_OFFSET_SIZE));
offset >>= (i * 8);
offset &= 0xff;
offset *= 8; /* Convert to byte from QWORD offset */
@@ -614,6 +615,405 @@ static long isst_if_clos_assoc(void __user *argp)
return 0;
}

+#define _read_pp_info(name_str, name, offset, start, width, mult_factor)\
+{\
+ u64 val, _mask;\
+ \
+ val = readq(power_domain_info->sst_base + power_domain_info->sst_header.pp_offset +\
+ (offset));\
+ _mask = GENMASK_ULL((start + width - 1), start);\
+ val &= _mask;\
+ val >>= start;\
+ name = (val * mult_factor);\
+}
+
+#define _write_pp_info(name_str, name, offset, start, width, div_factor)\
+{\
+ u64 val, _mask;\
+ \
+ val = readq(power_domain_info->sst_base + power_domain_info->sst_header.pp_offset +\
+ (offset));\
+ _mask = GENMASK((start + width - 1), start);\
+ val &= ~_mask;\
+ val |= (name / div_factor) << start;\
+ writeq(val, power_domain_info->sst_base + power_domain_info->sst_header.pp_offset +\
+ (offset));\
+}
+
+#define _read_bf_level_info(name_str, name, level, offset, start, width, mult_factor)\
+{\
+ u64 val, _mask;\
+ \
+ val = readq(power_domain_info->sst_base +\
+ power_domain_info->perf_levels[level].mmio_offset +\
+ (power_domain_info->feature_offsets.bf_offset * 8) + (offset));\
+ _mask = GENMASK_ULL((start + width - 1), start);\
+ val &= _mask; \
+ val >>= start;\
+ name = (val * mult_factor);\
+}
+
+#define _read_tf_level_info(name_str, name, level, offset, start, width, mult_factor)\
+{\
+ u64 val, _mask;\
+ \
+ val = readq(power_domain_info->sst_base +\
+ power_domain_info->perf_levels[level].mmio_offset +\
+ (power_domain_info->feature_offsets.tf_offset * 8) + (offset));\
+ _mask = GENMASK_ULL((start + width - 1), start);\
+ val &= _mask; \
+ val >>= start;\
+ name = (val * mult_factor);\
+}
+
+#define SST_PP_STATUS_OFFSET 32
+
+#define SST_PP_LEVEL_START 0
+#define SST_PP_LEVEL_WIDTH 3
+
+#define SST_PP_LOCK_START 3
+#define SST_PP_LOCK_WIDTH 1
+
+#define SST_PP_FEATURE_STATE_START 8
+#define SST_PP_FEATURE_STATE_WIDTH 8
+
+#define SST_BF_FEATURE_SUPPORTED_START 12
+#define SST_BF_FEATURE_SUPPORTED_WIDTH 1
+
+#define SST_TF_FEATURE_SUPPORTED_START 12
+#define SST_TF_FEATURE_SUPPORTED_WIDTH 1
+
+static int isst_if_get_perf_level(void __user *argp)
+{
+ struct isst_perf_level_info perf_level;
+ struct tpmi_per_power_domain_info *power_domain_info;
+
+ if (copy_from_user(&perf_level, argp, sizeof(perf_level)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(perf_level.socket_id, perf_level.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ perf_level.max_level = power_domain_info->max_level;
+ perf_level.level_mask = power_domain_info->pp_header.allowed_level_mask;
+ perf_level.feature_rev = power_domain_info->pp_header.feature_rev;
+ _read_pp_info("current_level", perf_level.current_level, SST_PP_STATUS_OFFSET,
+ SST_PP_LEVEL_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE)
+ _read_pp_info("locked", perf_level.locked, SST_PP_STATUS_OFFSET,
+ SST_PP_LOCK_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE)
+ _read_pp_info("feature_state", perf_level.feature_state, SST_PP_STATUS_OFFSET,
+ SST_PP_FEATURE_STATE_START, SST_PP_FEATURE_STATE_WIDTH, SST_MUL_FACTOR_NONE)
+ perf_level.enabled = !!(power_domain_info->sst_header.cap_mask & BIT(1));
+
+ _read_bf_level_info("bf_support", perf_level.sst_bf_support, 0, 0,
+ SST_BF_FEATURE_SUPPORTED_START, SST_BF_FEATURE_SUPPORTED_WIDTH,
+ SST_MUL_FACTOR_NONE);
+ _read_tf_level_info("tf_support", perf_level.sst_tf_support, 0, 0,
+ SST_TF_FEATURE_SUPPORTED_START, SST_TF_FEATURE_SUPPORTED_WIDTH,
+ SST_MUL_FACTOR_NONE);
+
+ if (copy_to_user(argp, &perf_level, sizeof(perf_level)))
+ return -EFAULT;
+
+ return 0;
+}
+
+#define SST_PP_CONTROL_OFFSET 24
+#define SST_PP_LEVEL_CHANGE_TIME_MS 5
+#define SST_PP_LEVEL_CHANGE_RETRY_COUNT 3
+
+static int isst_if_set_perf_level(void __user *argp)
+{
+ struct isst_perf_level_control perf_level;
+ struct tpmi_per_power_domain_info *power_domain_info;
+ int level, retry = 0;
+
+ if (disable_dynamic_sst_features())
+ return -EFAULT;
+
+ if (copy_from_user(&perf_level, argp, sizeof(perf_level)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(perf_level.socket_id, perf_level.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ if (!(power_domain_info->pp_header.allowed_level_mask & BIT(perf_level.level)))
+ return -EINVAL;
+
+ _read_pp_info("current_level", level, SST_PP_STATUS_OFFSET,
+ SST_PP_LEVEL_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE)
+
+ /* If the requested new level is same as the current level, reject */
+ if (perf_level.level == level)
+ return -EINVAL;
+
+ _write_pp_info("perf_level", perf_level.level, SST_PP_CONTROL_OFFSET,
+ SST_PP_LEVEL_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE)
+
+ /* It is possible that firmware is busy (although unlikely), so retry */
+ do {
+ /* Give time to FW to process */
+ msleep(SST_PP_LEVEL_CHANGE_TIME_MS);
+
+ _read_pp_info("current_level", level, SST_PP_STATUS_OFFSET,
+ SST_PP_LEVEL_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE)
+
+ /* Check if the new level is active */
+ if (perf_level.level == level)
+ break;
+
+ } while (retry++ < SST_PP_LEVEL_CHANGE_RETRY_COUNT);
+
+ /* If the level change didn't happen, return fault */
+ if (perf_level.level != level)
+ return -EFAULT;
+
+ /* Reset the feature state on level change */
+ _write_pp_info("perf_feature", 0, SST_PP_CONTROL_OFFSET,
+ SST_PP_FEATURE_STATE_START, SST_PP_FEATURE_STATE_WIDTH,
+ SST_MUL_FACTOR_NONE)
+
+ /* Give time to FW to process */
+ msleep(SST_PP_LEVEL_CHANGE_TIME_MS);
+
+ return 0;
+}
+
+static int isst_if_set_perf_feature(void __user *argp)
+{
+ struct isst_perf_feature_control perf_feature;
+ struct tpmi_per_power_domain_info *power_domain_info;
+
+ if (disable_dynamic_sst_features())
+ return -EFAULT;
+
+ if (copy_from_user(&perf_feature, argp, sizeof(perf_feature)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(perf_feature.socket_id, perf_feature.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ _write_pp_info("perf_feature", perf_feature.feature, SST_PP_CONTROL_OFFSET,
+ SST_PP_FEATURE_STATE_START, SST_PP_FEATURE_STATE_WIDTH,
+ SST_MUL_FACTOR_NONE)
+
+ return 0;
+}
+
+#define _read_pp_level_info(name_str, name, level, offset, start, width, mult_factor)\
+{\
+ u64 val, _mask;\
+ \
+ val = readq(power_domain_info->sst_base +\
+ power_domain_info->perf_levels[level].mmio_offset +\
+ (power_domain_info->feature_offsets.pp_offset * 8) + (offset));\
+ _mask = GENMASK_ULL((start + width - 1), start);\
+ val &= _mask; \
+ val >>= start;\
+ name = (val * mult_factor);\
+}
+
+#define SST_PP_INFO_0_OFFSET 0
+#define SST_PP_INFO_1_OFFSET 8
+#define SST_PP_INFO_2_OFFSET 16
+#define SST_PP_INFO_3_OFFSET 24
+
+/* SST_PP_INFO_4_OFFSET to SST_PP_INFO_9_OFFSET are trl levels */
+#define SST_PP_INFO_4_OFFSET 32
+
+#define SST_PP_INFO_10_OFFSET 80
+#define SST_PP_INFO_11_OFFSET 88
+
+#define SST_PP_P1_SSE_START 0
+#define SST_PP_P1_SSE_WIDTH 8
+
+#define SST_PP_P1_AVX2_START 8
+#define SST_PP_P1_AVX2_WIDTH 8
+
+#define SST_PP_P1_AVX512_START 16
+#define SST_PP_P1_AVX512_WIDTH 8
+
+#define SST_PP_P1_AMX_START 24
+#define SST_PP_P1_AMX_WIDTH 8
+
+#define SST_PP_TDP_START 32
+#define SST_PP_TDP_WIDTH 15
+
+#define SST_PP_T_PROCHOT_START 47
+#define SST_PP_T_PROCHOT_WIDTH 8
+
+#define SST_PP_MAX_MEMORY_FREQ_START 55
+#define SST_PP_MAX_MEMORY_FREQ_WIDTH 7
+
+#define SST_PP_COOLING_TYPE_START 62
+#define SST_PP_COOLING_TYPE_WIDTH 2
+
+#define SST_PP_TRL_0_RATIO_0_START 0
+#define SST_PP_TRL_0_RATIO_0_WIDTH 8
+
+#define SST_PP_TRL_CORES_BUCKET_0_START 0
+#define SST_PP_TRL_CORES_BUCKET_0_WIDTH 8
+
+#define SST_PP_CORE_RATIO_P0_START 0
+#define SST_PP_CORE_RATIO_P0_WIDTH 8
+
+#define SST_PP_CORE_RATIO_P1_START 8
+#define SST_PP_CORE_RATIO_P1_WIDTH 8
+
+#define SST_PP_CORE_RATIO_PN_START 16
+#define SST_PP_CORE_RATIO_PN_WIDTH 8
+
+#define SST_PP_CORE_RATIO_PM_START 24
+#define SST_PP_CORE_RATIO_PM_WIDTH 8
+
+#define SST_PP_CORE_RATIO_P0_FABRIC_START 32
+#define SST_PP_CORE_RATIO_P0_FABRIC_WIDTH 8
+
+#define SST_PP_CORE_RATIO_P1_FABRIC_START 40
+#define SST_PP_CORE_RATIO_P1_FABRIC_WIDTH 8
+
+#define SST_PP_CORE_RATIO_PM_FABRIC_START 48
+#define SST_PP_CORE_RATIO_PM_FABRIC_WIDTH 8
+
+static int isst_if_get_perf_level_info(void __user *argp)
+{
+ struct isst_perf_level_data_info perf_level;
+ struct tpmi_per_power_domain_info *power_domain_info;
+ int i, j;
+
+ if (copy_from_user(&perf_level, argp, sizeof(perf_level)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(perf_level.socket_id, perf_level.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ if (perf_level.level > power_domain_info->max_level)
+ return -EINVAL;
+
+ if (!(power_domain_info->pp_header.level_en_mask & BIT(perf_level.level)))
+ return -EINVAL;
+
+ _read_pp_level_info("tdp_ratio", perf_level.tdp_ratio, perf_level.level,
+ SST_PP_INFO_0_OFFSET, SST_PP_P1_SSE_START, SST_PP_P1_SSE_WIDTH,
+ SST_MUL_FACTOR_NONE)
+ _read_pp_level_info("base_freq_mhz", perf_level.base_freq_mhz, perf_level.level,
+ SST_PP_INFO_0_OFFSET, SST_PP_P1_SSE_START, SST_PP_P1_SSE_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("base_freq_avx2_mhz", perf_level.base_freq_avx2_mhz, perf_level.level,
+ SST_PP_INFO_0_OFFSET, SST_PP_P1_AVX2_START, SST_PP_P1_AVX2_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("base_freq_avx512_mhz", perf_level.base_freq_avx512_mhz,
+ perf_level.level, SST_PP_INFO_0_OFFSET, SST_PP_P1_AVX512_START,
+ SST_PP_P1_AVX512_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("base_freq_amx_mhz", perf_level.base_freq_amx_mhz, perf_level.level,
+ SST_PP_INFO_0_OFFSET, SST_PP_P1_AMX_START, SST_PP_P1_AMX_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+
+ _read_pp_level_info("thermal_design_power_w", perf_level.thermal_design_power_w,
+ perf_level.level, SST_PP_INFO_1_OFFSET, SST_PP_TDP_START,
+ SST_PP_TDP_WIDTH, SST_MUL_FACTOR_NONE)
+ perf_level.thermal_design_power_w /= 8; /* units are in 1/8th watt */
+ _read_pp_level_info("tjunction_max_c", perf_level.tjunction_max_c, perf_level.level,
+ SST_PP_INFO_1_OFFSET, SST_PP_T_PROCHOT_START, SST_PP_T_PROCHOT_WIDTH,
+ SST_MUL_FACTOR_NONE)
+ _read_pp_level_info("max_memory_freq_mhz", perf_level.max_memory_freq_mhz,
+ perf_level.level, SST_PP_INFO_1_OFFSET, SST_PP_MAX_MEMORY_FREQ_START,
+ SST_PP_MAX_MEMORY_FREQ_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("cooling_type", perf_level.cooling_type, perf_level.level,
+ SST_PP_INFO_1_OFFSET, SST_PP_COOLING_TYPE_START,
+ SST_PP_COOLING_TYPE_WIDTH, SST_MUL_FACTOR_NONE)
+
+ for (i = 0; i < TRL_MAX_LEVELS; ++i) {
+ for (j = 0; j < TRL_MAX_BUCKETS; ++j)
+ _read_pp_level_info("trl*_bucket*_freq_mhz",
+ perf_level.trl_freq_mhz[i][j], perf_level.level,
+ SST_PP_INFO_4_OFFSET + (i * SST_PP_TRL_0_RATIO_0_WIDTH),
+ j * SST_PP_TRL_0_RATIO_0_WIDTH,
+ SST_PP_TRL_0_RATIO_0_WIDTH,
+ SST_MUL_FACTOR_FREQ);
+ }
+
+ for (i = 0; i < TRL_MAX_BUCKETS; ++i)
+ _read_pp_level_info("bucket*_core_count", perf_level.bucket_core_counts[i],
+ perf_level.level, SST_PP_INFO_10_OFFSET,
+ SST_PP_TRL_CORES_BUCKET_0_WIDTH * i,
+ SST_PP_TRL_CORES_BUCKET_0_WIDTH, SST_MUL_FACTOR_NONE)
+
+ perf_level.max_buckets = TRL_MAX_BUCKETS;
+ perf_level.max_trl_levels = TRL_MAX_LEVELS;
+
+ _read_pp_level_info("p0_freq_mhz", perf_level.p0_freq_mhz, perf_level.level,
+ SST_PP_INFO_11_OFFSET, SST_PP_CORE_RATIO_P0_START,
+ SST_PP_CORE_RATIO_P0_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("p1_freq_mhz", perf_level.p1_freq_mhz, perf_level.level,
+ SST_PP_INFO_11_OFFSET, SST_PP_CORE_RATIO_P1_START,
+ SST_PP_CORE_RATIO_P1_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("pn_freq_mhz", perf_level.pn_freq_mhz, perf_level.level,
+ SST_PP_INFO_11_OFFSET, SST_PP_CORE_RATIO_PN_START,
+ SST_PP_CORE_RATIO_PN_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("pm_freq_mhz", perf_level.pm_freq_mhz, perf_level.level,
+ SST_PP_INFO_11_OFFSET, SST_PP_CORE_RATIO_PM_START,
+ SST_PP_CORE_RATIO_PM_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("p0_fabric_freq_mhz", perf_level.p0_fabric_freq_mhz,
+ perf_level.level, SST_PP_INFO_11_OFFSET,
+ SST_PP_CORE_RATIO_P0_FABRIC_START,
+ SST_PP_CORE_RATIO_P0_FABRIC_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("p1_fabric_freq_mhz", perf_level.p1_fabric_freq_mhz,
+ perf_level.level, SST_PP_INFO_11_OFFSET,
+ SST_PP_CORE_RATIO_P1_FABRIC_START,
+ SST_PP_CORE_RATIO_P1_FABRIC_WIDTH, SST_MUL_FACTOR_FREQ)
+ _read_pp_level_info("pm_fabric_freq_mhz", perf_level.pm_fabric_freq_mhz,
+ perf_level.level, SST_PP_INFO_11_OFFSET,
+ SST_PP_CORE_RATIO_PM_FABRIC_START,
+ SST_PP_CORE_RATIO_PM_FABRIC_WIDTH, SST_MUL_FACTOR_FREQ)
+
+ if (copy_to_user(argp, &perf_level, sizeof(perf_level)))
+ return -EFAULT;
+
+ return 0;
+}
+
+#define SST_PP_FUSED_CORE_COUNT_START 0
+#define SST_PP_FUSED_CORE_COUNT_WIDTH 8
+
+#define SST_PP_RSLVD_CORE_COUNT_START 8
+#define SST_PP_RSLVD_CORE_COUNT_WIDTH 8
+
+#define SST_PP_RSLVD_CORE_MASK_START 0
+#define SST_PP_RSLVD_CORE_MASK_WIDTH 64
+
+static int isst_if_get_perf_level_mask(void __user *argp)
+{
+ static struct isst_perf_level_cpu_mask cpumask;
+ struct tpmi_per_power_domain_info *power_domain_info;
+ u64 mask;
+
+ if (copy_from_user(&cpumask, argp, sizeof(cpumask)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(cpumask.socket_id, cpumask.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ _read_pp_level_info("mask", mask, cpumask.level, SST_PP_INFO_2_OFFSET,
+ SST_PP_RSLVD_CORE_MASK_START, SST_PP_RSLVD_CORE_MASK_WIDTH,
+ SST_MUL_FACTOR_NONE)
+
+ cpumask.mask = mask;
+
+ if (!cpumask.punit_cpu_map)
+ return -EOPNOTSUPP;
+
+ if (copy_to_user(argp, &cpumask, sizeof(cpumask)))
+ return -EFAULT;
+
+ return 0;
+}
+
static int isst_if_get_tpmi_instance_count(void __user *argp)
{
struct isst_tpmi_instance_count tpmi_inst;
@@ -664,6 +1064,21 @@ static long isst_if_def_ioctl(struct file *file, unsigned int cmd,
case ISST_IF_CLOS_ASSOC:
ret = isst_if_clos_assoc(argp);
break;
+ case ISST_IF_PERF_LEVELS:
+ ret = isst_if_get_perf_level(argp);
+ break;
+ case ISST_IF_PERF_SET_LEVEL:
+ ret = isst_if_set_perf_level(argp);
+ break;
+ case ISST_IF_PERF_SET_FEATURE:
+ ret = isst_if_set_perf_feature(argp);
+ break;
+ case ISST_IF_GET_PERF_LEVEL_INFO:
+ ret = isst_if_get_perf_level_info(argp);
+ break;
+ case ISST_IF_GET_PERF_LEVEL_CPU_MASK:
+ ret = isst_if_get_perf_level_mask(argp);
+ break;
default:
break;
}
diff --git a/include/uapi/linux/isst_if.h b/include/uapi/linux/isst_if.h
index 32687d8023ef..c4b350ea5cbe 100644
--- a/include/uapi/linux/isst_if.h
+++ b/include/uapi/linux/isst_if.h
@@ -254,6 +254,178 @@ struct isst_tpmi_instance_count {
__u16 valid_mask;
};

+/**
+ * struct isst_perf_level_info - Structure to get information on SST-PP levels
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @logical_cpu: CPU number
+ * @clos: Clos ID to assign to the logical CPU
+ * @max_level: Maximum performance level supported by the platform
+ * @feature_rev: The feature revision for SST-PP supported by the platform
+ * @level_mask: Mask of supported performance levels
+ * @current_level: Current performance level
+ * @feature_state: SST-BF and SST-TF (enabled/disabled) status at current level
+ * @locked: SST-PP performance level change is locked/unlocked
+ * @enabled: SST-PP feature is enabled or not
+ * @sst-tf_support: SST-TF support status at this level
+ * @sst-bf_support: SST-BF support status at this level
+ *
+ * Structure to get SST-PP details using IOCTL ISST_IF_PERF_LEVELS.
+ */
+struct isst_perf_level_info {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u8 max_level;
+ __u8 feature_rev;
+ __u8 level_mask;
+ __u8 current_level;
+ __u8 feature_state;
+ __u8 locked;
+ __u8 enabled;
+ __u8 sst_tf_support;
+ __u8 sst_bf_support;
+};
+
+/**
+ * struct isst_perf_level_control - Structure to set SST-PP level
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @level: level to set
+ *
+ * Structure used change SST-PP level using IOCTL ISST_IF_PERF_SET_LEVEL.
+ */
+struct isst_perf_level_control {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u8 level;
+};
+
+/**
+ * struct isst_perf_feature_control - Structure to activate SST-BF/SST-TF
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @feature: bit 0 = SST-BF state, bit 1 = SST-TF state
+ *
+ * Structure used to enable SST-BF/SST-TF using IOCTL ISST_IF_PERF_SET_FEATURE.
+ */
+struct isst_perf_feature_control {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u8 feature;
+};
+
+#define TRL_MAX_BUCKETS 8
+#define TRL_MAX_LEVELS 6
+
+/**
+ * struct isst_perf_level_data_info - Structure to get SST-PP level details
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @level: SST-PP level for which caller wants to get information
+ * @tdp_ratio: TDP Ratio
+ * @base_freq_mhz: Base frequency in MHz
+ * @base_freq_avx2_mhz: AVX2 Base frequency in MHz
+ * @base_freq_avx512_mhz: AVX512 base frequency in MHz
+ * @base_freq_amx_mhz: AMX base frequency in MHz
+ * @thermal_design_power_w: Thermal design (TDP) power
+ * @tjunction_max_c: Max junction temperature
+ * @max_memory_freq_mhz: Max memory frequency in MHz
+ * @cooling_type: Type of cooling is used
+ * @p0_freq_mhz: core maximum frequency
+ * @p1_freq_mhz: Core TDP frequency
+ * @pn_freq_mhz: Core maximum efficiency frequency
+ * @pm_freq_mhz: Core minimum frequency
+ * @p0_fabric_freq_mhz: Fabric (Uncore) maximum frequency
+ * @p1_fabric_freq_mhz: Fabric (Uncore) TDP frequency
+ * @pn_fabric_freq_mhz: Fabric (Uncore) minimum efficiency frequency
+ * @pm_fabric_freq_mhz: Fabric (Uncore) minimum frequency
+ * @max_buckets: Maximum trl buckets
+ * @max_trl_levels: Maximum trl levels
+ * @bucket_core_counts[TRL_MAX_BUCKETS]: Number of cores per bucket
+ * @trl_freq_mhz[TRL_MAX_LEVELS][TRL_MAX_BUCKETS]: maximum frequency
+ * for a bucket and trl level
+ *
+ * Structure used to get information on frequencies and TDP for a SST-PP
+ * level using ISST_IF_GET_PERF_LEVEL_INFO.
+ */
+struct isst_perf_level_data_info {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u16 level;
+ __u16 tdp_ratio;
+ __u16 base_freq_mhz;
+ __u16 base_freq_avx2_mhz;
+ __u16 base_freq_avx512_mhz;
+ __u16 base_freq_amx_mhz;
+ __u16 thermal_design_power_w;
+ __u16 tjunction_max_c;
+ __u16 max_memory_freq_mhz;
+ __u16 cooling_type;
+ __u16 p0_freq_mhz;
+ __u16 p1_freq_mhz;
+ __u16 pn_freq_mhz;
+ __u16 pm_freq_mhz;
+ __u16 p0_fabric_freq_mhz;
+ __u16 p1_fabric_freq_mhz;
+ __u16 pn_fabric_freq_mhz;
+ __u16 pm_fabric_freq_mhz;
+ __u16 max_buckets;
+ __u16 max_trl_levels;
+ __u16 bucket_core_counts[TRL_MAX_BUCKETS];
+ __u16 trl_freq_mhz[TRL_MAX_LEVELS][TRL_MAX_BUCKETS];
+};
+
+/**
+ * struct isst_perf_level_cpu_mask - Structure to get SST-PP level CPU mask
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @level: SST-PP level for which caller wants to get information
+ * @punit_cpu_map: Set to 1 if the CPU number is punit numbering not
+ * Linux CPU number. If 0 CPU buffer is copied to user space
+ * supplied cpu_buffer of size cpu_buffer_size. Punit
+ * cpu mask is copied to "mask" field.
+ * @mask: cpu mask for this PP level (punit CPU numbering)
+ * @cpu_buffer_size: size of cpu_buffer also used to return the copied CPU
+ * buffer size.
+ * @cpu_buffer: Buffer to copy CPU mask when punit_cpu_map is 0
+ *
+ * Structure used to get cpumask for a SST-PP level using
+ * IOCTL ISST_IF_GET_PERF_LEVEL_CPU_MASK. Also used to get CPU mask for
+ * IOCTL ISST_IF_GET_BASE_FREQ_CPU_MASK for SST-BF.
+ */
+struct isst_perf_level_cpu_mask {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u8 level;
+ __u8 punit_cpu_map;
+ __u64 mask;
+ __u16 cpu_buffer_size;
+ __s8 cpu_buffer[1];
+};
+
+/**
+ * struct isst_base_freq_info - Structure to get SST-BF frequencies
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @level: SST-PP level for which caller wants to get information
+ * @high_base_freq_mhz: High priority CPU base frequency
+ * @low_base_freq_mhz: Low priority CPU base frequency
+ * @tjunction_max_c: Max junction temperature
+ * @thermal_design_power_w: Thermal design power in watts
+ *
+ * Structure used to get SST-BF information using
+ * IOCTL ISST_IF_GET_BASE_FREQ_INFO.
+ */
+struct isst_base_freq_info {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u16 level;
+ __u16 high_base_freq_mhz;
+ __u16 low_base_freq_mhz;
+ __u16 tjunction_max_c;
+ __u16 thermal_design_power_w;
+};
+
#define ISST_IF_MAGIC 0xFE
#define ISST_IF_GET_PLATFORM_INFO _IOR(ISST_IF_MAGIC, 0, struct isst_if_platform_info *)
#define ISST_IF_GET_PHY_ID _IOWR(ISST_IF_MAGIC, 1, struct isst_if_cpu_map *)
@@ -266,4 +438,12 @@ struct isst_tpmi_instance_count {
#define ISST_IF_CLOS_PARAM _IOWR(ISST_IF_MAGIC, 7, struct isst_clos_param *)
#define ISST_IF_CLOS_ASSOC _IOWR(ISST_IF_MAGIC, 8, struct isst_if_clos_assoc_cmds *)

+#define ISST_IF_PERF_LEVELS _IOWR(ISST_IF_MAGIC, 9, struct isst_perf_level_info *)
+#define ISST_IF_PERF_SET_LEVEL _IOW(ISST_IF_MAGIC, 10, struct isst_perf_level_control *)
+#define ISST_IF_PERF_SET_FEATURE _IOW(ISST_IF_MAGIC, 11, struct isst_perf_feature_control *)
+#define ISST_IF_GET_PERF_LEVEL_INFO _IOR(ISST_IF_MAGIC, 12, struct isst_perf_level_data_info *)
+#define ISST_IF_GET_PERF_LEVEL_CPU_MASK _IOR(ISST_IF_MAGIC, 13, struct isst_perf_level_cpu_mask *)
+#define ISST_IF_GET_BASE_FREQ_INFO _IOR(ISST_IF_MAGIC, 14, struct isst_base_freq_info *)
+#define ISST_IF_GET_BASE_FREQ_CPU_MASK _IOR(ISST_IF_MAGIC, 15, struct isst_perf_level_cpu_mask *)
+
#endif
--
2.34.1


2023-03-08 07:07:30

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 7/8] platform/x86: ISST: Add SST-TF support via TPMI

The support of Intel Speed Select Technology - Turbo Frequency (SST-TF)
feature enables the ability to set different “All core turbo ratio
limits” to cores based on the priority. By using this feature, some cores
can be configured to get higher turbo frequency by designating them as
high priority at the cost of lower or no turbo frequency on the low
priority cores.

One new IOCTLs are added:
ISST_IF_GET_TURBO_FREQ_INFO : Get information about turbo frequency
buckets

Once an instance is identified, read or write from correct MMIO
offset for a given field as defined in the specification.

For details on SST-TF operations using intel-speed-selet utility,
refer to:
Documentation/admin-guide/pm/intel-speed-select.rst
under the kernel documentation

Signed-off-by: Srinivas Pandruvada <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2:
No change

.../intel/speed_select_if/isst_tpmi_core.c | 66 +++++++++++++++++++
include/uapi/linux/isst_if.h | 26 ++++++++
2 files changed, 92 insertions(+)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
index 0566c8d30181..5104717afe0e 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
@@ -1125,6 +1125,69 @@ static int isst_if_get_tpmi_instance_count(void __user *argp)
return 0;
}

+#define SST_TF_INFO_0_OFFSET 0
+#define SST_TF_INFO_1_OFFSET 8
+#define SST_TF_INFO_2_OFFSET 16
+
+#define SST_TF_MAX_LP_CLIP_RATIOS TRL_MAX_LEVELS
+
+#define SST_TF_LP_CLIP_RATIO_0_START 16
+#define SST_TF_LP_CLIP_RATIO_0_WIDTH 8
+
+#define SST_TF_RATIO_0_START 0
+#define SST_TF_RATIO_0_WIDTH 8
+
+#define SST_TF_NUM_CORE_0_START 0
+#define SST_TF_NUM_CORE_0_WIDTH 8
+
+static int isst_if_get_turbo_freq_info(void __user *argp)
+{
+ static struct isst_turbo_freq_info turbo_freq;
+ struct tpmi_per_power_domain_info *power_domain_info;
+ int i, j;
+
+ if (copy_from_user(&turbo_freq, argp, sizeof(turbo_freq)))
+ return -EFAULT;
+
+ power_domain_info = get_instance(turbo_freq.socket_id, turbo_freq.power_domain_id);
+ if (!power_domain_info)
+ return -EINVAL;
+
+ if (turbo_freq.level > power_domain_info->max_level)
+ return -EINVAL;
+
+ turbo_freq.max_buckets = TRL_MAX_BUCKETS;
+ turbo_freq.max_trl_levels = TRL_MAX_LEVELS;
+ turbo_freq.max_clip_freqs = SST_TF_MAX_LP_CLIP_RATIOS;
+
+ for (i = 0; i < turbo_freq.max_clip_freqs; ++i)
+ _read_tf_level_info("lp_clip*", turbo_freq.lp_clip_freq_mhz[i],
+ turbo_freq.level, SST_TF_INFO_0_OFFSET,
+ SST_TF_LP_CLIP_RATIO_0_START +
+ (i * SST_TF_LP_CLIP_RATIO_0_WIDTH),
+ SST_TF_LP_CLIP_RATIO_0_WIDTH, SST_MUL_FACTOR_FREQ)
+
+ for (i = 0; i < TRL_MAX_LEVELS; ++i) {
+ for (j = 0; j < TRL_MAX_BUCKETS; ++j)
+ _read_tf_level_info("cydn*_bucket_*_trl",
+ turbo_freq.trl_freq_mhz[i][j], turbo_freq.level,
+ SST_TF_INFO_2_OFFSET + (i * SST_TF_RATIO_0_WIDTH),
+ j * SST_TF_RATIO_0_WIDTH, SST_TF_RATIO_0_WIDTH,
+ SST_MUL_FACTOR_FREQ)
+ }
+
+ for (i = 0; i < TRL_MAX_BUCKETS; ++i)
+ _read_tf_level_info("bucket_*_core_count", turbo_freq.bucket_core_counts[i],
+ turbo_freq.level, SST_TF_INFO_1_OFFSET,
+ SST_TF_NUM_CORE_0_WIDTH * i, SST_TF_NUM_CORE_0_WIDTH,
+ SST_MUL_FACTOR_NONE)
+
+ if (copy_to_user(argp, &turbo_freq, sizeof(turbo_freq)))
+ return -EFAULT;
+
+ return 0;
+}
+
static long isst_if_def_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -1166,6 +1229,9 @@ static long isst_if_def_ioctl(struct file *file, unsigned int cmd,
case ISST_IF_GET_BASE_FREQ_CPU_MASK:
ret = isst_if_get_base_freq_mask(argp);
break;
+ case ISST_IF_GET_TURBO_FREQ_INFO:
+ ret = isst_if_get_turbo_freq_info(argp);
+ break;
default:
break;
}
diff --git a/include/uapi/linux/isst_if.h b/include/uapi/linux/isst_if.h
index c4b350ea5cbe..0df1a1c3caf4 100644
--- a/include/uapi/linux/isst_if.h
+++ b/include/uapi/linux/isst_if.h
@@ -426,6 +426,31 @@ struct isst_base_freq_info {
__u16 thermal_design_power_w;
};

+/**
+ * struct isst_turbo_freq_info - Structure to get SST-TF frequencies
+ * @socket_id: Socket/package id
+ * @power_domain: Power Domain id
+ * @level: SST-PP level for which caller wants to get information
+ * @max_clip_freqs: Maximum number of low priority core clipping frequencies
+ * @lp_clip_freq_mhz: Clip frequencies per trl level
+ * @bucket_core_counts: Maximum number of cores for a bucket
+ * @trl_freq_mhz: Frequencies per trl level for each bucket
+ *
+ * Structure used to get SST-TF information using
+ * IOCTL ISST_IF_GET_TURBO_FREQ_INFO.
+ */
+struct isst_turbo_freq_info {
+ __u8 socket_id;
+ __u8 power_domain_id;
+ __u16 level;
+ __u16 max_clip_freqs;
+ __u16 max_buckets;
+ __u16 max_trl_levels;
+ __u16 lp_clip_freq_mhz[TRL_MAX_LEVELS];
+ __u16 bucket_core_counts[TRL_MAX_BUCKETS];
+ __u16 trl_freq_mhz[TRL_MAX_LEVELS][TRL_MAX_BUCKETS];
+};
+
#define ISST_IF_MAGIC 0xFE
#define ISST_IF_GET_PLATFORM_INFO _IOR(ISST_IF_MAGIC, 0, struct isst_if_platform_info *)
#define ISST_IF_GET_PHY_ID _IOWR(ISST_IF_MAGIC, 1, struct isst_if_cpu_map *)
@@ -445,5 +470,6 @@ struct isst_base_freq_info {
#define ISST_IF_GET_PERF_LEVEL_CPU_MASK _IOR(ISST_IF_MAGIC, 13, struct isst_perf_level_cpu_mask *)
#define ISST_IF_GET_BASE_FREQ_INFO _IOR(ISST_IF_MAGIC, 14, struct isst_base_freq_info *)
#define ISST_IF_GET_BASE_FREQ_CPU_MASK _IOR(ISST_IF_MAGIC, 15, struct isst_perf_level_cpu_mask *)
+#define ISST_IF_GET_TURBO_FREQ_INFO _IOR(ISST_IF_MAGIC, 16, struct isst_turbo_freq_info *)

#endif
--
2.34.1


2023-03-08 07:07:38

by srinivas pandruvada

[permalink] [raw]
Subject: [PATCH v2 8/8] platform/x86: ISST: Add suspend/resume callbacks

To support S3/S4 with TPMI interface add suspend/resume callbacks.
Here HW state is stored in suspend callback and restored during
resume callback.

The hardware state which needs to be stored/restored:
- CLOS configuration
- CLOS Association
- SST-CP enable/disable status
- SST-PP perf level setting

Signed-off-by: Srinivas Pandruvada <[email protected]>
Suggested-by: Hans de Goede <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Tested-by: Pragya Tanwar <[email protected]>
---
v2:
- As suggested by Hans, modified suspend/resume callbacks

.../x86/intel/speed_select_if/isst_tpmi.c | 19 +++++++
.../intel/speed_select_if/isst_tpmi_core.c | 49 +++++++++++++++++++
.../intel/speed_select_if/isst_tpmi_core.h | 2 +
3 files changed, 70 insertions(+)

diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi.c
index 7b4bdeefb8bc..17972191538a 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi.c
@@ -34,6 +34,22 @@ static void intel_sst_remove(struct auxiliary_device *auxdev)
tpmi_sst_exit();
}

+static int intel_sst_suspend(struct device *dev)
+{
+ tpmi_sst_dev_suspend(to_auxiliary_dev(dev));
+
+ return 0;
+}
+
+static int intel_sst_resume(struct device *dev)
+{
+ tpmi_sst_dev_resume(to_auxiliary_dev(dev));
+
+ return 0;
+}
+
+static DEFINE_SIMPLE_DEV_PM_OPS(intel_sst_pm, intel_sst_suspend, intel_sst_resume);
+
static const struct auxiliary_device_id intel_sst_id_table[] = {
{ .name = "intel_vsec.tpmi-sst" },
{}
@@ -44,6 +60,9 @@ static struct auxiliary_driver intel_sst_aux_driver = {
.id_table = intel_sst_id_table,
.remove = intel_sst_remove,
.probe = intel_sst_probe,
+ .driver = {
+ .pm = pm_sleep_ptr(&intel_sst_pm),
+ },
};

module_auxiliary_driver(intel_sst_aux_driver);
diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
index 5104717afe0e..cdb56a18ea17 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
@@ -229,6 +229,10 @@ struct perf_level {
* @status_offset: Store the status offset for each PP-level
* @sst_base: Mapped SST base IO memory
* @auxdev: Auxiliary device instance enumerated this instance
+ * @saved_sst_cp_control: Save SST-CP control configuration to store restore for suspend/resume
+ * @saved_clos_configs: Save SST-CP CLOS configuration to store restore for suspend/resume
+ * @saved_clos_assocs: Save SST-CP CLOS association to store restore for suspend/resume
+ * @saved_pp_control: Save SST-PP control information to store restore for suspend/resume
*
* This structure is used store complete SST information for a power_domain. This information
* is used to read/write request for any SST IOCTL. Each physical CPU package can have multiple
@@ -250,6 +254,10 @@ struct tpmi_per_power_domain_info {
struct pp_status_offset status_offset;
void __iomem *sst_base;
struct auxiliary_device *auxdev;
+ u64 saved_sst_cp_control;
+ u64 saved_clos_configs[4];
+ u64 saved_clos_assocs[4];
+ u64 saved_pp_control;
};

/**
@@ -1333,6 +1341,47 @@ void tpmi_sst_dev_remove(struct auxiliary_device *auxdev)
}
EXPORT_SYMBOL_NS_GPL(tpmi_sst_dev_remove, INTEL_TPMI_SST);

+void tpmi_sst_dev_suspend(struct auxiliary_device *auxdev)
+{
+ struct tpmi_sst_struct *tpmi_sst = auxiliary_get_drvdata(auxdev);
+ struct tpmi_per_power_domain_info *power_domain_info = tpmi_sst->power_domain_info;
+ void __iomem *cp_base;
+
+ cp_base = power_domain_info->sst_base + power_domain_info->sst_header.cp_offset;
+ power_domain_info->saved_sst_cp_control = readq(cp_base + SST_CP_CONTROL_OFFSET);
+
+ memcpy_fromio(power_domain_info->saved_clos_configs, cp_base + SST_CLOS_CONFIG_0_OFFSET,
+ sizeof(power_domain_info->saved_clos_configs));
+
+ memcpy_fromio(power_domain_info->saved_clos_assocs, cp_base + SST_CLOS_ASSOC_0_OFFSET,
+ sizeof(power_domain_info->saved_clos_assocs));
+
+ power_domain_info->saved_pp_control = readq(power_domain_info->sst_base +
+ power_domain_info->sst_header.pp_offset +
+ SST_PP_CONTROL_OFFSET);
+}
+EXPORT_SYMBOL_NS_GPL(tpmi_sst_dev_suspend, INTEL_TPMI_SST);
+
+void tpmi_sst_dev_resume(struct auxiliary_device *auxdev)
+{
+ struct tpmi_sst_struct *tpmi_sst = auxiliary_get_drvdata(auxdev);
+ struct tpmi_per_power_domain_info *power_domain_info = tpmi_sst->power_domain_info;
+ void __iomem *cp_base;
+
+ cp_base = power_domain_info->sst_base + power_domain_info->sst_header.cp_offset;
+ writeq(power_domain_info->saved_sst_cp_control, cp_base + SST_CP_CONTROL_OFFSET);
+
+ memcpy_toio(cp_base + SST_CLOS_CONFIG_0_OFFSET, power_domain_info->saved_clos_configs,
+ sizeof(power_domain_info->saved_clos_configs));
+
+ memcpy_toio(cp_base + SST_CLOS_ASSOC_0_OFFSET, power_domain_info->saved_clos_assocs,
+ sizeof(power_domain_info->saved_clos_assocs));
+
+ writeq(power_domain_info->saved_pp_control, power_domain_info->sst_base +
+ power_domain_info->sst_header.pp_offset + SST_PP_CONTROL_OFFSET);
+}
+EXPORT_SYMBOL_NS_GPL(tpmi_sst_dev_resume, INTEL_TPMI_SST);
+
#define ISST_TPMI_API_VERSION 0x02

int tpmi_sst_init(void)
diff --git a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.h b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.h
index 356cb02273b1..900b483703f9 100644
--- a/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.h
+++ b/drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.h
@@ -13,4 +13,6 @@ int tpmi_sst_init(void);
void tpmi_sst_exit(void);
int tpmi_sst_dev_add(struct auxiliary_device *auxdev);
void tpmi_sst_dev_remove(struct auxiliary_device *auxdev);
+void tpmi_sst_dev_suspend(struct auxiliary_device *auxdev);
+void tpmi_sst_dev_resume(struct auxiliary_device *auxdev);
#endif
--
2.34.1


2023-03-16 14:21:58

by Hans de Goede

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] platform/x86: ISST: Use TPMI interface

Hi,

On 3/8/23 08:06, Srinivas Pandruvada wrote:
> This series implements TPMI as Intel Speed Select Technology (Intel SST)
> HW interface. TPMI has several advantages for Intel SST. This replaces
> legacy mailbox and MMIO with architectural interface over TPMI.
>
> This improves performance for HPC type applications. One single IOCTL command
> replaces 10s of IOCTLs for mailboxes. This allowed to offer many more
> performance levels and SST configurations.
>
> This series depends on previously posted series:
> - platform/x86/intel: Intel TPMI enumeration driver
>
> Change History
> v2
> - Rebased on top of review-hans branch of platform-drivers-x86
> - Removed patches which are already present in this branch from the last review
> So number of patches are reduced from 12 to 8.
> - Rework patch for MSR 0x54 support
> - Use suggestion from Hans for suspend/resume callbacks
> - Add Reviewed-by and Test-by tags

Thank you for your patch-series, I've applied the series to my
review-hans branch:
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/log/?h=review-hans

Once I've run some tests on this branch the patches there will be
added to the platform-drivers-x86/for-next branch and eventually
will be included in the pdx86 pull-request to Linus for the next
merge-window.

Regards,

Hans





> Srinivas Pandruvada (8):
> platform/x86: ISST: Add support for MSR 0x54
> platform/x86: ISST: Enumerate TPMI SST and create framework
> platform/x86: ISST: Parse SST MMIO and update instance
> platform/x86: ISST: Add SST-CP support via TPMI
> platform/x86: ISST: Add SST-PP support via TPMI
> platform/x86: ISST: Add SST-BF support via TPMI
> platform/x86: ISST: Add SST-TF support via TPMI
> platform/x86: ISST: Add suspend/resume callbacks
>
> .../x86/intel/speed_select_if/Kconfig | 4 +
> .../x86/intel/speed_select_if/Makefile | 2 +
> .../intel/speed_select_if/isst_if_common.c | 28 +
> .../x86/intel/speed_select_if/isst_tpmi.c | 72 +
> .../intel/speed_select_if/isst_tpmi_core.c | 1438 +++++++++++++++++
> .../intel/speed_select_if/isst_tpmi_core.h | 18 +
> include/uapi/linux/isst_if.h | 303 ++++
> 7 files changed, 1865 insertions(+)
> create mode 100644 drivers/platform/x86/intel/speed_select_if/isst_tpmi.c
> create mode 100644 drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c
> create mode 100644 drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.h
>