This submission contains 4 patches relating to mlxbf-pmc.
Patch 1 adds documentation for the sysfs files created by the driver.
Patches 2 and 3 add specific functionality to the driver for supporting
64-bit ocunters, cycle count and clock_measure performance block.
Patch 4 adds documentation for the newly added sysfs entries.
v2 -> v3
Updated documentation as suggested to provide more details on the
sysfs and driver functionality.
v1 -> v2
Added patch 4 to document sysfs entries added in patches 2 and 3.
Shravan Kumar Ramani (4):
Documentation/ABI: Add document for Mellanox PMC driver
platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and
cycle count
platform/mellanox: mlxbf-pmc: Add support for clock_measure
performance block
Documentation/ABI: Add new sysfs fields to sysfs-platform-mellanox-pmc
.../ABI/testing/sysfs-platform-mellanox-pmc | 77 ++++++++
drivers/platform/mellanox/mlxbf-pmc.c | 180 +++++++++++++++++-
2 files changed, 253 insertions(+), 4 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-platform-mellanox-pmc
--
2.30.1
Document the sysfs interface for programming and monitoring the
performance counters in various HW blocks of Mellanox BlueField-1,
BlueField-2 and BlueField-3.
Signed-off-by: Shravan Kumar Ramani <[email protected]>
Reviewed-by: David Thompson <[email protected]>
---
.../ABI/testing/sysfs-platform-mellanox-pmc | 54 +++++++++++++++++++
1 file changed, 54 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-platform-mellanox-pmc
diff --git a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
new file mode 100644
index 000000000000..9f987c6410da
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
@@ -0,0 +1,54 @@
+HID Driver Description
+MLNXBFD0 mlxbf-pmc Performance counters (BlueField-1)
+MLNXBFD1 mlxbf-pmc Performance counters (BlueField-2)
+MLNXBFD2 mlxbf-pmc Performance counters (BlueField-3)
+
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/event_list
+Date: Dec 2020
+KernelVersion: 5.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ List of events supported by the counters in the specific block.
+ It is used to extract the event number or ID associated with
+ each event.
+
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/event<N>
+Date: Dec 2020
+KernelVersion: 5.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ Event monitored by corresponding counter. This is used to
+ program or read back the event that should be or is currently
+ being monitored by counter<N>.
+
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/counter<N>
+Date: Dec 2020
+KernelVersion: 5.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ Counter value of the event being monitored. This is used to
+ read the counter value of the event which was programmed using
+ event<N>. This is also used to clear or reset the counter value
+ by writing 0 to the counter sysfs.
+
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/enable
+Date: Dec 2020
+KernelVersion: 5.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ Start or stop counters. This is used to start the counters
+ for monitoring the programmed events and also to stop the
+ counters after the desired duration. Writing value 1 will
+ start all the counters in the block, and writing 0 will
+ stop all the counters together.
+
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/<reg>
+Date: Dec 2020
+KernelVersion: 5.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ Value of register. This is used to read or reset the registers
+ where various performance statistics are counted for each block.
+ Writing 0 to the sysfs will clear the counter, writing any other
+ value is not allowed.
+
--
2.30.1
Add support for programming any counter to monitor the cycle count.
This will allow the user to repurpose and dedicate any of the counters
in the block to counting cycles.
Since counting of cycles using 32-bit counters would result in frequent
wraparounds, add the ability to combine 2 adjacent 32-bit counters to
form 1 64-bit counter. The 64-bit value can be read via the same 2
counter sysfs which will each hold the lower and upper 32 bits.
Both these features are supported by BlueField-3 PMC hardware, hence
the required bit-fields are exposed by the driver via sysfs to allow
the user to configure as needed.
Signed-off-by: Shravan Kumar Ramani <[email protected]>
Reviewed-by: David Thompson <[email protected]>
Reviewed-by: Vadim Pasternak <[email protected]>
---
drivers/platform/mellanox/mlxbf-pmc.c | 134 ++++++++++++++++++++++++++
1 file changed, 134 insertions(+)
diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
index 4ed9c7fd2b62..635ecc3b3845 100644
--- a/drivers/platform/mellanox/mlxbf-pmc.c
+++ b/drivers/platform/mellanox/mlxbf-pmc.c
@@ -88,6 +88,8 @@
#define MLXBF_PMC_CRSPACE_PERFMON_CTL(n) (n * MLXBF_PMC_CRSPACE_PERFMON_REG0_SZ)
#define MLXBF_PMC_CRSPACE_PERFMON_EN BIT(30)
#define MLXBF_PMC_CRSPACE_PERFMON_CLR BIT(28)
+#define MLXBF_PMC_CRSPACE_PERFMON_UOC GENMASK(15, 0)
+#define MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(n) (MLXBF_PMC_CRSPACE_PERFMON_CTL(n) + 0x4)
#define MLXBF_PMC_CRSPACE_PERFMON_VAL0(n) (MLXBF_PMC_CRSPACE_PERFMON_CTL(n) + 0xc)
/**
@@ -114,6 +116,8 @@ struct mlxbf_pmc_attribute {
* @attr_event: Attributes for "event" sysfs files
* @attr_event_list: Attributes for "event_list" sysfs files
* @attr_enable: Attributes for "enable" sysfs files
+ * @attr_use_odd_counter: Attributes for "use_odd_counter" sysfs files
+ * @attr_count_clock: Attributes for "count_clock" sysfs files
* @block_attr: All attributes needed for the block
* @block_attr_grp: Attribute group for the block
*/
@@ -126,6 +130,8 @@ struct mlxbf_pmc_block_info {
struct mlxbf_pmc_attribute *attr_event;
struct mlxbf_pmc_attribute attr_event_list;
struct mlxbf_pmc_attribute attr_enable;
+ struct mlxbf_pmc_attribute attr_use_odd_counter;
+ struct mlxbf_pmc_attribute attr_count_clock;
struct attribute *block_attr[MLXBF_PMC_MAX_ATTRS];
struct attribute_group block_attr_grp;
};
@@ -1763,6 +1769,103 @@ static ssize_t mlxbf_pmc_enable_store(struct device *dev,
return count;
}
+/* Show function for "use_odd_counter" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_use_odd_counter_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct mlxbf_pmc_attribute *attr_use_odd_counter = container_of(
+ attr, struct mlxbf_pmc_attribute, dev_attr);
+ unsigned int blk_num;
+ u32 value, reg;
+
+ blk_num = attr_use_odd_counter->nr;
+
+ if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
+ MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
+ ®))
+ return -EINVAL;
+
+ value = FIELD_GET(MLXBF_PMC_CRSPACE_PERFMON_UOC, reg);
+
+ return sysfs_emit(buf, "%u\n", value);
+}
+
+/* Store function for "use_odd_counter" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_use_odd_counter_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlxbf_pmc_attribute *attr_use_odd_counter = container_of(
+ attr, struct mlxbf_pmc_attribute, dev_attr);
+ unsigned int blk_num;
+ u32 uoc, reg;
+ int err;
+
+ blk_num = attr_use_odd_counter->nr;
+
+ err = kstrtouint(buf, 0, &uoc);
+ if (err < 0)
+ return err;
+
+ err = mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
+ MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
+ ®);
+ if (err)
+ return -EINVAL;
+
+ reg &= ~MLXBF_PMC_CRSPACE_PERFMON_UOC;
+ reg |= FIELD_PREP(MLXBF_PMC_CRSPACE_PERFMON_UOC, uoc);
+
+ mlxbf_pmc_write(pmc->block[blk_num].mmio_base +
+ MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
+ MLXBF_PMC_WRITE_REG_32, reg);
+
+ return count;
+}
+
+/* Show function for "count_clock" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_count_clock_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct mlxbf_pmc_attribute *attr_count_clock = container_of(
+ attr, struct mlxbf_pmc_attribute, dev_attr);
+ unsigned int blk_num;
+ u32 reg;
+
+ blk_num = attr_count_clock->nr;
+
+ if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
+ MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(pmc->block[blk_num].counters),
+ ®))
+ return -EINVAL;
+
+ return sysfs_emit(buf, "%u\n", reg);
+}
+
+/* Store function for "count_clock" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_count_clock_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlxbf_pmc_attribute *attr_count_clock = container_of(
+ attr, struct mlxbf_pmc_attribute, dev_attr);
+ unsigned int blk_num;
+ u32 reg;
+ int err;
+
+ blk_num = attr_count_clock->nr;
+
+ err = kstrtouint(buf, 0, ®);
+ if (err < 0)
+ return err;
+
+ mlxbf_pmc_write(pmc->block[blk_num].mmio_base +
+ MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(pmc->block[blk_num].counters),
+ MLXBF_PMC_WRITE_REG_32, reg);
+
+ return count;
+}
+
/* Populate attributes for blocks with counters to monitor performance */
static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_num)
{
@@ -1799,6 +1902,37 @@ static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_
attr = NULL;
}
+ if (pmc->block[blk_num].type == MLXBF_PMC_TYPE_CRSPACE) {
+ /*
+ * Couple adjacent odd and even 32-bit counters to form 64-bit counters
+ * using "use_odd_counter" sysfs which has one bit per even counter.
+ */
+ attr = &pmc->block[blk_num].attr_use_odd_counter;
+ attr->dev_attr.attr.mode = 0644;
+ attr->dev_attr.show = mlxbf_pmc_use_odd_counter_show;
+ attr->dev_attr.store = mlxbf_pmc_use_odd_counter_store;
+ attr->nr = blk_num;
+ attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
+ "use_odd_counter");
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
+ pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
+ attr = NULL;
+
+ /* Program crspace counters to count clock cycles using "count_clock" sysfs */
+ attr = &pmc->block[blk_num].attr_count_clock;
+ attr->dev_attr.attr.mode = 0644;
+ attr->dev_attr.show = mlxbf_pmc_count_clock_show;
+ attr->dev_attr.store = mlxbf_pmc_count_clock_store;
+ attr->nr = blk_num;
+ attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
+ "count_clock");
+ if (!attr->dev_attr.attr.name)
+ return -ENOMEM;
+ pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
+ attr = NULL;
+ }
+
pmc->block[blk_num].attr_counter = devm_kcalloc(
dev, pmc->block[blk_num].counters,
sizeof(struct mlxbf_pmc_attribute), GFP_KERNEL);
--
2.30.1
The HW clock_measure counter info is passed to the driver from ACPI.
Create a new sub-directory for clock_measure events and provide
read access to the user. Writes are blocked since the fields are RO.
Signed-off-by: Shravan Kumar Ramani <[email protected]>
Reviewed-by: David Thompson <[email protected]>
Reviewed-by: Vadim Pasternak <[email protected]>
---
drivers/platform/mellanox/mlxbf-pmc.c | 46 ++++++++++++++++++++++++---
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
index 635ecc3b3845..1212a96fb3eb 100644
--- a/drivers/platform/mellanox/mlxbf-pmc.c
+++ b/drivers/platform/mellanox/mlxbf-pmc.c
@@ -865,6 +865,37 @@ static const struct mlxbf_pmc_events mlxbf_pmc_llt_miss_events[] = {
{75, "HISTOGRAM_HISTOGRAM_BIN9"},
};
+static const struct mlxbf_pmc_events mlxbf_pmc_clock_events[] = {
+ { 0x0, "FMON_CLK_LAST_COUNT_PLL_D1_INST0" },
+ { 0x4, "REFERENCE_WINDOW_WIDTH_PLL_D1_INST0" },
+ { 0x8, "FMON_CLK_LAST_COUNT_PLL_D1_INST1" },
+ { 0xc, "REFERENCE_WINDOW_WIDTH_PLL_D1_INST1" },
+ { 0x10, "FMON_CLK_LAST_COUNT_PLL_G1" },
+ { 0x14, "REFERENCE_WINDOW_WIDTH_PLL_G1" },
+ { 0x18, "FMON_CLK_LAST_COUNT_PLL_W1" },
+ { 0x1c, "REFERENCE_WINDOW_WIDTH_PLL_W1" },
+ { 0x20, "FMON_CLK_LAST_COUNT_PLL_T1" },
+ { 0x24, "REFERENCE_WINDOW_WIDTH_PLL_T1" },
+ { 0x28, "FMON_CLK_LAST_COUNT_PLL_A0" },
+ { 0x2c, "REFERENCE_WINDOW_WIDTH_PLL_A0" },
+ { 0x30, "FMON_CLK_LAST_COUNT_PLL_C0" },
+ { 0x34, "REFERENCE_WINDOW_WIDTH_PLL_C0" },
+ { 0x38, "FMON_CLK_LAST_COUNT_PLL_N1" },
+ { 0x3c, "REFERENCE_WINDOW_WIDTH_PLL_N1" },
+ { 0x40, "FMON_CLK_LAST_COUNT_PLL_I1" },
+ { 0x44, "REFERENCE_WINDOW_WIDTH_PLL_I1" },
+ { 0x48, "FMON_CLK_LAST_COUNT_PLL_R1" },
+ { 0x4c, "REFERENCE_WINDOW_WIDTH_PLL_R1" },
+ { 0x50, "FMON_CLK_LAST_COUNT_PLL_P1" },
+ { 0x54, "REFERENCE_WINDOW_WIDTH_PLL_P1" },
+ { 0x58, "FMON_CLK_LAST_COUNT_REF_100_INST0" },
+ { 0x5c, "REFERENCE_WINDOW_WIDTH_REF_100_INST0" },
+ { 0x60, "FMON_CLK_LAST_COUNT_REF_100_INST1" },
+ { 0x64, "REFERENCE_WINDOW_WIDTH_REF_100_INST1" },
+ { 0x68, "FMON_CLK_LAST_COUNT_REF_156" },
+ { 0x6c, "REFERENCE_WINDOW_WIDTH_REF_156" },
+};
+
static struct mlxbf_pmc_context *pmc;
/* UUID used to probe ATF service. */
@@ -1038,6 +1069,9 @@ static const struct mlxbf_pmc_events *mlxbf_pmc_event_list(const char *blk, size
} else if (strstr(blk, "llt")) {
events = mlxbf_pmc_llt_events;
size = ARRAY_SIZE(mlxbf_pmc_llt_events);
+ } else if (strstr(blk, "clock_measure")) {
+ events = mlxbf_pmc_clock_events;
+ size = ARRAY_SIZE(mlxbf_pmc_clock_events);
} else {
events = NULL;
size = 0;
@@ -1472,14 +1506,15 @@ static int mlxbf_pmc_read_event(unsigned int blk_num, u32 cnt_num, bool is_l3, u
/* Method to read a register */
static int mlxbf_pmc_read_reg(unsigned int blk_num, u32 offset, u64 *result)
{
- u32 ecc_out;
+ u32 reg;
- if (strstr(pmc->block_name[blk_num], "ecc")) {
+ if ((strstr(pmc->block_name[blk_num], "ecc")) ||
+ (strstr(pmc->block_name[blk_num], "clock_measure"))) {
if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base + offset,
- &ecc_out))
+ ®))
return -EFAULT;
- *result = ecc_out;
+ *result = reg;
return 0;
}
@@ -1493,6 +1528,9 @@ static int mlxbf_pmc_read_reg(unsigned int blk_num, u32 offset, u64 *result)
/* Method to write to a register */
static int mlxbf_pmc_write_reg(unsigned int blk_num, u32 offset, u64 data)
{
+ if (strstr(pmc->block_name[blk_num], "clock_measure"))
+ return -EINVAL;
+
if (strstr(pmc->block_name[blk_num], "ecc")) {
return mlxbf_pmc_write(pmc->block[blk_num].mmio_base + offset,
MLXBF_PMC_WRITE_REG_32, data);
--
2.30.1
Document newly added "count_clock" and "use_odd_counter" sysfs entries
for the Mellanox BlueField PMC driver.
Signed-off-by: Shravan Kumar Ramani <[email protected]>
Reviewed-by: David Thompson <[email protected]>
---
.../ABI/testing/sysfs-platform-mellanox-pmc | 23 +++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
index 9f987c6410da..ee03d066c0d9 100644
--- a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
+++ b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
@@ -52,3 +52,26 @@ Description:
Writing 0 to the sysfs will clear the counter, writing any other
value is not allowed.
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/count_clock
+Date: May 2024
+KernelVersion: 6.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ Use a counter for counting cycles. This is used to repurpose/dedicate
+ any of the counters in the block to counting cycles. Each counter is
+ represented by a bit (bit 0 for counter0, bit1 for counter1 and so on)
+ and setting the corresponding bit will reserve that specific counter
+ for counting cycles and override the event<N> setting.
+
+What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/use_odd_counter
+Date: May 2024
+KernelVersion: 6.10
+Contact: "Shravan Kumar Ramani <[email protected]>"
+Description:
+ Form 64-bit counter using 2 32-bit counters. This is used to combine
+ 2 adjacent counters to form a single 64-bit counter. Each even counter
+ is represented by a bit and setting the bit will join the corresponding
+ even counter with the next (odd) counter. The full 64-bit value can be
+ accessed using the same 2 counter<N> and counter<N+1> sysfs, with each
+ of them holding the lower and upper 32 bits respectively.
+
--
2.30.1
On Fri, 7 Jun 2024, Shravan Kumar Ramani wrote:
> Document newly added "count_clock" and "use_odd_counter" sysfs entries
> for the Mellanox BlueField PMC driver.
>
> Signed-off-by: Shravan Kumar Ramani <[email protected]>
> Reviewed-by: David Thompson <[email protected]>
> ---
> .../ABI/testing/sysfs-platform-mellanox-pmc | 23 +++++++++++++++++++
> 1 file changed, 23 insertions(+)
>
> diff --git a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
> index 9f987c6410da..ee03d066c0d9 100644
> --- a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
> +++ b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
> @@ -52,3 +52,26 @@ Description:
> Writing 0 to the sysfs will clear the counter, writing any other
> value is not allowed.
>
> +What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/count_clock
> +Date: May 2024
> +KernelVersion: 6.10
> +Contact: "Shravan Kumar Ramani <[email protected]>"
> +Description:
> + Use a counter for counting cycles. This is used to repurpose/dedicate
> + any of the counters in the block to counting cycles. Each counter is
> + represented by a bit (bit 0 for counter0, bit1 for counter1 and so on)
> + and setting the corresponding bit will reserve that specific counter
> + for counting cycles and override the event<N> setting.
> +
> +What: /sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/use_odd_counter
> +Date: May 2024
> +KernelVersion: 6.10
6.10 ship has already sailed.
> +Contact: "Shravan Kumar Ramani <[email protected]>"
> +Description:
> + Form 64-bit counter using 2 32-bit counters. This is used to combine
> + 2 adjacent counters to form a single 64-bit counter. Each even counter
> + is represented by a bit and setting the bit will join the corresponding
> + even counter with the next (odd) counter. The full 64-bit value can be
> + accessed using the same 2 counter<N> and counter<N+1> sysfs, with each
> + of them holding the lower and upper 32 bits respectively.
Okay, thanks for updating this, it's much clearer now what the intent of
each file is. One thing that is still unclear whether both bits are
expected to be set in count_clock when use_odd_counter is used to combine
two counters?
--
i.
> > + Form 64-bit counter using 2 32-bit counters. This is used to combine
> > + 2 adjacent counters to form a single 64-bit counter. Each even counter
> > + is represented by a bit and setting the bit will join the corresponding
> > + even counter with the next (odd) counter. The full 64-bit value can be
> > + accessed using the same 2 counter<N> and counter<N+1> sysfs, with each
> > + of them holding the lower and upper 32 bits respectively.
>
> Okay, thanks for updating this, it's much clearer now what the intent of
> each file is. One thing that is still unclear whether both bits are
> expected to be set in count_clock when use_odd_counter is used to combine
> two counters?
When use_odd_counter is set to combine counter0 and counter1, for example, the counter
value will increment only based on what event is being monitored by counter0. So for
counting cycles with 64-bits, count_clock needs to be set for counter0. Bit1 in count_clock
as well as event1 will not matter once counter1 is joined to counter0. I will capture this as
well in the doc in the next patch when I fix the kernel version.
Thanks,
Shravan