2020-10-01 01:45:32

by David E. Box

[permalink] [raw]
Subject: [PATCH V7 0/5] Intel Platform Monitoring Technology

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.
With customers increasingly asking for hardware telemetry, engineers not
only have to figure out how to measure and collect data, but also how to
deliver it and make it discoverable. The latter may be through some device
specific method requiring device specific tools to collect the data. This
in turn requires customers to manage a suite of different tools in order to
collect the differing assortment of monitoring data on their systems. Even
when such information can be provided in kernel drivers, they may require
constant maintenance to update register mappings as they change with
firmware updates and new versions of hardware. PMT provides a solution for
discovering and reading telemetry from a device through a hardware agnostic
framework that allows for updates to systems without requiring patches to
the kernel or software tools.

PMT defines several capabilities to support collecting monitoring data from
hardware. All are discoverable as separate instances of the PCIE Designated
Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
field uniquely identifies the capability. Each DVSEC also provides a BAR
offset to a header that defines capability-specific attributes, including
GUID, feature type, offset and length, as well as configuration settings
where applicable. The GUID uniquely identifies the register space of any
monitor data exposed by the capability. The GUID is associated with an XML
file from the vendor that describes the mapping of the register space along
with properties of the monitor data. This allows vendors to perform
firmware updates that can change the mapping (e.g. add new metrics) without
requiring any changes to drivers or software tools. The new mapping is
confirmed by an updated GUID, read from the hardware, which software uses
with a new XML.

The current capabilities defined by PMT are Telemetry, Watcher, and
Crashlog. The Telemetry capability provides access to a continuous block
of read only data. The Watcher capability provides access to hardware
sampling and tracing features. Crashlog provides access to device crash
dumps. While there is some relationship between capabilities (Watcher can
be configured to sample from the Telemetry data set) each exists as stand
alone features with no dependency on any other. The design therefore splits
them into individual, capability specific drivers. MFD is used to create
platform devices for each capability so that they may be managed by their
own driver. The PMT architecture is (for the most part) agnostic to the
type of device it can collect from. Software can determine which devices
support a PMT feature by searching through each device node entry in the
sysfs class folder. It can additionally determine if a particular device
supports a PMT feature by checking for a PMT class folder in the device
folder.

This patch set provides support for the PMT framework, along with support
for Telemetry on Tiger Lake.

Changes from V6:
- Use NULL for OOBMSM driver data instead of an empty struct.
Rewrite the code to check for NULL driver_data.
- Fix spelling and formatting in Kconfig.
- Use MKDEV(0,0) to prevent unneeded device node from being
created.

Changes from V5:
- Add Alder Lake and the "Out of Band Management Services
Module (OOBMSM)" ids to the MFD driver. Transferred to this
patch set.
- Use a single class for all PMT capabilities as suggested by
Hans.
- Add binary attribute for telemetry driver to allow read
syscall as suggested by Hans.
- Use the class file to hold attributes and other common code
used by all PMT drivers.
- Add the crashlog driver to the patch set and add a mutex to
protect access to the enable control and trigger files as
suggested by Hans.

Changes from V4:
- Replace MFD with PMT in driver title
- Fix commit tags in chronological order
- Fix includes in alphabetical order
- Use 'raw' string instead of defines for device names
- Add an error message when returning an error code for
unrecognized capability id
- Use dev_err instead of dev_warn for messages when returning
an error
- Change while loop to call pci_find_next_ext_capability once
- Add missing continue in while loop
- Keep PCI platform defines using PCI_DEVICE_DATA magic tied to
the pci_device_id table
- Comment and kernel message cleanup

Changes from V3:
- Write out full acronym for DVSEC in PCI patch commit message and
add 'Designated' to comments
- remove unused variable caught by kernel test robot <[email protected]>
- Add required Co-developed-by signoffs, noted by Andy
- Allow access using new CAP_PERFMON capability as suggested by
Alexey Bundankov
- Fix spacing in Kconfig, noted by Randy
- Other style changes and fixups suggested by Andy

Changes from V2:
- In order to handle certain HW bugs from the telemetry capability
driver, create a single platform device per capability instead of
a device per entry. Add the entry data as device resources and
let the capability driver manage them as a set allowing for
cleaner HW bug resolution.
- Handle discovery table offset bug in intel_pmt.c
- Handle overlapping regions in intel_pmt_telemetry.c
- Add description of sysfs class to testing ABI.
- Don't check size and count until confirming support for the PMT
capability to avoid bailing out when we need to skip it.
- Remove unneeded header file. Move code to the intel_pmt.c, the
only place where it's needed.
- Remove now unused platform data.
- Add missing header files types.h, bits.h.
- Rename file name and build options from telem to telemetry.
- Code cleanup suggested by Andy S.
- x86 mailing list added.

Changes from V1:
- In the telemetry driver, set the device in device_create() to
the parent PCI device (the monitoring device) for clear
association in sysfs. Was set before to the platform device
created by the PCI parent.
- Move telem struct into driver and delete unneeded header file.
- Start telem device numbering from 0 instead of 1. 1 was used
due to anticipated changes, no longer needed.
- Use helper macros suggested by Andy S.
- Rename class to pmt_telemetry, spelling out full name
- Move monitor device name defines to common header
- Coding style, spelling, and Makefile/MAINTAINERS ordering fixes

Alexander Duyck (3):
platform/x86: Intel PMT class driver
platform/x86: Intel PMT Telemetry capability driver
platform/x86: Intel PMT Crashlog capability driver

David E. Box (2):
PCI: Add defines for Designated Vendor-Specific Extended Capability
mfd: Intel Platform Monitoring Technology support

.../ABI/testing/sysfs-class-intel_pmt | 119 ++++++
MAINTAINERS | 6 +
drivers/mfd/Kconfig | 10 +
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 225 ++++++++++++
drivers/platform/x86/Kconfig | 25 ++
drivers/platform/x86/Makefile | 3 +
drivers/platform/x86/intel_pmt_class.c | 286 +++++++++++++++
drivers/platform/x86/intel_pmt_class.h | 57 +++
drivers/platform/x86/intel_pmt_crashlog.c | 339 ++++++++++++++++++
drivers/platform/x86/intel_pmt_telemetry.c | 158 ++++++++
include/uapi/linux/pci_regs.h | 5 +
12 files changed, 1234 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-intel_pmt
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 drivers/platform/x86/intel_pmt_class.c
create mode 100644 drivers/platform/x86/intel_pmt_class.h
create mode 100644 drivers/platform/x86/intel_pmt_crashlog.c
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

--
2.20.1


2020-10-01 01:47:34

by David E. Box

[permalink] [raw]
Subject: [PATCH V7 4/5] platform/x86: Intel PMT Telemetry capability driver

From: Alexander Duyck <[email protected]>

PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
The Telemetry capability provides access to device telemetry metrics that
provide hardware performance data to users from read-only register spaces.

With this driver present the intel_pmt directory can be populated with
telem<x> devices. These devices will contain the standard intel_pmt sysfs
data and a "telem" binary sysfs attribute which can be used to access the
telemetry data.

Co-developed-by: David E. Box <[email protected]>
Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
drivers/platform/x86/Kconfig | 8 ++
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 158 +++++++++++++++++++++
3 files changed, 167 insertions(+)
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 82465d0e8fd3..8eae17a57a5b 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1369,6 +1369,14 @@ config INTEL_PMT_CLASS
For more information, see:
<file:Documentation/ABI/testing/sysfs-class-intel_pmt>

+config INTEL_PMT_TELEMETRY
+ tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
+ select INTEL_PMT_CLASS
+ help
+ The Intel Platform Monitory Technology (PMT) Telemetry driver provides
+ access to hardware telemetry metrics on devices that support the
+ feature.
+
config INTEL_PUNIT_IPC
tristate "Intel P-Unit IPC Driver"
help
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index f4b1f87f2401..6a7b61f59ea8 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -141,6 +141,7 @@ obj-$(CONFIG_INTEL_MID_POWER_BUTTON) += intel_mid_powerbtn.o
obj-$(CONFIG_INTEL_MRFLD_PWRBTN) += intel_mrfld_pwrbtn.o
obj-$(CONFIG_INTEL_PMC_CORE) += intel_pmc_core.o intel_pmc_core_pltdrv.o
obj-$(CONFIG_INTEL_PMT_CLASS) += intel_pmt_class.o
+obj-$(CONFIG_INTEL_PMT_TELEMETRY) += intel_pmt_telemetry.o
obj-$(CONFIG_INTEL_PUNIT_IPC) += intel_punit_ipc.o
obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
obj-$(CONFIG_INTEL_SCU_PCI) += intel_scu_pcidrv.o
diff --git a/drivers/platform/x86/intel_pmt_telemetry.c b/drivers/platform/x86/intel_pmt_telemetry.c
new file mode 100644
index 000000000000..d4819aefdd65
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telemetry.c
@@ -0,0 +1,158 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitory Technology Telemetry driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: "David E. Box" <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+
+#include "intel_pmt_class.h"
+
+#define TELEM_DEV_NAME "pmt_telemetry"
+
+#define TELEM_SIZE_OFFSET 0x0
+#define TELEM_GUID_OFFSET 0x4
+#define TELEM_BASE_OFFSET 0x8
+#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
+/* size is in bytes */
+#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
+
+/* Used by client hardware to identify a fixed telemetry entry*/
+#define TELEM_CLIENT_FIXED_BLOCK_GUID 0x10000000
+
+struct pmt_telem_priv {
+ int num_entries;
+ struct intel_pmt_entry entry[];
+};
+
+static DEFINE_XARRAY_ALLOC(telem_array);
+static struct intel_pmt_namespace pmt_telem_ns = {
+ .name = "telem",
+ .xa = &telem_array
+};
+
+/*
+ * driver initialization
+ */
+static int pmt_telem_add_entry(struct intel_pmt_entry *entry,
+ struct device *parent,
+ struct resource *disc_res)
+{
+ void __iomem *disc_table = entry->disc_table;
+ struct intel_pmt_header header;
+ int ret;
+
+ header.access_type = TELEM_ACCESS(readl(disc_table));
+ header.guid = readl(disc_table + TELEM_GUID_OFFSET);
+ header.base_offset = readl(disc_table + TELEM_BASE_OFFSET);
+
+ /* Size is measured in DWORDS, but accessor returns bytes */
+ header.size = TELEM_SIZE(readl(disc_table));
+
+ ret = intel_pmt_populate_entry(entry, &header, parent, disc_res);
+ if (ret)
+ return ret;
+
+ return intel_pmt_dev_create(entry, &pmt_telem_ns, parent);
+}
+
+static bool pmt_telem_region_overlaps(struct intel_pmt_entry *entry)
+{
+ u32 guid;
+
+ guid = readl(entry->disc_table + TELEM_GUID_OFFSET);
+
+ return guid == TELEM_CLIENT_FIXED_BLOCK_GUID;
+}
+
+static int pmt_telem_remove(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
+ int i;
+
+ for (i = 0; i < priv->num_entries; i++)
+ intel_pmt_dev_destroy(&priv->entry[i], &pmt_telem_ns);
+
+ return 0;
+}
+
+static int pmt_telem_probe(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv;
+ bool early_hw;
+ size_t size;
+ int i, ret;
+
+ size = offsetof(struct pmt_telem_priv, entry[pdev->num_resources]);
+ priv = devm_kzalloc(&pdev->dev, size, GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, priv);
+
+ if (intel_pmt_is_early_client_hw(&pdev->dev))
+ early_hw = true;
+
+ for (i = 0; i < pdev->num_resources; i++) {
+ struct intel_pmt_entry *entry = &priv->entry[i];
+ struct resource *disc_res;
+
+ ret = -ENODEV;
+
+ disc_res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+ if (!disc_res)
+ goto abort_probe;
+
+ ret = intel_pmt_ioremap_discovery_table(entry, pdev, i);
+ if (ret)
+ goto abort_probe;
+
+ if (pmt_telem_region_overlaps(entry) && early_hw)
+ continue;
+
+ ret = pmt_telem_add_entry(entry, &pdev->dev, disc_res);
+ if (ret)
+ goto abort_probe;
+
+ priv->num_entries++;
+ }
+
+ return 0;
+abort_probe:
+ pmt_telem_remove(pdev);
+ return ret;
+}
+
+static struct platform_driver pmt_telem_driver = {
+ .driver = {
+ .name = TELEM_DEV_NAME,
+ },
+ .remove = pmt_telem_remove,
+ .probe = pmt_telem_probe,
+};
+
+static int __init pmt_telem_init(void)
+{
+ return platform_driver_register(&pmt_telem_driver);
+}
+module_init(pmt_telem_init);
+
+static void __exit pmt_telem_exit(void)
+{
+ platform_driver_unregister(&pmt_telem_driver);
+ xa_destroy(&telem_array);
+}
+module_exit(pmt_telem_exit);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel PMT Telemetry driver");
+MODULE_ALIAS("platform:" TELEM_DEV_NAME);
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-10-01 16:05:07

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH V7 4/5] platform/x86: Intel PMT Telemetry capability driver

On Thu, Oct 1, 2020 at 4:43 AM David E. Box <[email protected]> wrote:
>
> From: Alexander Duyck <[email protected]>
>
> PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
> The Telemetry capability provides access to device telemetry metrics that
> provide hardware performance data to users from read-only register spaces.
>
> With this driver present the intel_pmt directory can be populated with
> telem<x> devices. These devices will contain the standard intel_pmt sysfs
> data and a "telem" binary sysfs attribute which can be used to access the
> telemetry data.

...

> +static DEFINE_XARRAY_ALLOC(telem_array);
> +static struct intel_pmt_namespace pmt_telem_ns = {
> + .name = "telem",
> + .xa = &telem_array

Leave comma at the end.

> +};
> +
> +/*
> + * driver initialization
> + */

This is a useless comment.

> + size = offsetof(struct pmt_telem_priv, entry[pdev->num_resources]);
> + priv = devm_kzalloc(&pdev->dev, size, GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;

Please, use struct_size() from overflow.h instead of custom approach.

...

> +static struct platform_driver pmt_telem_driver = {
> + .driver = {
> + .name = TELEM_DEV_NAME,

I'm not sure I have interpreted this:
- Use 'raw' string instead of defines for device names
correctly. Can you elaborate?

> + },
> + .remove = pmt_telem_remove,
> + .probe = pmt_telem_probe,
> +};

...

> +MODULE_ALIAS("platform:" TELEM_DEV_NAME);

Ditto.

--
With Best Regards,
Andy Shevchenko

2020-10-01 16:48:10

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH V7 4/5] platform/x86: Intel PMT Telemetry capability driver

On Thu, Oct 1, 2020 at 9:03 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Thu, Oct 1, 2020 at 4:43 AM David E. Box <[email protected]> wrote:
> >
> > From: Alexander Duyck <[email protected]>
> >
> > PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
> > The Telemetry capability provides access to device telemetry metrics that
> > provide hardware performance data to users from read-only register spaces.
> >
> > With this driver present the intel_pmt directory can be populated with
> > telem<x> devices. These devices will contain the standard intel_pmt sysfs
> > data and a "telem" binary sysfs attribute which can be used to access the
> > telemetry data.
>
> ...
>
> > +static DEFINE_XARRAY_ALLOC(telem_array);
> > +static struct intel_pmt_namespace pmt_telem_ns = {
> > + .name = "telem",
> > + .xa = &telem_array
>
> Leave comma at the end.
>
> > +};
> > +
> > +/*
> > + * driver initialization
> > + */
>
> This is a useless comment.
>
> > + size = offsetof(struct pmt_telem_priv, entry[pdev->num_resources]);
> > + priv = devm_kzalloc(&pdev->dev, size, GFP_KERNEL);
> > + if (!priv)
> > + return -ENOMEM;
>
> Please, use struct_size() from overflow.h instead of custom approach.
>
> ...

So all of the above make sense and can be fixed shortly and pushed as
a v8 for both the telemetry and crashlog drivers.

> > +static struct platform_driver pmt_telem_driver = {
> > + .driver = {
> > + .name = TELEM_DEV_NAME,
>
> I'm not sure I have interpreted this:
> - Use 'raw' string instead of defines for device names
> correctly. Can you elaborate?

Can you point me to a reference of that? I'm not sure what you are referring to.

> > + },
> > + .remove = pmt_telem_remove,
> > + .probe = pmt_telem_probe,
> > +};
>
> ...
>
> > +MODULE_ALIAS("platform:" TELEM_DEV_NAME);
>
> Ditto.

This doesn't make sense to me. Are you saying we are expected to use
"pmt_telemetry" everywhere instead of the define? It seems like that
would be much more error prone. It seems like common practice to use
DRV_NAME throughout a driver for these sort of things so if you are
wanting us to rename it to that I am fine with that, but I am not sure
getting rid of the use of a define makes sense.

2020-10-01 17:56:43

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH V7 4/5] platform/x86: Intel PMT Telemetry capability driver

On Thu, Oct 1, 2020 at 7:46 PM Alexander Duyck
<[email protected]> wrote:
> On Thu, Oct 1, 2020 at 9:03 AM Andy Shevchenko
> <[email protected]> wrote:

> > > +static struct platform_driver pmt_telem_driver = {
> > > + .driver = {
> > > + .name = TELEM_DEV_NAME,
> >
> > I'm not sure I have interpreted this:
> > - Use 'raw' string instead of defines for device names
> > correctly. Can you elaborate?
>
> Can you point me to a reference of that? I'm not sure what you are referring to.

It's a changelog of this very series.

> > > + },
> > > + .remove = pmt_telem_remove,
> > > + .probe = pmt_telem_probe,
> > > +};
> >
> > ...
> >
> > > +MODULE_ALIAS("platform:" TELEM_DEV_NAME);
> >
> > Ditto.
>
> This doesn't make sense to me. Are you saying we are expected to use
> "pmt_telemetry" everywhere instead of the define? It seems like that
> would be much more error prone. It seems like common practice to use
> DRV_NAME throughout a driver for these sort of things so if you are
> wanting us to rename it to that I am fine with that, but I am not sure
> getting rid of the use of a define makes sense.

I'm just wondering why changelog mentioned that and what it meant.
If it's indeed conversion to explicit naming, then this has to be
followed (somebody seems asked for that, right?) or commented why not.
Or maybe I understood it wrong?

--
With Best Regards,
Andy Shevchenko