2020-05-05 01:34:47

by David E. Box

[permalink] [raw]
Subject: [PATCH 0/3] Intel Platform Monitoring Technology

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.
With customers increasingly asking for hardware telemetry, engineers not
only have to figure out how to measure and collect data, but also how to
deliver it and make it discoverable. The latter may be through some device
specific method requiring device specific tools to collect the data. This
in turn requires customers to manage a suite of different tools in order to
collect the differing assortment of monitoring data on their systems. Even
when such information can be provided in kernel drivers, they may require
constant maintanence to update register mappings as they change with
firmware updates and new versions of hardware. PMT provides a solution for
discovering and reading telemetry from a device through a hardware agnostic
framework that allows for updates to systems without requiring patches to
the kernel or software tools.

PMT defines several capabilities to support collecting monitoring data from
hardware. All are discoverable as separate instances of the PCIE Designated
Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
field uniquely identifies the capability. Each DVSEC also provides a BAR
offset to a header that defines capability-specific attributes, including
GUID, feature type, offset and length, as well as configuration settings
where applicable. The GUID uniquely identifies the register space of any
monitor data exposed by the capability. The GUID is associated with an XML
file from the vendor that describes the mapping of the register space along
with properties of the monitor data. This allows vendors to perform
firmware updates that can change the mapping (e.g. add new metrics) without
requiring any changes to drivers or software tools. The new mapping is
confirmed by an updated GUID, read from the hardware, which software uses
with a new XML.

The current capabilities defined by PMT are Telemetry, Watcher, and
Crashlog. The Telemetry capability provides access to a continuous block
of read only data. The Watcher capability provides access to hardware
sampling and tracing features. Crashlog provides access to device crash
dumps. While there is some relationship between capabilities (Watcher can
be configured to sample from the Telemetry data set) each exists as stand
alone features with no dependency on any other. The design therefore splits
them into individual, capability specific drivers. MFD is used to create
platform devices for each capability so that they may be managed by their
own driver. The PMT architecture is (for the most part) agnostic to the
type of device it can collect from. Devices nodes are consequently generic
in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
creates a class to manage the list of devices supporting it. Software can
see which devices support a PMT feature by perusing each device file
underneath the class in sysfs. It can additionally see if a particluar
device supports a PMT feature by seeing if that device contains a pointer
to a PMT class in its device folder.

This patch set provides support for the PMT framework, along with support
for Telemetry on Tiger Lake.

Patch 1 - adds the Designated Vendor PCI Extended Capability. The PMT
feature is discoverable as an Intel DVSEC capabilitity.

Patch 2 - an MFD driver that creates cells for each PMT capability found on
a PCI device. This supports SoC platforms that expose PMT
capabilities under a PMT dedicated PCI device id.

Patch 3 - adds support for the PMT Telemetry feature.

To: [email protected],
[email protected],
[email protected]
Cc: [email protected],
[email protected]

*** BLURB HERE ***

David E. Box (3):
pci: Add Designated Vendor Specific Capability
mfd: Intel Platform Monitoring Technology support
platform/x86: Intel PMT Telemetry capability driver

.../ABI/testing/sysfs-class-intel_pmt_telem | 46 +++
MAINTAINERS | 6 +
drivers/mfd/Kconfig | 10 +
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 174 +++++++++
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telem.c | 356 ++++++++++++++++++
drivers/platform/x86/intel_pmt_telem.h | 20 +
include/linux/intel-dvsec.h | 44 +++
include/uapi/linux/pci_regs.h | 5 +
11 files changed, 673 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-intel_pmt_telem
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 drivers/platform/x86/intel_pmt_telem.c
create mode 100644 drivers/platform/x86/intel_pmt_telem.h
create mode 100644 include/linux/intel-dvsec.h

--
2.20.1


2020-05-05 02:34:39

by David E. Box

[permalink] [raw]
Subject: [PATCH 2/3] mfd: Intel Platform Monitoring Technology support

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring facilities. PMT supports
multiple types of monitoring capabilities. Capabilities are discovered
using PCIe DVSEC with the Intel VID. Each capability is discovered as a
separate DVSEC instance in a device's config space. This driver uses MFD to
manage the creation of platform devices for each type so that they may be
controlled by their own drivers (to be introduced). Support is included
for the 3 current capability types, Telemetry, Watcher, and Crashlog. The
features are available on new Intel platforms starting from Tiger Lake for
which support is added. Tiger Lake however will not support Watcher and
Crashlog even though the capabilities appear on the device. So add a quirk
facility and use it to disable them.

Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
MAINTAINERS | 5 ++
drivers/mfd/Kconfig | 10 +++
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 174 ++++++++++++++++++++++++++++++++++++
include/linux/intel-dvsec.h | 44 +++++++++
5 files changed, 234 insertions(+)
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 include/linux/intel-dvsec.h

diff --git a/MAINTAINERS b/MAINTAINERS
index e64e5db31497..bacf7ecd4d21 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8783,6 +8783,11 @@ S: Maintained
F: arch/x86/include/asm/intel_telemetry.h
F: drivers/platform/x86/intel_telemetry*

+INTEL PMT DRIVER
+M: "David E. Box" <[email protected]>
+S: Maintained
+F: drivers/mfd/intel_pmt.c
+
INTEL UNCORE FREQUENCY CONTROL
M: Srinivas Pandruvada <[email protected]>
L: [email protected]
diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 0a59249198d3..c673031acdf1 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -632,6 +632,16 @@ config MFD_INTEL_MSIC
Passage) chip. This chip embeds audio, battery, GPIO, etc.
devices used in Intel Medfield platforms.

+config MFD_INTEL_PMT
+ tristate "Intel Platform Monitoring Technology support"
+ depends on PCI
+ select MFD_CORE
+ help
+ The Intel Platform Monitoring Technology (PMT) is an interface that
+ provides access to hardware monitor registers. This driver supports
+ Telemetry, Watcher, and Crashlog PTM capabilities/devices for
+ platforms starting from Tiger Lake.
+
config MFD_IPAQ_MICRO
bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
depends on SA1100_H3100 || SA1100_H3600
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index f935d10cbf0f..0041f673faa1 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -212,6 +212,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS) += intel-lpss.o
obj-$(CONFIG_MFD_INTEL_LPSS_PCI) += intel-lpss-pci.o
obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
+obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
obj-$(CONFIG_MFD_PALMAS) += palmas.o
obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
new file mode 100644
index 000000000000..c48a2b82ca99
--- /dev/null
+++ b/drivers/mfd/intel_pmt.c
@@ -0,0 +1,174 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitoring Technology MFD driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Authors: David E. Box <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/mfd/core.h>
+#include <linux/intel-dvsec.h>
+
+#define TELEM_DEV_NAME "pmt_telemetry"
+#define WATCHER_DEV_NAME "pmt_watcher"
+#define CRASHLOG_DEV_NAME "pmt_crashlog"
+
+static const struct pmt_platform_info tgl_info = {
+ .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG,
+};
+
+static int
+pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header *header,
+ struct pmt_platform_info *info)
+{
+ struct mfd_cell *cell, *tmp;
+ const char *name;
+ int i;
+
+ switch (header->id) {
+ case DVSEC_INTEL_ID_TELEM:
+ name = TELEM_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_WATCHER:
+ if (info->quirks && PMT_QUIRK_NO_WATCHER) {
+ dev_info(&pdev->dev, "Watcher not supported\n");
+ return 0;
+ }
+ name = WATCHER_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_CRASHLOG:
+ if (info->quirks && PMT_QUIRK_NO_WATCHER) {
+ dev_info(&pdev->dev, "Crashlog not supported\n");
+ return 0;
+ }
+ name = CRASHLOG_DEV_NAME;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ cell = devm_kcalloc(&pdev->dev, header->num_entries,
+ sizeof(*cell), GFP_KERNEL);
+ if (!cell)
+ return -ENOMEM;
+
+ /* Create a platform device for each entry. */
+ for (i = 0, tmp = cell; i < header->num_entries; i++, tmp++) {
+ struct resource *res;
+
+ res = devm_kzalloc(&pdev->dev, sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ tmp->name = name;
+
+ res->start = pdev->resource[header->tbir].start +
+ header->offset +
+ (i * (INTEL_DVSEC_ENTRY_SIZE << 2));
+ res->end = res->start + (header->entry_size << 2) - 1;
+ res->flags = IORESOURCE_MEM;
+
+ tmp->resources = res;
+ tmp->num_resources = 1;
+ tmp->platform_data = header;
+ tmp->pdata_size = sizeof(*header);
+
+ }
+
+ return devm_mfd_add_devices(&pdev->dev, PLATFORM_DEVID_AUTO, cell,
+ header->num_entries, NULL, 0, NULL);
+}
+
+static int
+pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ u16 vid;
+ u32 table;
+ int ret, pos = 0, last_pos = 0;
+ struct pmt_platform_info *info;
+ struct intel_dvsec_header header;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
+ GFP_KERNEL);
+
+ if (!info)
+ return -ENOMEM;
+
+ while ((pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC))) {
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
+ if (vid != PCI_VENDOR_ID_INTEL)
+ continue;
+
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
+ &header.id);
+
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
+ &header.num_entries);
+
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
+ &header.entry_size);
+
+ if (!header.num_entries || !header.entry_size)
+ return -EINVAL;
+
+ pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
+ &table);
+
+ header.tbir = INTEL_DVSEC_TABLE_BAR(table);
+ header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
+ ret = pmt_add_dev(pdev, &header, info);
+ if (ret)
+ dev_warn(&pdev->dev,
+ "Failed to add devices for DVSEC id %d\n",
+ header.id);
+ last_pos = pos;
+ }
+
+ if (!last_pos) {
+ dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
+ return -ENODEV;
+ }
+
+ pm_runtime_put(&pdev->dev);
+ pm_runtime_allow(&pdev->dev);
+
+ return 0;
+}
+
+static void pmt_pci_remove(struct pci_dev *pdev)
+{
+ pm_runtime_forbid(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);
+}
+
+static const struct pci_device_id pmt_pci_ids[] = {
+ /* TGL */
+ { PCI_VDEVICE(INTEL, 0x9a0d), (kernel_ulong_t)&tgl_info },
+ { }
+};
+MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
+
+static struct pci_driver pmt_pci_driver = {
+ .name = "intel-pmt",
+ .id_table = pmt_pci_ids,
+ .probe = pmt_pci_probe,
+ .remove = pmt_pci_remove,
+};
+
+module_pci_driver(pmt_pci_driver);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel Platform Monitoring Technology MFD driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/intel-dvsec.h b/include/linux/intel-dvsec.h
new file mode 100644
index 000000000000..94f606bf8eae
--- /dev/null
+++ b/include/linux/intel-dvsec.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef INTEL_DVSEC_H
+#define INTEL_DVSEC_H
+
+#include <linux/types.h>
+
+#define DVSEC_INTEL_ID_TELEM 2
+#define DVSEC_INTEL_ID_WATCHER 3
+#define DVSEC_INTEL_ID_CRASHLOG 4
+
+/* Intel DVSEC capability vendor space offsets */
+#define INTEL_DVSEC_ENTRIES 0xA
+#define INTEL_DVSEC_SIZE 0xB
+#define INTEL_DVSEC_TABLE 0xC
+#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
+#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) >> 3)
+
+#define INTEL_DVSEC_ENTRY_SIZE 4
+
+/* DVSEC header */
+struct intel_dvsec_header {
+ u16 length;
+ u16 id;
+ u8 num_entries;
+ u8 entry_size;
+ u8 entry_max;
+ u8 tbir;
+ u32 offset;
+};
+
+enum pmt_quirks {
+ /* Watcher capability not supported */
+ PMT_QUIRK_NO_WATCHER = (1 << 0),
+
+ /* Crashlog capability not supported */
+ PMT_QUIRK_NO_CRASHLOG = (1 << 1),
+};
+
+struct pmt_platform_info {
+ unsigned long quirks;
+ struct intel_dvsec_header **capabilities;
+};
+
+#endif
--
2.20.1

2020-05-05 02:38:11

by David E. Box

[permalink] [raw]
Subject: [PATCH 3/3] platform/x86: Intel PMT Telemetry capability driver

PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
The Telemetry capability provides access to device telemetry metrics that
provide hardware performance data to users from continuous, memory mapped,
read-only register spaces.

Register mappings are not provided by the driver. Instead, a GUID is read
from a header for each endpoint. The GUID identifies the device and is to
be used with an XML, provided by the vendor, to discover the available set
of metrics and their register mapping. This allows firmware updates to
modify the register space without needing to update the driver every time
with new mappings. Firmware writes a new GUID in this case to specify the
new mapping. Software tools with access to the associated XML file can
then interpret the changes.

This module manages access to all PMT Telemetry endpoints on a system,
regardless of the device exporting them. It creates an intel_pmt_telem
class to manage the list. For each endpoint, sysfs files provide GUID and
size information as well as a pointer to the parent device the telemetry
comes from. Software may discover the association between endpoints and
devices by iterating through the list in sysfs, or by looking for the
existence of the class folder under the device of interest. A device node
of the same name allows software to then map the telemetry space for direct
access.

Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
.../ABI/testing/sysfs-class-intel_pmt_telem | 46 +++
MAINTAINERS | 1 +
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telem.c | 356 ++++++++++++++++++
drivers/platform/x86/intel_pmt_telem.h | 20 +
6 files changed, 434 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-intel_pmt_telem
create mode 100644 drivers/platform/x86/intel_pmt_telem.c
create mode 100644 drivers/platform/x86/intel_pmt_telem.h

diff --git a/Documentation/ABI/testing/sysfs-class-intel_pmt_telem b/Documentation/ABI/testing/sysfs-class-intel_pmt_telem
new file mode 100644
index 000000000000..cdd9a16b31f3
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-intel_pmt_telem
@@ -0,0 +1,46 @@
+What: /sys/class/intel_pmt_telem/
+Date: April 2020
+KernelVersion: 5.8
+Contact: David Box <[email protected]>
+Description:
+ The intel_pmt_telem/ class directory contains information for
+ devices that expose hardware telemetry using Intel Platform
+ Monitoring Technology (PMT)
+
+What: /sys/class/intel_pmt_telem/telemX
+Date: April 2020
+KernelVersion: 5.8
+Contact: David Box <[email protected]>
+Description:
+ The telemX directory contains files describing an instance of a
+ PMT telemetry device that exposes hardware telemetry. Each
+ telemX device has an associated /dev/telemX node. This node can
+ be opened and mapped to access the telemetry space of the
+ device. The register layout of the telemetry space is
+ determined from an XML file of specific guid for the corresponding
+ parent device.
+
+What: /sys/class/intel_pmt_telem/telemX/guid
+Date: April 2020
+KernelVersion: 5.8
+Contact: David Box <[email protected]>
+Description:
+ (RO) The guid for this telemetry device. The guid identifies
+ the version of the XML file for the parent device that should
+ be used to determine the register layout.
+
+What: /sys/class/intel_pmt_telem/telemX/size
+Date: April 2020
+KernelVersion: 5.8
+Contact: David Box <[email protected]>
+Description:
+ (RO) The size of telemetry region in bytes that corresponds to
+ the mapping size for the /dev/telemX device node.
+
+What: /sys/class/intel_pmt_telem/telemX/offset
+Date: April 2020
+KernelVersion: 5.8
+Contact: David Box <[email protected]>
+Description:
+ (RO) The offset of telemetry region in bytes that corresponds to
+ the mapping for the /dev/telemX device node.
diff --git a/MAINTAINERS b/MAINTAINERS
index bacf7ecd4d21..c49a9d3a28d2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8787,6 +8787,7 @@ INTEL PMT DRIVER
M: "David E. Box" <[email protected]>
S: Maintained
F: drivers/mfd/intel_pmt.c
+F: drivers/platform/x86/intel_pmt_*

INTEL UNCORE FREQUENCY CONTROL
M: Srinivas Pandruvada <[email protected]>
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 0ad7ad8cf8e1..dd734eb66e74 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1368,6 +1368,16 @@ config INTEL_TELEMETRY
directly via debugfs files. Various tools may use
this interface for SoC state monitoring.

+config INTEL_PMT_TELEM
+ tristate "Intel PMT telemetry driver"
+ help
+ The Intel Platform Monitory Technology (PMT) Telemetry driver provides
+ access to hardware telemetry metrics on devices that support the
+ feature.
+
+ For more information, see
+ <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
+
endif # X86_PLATFORM_DEVICES

config PMC_ATOM
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 53408d965874..f37e000ef8cb 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -146,3 +146,4 @@ obj-$(CONFIG_INTEL_TELEMETRY) += intel_telemetry_core.o \
intel_telemetry_pltdrv.o \
intel_telemetry_debugfs.o
obj-$(CONFIG_PMC_ATOM) += pmc_atom.o
+obj-$(CONFIG_INTEL_PMT_TELEM) += intel_pmt_telem.o
diff --git a/drivers/platform/x86/intel_pmt_telem.c b/drivers/platform/x86/intel_pmt_telem.c
new file mode 100644
index 000000000000..ae6f867f53fa
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telem.c
@@ -0,0 +1,356 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitory Technology Telemetry driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: "David E. Box" <[email protected]>
+ */
+
+#include <linux/cdev.h>
+#include <linux/intel-dvsec.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/xarray.h>
+
+#include "intel_pmt_telem.h"
+
+/* platform device name to bind to driver */
+#define TELEM_DRV_NAME "pmt_telemetry"
+
+/* Telemetry access types */
+#define TELEM_ACCESS_FUTURE 1
+#define TELEM_ACCESS_BARID 2
+#define TELEM_ACCESS_LOCAL 3
+
+#define TELEM_GUID_OFFSET 0x4
+#define TELEM_BASE_OFFSET 0x8
+#define TELEM_TBIR_MASK 0x7
+#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
+#define TELEM_TYPE(v) (((v) & GENMASK(7, 4)) >> 4)
+/* size is in bytes */
+#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
+
+#define TELEM_XA_START 1
+#define TELEM_XA_MAX INT_MAX
+#define TELEM_XA_LIMIT XA_LIMIT(TELEM_XA_START, TELEM_XA_MAX)
+
+static DEFINE_XARRAY_ALLOC(telem_array);
+
+struct pmt_telem_priv {
+ struct device *dev;
+ struct intel_dvsec_header *dvsec;
+ struct telem_header header;
+ unsigned long base_addr;
+ void __iomem *disc_table;
+ struct cdev cdev;
+ dev_t devt;
+ int devid;
+};
+
+/*
+ * devfs
+ */
+static int pmt_telem_open(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv;
+ struct pci_driver *pci_drv;
+ struct pci_dev *pci_dev;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ priv = container_of(inode->i_cdev, struct pmt_telem_priv, cdev);
+ pci_dev = to_pci_dev(priv->dev->parent);
+
+ pci_drv = pci_dev_driver(pci_dev);
+ if (!pci_drv)
+ return -ENODEV;
+
+ filp->private_data = priv;
+ get_device(&pci_dev->dev);
+
+ if (!try_module_get(pci_drv->driver.owner)) {
+ put_device(&pci_dev->dev);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static int pmt_telem_release(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv = filp->private_data;
+ struct pci_dev *pci_dev = to_pci_dev(priv->dev->parent);
+ struct pci_driver *pci_drv = pci_dev_driver(pci_dev);
+
+ put_device(&pci_dev->dev);
+ module_put(pci_drv->driver.owner);
+
+ return 0;
+}
+
+static int pmt_telem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ struct pmt_telem_priv *priv = filp->private_data;
+ unsigned long vsize = vma->vm_end - vma->vm_start;
+ unsigned long phys = priv->base_addr;
+ unsigned long pfn = PFN_DOWN(phys);
+ unsigned long psize;
+
+ psize = (PFN_UP(priv->base_addr + priv->header.size) - pfn) * PAGE_SIZE;
+ if (vsize > psize) {
+ dev_err(priv->dev, "Requested mmap size is too large\n");
+ return -EINVAL;
+ }
+
+ if ((vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_MAYWRITE))
+ return -EPERM;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+ if (io_remap_pfn_range(vma, vma->vm_start, pfn, vsize,
+ vma->vm_page_prot))
+ return -EINVAL;
+
+ return 0;
+}
+
+static const struct file_operations pmt_telem_fops = {
+ .owner = THIS_MODULE,
+ .open = pmt_telem_open,
+ .mmap = pmt_telem_mmap,
+ .release = pmt_telem_release,
+};
+
+/*
+ * sysfs
+ */
+static ssize_t guid_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_priv *priv = dev_get_drvdata(dev);
+
+ return sprintf(buf, "0x%x\n", priv->header.guid);
+}
+static DEVICE_ATTR_RO(guid);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_priv *priv = dev_get_drvdata(dev);
+
+ /* Display buffer size in bytes */
+ return sprintf(buf, "%u\n", priv->header.size);
+}
+static DEVICE_ATTR_RO(size);
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_priv *priv = dev_get_drvdata(dev);
+
+ /* Display buffer offset in bytes */
+ return sprintf(buf, "%lu\n", offset_in_page(priv->base_addr));
+}
+static DEVICE_ATTR_RO(offset);
+
+static struct attribute *pmt_telem_attrs[] = {
+ &dev_attr_guid.attr,
+ &dev_attr_size.attr,
+ &dev_attr_offset.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(pmt_telem);
+
+struct class pmt_telem_class = {
+ .owner = THIS_MODULE,
+ .name = "intel_pmt_telem",
+ .dev_groups = pmt_telem_groups,
+};
+
+/*
+ * driver initialization
+ */
+static int pmt_telem_create_dev(struct pmt_telem_priv *priv)
+{
+ struct device *dev;
+ int ret;
+
+ cdev_init(&priv->cdev, &pmt_telem_fops);
+ ret = cdev_add(&priv->cdev, priv->devt, 1);
+ if (ret) {
+ dev_err(priv->dev, "Could not add char dev\n");
+ return ret;
+ }
+
+ dev = device_create(&pmt_telem_class, priv->dev, priv->devt,
+ priv, "telem%d", priv->devid);
+ if (IS_ERR(dev)) {
+ dev_err(priv->dev, "Could not create device node\n");
+ cdev_del(&priv->cdev);
+ }
+
+ return PTR_ERR_OR_ZERO(dev);
+}
+
+static void pmt_telem_populate_header(void __iomem *disc_offset,
+ struct telem_header *header)
+{
+ header->access_type = TELEM_ACCESS(readb(disc_offset));
+ header->telem_type = TELEM_TYPE(readb(disc_offset));
+ header->size = TELEM_SIZE(readl(disc_offset));
+ header->guid = readl(disc_offset + TELEM_GUID_OFFSET);
+ header->base_offset = readl(disc_offset + TELEM_BASE_OFFSET);
+
+ /*
+ * For non-local access types the lower 3 bits of base offset
+ * contains the index of the base address register where the
+ * telemetry can be found.
+ */
+ header->tbir = header->base_offset & TELEM_TBIR_MASK;
+ header->base_offset ^= header->tbir;
+}
+
+static int pmt_telem_probe(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv;
+ struct pci_dev *parent;
+ int err;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, priv);
+ priv->dev = &pdev->dev;
+ parent = to_pci_dev(priv->dev->parent);
+
+ /* TODO: replace with device properties??? */
+ priv->dvsec = dev_get_platdata(&pdev->dev);
+ if (!priv->dvsec) {
+ dev_err(&pdev->dev, "Platform data not found\n");
+ return -ENODEV;
+ }
+
+ /* Remap and access the discovery table header */
+ priv->disc_table = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(priv->disc_table))
+ return PTR_ERR(priv->disc_table);
+
+ pmt_telem_populate_header(priv->disc_table, &priv->header);
+
+ /* Local access and BARID only for now */
+ switch (priv->header.access_type) {
+ case TELEM_ACCESS_LOCAL:
+ if (priv->header.tbir) {
+ dev_err(&pdev->dev,
+ "Unsupported BAR index %d for access type %d\n",
+ priv->header.tbir, priv->header.access_type);
+ return -EINVAL;
+ }
+
+ fallthrough;
+
+ case TELEM_ACCESS_BARID:
+ break;
+ default:
+ dev_err(&pdev->dev, "Unsupported access type %d\n",
+ priv->header.access_type);
+ return -EINVAL;
+ }
+
+ priv->base_addr = pci_resource_start(parent, priv->header.tbir) +
+ priv->header.base_offset;
+
+ err = alloc_chrdev_region(&priv->devt, 0, 1, TELEM_DRV_NAME);
+ if (err < 0) {
+ dev_err(&pdev->dev,
+ "PMT telemetry chrdev_region err: %d\n", err);
+ return err;
+ }
+
+ err = xa_alloc(&telem_array, &priv->devid, priv, TELEM_XA_LIMIT,
+ GFP_KERNEL);
+ if (err < 0)
+ goto fail_xa_alloc;
+
+ err = pmt_telem_create_dev(priv);
+ if (err < 0)
+ goto fail_create_dev;
+
+ return 0;
+
+fail_create_dev:
+ xa_erase(&telem_array, priv->devid);
+fail_xa_alloc:
+ unregister_chrdev_region(priv->devt, 1);
+
+ return err;
+}
+
+static int pmt_telem_remove(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
+
+ device_destroy(&pmt_telem_class, priv->devt);
+ cdev_del(&priv->cdev);
+
+ xa_erase(&telem_array, priv->devid);
+ unregister_chrdev_region(priv->devt, 1);
+
+ return 0;
+}
+
+static const struct platform_device_id pmt_telem_table[] = {
+ {
+ .name = "pmt_telemetry",
+ }, {
+ /* sentinel */
+ }
+};
+MODULE_DEVICE_TABLE(platform, pmt_telem_table);
+
+static struct platform_driver pmt_telem_driver = {
+ .driver = {
+ .name = TELEM_DRV_NAME,
+ },
+ .probe = pmt_telem_probe,
+ .remove = pmt_telem_remove,
+ .id_table = pmt_telem_table,
+};
+
+static int __init pmt_telem_init(void)
+{
+ int ret = class_register(&pmt_telem_class);
+
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&pmt_telem_driver);
+ if (ret)
+ class_unregister(&pmt_telem_class);
+
+ return ret;
+}
+
+static void __exit pmt_telem_exit(void)
+{
+ platform_driver_unregister(&pmt_telem_driver);
+ class_unregister(&pmt_telem_class);
+ xa_destroy(&telem_array);
+}
+
+module_init(pmt_telem_init);
+module_exit(pmt_telem_exit);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel PMT Telemetry driver");
+MODULE_ALIAS("platform:" TELEM_DRV_NAME);
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/platform/x86/intel_pmt_telem.h b/drivers/platform/x86/intel_pmt_telem.h
new file mode 100644
index 000000000000..3c6d1da3dc48
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telem.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _INTEL_PMT_TELEM_H
+#define _INTEL_PMT_TELEM_H
+
+#include <linux/intel-dvsec.h>
+
+/* Telemetry types */
+#define PMT_TELEM_TELEMETRY 0
+#define PMT_TELEM_CRASHLOG 1
+
+struct telem_header {
+ u8 access_type;
+ u8 telem_type;
+ u16 size;
+ u32 guid;
+ u32 base_offset;
+ u8 tbir;
+};
+
+#endif
--
2.20.1

2020-05-05 02:57:43

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 2/3] mfd: Intel Platform Monitoring Technology support

On 5/4/20 7:31 PM, David E. Box wrote:
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 0a59249198d3..c673031acdf1 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -632,6 +632,16 @@ config MFD_INTEL_MSIC
> Passage) chip. This chip embeds audio, battery, GPIO, etc.
> devices used in Intel Medfield platforms.
>
> +config MFD_INTEL_PMT
> + tristate "Intel Platform Monitoring Technology support"
> + depends on PCI
> + select MFD_CORE
> + help
> + The Intel Platform Monitoring Technology (PMT) is an interface that
> + provides access to hardware monitor registers. This driver supports
> + Telemetry, Watcher, and Crashlog PTM capabilities/devices for

What is PTM?


> + platforms starting from Tiger Lake.
> +
> config MFD_IPAQ_MICRO
> bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
> depends on SA1100_H3100 || SA1100_H3600


--
~Randy

2020-05-05 09:05:02

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/3] mfd: Intel Platform Monitoring Technology support

On Tue, May 5, 2020 at 5:32 AM David E. Box <[email protected]> wrote:
>
> Intel Platform Monitoring Technology (PMT) is an architecture for
> enumerating and accessing hardware monitoring facilities. PMT supports
> multiple types of monitoring capabilities. Capabilities are discovered
> using PCIe DVSEC with the Intel VID. Each capability is discovered as a
> separate DVSEC instance in a device's config space. This driver uses MFD to
> manage the creation of platform devices for each type so that they may be
> controlled by their own drivers (to be introduced). Support is included
> for the 3 current capability types, Telemetry, Watcher, and Crashlog. The
> features are available on new Intel platforms starting from Tiger Lake for
> which support is added. Tiger Lake however will not support Watcher and
> Crashlog even though the capabilities appear on the device. So add a quirk
> facility and use it to disable them.

...

> include/linux/intel-dvsec.h | 44 +++++++++

I guess it's no go for a such header, since we may end up with tons of
a such. Perhaps simple pcie-dvsec.h ?

...

> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8783,6 +8783,11 @@ S: Maintained
> F: arch/x86/include/asm/intel_telemetry.h
> F: drivers/platform/x86/intel_telemetry*
>
> +INTEL PMT DRIVER
> +M: "David E. Box" <[email protected]>
> +S: Maintained
> +F: drivers/mfd/intel_pmt.c

I believe you forgot to run parse-maintainers.pl --order
--input=MAINTAINERS --output=MAINTAINERS

...

> + info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
> + GFP_KERNEL);

> +

Extra blank line.

> + if (!info)
> + return -ENOMEM;
> +
> + while ((pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC))) {
> + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
> + if (vid != PCI_VENDOR_ID_INTEL)
> + continue;

Perhaps a candidate for for_each_vendor_cap() macro in pcie-dvsec.h.
Or how is it done for the rest of capabilities?

> + }

...

> +static const struct pci_device_id pmt_pci_ids[] = {
> + /* TGL */

> + { PCI_VDEVICE(INTEL, 0x9a0d), (kernel_ulong_t)&tgl_info },

PCI_DEVICE_DATA()?

> + { }
> +};

--
With Best Regards,
Andy Shevchenko

2020-05-05 13:54:01

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 3/3] platform/x86: Intel PMT Telemetry capability driver

On Tue, May 5, 2020 at 5:32 AM David E. Box <[email protected]> wrote:

...

> Register mappings are not provided by the driver. Instead, a GUID is read
> from a header for each endpoint. The GUID identifies the device and is to
> be used with an XML, provided by the vendor, to discover the available set
> of metrics and their register mapping. This allows firmware updates to
> modify the register space without needing to update the driver every time
> with new mappings. Firmware writes a new GUID in this case to specify the
> new mapping. Software tools with access to the associated XML file can
> then interpret the changes.

Is old hardware going to support this in the future?
(I have in mind Apollo Lake / Broxton)

> This module manages access to all PMT Telemetry endpoints on a system,
> regardless of the device exporting them. It creates an intel_pmt_telem

Name is not the best we can come up with. Would anyone else use PMT?
Would it be vendor-agnostic ABI?
(For example, I know that MIPI standardizes tracing protocols, like
STM, do we have any plans to standardize this one?)

telem -> telemetry.

> class to manage the list. For each endpoint, sysfs files provide GUID and
> size information as well as a pointer to the parent device the telemetry
> comes from. Software may discover the association between endpoints and
> devices by iterating through the list in sysfs, or by looking for the
> existence of the class folder under the device of interest. A device node
> of the same name allows software to then map the telemetry space for direct
> access.

...

> + tristate "Intel PMT telemetry driver"

I think user should understand what is it from the title (hint: spell
PMT fully).

...

> obj-$(CONFIG_PMC_ATOM) += pmc_atom.o
> +obj-$(CONFIG_INTEL_PMT_TELEM) += intel_pmt_telem.o

Keep this and Kconfig section in order with the other stuff.

...

bits.h?

> +#include <linux/cdev.h>
> +#include <linux/intel-dvsec.h>
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/uaccess.h>
> +#include <linux/xarray.h>

...

> +/* platform device name to bind to driver */
> +#define TELEM_DRV_NAME "pmt_telemetry"

Shouldn't be part of MFD header?

...

> +#define TELEM_TBIR_MASK 0x7

GENMASK() ?

> +struct pmt_telem_priv {
> + struct device *dev;
> + struct intel_dvsec_header *dvsec;
> + struct telem_header header;
> + unsigned long base_addr;
> + void __iomem *disc_table;
> + struct cdev cdev;
> + dev_t devt;
> + int devid;
> +};

...

> + unsigned long phys = priv->base_addr;
> + unsigned long pfn = PFN_DOWN(phys);
> + unsigned long psize;
> +
> + psize = (PFN_UP(priv->base_addr + priv->header.size) - pfn) * PAGE_SIZE;
> + if (vsize > psize) {
> + dev_err(priv->dev, "Requested mmap size is too large\n");
> + return -EINVAL;
> + }

...


> +static ssize_t guid_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct pmt_telem_priv *priv = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "0x%x\n", priv->header.guid);
> +}

So, it's not a GUID but rather some custom number? Can we actually do
a real GUID / UUID here?
Because of TODO below I suppose it's not carved in stone (yet) and
basically a protocol defined by firmware (which can be amended).

...

> + /* TODO: replace with device properties??? */

So, please, fulfill. swnode I guess is what you are looking for.

> + priv->dvsec = dev_get_platdata(&pdev->dev);
> + if (!priv->dvsec) {
> + dev_err(&pdev->dev, "Platform data not found\n");
> + return -ENODEV;
> + }

...

> + /* Local access and BARID only for now */
> + switch (priv->header.access_type) {
> + case TELEM_ACCESS_LOCAL:
> + if (priv->header.tbir) {
> + dev_err(&pdev->dev,
> + "Unsupported BAR index %d for access type %d\n",
> + priv->header.tbir, priv->header.access_type);
> + return -EINVAL;
> + }

> + fallthrough;

What's the point?

> +
> + case TELEM_ACCESS_BARID:
> + break;
> + default:
> + dev_err(&pdev->dev, "Unsupported access type %d\n",
> + priv->header.access_type);
> + return -EINVAL;
> + }

> + err = alloc_chrdev_region(&priv->devt, 0, 1, TELEM_DRV_NAME);

err or ret? Be consistent in the module.

> + if (err < 0) {

' < 0' Do we need it?

> + dev_err(&pdev->dev,
> + "PMT telemetry chrdev_region err: %d\n", err);
> + return err;
> + }

...

> + err = pmt_telem_create_dev(priv);
> + if (err < 0)

' < 0' Do we need it?

> + goto fail_create_dev;
> +
> + return 0;

> +}

...

> +static const struct platform_device_id pmt_telem_table[] = {
> + {
> + .name = "pmt_telemetry",
> + }, {
> + /* sentinel */
> + }

{ .name = ... },
{}

is enough.

> +};

...

> +static int __init pmt_telem_init(void)
> +{

> + int ret = class_register(&pmt_telem_class);
> +
> + if (ret)

int ret;

ret = ...
if (ret)

> + return ret;
> +
> + ret = platform_driver_register(&pmt_telem_driver);
> + if (ret)
> + class_unregister(&pmt_telem_class);
> +
> + return ret;
> +}

...

> +{

> +}

> +

Extra blank line.

> +module_init(pmt_telem_init);
> +module_exit(pmt_telem_exit);

Better to attach to the respective functions.

...

> +#include <linux/intel-dvsec.h>

There is no user of this below, but types.h has users here.

> +/* Telemetry types */
> +#define PMT_TELEM_TELEMETRY 0
> +#define PMT_TELEM_CRASHLOG 1
> +
> +struct telem_header {

> + u8 access_type;

If it's part of hardware communication, shouldn't be rather __uXX
types to show that this is part of protocol between software and
hardware?

> + u8 telem_type;
> + u16 size;
> + u32 guid;
> + u32 base_offset;
> + u8 tbir;
> +};


--
With Best Regards,
Andy Shevchenko

2020-05-05 14:59:10

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH 2/3] mfd: Intel Platform Monitoring Technology support

On Mon, 2020-05-04 at 19:53 -0700, Randy Dunlap wrote:
> On 5/4/20 7:31 PM, David E. Box wrote:
> > diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> > index 0a59249198d3..c673031acdf1 100644
> > --- a/drivers/mfd/Kconfig
> > +++ b/drivers/mfd/Kconfig
> > @@ -632,6 +632,16 @@ config MFD_INTEL_MSIC
> > Passage) chip. This chip embeds audio, battery, GPIO, etc.
> > devices used in Intel Medfield platforms.
> >
> > +config MFD_INTEL_PMT
> > + tristate "Intel Platform Monitoring Technology support"
> > + depends on PCI
> > + select MFD_CORE
> > + help
> > + The Intel Platform Monitoring Technology (PMT) is an
> > interface that
> > + provides access to hardware monitor registers. This driver
> > supports
> > + Telemetry, Watcher, and Crashlog PTM capabilities/devices for
>
> What is PTM?

s/PTM/PMT

I have the fortune of working on another project involving PCI
Precision Time Management.

2020-05-05 15:18:18

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH 2/3] mfd: Intel Platform Monitoring Technology support

On Tue, 2020-05-05 at 12:02 +0300, Andy Shevchenko wrote:
> On Tue, May 5, 2020 at 5:32 AM David E. Box <
> [email protected]> wrote:
> > Intel Platform Monitoring Technology (PMT) is an architecture for
> > enumerating and accessing hardware monitoring facilities. PMT
> > supports
> > multiple types of monitoring capabilities. Capabilities are
> > discovered
> > using PCIe DVSEC with the Intel VID. Each capability is discovered
> > as a
> > separate DVSEC instance in a device's config space. This driver
> > uses MFD to
> > manage the creation of platform devices for each type so that they
> > may be
> > controlled by their own drivers (to be introduced). Support is
> > included
> > for the 3 current capability types, Telemetry, Watcher, and
> > Crashlog. The
> > features are available on new Intel platforms starting from Tiger
> > Lake for
> > which support is added. Tiger Lake however will not support Watcher
> > and
> > Crashlog even though the capabilities appear on the device. So add
> > a quirk
> > facility and use it to disable them.
>
> ...
>
> > include/linux/intel-dvsec.h | 44 +++++++++
>
> I guess it's no go for a such header, since we may end up with tons
> of
> a such. Perhaps simple pcie-dvsec.h ?

Too general. Nothing in here applies to all PCIE DVSEC capabilities.
The file describes only the vendor defined space in a DVSEC region.

>
> ...
>
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -8783,6 +8783,11 @@ S: Maintained
> > F: arch/x86/include/asm/intel_telemetry.h
> > F: drivers/platform/x86/intel_telemetry*
> >
> > +INTEL PMT DRIVER
> > +M: "David E. Box" <[email protected]>
> > +S: Maintained
> > +F: drivers/mfd/intel_pmt.c
>
> I believe you forgot to run parse-maintainers.pl --order
> --input=MAINTAINERS --output=MAINTAINERS
>
> ...
>
> > + info = devm_kmemdup(&pdev->dev, (void *)id->driver_data,
> > sizeof(*info),
> > + GFP_KERNEL);
> > +
>
> Extra blank line.
>
> > + if (!info)
> > + return -ENOMEM;
> > +
> > + while ((pos = pci_find_next_ext_capability(pdev, pos,
> > PCI_EXT_CAP_ID_DVSEC))) {
> > + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1,
> > &vid);
> > + if (vid != PCI_VENDOR_ID_INTEL)
> > + continue;
>
> Perhaps a candidate for for_each_vendor_cap() macro in pcie-dvsec.h.
> Or how is it done for the rest of capabilities?
>
> > + }
>
> ...
>
> > +static const struct pci_device_id pmt_pci_ids[] = {
> > + /* TGL */
> > + { PCI_VDEVICE(INTEL, 0x9a0d), (kernel_ulong_t)&tgl_info },
>
> PCI_DEVICE_DATA()?

Ack on the rest of the changes.

2020-05-05 21:11:26

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH 3/3] platform/x86: Intel PMT Telemetry capability driver

On Tue, 2020-05-05 at 16:49 +0300, Andy Shevchenko wrote:
> On Tue, May 5, 2020 at 5:32 AM David E. Box <
> [email protected]> wrote:
>
> ...
>
> > Register mappings are not provided by the driver. Instead, a GUID
> > is read
> > from a header for each endpoint. The GUID identifies the device and
> > is to
> > be used with an XML, provided by the vendor, to discover the
> > available set
> > of metrics and their register mapping. This allows firmware
> > updates to
> > modify the register space without needing to update the driver
> > every time
> > with new mappings. Firmware writes a new GUID in this case to
> > specify the
> > new mapping. Software tools with access to the associated XML file
> > can
> > then interpret the changes.
>
> Is old hardware going to support this in the future?
> (I have in mind Apollo Lake / Broxton)

I don't know of any plans for this.

>
> > This module manages access to all PMT Telemetry endpoints on a
> > system,
> > regardless of the device exporting them. It creates an
> > intel_pmt_telem
>
> Name is not the best we can come up with. Would anyone else use PMT?
> Would it be vendor-agnostic ABI?
> (For example, I know that MIPI standardizes tracing protocols, like
> STM, do we have any plans to standardize this one?)

Not at this time. The technology may be used as a feature on non-Intel
devices, but it is Intel owned. Hence the use of DVSEC which allows
hardware to enumerate and get driver support for IP from other vendors.

>
> telem -> telemetry.
>
> > class to manage the list. For each endpoint, sysfs files provide
> > GUID and
> > size information as well as a pointer to the parent device the
> > telemetry
> > comes from. Software may discover the association between endpoints
> > and
> > devices by iterating through the list in sysfs, or by looking for
> > the
> > existence of the class folder under the device of interest. A
> > device node
> > of the same name allows software to then map the telemetry space
> > for direct
> > access.
>
> ...
>
> > + tristate "Intel PMT telemetry driver"
>
> I think user should understand what is it from the title (hint: spell
> PMT fully).
>
> ...
>
> > obj-$(CONFIG_PMC_ATOM) += pmc_atom.o
> > +obj-$(CONFIG_INTEL_PMT_TELEM) += intel_pmt_telem.o
>
> Keep this and Kconfig section in order with the other stuff.
>
> ...
>
> bits.h?
>
> > +#include <linux/cdev.h>
> > +#include <linux/intel-dvsec.h>
> > +#include <linux/io-64-nonatomic-lo-hi.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/slab.h>
> > +#include <linux/uaccess.h>
> > +#include <linux/xarray.h>
>
> ...
>
> > +/* platform device name to bind to driver */
> > +#define TELEM_DRV_NAME "pmt_telemetry"
>
> Shouldn't be part of MFD header?

Can place in the dvsec header shared by MFD and drivers.

>
> ...
>
> > +#define TELEM_TBIR_MASK 0x7
>
> GENMASK() ?
>
> > +struct pmt_telem_priv {
> > + struct device *dev;
> > + struct intel_dvsec_header *dvsec;
> > + struct telem_header header;
> > + unsigned long base_addr;
> > + void __iomem *disc_table;
> > + struct cdev cdev;
> > + dev_t devt;
> > + int devid;
> > +};
>
> ...
>
> > + unsigned long phys = priv->base_addr;
> > + unsigned long pfn = PFN_DOWN(phys);
> > + unsigned long psize;
> > +
> > + psize = (PFN_UP(priv->base_addr + priv->header.size) - pfn)
> > * PAGE_SIZE;
> > + if (vsize > psize) {
> > + dev_err(priv->dev, "Requested mmap size is too
> > large\n");
> > + return -EINVAL;
> > + }
>
> ...
>
>
> > +static ssize_t guid_show(struct device *dev, struct
> > device_attribute *attr,
> > + char *buf)
> > +{
> > + struct pmt_telem_priv *priv = dev_get_drvdata(dev);
> > +
> > + return sprintf(buf, "0x%x\n", priv->header.guid);
> > +}
>
> So, it's not a GUID but rather some custom number? Can we actually do
> a real GUID / UUID here?

I wish but this is the name it was called. We should have pushed back
more on this. My concern now in calling the attribute something
different is that it will not align with public documentation.

...

>
> > + /* Local access and BARID only for now */
> > + switch (priv->header.access_type) {
> > + case TELEM_ACCESS_LOCAL:
> > + if (priv->header.tbir) {
> > + dev_err(&pdev->dev,
> > + "Unsupported BAR index %d for
> > access type %d\n",
> > + priv->header.tbir, priv-
> > >header.access_type);
> > + return -EINVAL;
> > + }
> > + fallthrough;
>
> What's the point?

The next case has the break. That case is only there to validate that
it's not the default which would be an error. Will switch this to break
though to make it explicit.

Ack on everything else. Thanks.

2020-05-08 02:21:30

by David E. Box

[permalink] [raw]
Subject: [PATCH v2 0/3] Intel Platform Monitoring Technology

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.
With customers increasingly asking for hardware telemetry, engineers not
only have to figure out how to measure and collect data, but also how to
deliver it and make it discoverable. The latter may be through some device
specific method requiring device specific tools to collect the data. This
in turn requires customers to manage a suite of different tools in order to
collect the differing assortment of monitoring data on their systems. Even
when such information can be provided in kernel drivers, they may require
constant maintenance to update register mappings as they change with
firmware updates and new versions of hardware. PMT provides a solution for
discovering and reading telemetry from a device through a hardware agnostic
framework that allows for updates to systems without requiring patches to
the kernel or software tools.

PMT defines several capabilities to support collecting monitoring data from
hardware. All are discoverable as separate instances of the PCIE Designated
Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
field uniquely identifies the capability. Each DVSEC also provides a BAR
offset to a header that defines capability-specific attributes, including
GUID, feature type, offset and length, as well as configuration settings
where applicable. The GUID uniquely identifies the register space of any
monitor data exposed by the capability. The GUID is associated with an XML
file from the vendor that describes the mapping of the register space along
with properties of the monitor data. This allows vendors to perform
firmware updates that can change the mapping (e.g. add new metrics) without
requiring any changes to drivers or software tools. The new mapping is
confirmed by an updated GUID, read from the hardware, which software uses
with a new XML.

The current capabilities defined by PMT are Telemetry, Watcher, and
Crashlog. The Telemetry capability provides access to a continuous block
of read only data. The Watcher capability provides access to hardware
sampling and tracing features. Crashlog provides access to device crash
dumps. While there is some relationship between capabilities (Watcher can
be configured to sample from the Telemetry data set) each exists as stand
alone features with no dependency on any other. The design therefore splits
them into individual, capability specific drivers. MFD is used to create
platform devices for each capability so that they may be managed by their
own driver. The PMT architecture is (for the most part) agnostic to the
type of device it can collect from. Devices nodes are consequently generic
in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
creates a class to manage the list of devices supporting it. Software can
determine which devices support a PMT feature by searching through each
device node entry in the sysfs class folder. It can additionally determine
if a particular device supports a PMT feature by checking for a PMT class
folder in the device folder.

This patch set provides support for the PMT framework, along with support
for Telemetry on Tiger Lake.

Changes from V1:

- In the telemetry driver, set the device in device_create() to
the parent pci device (the monitoring device) for clear
association in sysfs. Was set before to the platform device
created by the pci parent.
- Move telem struct into driver and delete unneeded header file.
- Start telem device numbering from 0 instead of 1. 1 was used
due to anticipated changes, no longer needed.
- Use helper macros suggested by Andy S.
- Rename class to pmt_telemetry, spelling out full name
- Move monitor device name defines to common header
- Coding style, spelling, and Makefile/MAINTAINERS ordering fixes

David E. Box (3):
PCI: Add #defines for Designated Vendor-Specific Capability
mfd: Intel Platform Monitoring Technology support
platform/x86: Intel PMT Telemetry capability driver

MAINTAINERS | 6 +
drivers/mfd/Kconfig | 10 +
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 170 ++++++++++++
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telem.c | 362 +++++++++++++++++++++++++
include/linux/intel-dvsec.h | 48 ++++
include/uapi/linux/pci_regs.h | 5 +
9 files changed, 613 insertions(+)
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 drivers/platform/x86/intel_pmt_telem.c
create mode 100644 include/linux/intel-dvsec.h

--
2.20.1

2020-05-08 02:21:52

by David E. Box

[permalink] [raw]
Subject: [PATCH v2 1/3] PCI: Add defines for Designated Vendor-Specific Capability

Add PCIe DVSEC extended capability ID and defines for the header offsets.
Defined in PCIe r5.0, sec 7.9.6.

Signed-off-by: David E. Box <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
include/uapi/linux/pci_regs.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index f9701410d3b5..09daa9f07b6b 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -720,6 +720,7 @@
#define PCI_EXT_CAP_ID_DPC 0x1D /* Downstream Port Containment */
#define PCI_EXT_CAP_ID_L1SS 0x1E /* L1 PM Substates */
#define PCI_EXT_CAP_ID_PTM 0x1F /* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
#define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */
#define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */
#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
@@ -1062,6 +1063,10 @@
#define PCI_L1SS_CTL1_LTR_L12_TH_SCALE 0xe0000000 /* LTR_L1.2_THRESHOLD_Scale */
#define PCI_L1SS_CTL2 0x0c /* Control 2 Register */

+/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
+#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
+#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */
+
/* Data Link Feature */
#define PCI_DLF_CAP 0x04 /* Capabilities Register */
#define PCI_DLF_EXCHANGE_ENABLE 0x80000000 /* Data Link Feature Exchange Enable */
--
2.20.1

2020-05-08 02:22:52

by David E. Box

[permalink] [raw]
Subject: [PATCH v2 2/3] mfd: Intel Platform Monitoring Technology support

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring facilities. PMT supports
multiple types of monitoring capabilities. This driver creates platform
devices for each type so that they may be managed by capability specific
drivers (to be introduced). Capabilities are discovered using PCIe DVSEC
ids. Support is included for the 3 current capability types, Telemetry,
Watcher, and Crashlog. The features are available on new Intel platforms
starting from Tiger Lake for which support is added. Tiger Lake however
will not support Watcher and Crashlog even though the capabilities appear
on the device. So add a quirk facility and use it to disable them.

Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
MAINTAINERS | 5 ++
drivers/mfd/Kconfig | 10 +++
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 170 ++++++++++++++++++++++++++++++++++++
include/linux/intel-dvsec.h | 48 ++++++++++
5 files changed, 234 insertions(+)
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 include/linux/intel-dvsec.h

diff --git a/MAINTAINERS b/MAINTAINERS
index e64e5db31497..367e49d27960 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8733,6 +8733,11 @@ F: drivers/mfd/intel_soc_pmic*
F: include/linux/mfd/intel_msic.h
F: include/linux/mfd/intel_soc_pmic*

+INTEL PMT DRIVER
+M: "David E. Box" <[email protected]>
+S: Maintained
+F: drivers/mfd/intel_pmt.c
+
INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
L: [email protected]
diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 0a59249198d3..8777ff99e633 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -632,6 +632,16 @@ config MFD_INTEL_MSIC
Passage) chip. This chip embeds audio, battery, GPIO, etc.
devices used in Intel Medfield platforms.

+config MFD_INTEL_PMT
+ tristate "Intel Platform Monitoring Technology support"
+ depends on PCI
+ select MFD_CORE
+ help
+ The Intel Platform Monitoring Technology (PMT) is an interface that
+ provides access to hardware monitor registers. This driver supports
+ Telemetry, Watcher, and Crashlog PMT capabilities/devices for
+ platforms starting from Tiger Lake.
+
config MFD_IPAQ_MICRO
bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
depends on SA1100_H3100 || SA1100_H3600
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index f935d10cbf0f..0041f673faa1 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -212,6 +212,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS) += intel-lpss.o
obj-$(CONFIG_MFD_INTEL_LPSS_PCI) += intel-lpss-pci.o
obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
+obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
obj-$(CONFIG_MFD_PALMAS) += palmas.o
obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
new file mode 100644
index 000000000000..951128ec2afa
--- /dev/null
+++ b/drivers/mfd/intel_pmt.c
@@ -0,0 +1,170 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitoring Technology MFD driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Authors: David E. Box <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/mfd/core.h>
+#include <linux/intel-dvsec.h>
+
+static const struct pmt_platform_info tgl_info = {
+ .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG,
+};
+
+static int
+pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header *header,
+ struct pmt_platform_info *info)
+{
+ struct mfd_cell *cell, *tmp;
+ const char *name;
+ int i;
+
+ switch (header->id) {
+ case DVSEC_INTEL_ID_TELEM:
+ name = TELEM_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_WATCHER:
+ if (info->quirks && PMT_QUIRK_NO_WATCHER) {
+ dev_info(&pdev->dev, "Watcher not supported\n");
+ return 0;
+ }
+ name = WATCHER_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_CRASHLOG:
+ if (info->quirks && PMT_QUIRK_NO_WATCHER) {
+ dev_info(&pdev->dev, "Crashlog not supported\n");
+ return 0;
+ }
+ name = CRASHLOG_DEV_NAME;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ cell = devm_kcalloc(&pdev->dev, header->num_entries,
+ sizeof(*cell), GFP_KERNEL);
+ if (!cell)
+ return -ENOMEM;
+
+ /* Create a platform device for each entry. */
+ for (i = 0, tmp = cell; i < header->num_entries; i++, tmp++) {
+ struct resource *res;
+
+ res = devm_kzalloc(&pdev->dev, sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ tmp->name = name;
+
+ res->start = pdev->resource[header->tbir].start +
+ header->offset +
+ (i * (INTEL_DVSEC_ENTRY_SIZE << 2));
+ res->end = res->start + (header->entry_size << 2) - 1;
+ res->flags = IORESOURCE_MEM;
+
+ tmp->resources = res;
+ tmp->num_resources = 1;
+ tmp->platform_data = header;
+ tmp->pdata_size = sizeof(*header);
+
+ }
+
+ return devm_mfd_add_devices(&pdev->dev, PLATFORM_DEVID_AUTO, cell,
+ header->num_entries, NULL, 0, NULL);
+}
+
+static int
+pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ u16 vid;
+ u32 table;
+ int ret, pos = 0, last_pos = 0;
+ struct pmt_platform_info *info;
+ struct intel_dvsec_header header;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
+ GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ while ((pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC))) {
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
+ if (vid != PCI_VENDOR_ID_INTEL)
+ continue;
+
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
+ &header.id);
+
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
+ &header.num_entries);
+
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
+ &header.entry_size);
+
+ if (!header.num_entries || !header.entry_size)
+ return -EINVAL;
+
+ pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
+ &table);
+
+ header.tbir = INTEL_DVSEC_TABLE_BAR(table);
+ header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
+ ret = pmt_add_dev(pdev, &header, info);
+ if (ret)
+ dev_warn(&pdev->dev,
+ "Failed to add devices for DVSEC id %d\n",
+ header.id);
+ last_pos = pos;
+ }
+
+ if (!last_pos) {
+ dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
+ return -ENODEV;
+ }
+
+ pm_runtime_put(&pdev->dev);
+ pm_runtime_allow(&pdev->dev);
+
+ return 0;
+}
+
+static void pmt_pci_remove(struct pci_dev *pdev)
+{
+ pm_runtime_forbid(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);
+}
+
+#define PCI_DEVICE_ID_INTEL_PMT_TGL 0x9a0d
+
+static const struct pci_device_id pmt_pci_ids[] = {
+ { PCI_DEVICE_DATA(INTEL, PMT_TGL, &tgl_info) },
+ { }
+};
+MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
+
+static struct pci_driver pmt_pci_driver = {
+ .name = "intel-pmt",
+ .id_table = pmt_pci_ids,
+ .probe = pmt_pci_probe,
+ .remove = pmt_pci_remove,
+};
+
+module_pci_driver(pmt_pci_driver);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel Platform Monitoring Technology MFD driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/intel-dvsec.h b/include/linux/intel-dvsec.h
new file mode 100644
index 000000000000..87bb67fd62f7
--- /dev/null
+++ b/include/linux/intel-dvsec.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef INTEL_DVSEC_H
+#define INTEL_DVSEC_H
+
+#include <linux/types.h>
+
+#define DVSEC_INTEL_ID_TELEM 2
+#define DVSEC_INTEL_ID_WATCHER 3
+#define DVSEC_INTEL_ID_CRASHLOG 4
+
+#define TELEM_DEV_NAME "pmt_telemetry"
+#define WATCHER_DEV_NAME "pmt_watcher"
+#define CRASHLOG_DEV_NAME "pmt_crashlog"
+
+/* Intel DVSEC capability vendor space offsets */
+#define INTEL_DVSEC_ENTRIES 0xA
+#define INTEL_DVSEC_SIZE 0xB
+#define INTEL_DVSEC_TABLE 0xC
+#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
+#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) >> 3)
+
+#define INTEL_DVSEC_ENTRY_SIZE 4
+
+/* DVSEC header */
+struct intel_dvsec_header {
+ u16 length;
+ u16 id;
+ u8 num_entries;
+ u8 entry_size;
+ u8 entry_max;
+ u8 tbir;
+ u32 offset;
+};
+
+enum pmt_quirks {
+ /* Watcher capability not supported */
+ PMT_QUIRK_NO_WATCHER = (1 << 0),
+
+ /* Crashlog capability not supported */
+ PMT_QUIRK_NO_CRASHLOG = (1 << 1),
+};
+
+struct pmt_platform_info {
+ unsigned long quirks;
+ struct intel_dvsec_header **capabilities;
+};
+
+#endif
--
2.20.1

2020-05-08 02:23:09

by David E. Box

[permalink] [raw]
Subject: [PATCH v2 3/3] platform/x86: Intel PMT Telemetry capability driver

PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
The Telemetry capability provides access to device telemetry metrics that
provide hardware performance data to users from continuous, memory mapped,
read-only register spaces.

Register mappings are not provided by the driver. Instead, a GUID is read
from a header for each endpoint. The GUID identifies the device and is to
be used with an XML, provided by the vendor, to discover the available set
of metrics and their register mapping. This allows firmware updates to
modify the register space without needing to update the driver every time
with new mappings. Firmware writes a new GUID in this case to specify the
new mapping. Software tools with access to the associated XML file can
then interpret the changes.

This module manages access to all PMT Telemetry endpoints on a system,
regardless of the device exporting them. It creates a pmt_telemetry class
to manage the list. For each endpoint, sysfs files provide GUID and size
information as well as a pointer to the parent device the telemetry comes
from. Software may discover the association between endpoints and devices
by iterating through the list in sysfs, or by looking for the existence of
the class folder under the device of interest. A device node of the same
name allows software to then map the telemetry space for direct access.

Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
MAINTAINERS | 1 +
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telem.c | 362 +++++++++++++++++++++++++
4 files changed, 374 insertions(+)
create mode 100644 drivers/platform/x86/intel_pmt_telem.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 367e49d27960..a2a12c1196c4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8737,6 +8737,7 @@ INTEL PMT DRIVER
M: "David E. Box" <[email protected]>
S: Maintained
F: drivers/mfd/intel_pmt.c
+F: drivers/platform/x86/intel_pmt_*

INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 0ad7ad8cf8e1..41f66da0e3f9 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1368,6 +1368,16 @@ config INTEL_TELEMETRY
directly via debugfs files. Various tools may use
this interface for SoC state monitoring.

+config INTEL_PMT_TELEM
+ tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
+ help
+ The Intel Platform Monitory Technology (PMT) Telemetry driver provides
+ access to hardware telemetry metrics on devices that support the
+ feature.
+
+ For more information, see
+ <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
+
endif # X86_PLATFORM_DEVICES

config PMC_ATOM
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 53408d965874..e5cd49e54745 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -139,6 +139,7 @@ obj-$(CONFIG_INTEL_MID_POWER_BUTTON) += intel_mid_powerbtn.o
obj-$(CONFIG_INTEL_MRFLD_PWRBTN) += intel_mrfld_pwrbtn.o
obj-$(CONFIG_INTEL_PMC_CORE) += intel_pmc_core.o intel_pmc_core_pltdrv.o
obj-$(CONFIG_INTEL_PMC_IPC) += intel_pmc_ipc.o
+obj-$(CONFIG_INTEL_PMT_TELEM) += intel_pmt_telem.o
obj-$(CONFIG_INTEL_PUNIT_IPC) += intel_punit_ipc.o
obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
obj-$(CONFIG_INTEL_SCU_IPC_UTIL) += intel_scu_ipcutil.o
diff --git a/drivers/platform/x86/intel_pmt_telem.c b/drivers/platform/x86/intel_pmt_telem.c
new file mode 100644
index 000000000000..d5aac239bb35
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telem.c
@@ -0,0 +1,362 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitory Technology Telemetry driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: "David E. Box" <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/cdev.h>
+#include <linux/intel-dvsec.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/xarray.h>
+
+/* Telemetry access types */
+#define TELEM_ACCESS_FUTURE 1
+#define TELEM_ACCESS_BARID 2
+#define TELEM_ACCESS_LOCAL 3
+
+#define TELEM_GUID_OFFSET 0x4
+#define TELEM_BASE_OFFSET 0x8
+#define TELEM_TBIR_MASK GENMASK(2, 0)
+#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
+#define TELEM_TYPE(v) (((v) & GENMASK(7, 4)) >> 4)
+/* size is in bytes */
+#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
+
+#define TELEM_XA_START 0
+#define TELEM_XA_MAX INT_MAX
+#define TELEM_XA_LIMIT XA_LIMIT(TELEM_XA_START, TELEM_XA_MAX)
+
+static DEFINE_XARRAY_ALLOC(telem_array);
+
+struct telem_header {
+ u8 access_type;
+ u8 telem_type;
+ u16 size;
+ u32 guid;
+ u32 base_offset;
+ u8 tbir;
+};
+
+struct pmt_telem_priv {
+ struct device *dev;
+ struct intel_dvsec_header *dvsec;
+ struct telem_header header;
+ unsigned long base_addr;
+ void __iomem *disc_table;
+ struct cdev cdev;
+ dev_t devt;
+ int devid;
+};
+
+/*
+ * devfs
+ */
+static int pmt_telem_open(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv;
+ struct pci_driver *pci_drv;
+ struct pci_dev *pci_dev;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ priv = container_of(inode->i_cdev, struct pmt_telem_priv, cdev);
+ pci_dev = to_pci_dev(priv->dev->parent);
+
+ pci_drv = pci_dev_driver(pci_dev);
+ if (!pci_drv)
+ return -ENODEV;
+
+ filp->private_data = priv;
+ get_device(&pci_dev->dev);
+
+ if (!try_module_get(pci_drv->driver.owner)) {
+ put_device(&pci_dev->dev);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static int pmt_telem_release(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv = filp->private_data;
+ struct pci_dev *pci_dev = to_pci_dev(priv->dev->parent);
+ struct pci_driver *pci_drv = pci_dev_driver(pci_dev);
+
+ put_device(&pci_dev->dev);
+ module_put(pci_drv->driver.owner);
+
+ return 0;
+}
+
+static int pmt_telem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ struct pmt_telem_priv *priv = filp->private_data;
+ unsigned long vsize = vma->vm_end - vma->vm_start;
+ unsigned long phys = priv->base_addr;
+ unsigned long pfn = PFN_DOWN(phys);
+ unsigned long psize;
+
+ psize = (PFN_UP(priv->base_addr + priv->header.size) - pfn) * PAGE_SIZE;
+ if (vsize > psize) {
+ dev_err(priv->dev, "Requested mmap size is too large\n");
+ return -EINVAL;
+ }
+
+ if ((vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_MAYWRITE))
+ return -EPERM;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+ if (io_remap_pfn_range(vma, vma->vm_start, pfn, vsize,
+ vma->vm_page_prot))
+ return -EINVAL;
+
+ return 0;
+}
+
+static const struct file_operations pmt_telem_fops = {
+ .owner = THIS_MODULE,
+ .open = pmt_telem_open,
+ .mmap = pmt_telem_mmap,
+ .release = pmt_telem_release,
+};
+
+/*
+ * sysfs
+ */
+static ssize_t guid_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_priv *priv = dev_get_drvdata(dev);
+
+ return sprintf(buf, "0x%x\n", priv->header.guid);
+}
+static DEVICE_ATTR_RO(guid);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_priv *priv = dev_get_drvdata(dev);
+
+ /* Display buffer size in bytes */
+ return sprintf(buf, "%u\n", priv->header.size);
+}
+static DEVICE_ATTR_RO(size);
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_priv *priv = dev_get_drvdata(dev);
+
+ /* Display buffer offset in bytes */
+ return sprintf(buf, "%lu\n", offset_in_page(priv->base_addr));
+}
+static DEVICE_ATTR_RO(offset);
+
+static struct attribute *pmt_telem_attrs[] = {
+ &dev_attr_guid.attr,
+ &dev_attr_size.attr,
+ &dev_attr_offset.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(pmt_telem);
+
+struct class pmt_telem_class = {
+ .owner = THIS_MODULE,
+ .name = "pmt_telemetry",
+ .dev_groups = pmt_telem_groups,
+};
+
+/*
+ * driver initialization
+ */
+static int pmt_telem_create_dev(struct pmt_telem_priv *priv)
+{
+ struct pci_dev *pci_dev;
+ struct device *dev;
+ int ret;
+
+ cdev_init(&priv->cdev, &pmt_telem_fops);
+ ret = cdev_add(&priv->cdev, priv->devt, 1);
+ if (ret) {
+ dev_err(priv->dev, "Could not add char dev\n");
+ return ret;
+ }
+
+ pci_dev = to_pci_dev(priv->dev->parent);
+ dev = device_create(&pmt_telem_class, &pci_dev->dev, priv->devt,
+ priv, "telem%d", priv->devid);
+ if (IS_ERR(dev)) {
+ dev_err(priv->dev, "Could not create device node\n");
+ cdev_del(&priv->cdev);
+ }
+
+ return PTR_ERR_OR_ZERO(dev);
+}
+
+static void pmt_telem_populate_header(void __iomem *disc_offset,
+ struct telem_header *header)
+{
+ header->access_type = TELEM_ACCESS(readb(disc_offset));
+ header->telem_type = TELEM_TYPE(readb(disc_offset));
+ header->size = TELEM_SIZE(readl(disc_offset));
+ header->guid = readl(disc_offset + TELEM_GUID_OFFSET);
+ header->base_offset = readl(disc_offset + TELEM_BASE_OFFSET);
+
+ /*
+ * For non-local access types the lower 3 bits of base offset
+ * contains the index of the base address register where the
+ * telemetry can be found.
+ */
+ header->tbir = header->base_offset & TELEM_TBIR_MASK;
+ header->base_offset ^= header->tbir;
+}
+
+static int pmt_telem_probe(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv;
+ struct pci_dev *parent;
+ int ret;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, priv);
+ priv->dev = &pdev->dev;
+ parent = to_pci_dev(priv->dev->parent);
+
+ priv->dvsec = dev_get_platdata(&pdev->dev);
+ if (!priv->dvsec) {
+ dev_err(&pdev->dev, "Platform data not found\n");
+ return -ENODEV;
+ }
+
+ /* Remap and access the discovery table header */
+ priv->disc_table = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(priv->disc_table))
+ return PTR_ERR(priv->disc_table);
+
+ pmt_telem_populate_header(priv->disc_table, &priv->header);
+
+ /* Local access and BARID only for now */
+ switch (priv->header.access_type) {
+ case TELEM_ACCESS_LOCAL:
+ if (priv->header.tbir) {
+ dev_err(&pdev->dev,
+ "Unsupported BAR index %d for access type %d\n",
+ priv->header.tbir, priv->header.access_type);
+ return -EINVAL;
+ }
+ break;
+
+ case TELEM_ACCESS_BARID:
+ break;
+
+ default:
+ dev_err(&pdev->dev, "Unsupported access type %d\n",
+ priv->header.access_type);
+ return -EINVAL;
+ }
+
+ priv->base_addr = pci_resource_start(parent, priv->header.tbir) +
+ priv->header.base_offset;
+
+ ret = alloc_chrdev_region(&priv->devt, 0, 1, TELEM_DEV_NAME);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "PMT telemetry chrdev_region error: %d\n", ret);
+ return ret;
+ }
+
+ ret = xa_alloc(&telem_array, &priv->devid, priv, TELEM_XA_LIMIT,
+ GFP_KERNEL);
+ if (ret)
+ goto fail_xa_alloc;
+
+ ret = pmt_telem_create_dev(priv);
+ if (ret)
+ goto fail_create_dev;
+
+ return 0;
+
+fail_create_dev:
+ xa_erase(&telem_array, priv->devid);
+fail_xa_alloc:
+ unregister_chrdev_region(priv->devt, 1);
+
+ return ret;
+}
+
+static int pmt_telem_remove(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
+
+ device_destroy(&pmt_telem_class, priv->devt);
+ cdev_del(&priv->cdev);
+
+ xa_erase(&telem_array, priv->devid);
+ unregister_chrdev_region(priv->devt, 1);
+
+ return 0;
+}
+
+static const struct platform_device_id pmt_telem_table[] = {
+ {
+ .name = TELEM_DEV_NAME,
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(platform, pmt_telem_table);
+
+static struct platform_driver pmt_telem_driver = {
+ .driver = {
+ .name = TELEM_DEV_NAME,
+ },
+ .probe = pmt_telem_probe,
+ .remove = pmt_telem_remove,
+ .id_table = pmt_telem_table,
+};
+
+static int __init pmt_telem_init(void)
+{
+ int ret;
+
+ ret = class_register(&pmt_telem_class);
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&pmt_telem_driver);
+ if (ret)
+ class_unregister(&pmt_telem_class);
+
+ return ret;
+}
+module_init(pmt_telem_init);
+
+static void __exit pmt_telem_exit(void)
+{
+ platform_driver_unregister(&pmt_telem_driver);
+ class_unregister(&pmt_telem_class);
+ xa_destroy(&telem_array);
+}
+module_exit(pmt_telem_exit);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel PMT Telemetry driver");
+MODULE_ALIAS("platform:" TELEM_DEV_NAME);
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-05-08 02:37:37

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH 3/3] platform/x86: Intel PMT Telemetry capability driver

On Tue, 2020-05-05 at 16:49 +0300, Andy Shevchenko wrote:
> ...
>
> > + /* TODO: replace with device properties??? */
>
> So, please, fulfill. swnode I guess is what you are looking for.

I kept the platform data in v2 because swnode properties doesn't look
like a good fit. We are only passing information that was read from the
pci device. It is not hard coded, platform specific data.

David

2020-05-08 09:17:28

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] mfd: Intel Platform Monitoring Technology support

On Fri, May 8, 2020 at 5:18 AM David E. Box <[email protected]> wrote:
>
> Intel Platform Monitoring Technology (PMT) is an architecture for
> enumerating and accessing hardware monitoring facilities. PMT supports
> multiple types of monitoring capabilities. This driver creates platform
> devices for each type so that they may be managed by capability specific
> drivers (to be introduced). Capabilities are discovered using PCIe DVSEC
> ids. Support is included for the 3 current capability types, Telemetry,
> Watcher, and Crashlog. The features are available on new Intel platforms
> starting from Tiger Lake for which support is added. Tiger Lake however
> will not support Watcher and Crashlog even though the capabilities appear
> on the device. So add a quirk facility and use it to disable them.

Thank you for an update.
Some nitpicks below.

...

> + case DVSEC_INTEL_ID_TELEM:

Is this from the spec? Or can we also spell TELEMETRY ?

> + name = TELEM_DEV_NAME;

Ditto for all occurrences.

> + break;

...

> + cell = devm_kcalloc(&pdev->dev, header->num_entries,
> + sizeof(*cell), GFP_KERNEL);

I think if you use temporary
struct device *dev = &pdev->dev;
you may squeeze this to one line and make others smaller as well.

> + if (!cell)
> + return -ENOMEM;

...

> + res->start = pdev->resource[header->tbir].start +
> + header->offset +
> + (i * (INTEL_DVSEC_ENTRY_SIZE << 2));

Outer parentheses are redundant. And perhaps last two lines can be one.

...

> +static int
> +pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + u16 vid;
> + u32 table;

> + int ret, pos = 0, last_pos = 0;

Redundant assignment of pos.

> + while ((pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC))) {
> + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
> + if (vid != PCI_VENDOR_ID_INTEL)
> + continue;
> +

> + last_pos = pos;

Can we simple use a boolean flag?

> + }
> +
> + if (!last_pos) {
> + dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
> + return -ENODEV;
> + }

> +}

...

> +};

> +

Extra blank line.

> +module_pci_driver(pmt_pci_driver);

...

+ bits.h since GENMASK() is in use.

> +#include <linux/types.h>

...

> +enum pmt_quirks {
> + /* Watcher capability not supported */
> + PMT_QUIRK_NO_WATCHER = (1 << 0),

BIT() ?

> +
> + /* Crashlog capability not supported */
> + PMT_QUIRK_NO_CRASHLOG = (1 << 1),

BIT() ?

> +};

--
With Best Regards,
Andy Shevchenko

2020-05-08 10:00:20

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] platform/x86: Intel PMT Telemetry capability driver

On Fri, May 8, 2020 at 5:18 AM David E. Box <[email protected]> wrote:
>
> PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
> The Telemetry capability provides access to device telemetry metrics that
> provide hardware performance data to users from continuous, memory mapped,
> read-only register spaces.
>
> Register mappings are not provided by the driver. Instead, a GUID is read
> from a header for each endpoint. The GUID identifies the device and is to
> be used with an XML, provided by the vendor, to discover the available set
> of metrics and their register mapping. This allows firmware updates to
> modify the register space without needing to update the driver every time
> with new mappings. Firmware writes a new GUID in this case to specify the
> new mapping. Software tools with access to the associated XML file can
> then interpret the changes.
>
> This module manages access to all PMT Telemetry endpoints on a system,
> regardless of the device exporting them. It creates a pmt_telemetry class
> to manage the list. For each endpoint, sysfs files provide GUID and size
> information as well as a pointer to the parent device the telemetry comes
> from. Software may discover the association between endpoints and devices
> by iterating through the list in sysfs, or by looking for the existence of

ABI needs documentation.

> the class folder under the device of interest. A device node of the same
> name allows software to then map the telemetry space for direct access.

...

> +config INTEL_PMT_TELEM

TELEMETRY

...

> +obj-$(CONFIG_INTEL_PMT_TELEM) += intel_pmt_telem.o

telemetry

(Inside the file it's fine to have telem)

...

> + priv->dvsec = dev_get_platdata(&pdev->dev);
> + if (!priv->dvsec) {
> + dev_err(&pdev->dev, "Platform data not found\n");
> + return -ENODEV;
> + }

I don't see how is it being used?

--
With Best Regards,
Andy Shevchenko

2020-05-08 10:01:33

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] Intel Platform Monitoring Technology

On Fri, May 8, 2020 at 5:18 AM David E. Box <[email protected]> wrote:
>
> Intel Platform Monitoring Technology (PMT) is an architecture for
> enumerating and accessing hardware monitoring capabilities on a device.
> With customers increasingly asking for hardware telemetry, engineers not
> only have to figure out how to measure and collect data, but also how to
> deliver it and make it discoverable. The latter may be through some device
> specific method requiring device specific tools to collect the data. This
> in turn requires customers to manage a suite of different tools in order to
> collect the differing assortment of monitoring data on their systems. Even
> when such information can be provided in kernel drivers, they may require
> constant maintenance to update register mappings as they change with
> firmware updates and new versions of hardware. PMT provides a solution for
> discovering and reading telemetry from a device through a hardware agnostic
> framework that allows for updates to systems without requiring patches to
> the kernel or software tools.
>
> PMT defines several capabilities to support collecting monitoring data from
> hardware. All are discoverable as separate instances of the PCIE Designated
> Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
> field uniquely identifies the capability. Each DVSEC also provides a BAR
> offset to a header that defines capability-specific attributes, including
> GUID, feature type, offset and length, as well as configuration settings
> where applicable. The GUID uniquely identifies the register space of any
> monitor data exposed by the capability. The GUID is associated with an XML
> file from the vendor that describes the mapping of the register space along
> with properties of the monitor data. This allows vendors to perform
> firmware updates that can change the mapping (e.g. add new metrics) without
> requiring any changes to drivers or software tools. The new mapping is
> confirmed by an updated GUID, read from the hardware, which software uses
> with a new XML.
>
> The current capabilities defined by PMT are Telemetry, Watcher, and
> Crashlog. The Telemetry capability provides access to a continuous block
> of read only data. The Watcher capability provides access to hardware
> sampling and tracing features. Crashlog provides access to device crash
> dumps. While there is some relationship between capabilities (Watcher can
> be configured to sample from the Telemetry data set) each exists as stand
> alone features with no dependency on any other. The design therefore splits
> them into individual, capability specific drivers. MFD is used to create
> platform devices for each capability so that they may be managed by their
> own driver. The PMT architecture is (for the most part) agnostic to the
> type of device it can collect from. Devices nodes are consequently generic
> in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
> creates a class to manage the list of devices supporting it. Software can
> determine which devices support a PMT feature by searching through each
> device node entry in the sysfs class folder. It can additionally determine
> if a particular device supports a PMT feature by checking for a PMT class
> folder in the device folder.
>
> This patch set provides support for the PMT framework, along with support
> for Telemetry on Tiger Lake.
>

Some nitpicks per individual patches, also you forgot to send the
series to PDx86 mailing list and its maintainers (only me included).

> Changes from V1:
>
> - In the telemetry driver, set the device in device_create() to
> the parent pci device (the monitoring device) for clear
> association in sysfs. Was set before to the platform device
> created by the pci parent.
> - Move telem struct into driver and delete unneeded header file.
> - Start telem device numbering from 0 instead of 1. 1 was used
> due to anticipated changes, no longer needed.
> - Use helper macros suggested by Andy S.
> - Rename class to pmt_telemetry, spelling out full name
> - Move monitor device name defines to common header
> - Coding style, spelling, and Makefile/MAINTAINERS ordering fixes
>
> David E. Box (3):
> PCI: Add #defines for Designated Vendor-Specific Capability
> mfd: Intel Platform Monitoring Technology support
> platform/x86: Intel PMT Telemetry capability driver
>
> MAINTAINERS | 6 +
> drivers/mfd/Kconfig | 10 +
> drivers/mfd/Makefile | 1 +
> drivers/mfd/intel_pmt.c | 170 ++++++++++++
> drivers/platform/x86/Kconfig | 10 +
> drivers/platform/x86/Makefile | 1 +
> drivers/platform/x86/intel_pmt_telem.c | 362 +++++++++++++++++++++++++
> include/linux/intel-dvsec.h | 48 ++++
> include/uapi/linux/pci_regs.h | 5 +
> 9 files changed, 613 insertions(+)
> create mode 100644 drivers/mfd/intel_pmt.c
> create mode 100644 drivers/platform/x86/intel_pmt_telem.c
> create mode 100644 include/linux/intel-dvsec.h
>
> --
> 2.20.1
>


--
With Best Regards,
Andy Shevchenko

2020-05-09 16:33:09

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] platform/x86: Intel PMT Telemetry capability driver

On Fri, 2020-05-08 at 12:57 +0300, Andy Shevchenko wrote:
> On Fri, May 8, 2020 at 5:18 AM David E. Box <
> [email protected]> wrote:
> > PMT Telemetry is a capability of the Intel Platform Monitoring
> > Technology.
> > The Telemetry capability provides access to device telemetry
> > metrics that
> > provide hardware performance data to users from continuous, memory
> > mapped,
> > read-only register spaces.
> >
> > Register mappings are not provided by the driver. Instead, a GUID
> > is read
> > from a header for each endpoint. The GUID identifies the device and
> > is to
> > be used with an XML, provided by the vendor, to discover the
> > available set
> > of metrics and their register mapping. This allows firmware
> > updates to
> > modify the register space without needing to update the driver
> > every time
> > with new mappings. Firmware writes a new GUID in this case to
> > specify the
> > new mapping. Software tools with access to the associated XML file
> > can
> > then interpret the changes.
> >
> > This module manages access to all PMT Telemetry endpoints on a
> > system,
> > regardless of the device exporting them. It creates a pmt_telemetry
> > class
> > to manage the list. For each endpoint, sysfs files provide GUID and
> > size
> > information as well as a pointer to the parent device the telemetry
> > comes
> > from. Software may discover the association between endpoints and
> > devices
> > by iterating through the list in sysfs, or by looking for the
> > existence of
>
> ABI needs documentation.

We will be releasing a Linux software spec for PMT. We are waiting on
public release of the PMT spec. For this patch we did document the
sysfs class ABI.

>
> > the class folder under the device of interest. A device node of
> > the same
> > name allows software to then map the telemetry space for direct
> > access.
>
> ...
>
> > +config INTEL_PMT_TELEM
>
> TELEMETRY
>
> ...
>
> > +obj-$(CONFIG_INTEL_PMT_TELEM) += intel_pmt_telem.o
>
> telemetry
>
> (Inside the file it's fine to have telem)
>
> ...
>
> > + priv->dvsec = dev_get_platdata(&pdev->dev);
> > + if (!priv->dvsec) {
> > + dev_err(&pdev->dev, "Platform data not found\n");
> > + return -ENODEV;
> > + }
>
> I don't see how is it being used?

Good catch :). This was initially used to pass the DVSEC info from the
pci device to the telemetry driver. But with changes all of the needed
info is now read from the driver's memory resource. It was unnoticed
that dvsec fields are no longer used. Will remove in next version.

Okay on other comments.

David

2020-07-14 06:22:59

by David E. Box

[permalink] [raw]
Subject: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability

Add PCIe DVSEC extended capability ID and defines for the header offsets.
Defined in PCIe r5.0, sec 7.9.6.

Signed-off-by: David E. Box <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
include/uapi/linux/pci_regs.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index f9701410d3b5..09daa9f07b6b 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -720,6 +720,7 @@
#define PCI_EXT_CAP_ID_DPC 0x1D /* Downstream Port Containment */
#define PCI_EXT_CAP_ID_L1SS 0x1E /* L1 PM Substates */
#define PCI_EXT_CAP_ID_PTM 0x1F /* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
#define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */
#define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */
#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
@@ -1062,6 +1063,10 @@
#define PCI_L1SS_CTL1_LTR_L12_TH_SCALE 0xe0000000 /* LTR_L1.2_THRESHOLD_Scale */
#define PCI_L1SS_CTL2 0x0c /* Control 2 Register */

+/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
+#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
+#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */
+
/* Data Link Feature */
#define PCI_DLF_CAP 0x04 /* Capabilities Register */
#define PCI_DLF_EXCHANGE_ENABLE 0x80000000 /* Data Link Feature Exchange Enable */
--
2.20.1

2020-07-14 06:23:07

by David E. Box

[permalink] [raw]
Subject: [PATCH V3 0/3] Intel Platform Monitoring Technology

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.
With customers increasingly asking for hardware telemetry, engineers not
only have to figure out how to measure and collect data, but also how to
deliver it and make it discoverable. The latter may be through some device
specific method requiring device specific tools to collect the data. This
in turn requires customers to manage a suite of different tools in order to
collect the differing assortment of monitoring data on their systems. Even
when such information can be provided in kernel drivers, they may require
constant maintenance to update register mappings as they change with
firmware updates and new versions of hardware. PMT provides a solution for
discovering and reading telemetry from a device through a hardware agnostic
framework that allows for updates to systems without requiring patches to
the kernel or software tools.

PMT defines several capabilities to support collecting monitoring data from
hardware. All are discoverable as separate instances of the PCIE Designated
Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
field uniquely identifies the capability. Each DVSEC also provides a BAR
offset to a header that defines capability-specific attributes, including
GUID, feature type, offset and length, as well as configuration settings
where applicable. The GUID uniquely identifies the register space of any
monitor data exposed by the capability. The GUID is associated with an XML
file from the vendor that describes the mapping of the register space along
with properties of the monitor data. This allows vendors to perform
firmware updates that can change the mapping (e.g. add new metrics) without
requiring any changes to drivers or software tools. The new mapping is
confirmed by an updated GUID, read from the hardware, which software uses
with a new XML.

The current capabilities defined by PMT are Telemetry, Watcher, and
Crashlog. The Telemetry capability provides access to a continuous block
of read only data. The Watcher capability provides access to hardware
sampling and tracing features. Crashlog provides access to device crash
dumps. While there is some relationship between capabilities (Watcher can
be configured to sample from the Telemetry data set) each exists as stand
alone features with no dependency on any other. The design therefore splits
them into individual, capability specific drivers. MFD is used to create
platform devices for each capability so that they may be managed by their
own driver. The PMT architecture is (for the most part) agnostic to the
type of device it can collect from. Devices nodes are consequently generic
in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
creates a class to manage the list of devices supporting it. Software can
determine which devices support a PMT feature by searching through each
device node entry in the sysfs class folder. It can additionally determine
if a particular device supports a PMT feature by checking for a PMT class
folder in the device folder.

This patch set provides support for the PMT framework, along with support
for Telemetry on Tiger Lake.

Changes from V2:

Please excuse this delayed V3 as we dealt with last minute hardware
changes.

- In order to handle certain HW bugs from the telemetry capability
driver, create a single platform device per capability instead of
a device per entry. Add the entry data as device resources and
let the capability driver manage them as a set allowing for
cleaner HW bug resolution.
- Handle discovery table offset bug in intel_pmt.c
- Handle overlapping regions in intel_pmt_telemetry.c
- Add description of sysfs class to testing ABI.
- Don't check size and count until confirming support for the PMT
capability to avoid bailing out when we need to skip it.
- Remove unneeded header file. Move code to the intel_pmt.c, the
only place where it's needed.
- Remove now unused platform data.
- Add missing header files types.h, bits.h.
- Rename file name and build options from telem to telemetry.
- Code cleanup suggested by Andy S.
- x86 mailing list added.

Changes from V1:

- In the telemetry driver, set the device in device_create() to
the parent pci device (the monitoring device) for clear
association in sysfs. Was set before to the platform device
created by the pci parent.
- Move telem struct into driver and delete unneeded header file.
- Start telem device numbering from 0 instead of 1. 1 was used
due to anticipated changes, no longer needed.
- Use helper macros suggested by Andy S.
- Rename class to pmt_telemetry, spelling out full name
- Move monitor device name defines to common header
- Coding style, spelling, and Makefile/MAINTAINERS ordering fixes

David E. Box (3):
PCI: Add defines for Designated Vendor-Specific Capability
mfd: Intel Platform Monitoring Technology support
platform/x86: Intel PMT Telemetry capability driver

.../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
MAINTAINERS | 6 +
drivers/mfd/Kconfig | 10 +
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 218 +++++++++
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 454 ++++++++++++++++++
include/uapi/linux/pci_regs.h | 5 +
9 files changed, 751 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

--
2.20.1

2020-07-14 06:23:19

by David E. Box

[permalink] [raw]
Subject: [PATCH V3 3/3] platform/x86: Intel PMT Telemetry capability driver

PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
The Telemetry capability provides access to device telemetry metrics that
provide hardware performance data to users from continuous, memory mapped,
read-only register spaces.

Register mappings are not provided by the driver. Instead, a GUID is read
from a header for each endpoint. The GUID identifies the device and is to
be used with an XML, provided by the vendor, to discover the available set
of metrics and their register mapping. This allows firmware updates to
modify the register space without needing to update the driver every time
with new mappings. Firmware writes a new GUID in this case to specify the
new mapping. Software tools with access to the associated XML file can
then interpret the changes.

This module manages access to all PMT Telemetry endpoints on a system,
independent of the device exporting them. It creates a pmt_telemetry class
to manage the devices. For each telemetry endpoint, sysfs files provide
GUID and size information as well as a pointer to the parent device the
telemetry came from. Software may discover the association between
endpoints and devices by iterating through the list in sysfs, or by looking
for the existence of the class folder under the device of interest. A
device node of the same name allows software to then map the telemetry
space for direct access.

This patch also creates an pci device id list for early telemetry hardware
that requires workarounds for known issues.

Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
.../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
MAINTAINERS | 1 +
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 454 ++++++++++++++++++
5 files changed, 512 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

diff --git a/Documentation/ABI/testing/sysfs-class-pmt_telemetry b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
new file mode 100644
index 000000000000..381924549ecb
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
@@ -0,0 +1,46 @@
+What: /sys/class/pmt_telemetry/
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ The pmt_telemetry/ class directory contains information for
+ devices that expose hardware telemetry using Intel Platform
+ Monitoring Technology (PMT)
+
+What: /sys/class/pmt_telemetry/telem<x>
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ The telem<x> directory contains files describing an instance of
+ a PMT telemetry device that exposes hardware telemetry. Each
+ telem<x> directory has an associated /dev/telem<x> node. This
+ node may be opened and mapped to access the telemetry space of
+ the device. The register layout of the telemetry space is
+ determined from an XML file that matches the pci device id and
+ guid for the device.
+
+What: /sys/class/pmt_telemetry/telem<x>/guid
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The guid for this telemetry device. The guid identifies
+ the version of the XML file for the parent device that is to
+ be used to get the register layout.
+
+What: /sys/class/pmt_telemetry/telem<x>/size
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The size of telemetry region in bytes that corresponds to
+ the mapping size for the /dev/telem<x> device node.
+
+What: /sys/class/pmt_telemetry/telem<x>/offset
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The offset of telemetry region in bytes that corresponds to
+ the mapping for the /dev/telem<x> device node.
diff --git a/MAINTAINERS b/MAINTAINERS
index 2e42bf0c41ab..ebc145894abd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8849,6 +8849,7 @@ INTEL PMT DRIVER
M: "David E. Box" <[email protected]>
S: Maintained
F: drivers/mfd/intel_pmt.c
+F: drivers/platform/x86/intel_pmt_*

INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 0581a54cf562..5e1f7ce6e69f 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1396,6 +1396,16 @@ config INTEL_TELEMETRY
directly via debugfs files. Various tools may use
this interface for SoC state monitoring.

+config INTEL_PMT_TELEMETRY
+ tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
+ help
+ The Intel Platform Monitory Technology (PMT) Telemetry driver provides
+ access to hardware telemetry metrics on devices that support the
+ feature.
+
+ For more information, see
+ <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
+
endif # X86_PLATFORM_DEVICES

config PMC_ATOM
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 2b85852a1a87..95cd3d0be17f 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -139,6 +139,7 @@ obj-$(CONFIG_INTEL_MFLD_THERMAL) += intel_mid_thermal.o
obj-$(CONFIG_INTEL_MID_POWER_BUTTON) += intel_mid_powerbtn.o
obj-$(CONFIG_INTEL_MRFLD_PWRBTN) += intel_mrfld_pwrbtn.o
obj-$(CONFIG_INTEL_PMC_CORE) += intel_pmc_core.o intel_pmc_core_pltdrv.o
+obj-$(CONFIG_INTEL_PMT_TELEMETRY) += intel_pmt_telemetry.o
obj-$(CONFIG_INTEL_PUNIT_IPC) += intel_punit_ipc.o
obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
obj-$(CONFIG_INTEL_SCU_PCI) += intel_scu_pcidrv.o
diff --git a/drivers/platform/x86/intel_pmt_telemetry.c b/drivers/platform/x86/intel_pmt_telemetry.c
new file mode 100644
index 000000000000..e1856fc8c209
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telemetry.c
@@ -0,0 +1,454 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitory Technology Telemetry driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: "David E. Box" <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/cdev.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/xarray.h>
+
+#define TELEM_DEV_NAME "pmt_telemetry"
+
+/* Telemetry access types */
+#define TELEM_ACCESS_FUTURE 1
+#define TELEM_ACCESS_BARID 2
+#define TELEM_ACCESS_LOCAL 3
+
+#define TELEM_GUID_OFFSET 0x4
+#define TELEM_BASE_OFFSET 0x8
+#define TELEM_TBIR_MASK GENMASK(2, 0)
+#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
+#define TELEM_TYPE(v) (((v) & GENMASK(7, 4)) >> 4)
+/* size is in bytes */
+#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
+
+#define TELEM_XA_START 0
+#define TELEM_XA_MAX INT_MAX
+#define TELEM_XA_LIMIT XA_LIMIT(TELEM_XA_START, TELEM_XA_MAX)
+
+/* Used by client hardware to identify a fixed telemetry entry*/
+#define TELEM_CLIENT_FIXED_BLOCK_GUID 0x10000000
+
+static DEFINE_XARRAY_ALLOC(telem_array);
+
+struct pmt_telem_priv;
+
+struct telem_header {
+ u8 access_type;
+ u8 telem_type;
+ u16 size;
+ u32 guid;
+ u32 base_offset;
+ u8 tbir;
+};
+
+struct pmt_telem_entry {
+ struct pmt_telem_priv *priv;
+ struct telem_header header;
+ struct resource *header_res;
+ unsigned long base_addr;
+ void __iomem *disc_table;
+ struct cdev cdev;
+ dev_t devt;
+ int devid;
+};
+
+struct pmt_telem_priv {
+ struct pmt_telem_entry *entry;
+ int num_entries;
+ struct device *dev;
+};
+
+/*
+ * devfs
+ */
+static int pmt_telem_open(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv;
+ struct pmt_telem_entry *entry;
+ struct pci_driver *pci_drv;
+ struct pci_dev *pci_dev;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ entry = container_of(inode->i_cdev, struct pmt_telem_entry, cdev);
+ priv = entry->priv;
+ pci_dev = to_pci_dev(priv->dev->parent);
+
+ pci_drv = pci_dev_driver(pci_dev);
+ if (!pci_drv)
+ return -ENODEV;
+
+ filp->private_data = entry;
+ get_device(&pci_dev->dev);
+
+ if (!try_module_get(pci_drv->driver.owner)) {
+ put_device(&pci_dev->dev);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static int pmt_telem_release(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_entry *entry = filp->private_data;
+ struct pci_dev *pci_dev = to_pci_dev(entry->priv->dev->parent);
+ struct pci_driver *pci_drv = pci_dev_driver(pci_dev);
+
+ put_device(&pci_dev->dev);
+ module_put(pci_drv->driver.owner);
+
+ return 0;
+}
+
+static int pmt_telem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ struct pmt_telem_entry *entry = filp->private_data;
+ struct pmt_telem_priv *priv;
+ unsigned long vsize = vma->vm_end - vma->vm_start;
+ unsigned long phys = entry->base_addr;
+ unsigned long pfn = PFN_DOWN(phys);
+ unsigned long psize;
+
+ priv = entry->priv;
+ psize = (PFN_UP(entry->base_addr + entry->header.size) - pfn) *
+ PAGE_SIZE;
+ if (vsize > psize) {
+ dev_err(priv->dev, "Requested mmap size is too large\n");
+ return -EINVAL;
+ }
+
+ if ((vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_MAYWRITE))
+ return -EPERM;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+ if (io_remap_pfn_range(vma, vma->vm_start, pfn, vsize,
+ vma->vm_page_prot))
+ return -EINVAL;
+
+ return 0;
+}
+
+static const struct file_operations pmt_telem_fops = {
+ .owner = THIS_MODULE,
+ .open = pmt_telem_open,
+ .mmap = pmt_telem_mmap,
+ .release = pmt_telem_release,
+};
+
+/*
+ * sysfs
+ */
+static ssize_t guid_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ return sprintf(buf, "0x%x\n", entry->header.guid);
+}
+static DEVICE_ATTR_RO(guid);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ /* Display buffer size in bytes */
+ return sprintf(buf, "%u\n", entry->header.size);
+}
+static DEVICE_ATTR_RO(size);
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ /* Display buffer offset in bytes */
+ return sprintf(buf, "%lu\n", offset_in_page(entry->base_addr));
+}
+static DEVICE_ATTR_RO(offset);
+
+static struct attribute *pmt_telem_attrs[] = {
+ &dev_attr_guid.attr,
+ &dev_attr_size.attr,
+ &dev_attr_offset.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(pmt_telem);
+
+struct class pmt_telem_class = {
+ .owner = THIS_MODULE,
+ .name = "pmt_telemetry",
+ .dev_groups = pmt_telem_groups,
+};
+
+/*
+ * driver initialization
+ */
+static const struct pci_device_id pmt_telem_early_client_pci_ids[] = {
+ { PCI_VDEVICE(INTEL, 0x9a0d) }, /* TGL */
+ { }
+};
+
+static bool pmt_telem_is_early_client_hw(struct device *dev)
+{
+ struct pci_dev *parent;
+
+ parent = to_pci_dev(dev->parent);
+
+ return !!pci_match_id(pmt_telem_early_client_pci_ids, parent);
+}
+
+static int pmt_telem_create_dev(struct pmt_telem_priv *priv,
+ struct pmt_telem_entry *entry)
+{
+ struct pci_dev *pci_dev;
+ struct device *dev;
+ int ret;
+
+ cdev_init(&entry->cdev, &pmt_telem_fops);
+ ret = cdev_add(&entry->cdev, entry->devt, 1);
+ if (ret) {
+ dev_err(priv->dev, "Could not add char dev\n");
+ return ret;
+ }
+
+ pci_dev = to_pci_dev(priv->dev->parent);
+ dev = device_create(&pmt_telem_class, &pci_dev->dev, entry->devt,
+ entry, "telem%d", entry->devid);
+ if (IS_ERR(dev)) {
+ dev_err(priv->dev, "Could not create device node\n");
+ cdev_del(&entry->cdev);
+ }
+
+ return PTR_ERR_OR_ZERO(dev);
+}
+
+static void pmt_telem_populate_header(void __iomem *disc_offset,
+ struct telem_header *header)
+{
+ header->access_type = TELEM_ACCESS(readb(disc_offset));
+ header->telem_type = TELEM_TYPE(readb(disc_offset));
+ header->size = TELEM_SIZE(readl(disc_offset));
+ header->guid = readl(disc_offset + TELEM_GUID_OFFSET);
+ header->base_offset = readl(disc_offset + TELEM_BASE_OFFSET);
+
+ /*
+ * For non-local access types the lower 3 bits of base offset
+ * contains the index of the base address register where the
+ * telemetry can be found.
+ */
+ header->tbir = header->base_offset & TELEM_TBIR_MASK;
+ header->base_offset ^= header->tbir;
+}
+
+static int pmt_telem_add_entry(struct pmt_telem_priv *priv,
+ struct pmt_telem_entry *entry)
+{
+ struct resource *res = entry->header_res;
+ struct pci_dev *pci_dev = to_pci_dev(priv->dev->parent);
+ int ret;
+
+ pmt_telem_populate_header(entry->disc_table, &entry->header);
+
+ /* Local access and BARID only for now */
+ switch (entry->header.access_type) {
+ case TELEM_ACCESS_LOCAL:
+ if (entry->header.tbir) {
+ dev_err(priv->dev,
+ "Unsupported BAR index %d for access type %d\n",
+ entry->header.tbir, entry->header.access_type);
+ return -EINVAL;
+ }
+
+ /*
+ * For access_type LOCAL, the base address is as follows:
+ * base address = header address + header length + base offset
+ */
+ entry->base_addr = res->start + resource_size(res) +
+ entry->header.base_offset;
+ break;
+
+ case TELEM_ACCESS_BARID:
+ entry->base_addr = pci_dev->resource[entry->header.tbir].start +
+ entry->header.base_offset;
+ break;
+
+ default:
+ dev_err(priv->dev, "Unsupported access type %d\n",
+ entry->header.access_type);
+ return -EINVAL;
+ }
+
+ ret = alloc_chrdev_region(&entry->devt, 0, 1, TELEM_DEV_NAME);
+ if (ret) {
+ dev_err(priv->dev,
+ "PMT telemetry chrdev_region error: %d\n", ret);
+ return ret;
+ }
+
+ ret = xa_alloc(&telem_array, &entry->devid, entry, TELEM_XA_LIMIT,
+ GFP_KERNEL);
+ if (ret)
+ goto fail_xa_alloc;
+
+ ret = pmt_telem_create_dev(priv, entry);
+ if (ret)
+ goto fail_create_dev;
+
+ entry->priv = priv;
+ priv->num_entries++;
+ return 0;
+
+fail_create_dev:
+ xa_erase(&telem_array, entry->devid);
+fail_xa_alloc:
+ unregister_chrdev_region(entry->devt, 1);
+
+ return ret;
+}
+
+static bool pmt_telem_region_overlaps(struct platform_device *pdev,
+ void __iomem *disc_table)
+{
+ u32 guid;
+
+ guid = readl(disc_table + TELEM_GUID_OFFSET);
+
+ return guid == TELEM_CLIENT_FIXED_BLOCK_GUID;
+}
+
+static void pmt_telem_remove_entries(struct pmt_telem_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->num_entries; i++) {
+ device_destroy(&pmt_telem_class, priv->entry[i].devt);
+ cdev_del(&priv->entry[i].cdev);
+ xa_erase(&telem_array, priv->entry[i].devid);
+ unregister_chrdev_region(priv->entry[i].devt, 1);
+ }
+}
+
+static int pmt_telem_probe(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv;
+ struct pmt_telem_entry *entry;
+ bool early_hw;
+ int i;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, priv);
+ priv->dev = &pdev->dev;
+
+ priv->entry = devm_kcalloc(&pdev->dev, pdev->num_resources,
+ sizeof(struct pmt_telem_entry), GFP_KERNEL);
+ if (!priv->entry)
+ return -ENOMEM;
+
+ if (pmt_telem_is_early_client_hw(&pdev->dev))
+ early_hw = true;
+
+ for (i = 0, entry = priv->entry; i < pdev->num_resources;
+ i++, entry++) {
+ int ret;
+
+ entry->header_res = platform_get_resource(pdev, IORESOURCE_MEM,
+ i);
+ if (!entry->header_res) {
+ pmt_telem_remove_entries(priv);
+ return -ENODEV;
+ }
+
+ entry->disc_table = devm_platform_ioremap_resource(pdev, i);
+ if (IS_ERR(entry->disc_table)) {
+ pmt_telem_remove_entries(priv);
+ return PTR_ERR(entry->disc_table);
+ }
+
+ if (pmt_telem_region_overlaps(pdev, entry->disc_table) &&
+ early_hw)
+ continue;
+
+ ret = pmt_telem_add_entry(priv, entry);
+ if (ret) {
+ pmt_telem_remove_entries(priv);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+static int pmt_telem_remove(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
+
+ pmt_telem_remove_entries(priv);
+
+ return 0;
+}
+
+static const struct platform_device_id pmt_telem_table[] = {
+ {
+ .name = "pmt_telemetry",
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(platform, pmt_telem_table);
+
+static struct platform_driver pmt_telem_driver = {
+ .driver = {
+ .name = TELEM_DEV_NAME,
+ },
+ .probe = pmt_telem_probe,
+ .remove = pmt_telem_remove,
+ .id_table = pmt_telem_table,
+};
+
+static int __init pmt_telem_init(void)
+{
+ int ret = class_register(&pmt_telem_class);
+
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&pmt_telem_driver);
+ if (ret)
+ class_unregister(&pmt_telem_class);
+
+ return ret;
+}
+module_init(pmt_telem_init);
+
+static void __exit pmt_telem_exit(void)
+{
+ platform_driver_unregister(&pmt_telem_driver);
+ class_unregister(&pmt_telem_class);
+ xa_destroy(&telem_array);
+}
+module_exit(pmt_telem_exit);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel PMT Telemetry driver");
+MODULE_ALIAS("platform:" TELEM_DEV_NAME);
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-07-14 06:24:09

by David E. Box

[permalink] [raw]
Subject: [PATCH V3 2/3] mfd: Intel Platform Monitoring Technology support

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring facilities. PMT supports
multiple types of monitoring capabilities. This driver creates platform
devices for each type so that they may be managed by capability specific
drivers (to be introduced). Capabilities are discovered using PCIe DVSEC
ids. Support is included for the 3 current capability types, Telemetry,
Watcher, and Crashlog. The features are available on new Intel platforms
starting from Tiger Lake for which support is added.

This patch also adds a quirk mechanism for several early hardware
differences and bugs. For Tiger Lake, do not support Watcher and Crashlog
capabilities since they will not be compatible with future product. Also,
fix use a quirk to fix the discovery table offset.

Signed-off-by: David E. Box <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
---
MAINTAINERS | 5 +
drivers/mfd/Kconfig | 10 ++
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 218 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 234 insertions(+)
create mode 100644 drivers/mfd/intel_pmt.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b4a43a9e7fbc..2e42bf0c41ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8845,6 +8845,11 @@ F: drivers/mfd/intel_soc_pmic*
F: include/linux/mfd/intel_msic.h
F: include/linux/mfd/intel_soc_pmic*

+INTEL PMT DRIVER
+M: "David E. Box" <[email protected]>
+S: Maintained
+F: drivers/mfd/intel_pmt.c
+
INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
L: [email protected]
diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index a37d7d171382..1a62ce2c68d9 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -670,6 +670,16 @@ config MFD_INTEL_PMC_BXT
Register and P-unit access. In addition this creates devices
for iTCO watchdog and telemetry that are part of the PMC.

+config MFD_INTEL_PMT
+ tristate "Intel Platform Monitoring Technology support"
+ depends on PCI
+ select MFD_CORE
+ help
+ The Intel Platform Monitoring Technology (PMT) is an interface that
+ provides access to hardware monitor registers. This driver supports
+ Telemetry, Watcher, and Crashlog PMT capabilities/devices for
+ platforms starting from Tiger Lake.
+
config MFD_IPAQ_MICRO
bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
depends on SA1100_H3100 || SA1100_H3600
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 9367a92f795a..1961b4737985 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -216,6 +216,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS_PCI) += intel-lpss-pci.o
obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
obj-$(CONFIG_MFD_INTEL_PMC_BXT) += intel_pmc_bxt.o
+obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
obj-$(CONFIG_MFD_PALMAS) += palmas.o
obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
new file mode 100644
index 000000000000..0924eca25db0
--- /dev/null
+++ b/drivers/mfd/intel_pmt.c
@@ -0,0 +1,218 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitoring Technology MFD driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Authors: David E. Box <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/mfd/core.h>
+#include <linux/types.h>
+
+/* Intel DVSEC capability vendor space offsets */
+#define INTEL_DVSEC_ENTRIES 0xA
+#define INTEL_DVSEC_SIZE 0xB
+#define INTEL_DVSEC_TABLE 0xC
+#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
+#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) & GENMASK(31, 3))
+#define INTEL_DVSEC_ENTRY_SIZE 4
+
+/* PMT capabilities */
+#define DVSEC_INTEL_ID_TELEMETRY 2
+#define DVSEC_INTEL_ID_WATCHER 3
+#define DVSEC_INTEL_ID_CRASHLOG 4
+
+#define TELEMETRY_DEV_NAME "pmt_telemetry"
+#define WATCHER_DEV_NAME "pmt_watcher"
+#define CRASHLOG_DEV_NAME "pmt_crashlog"
+
+struct intel_dvsec_header {
+ u16 length;
+ u16 id;
+ u8 num_entries;
+ u8 entry_size;
+ u8 tbir;
+ u32 offset;
+};
+
+enum pmt_quirks {
+ /* Watcher capability not supported */
+ PMT_QUIRK_NO_WATCHER = BIT(0),
+
+ /* Crashlog capability not supported */
+ PMT_QUIRK_NO_CRASHLOG = BIT(1),
+
+ /* Use shift instead of mask to read discovery table offset */
+ PMT_QUIRK_TABLE_SHIFT = BIT(2),
+};
+
+struct pmt_platform_info {
+ unsigned long quirks;
+};
+
+static const struct pmt_platform_info tgl_info = {
+ .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG |
+ PMT_QUIRK_TABLE_SHIFT,
+};
+
+static const struct pmt_platform_info pmt_info = {
+};
+
+static int
+pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header *header,
+ struct pmt_platform_info *info)
+{
+ struct device *dev = &pdev->dev;
+ struct resource *res, *tmp;
+ struct mfd_cell *cell;
+ const char *name;
+ int count = header->num_entries;
+ int size = header->entry_size;
+ int i;
+
+ switch (header->id) {
+ case DVSEC_INTEL_ID_TELEMETRY:
+ name = TELEMETRY_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_WATCHER:
+ if (info->quirks & PMT_QUIRK_NO_WATCHER) {
+ dev_info(dev, "Watcher not supported\n");
+ return 0;
+ }
+ name = WATCHER_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_CRASHLOG:
+ if (info->quirks & PMT_QUIRK_NO_CRASHLOG) {
+ dev_info(dev, "Crashlog not supported\n");
+ return 0;
+ }
+ name = CRASHLOG_DEV_NAME;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (!header->num_entries || !header->entry_size) {
+ dev_warn(dev, "Invalid count or size for %s header\n", name);
+ return -EINVAL;
+ }
+
+ cell = devm_kzalloc(dev, sizeof(*cell), GFP_KERNEL);
+ if (!cell)
+ return -ENOMEM;
+
+ res = devm_kcalloc(dev, count, sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ if (info->quirks & PMT_QUIRK_TABLE_SHIFT)
+ header->offset >>= 3;
+
+ for (i = 0, tmp = res; i < count; i++, tmp++) {
+ tmp->start = pdev->resource[header->tbir].start +
+ header->offset + i * (size << 2);
+ tmp->end = tmp->start + (size << 2) - 1;
+ tmp->flags = IORESOURCE_MEM;
+ }
+
+ cell->resources = res;
+ cell->num_resources = count;
+ cell->name = name;
+
+ return devm_mfd_add_devices(dev, PLATFORM_DEVID_AUTO, cell, 1, NULL, 0,
+ NULL);
+}
+
+static int
+pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct intel_dvsec_header header;
+ struct pmt_platform_info *info;
+ bool found_devices = false;
+ int ret, pos = 0;
+ u32 table;
+ u16 vid;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
+ GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
+ while (pos) {
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
+ if (vid != PCI_VENDOR_ID_INTEL)
+ continue;
+
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
+ &header.id);
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
+ &header.num_entries);
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
+ &header.entry_size);
+ pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
+ &table);
+
+ header.tbir = INTEL_DVSEC_TABLE_BAR(table);
+ header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
+
+ ret = pmt_add_dev(pdev, &header, info);
+ if (ret)
+ dev_warn(&pdev->dev,
+ "Failed to add devices for DVSEC id %d\n",
+ header.id);
+ found_devices = true;
+
+ pos = pci_find_next_ext_capability(pdev, pos,
+ PCI_EXT_CAP_ID_DVSEC);
+ }
+
+ if (!found_devices) {
+ dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
+ return -ENODEV;
+ }
+
+ pm_runtime_put(&pdev->dev);
+ pm_runtime_allow(&pdev->dev);
+
+ return 0;
+}
+
+static void pmt_pci_remove(struct pci_dev *pdev)
+{
+ pm_runtime_forbid(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);
+}
+
+#define PCI_DEVICE_ID_INTEL_PMT_TGL 0x9a0d
+
+static const struct pci_device_id pmt_pci_ids[] = {
+ { PCI_DEVICE_DATA(INTEL, PMT_TGL, &tgl_info) },
+ { }
+};
+MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
+
+static struct pci_driver pmt_pci_driver = {
+ .name = "intel-pmt",
+ .id_table = pmt_pci_ids,
+ .probe = pmt_pci_probe,
+ .remove = pmt_pci_remove,
+};
+module_pci_driver(pmt_pci_driver);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel Platform Monitoring Technology MFD driver");
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-07-14 08:43:44

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability

On Tue, Jul 14, 2020 at 9:22 AM David E. Box
<[email protected]> wrote:
>
> Add PCIe DVSEC extended capability ID and defines for the header offsets.
> Defined in PCIe r5.0, sec 7.9.6.
>

Reviewed-by: Andy Shevchenko <[email protected]>

> Signed-off-by: David E. Box <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> include/uapi/linux/pci_regs.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index f9701410d3b5..09daa9f07b6b 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -720,6 +720,7 @@
> #define PCI_EXT_CAP_ID_DPC 0x1D /* Downstream Port Containment */
> #define PCI_EXT_CAP_ID_L1SS 0x1E /* L1 PM Substates */
> #define PCI_EXT_CAP_ID_PTM 0x1F /* Precision Time Measurement */
> +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
> #define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */
> #define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */
> #define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
> @@ -1062,6 +1063,10 @@
> #define PCI_L1SS_CTL1_LTR_L12_TH_SCALE 0xe0000000 /* LTR_L1.2_THRESHOLD_Scale */
> #define PCI_L1SS_CTL2 0x0c /* Control 2 Register */
>
> +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
> +#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
> +#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */
> +
> /* Data Link Feature */
> #define PCI_DLF_CAP 0x04 /* Capabilities Register */
> #define PCI_DLF_EXCHANGE_ENABLE 0x80000000 /* Data Link Feature Exchange Enable */
> --
> 2.20.1
>


--
With Best Regards,
Andy Shevchenko

2020-07-14 08:55:14

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH V3 3/3] platform/x86: Intel PMT Telemetry capability driver

On Tue, Jul 14, 2020 at 9:22 AM David E. Box
<[email protected]> wrote:
>
> PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
> The Telemetry capability provides access to device telemetry metrics that
> provide hardware performance data to users from continuous, memory mapped,
> read-only register spaces.
>
> Register mappings are not provided by the driver. Instead, a GUID is read
> from a header for each endpoint. The GUID identifies the device and is to
> be used with an XML, provided by the vendor, to discover the available set
> of metrics and their register mapping. This allows firmware updates to
> modify the register space without needing to update the driver every time
> with new mappings. Firmware writes a new GUID in this case to specify the
> new mapping. Software tools with access to the associated XML file can
> then interpret the changes.
>
> This module manages access to all PMT Telemetry endpoints on a system,
> independent of the device exporting them. It creates a pmt_telemetry class
> to manage the devices. For each telemetry endpoint, sysfs files provide
> GUID and size information as well as a pointer to the parent device the
> telemetry came from. Software may discover the association between
> endpoints and devices by iterating through the list in sysfs, or by looking
> for the existence of the class folder under the device of interest. A
> device node of the same name allows software to then map the telemetry
> space for direct access.
>
> This patch also creates an pci device id list for early telemetry hardware
> that requires workarounds for known issues.

Some more style issues, after addressing feel free to add
Reviewed-by: Andy Shevchenko <[email protected]>

> Signed-off-by: David E. Box <[email protected]>
> Signed-off-by: Alexander Duyck <[email protected]>

Since you are submitting this the order of the above SoB chain is a
bit strange. I think something like

SoB: Alexander
Co-developed-by: Alexander
SoB: David

is expected (same for patch 2).

...

> +Contact: David Box <[email protected]>
> +Description:
> + The telem<x> directory contains files describing an instance of
> + a PMT telemetry device that exposes hardware telemetry. Each
> + telem<x> directory has an associated /dev/telem<x> node. This
> + node may be opened and mapped to access the telemetry space of
> + the device. The register layout of the telemetry space is
> + determined from an XML file that matches the pci device id and

PCI

> + guid for the device.

GUID

Same for all code where it appears.

...

> + psize = (PFN_UP(entry->base_addr + entry->header.size) - pfn) *
> + PAGE_SIZE;

I wouldn't mind having this on one line.

...

> +static ssize_t guid_show(struct device *dev, struct device_attribute *attr,
> + char *buf)

Ditto.

...

> +static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> + char *buf)

Ditto.

...

> +static bool pmt_telem_is_early_client_hw(struct device *dev)
> +{

> + struct pci_dev *parent;
> +
> + parent = to_pci_dev(dev->parent);

Can be one line.

> + return !!pci_match_id(pmt_telem_early_client_pci_ids, parent);
> +}

...

> + entry->header_res = platform_get_resource(pdev, IORESOURCE_MEM,
> + i);

One line, please.

--
With Best Regards,
Andy Shevchenko

2020-07-15 05:59:07

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH V3 2/3] mfd: Intel Platform Monitoring Technology support

Hi "David,

I love your patch! Perhaps something to improve:

[auto build test WARNING on ljones-mfd/for-mfd-next]
[also build test WARNING on pci/next linus/master v5.8-rc5 next-20200714]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/David-E-Box/Intel-Platform-Monitoring-Technology/20200714-142630
base: https://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git for-mfd-next
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 02946de3802d3bc65bc9f2eb9b8d4969b5a7add8)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> drivers/mfd/intel_pmt.c:67:39: warning: unused variable 'pmt_info' [-Wunused-const-variable]
static const struct pmt_platform_info pmt_info = {
^
1 warning generated.

vim +/pmt_info +67 drivers/mfd/intel_pmt.c

66
> 67 static const struct pmt_platform_info pmt_info = {
68 };
69

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (1.76 kB)
.config.gz (73.59 kB)
Download all attachments

2020-07-15 07:40:16

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH V3 3/3] platform/x86: Intel PMT Telemetry capability driver

Hi David,

On 14.07.2020 9:23, David E. Box wrote:
> PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
> The Telemetry capability provides access to device telemetry metrics that
> provide hardware performance data to users from continuous, memory mapped,
> read-only register spaces.
>
> Register mappings are not provided by the driver. Instead, a GUID is read
> from a header for each endpoint. The GUID identifies the device and is to
> be used with an XML, provided by the vendor, to discover the available set
> of metrics and their register mapping. This allows firmware updates to
> modify the register space without needing to update the driver every time
> with new mappings. Firmware writes a new GUID in this case to specify the
> new mapping. Software tools with access to the associated XML file can
> then interpret the changes.
>
> This module manages access to all PMT Telemetry endpoints on a system,
> independent of the device exporting them. It creates a pmt_telemetry class
> to manage the devices. For each telemetry endpoint, sysfs files provide
> GUID and size information as well as a pointer to the parent device the
> telemetry came from. Software may discover the association between
> endpoints and devices by iterating through the list in sysfs, or by looking
> for the existence of the class folder under the device of interest. A
> device node of the same name allows software to then map the telemetry
> space for direct access.
>
> This patch also creates an pci device id list for early telemetry hardware
> that requires workarounds for known issues.
>
> Signed-off-by: David E. Box <[email protected]>
> Signed-off-by: Alexander Duyck <[email protected]>
> ---
> .../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
> MAINTAINERS | 1 +
> drivers/platform/x86/Kconfig | 10 +
> drivers/platform/x86/Makefile | 1 +
> drivers/platform/x86/intel_pmt_telemetry.c | 454 ++++++++++++++++++
> 5 files changed, 512 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
> create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c
>
> diff --git a/Documentation/ABI/testing/sysfs-class-pmt_telemetry b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
> new file mode 100644
> index 000000000000..381924549ecb
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
> @@ -0,0 +1,46 @@
> +What: /sys/class/pmt_telemetry/
> +Date: July 2020
> +KernelVersion: 5.9
> +Contact: David Box <[email protected]>
> +Description:
> + The pmt_telemetry/ class directory contains information for
> + devices that expose hardware telemetry using Intel Platform
> + Monitoring Technology (PMT)
> +
> +What: /sys/class/pmt_telemetry/telem<x>
> +Date: July 2020
> +KernelVersion: 5.9
> +Contact: David Box <[email protected]>
> +Description:
> + The telem<x> directory contains files describing an instance of
> + a PMT telemetry device that exposes hardware telemetry. Each
> + telem<x> directory has an associated /dev/telem<x> node. This
> + node may be opened and mapped to access the telemetry space of
> + the device. The register layout of the telemetry space is
> + determined from an XML file that matches the pci device id and
> + guid for the device.
> +
> +What: /sys/class/pmt_telemetry/telem<x>/guid
> +Date: July 2020
> +KernelVersion: 5.9
> +Contact: David Box <[email protected]>
> +Description:
> + (RO) The guid for this telemetry device. The guid identifies
> + the version of the XML file for the parent device that is to
> + be used to get the register layout.
> +
> +What: /sys/class/pmt_telemetry/telem<x>/size
> +Date: July 2020
> +KernelVersion: 5.9
> +Contact: David Box <[email protected]>
> +Description:
> + (RO) The size of telemetry region in bytes that corresponds to
> + the mapping size for the /dev/telem<x> device node.
> +
> +What: /sys/class/pmt_telemetry/telem<x>/offset
> +Date: July 2020
> +KernelVersion: 5.9
> +Contact: David Box <[email protected]>
> +Description:
> + (RO) The offset of telemetry region in bytes that corresponds to
> + the mapping for the /dev/telem<x> device node.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 2e42bf0c41ab..ebc145894abd 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8849,6 +8849,7 @@ INTEL PMT DRIVER
> M: "David E. Box" <[email protected]>
> S: Maintained
> F: drivers/mfd/intel_pmt.c
> +F: drivers/platform/x86/intel_pmt_*
>
> INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
> M: Stanislav Yakovlev <[email protected]>
> diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
> index 0581a54cf562..5e1f7ce6e69f 100644
> --- a/drivers/platform/x86/Kconfig
> +++ b/drivers/platform/x86/Kconfig
> @@ -1396,6 +1396,16 @@ config INTEL_TELEMETRY
> directly via debugfs files. Various tools may use
> this interface for SoC state monitoring.
>
> +config INTEL_PMT_TELEMETRY
> + tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
> + help
> + The Intel Platform Monitory Technology (PMT) Telemetry driver provides
> + access to hardware telemetry metrics on devices that support the
> + feature.
> +
> + For more information, see
> + <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
> +
> endif # X86_PLATFORM_DEVICES
>
> config PMC_ATOM
> diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
> index 2b85852a1a87..95cd3d0be17f 100644
> --- a/drivers/platform/x86/Makefile
> +++ b/drivers/platform/x86/Makefile
> @@ -139,6 +139,7 @@ obj-$(CONFIG_INTEL_MFLD_THERMAL) += intel_mid_thermal.o
> obj-$(CONFIG_INTEL_MID_POWER_BUTTON) += intel_mid_powerbtn.o
> obj-$(CONFIG_INTEL_MRFLD_PWRBTN) += intel_mrfld_pwrbtn.o
> obj-$(CONFIG_INTEL_PMC_CORE) += intel_pmc_core.o intel_pmc_core_pltdrv.o
> +obj-$(CONFIG_INTEL_PMT_TELEMETRY) += intel_pmt_telemetry.o
> obj-$(CONFIG_INTEL_PUNIT_IPC) += intel_punit_ipc.o
> obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
> obj-$(CONFIG_INTEL_SCU_PCI) += intel_scu_pcidrv.o
> diff --git a/drivers/platform/x86/intel_pmt_telemetry.c b/drivers/platform/x86/intel_pmt_telemetry.c
> new file mode 100644
> index 000000000000..e1856fc8c209
> --- /dev/null
> +++ b/drivers/platform/x86/intel_pmt_telemetry.c
> @@ -0,0 +1,454 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel Platform Monitory Technology Telemetry driver
> + *
> + * Copyright (c) 2020, Intel Corporation.
> + * All Rights Reserved.
> + *
> + * Author: "David E. Box" <[email protected]>
> + */
> +
> +#include <linux/bits.h>
> +#include <linux/cdev.h>
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/uaccess.h>
> +#include <linux/xarray.h>
> +
> +#define TELEM_DEV_NAME "pmt_telemetry"
> +
> +/* Telemetry access types */
> +#define TELEM_ACCESS_FUTURE 1
> +#define TELEM_ACCESS_BARID 2
> +#define TELEM_ACCESS_LOCAL 3
> +
> +#define TELEM_GUID_OFFSET 0x4
> +#define TELEM_BASE_OFFSET 0x8
> +#define TELEM_TBIR_MASK GENMASK(2, 0)
> +#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
> +#define TELEM_TYPE(v) (((v) & GENMASK(7, 4)) >> 4)
> +/* size is in bytes */
> +#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
> +
> +#define TELEM_XA_START 0
> +#define TELEM_XA_MAX INT_MAX
> +#define TELEM_XA_LIMIT XA_LIMIT(TELEM_XA_START, TELEM_XA_MAX)
> +
> +/* Used by client hardware to identify a fixed telemetry entry*/
> +#define TELEM_CLIENT_FIXED_BLOCK_GUID 0x10000000
> +
> +static DEFINE_XARRAY_ALLOC(telem_array);
> +
> +struct pmt_telem_priv;
> +
> +struct telem_header {
> + u8 access_type;
> + u8 telem_type;
> + u16 size;
> + u32 guid;
> + u32 base_offset;
> + u8 tbir;
> +};
> +
> +struct pmt_telem_entry {
> + struct pmt_telem_priv *priv;
> + struct telem_header header;
> + struct resource *header_res;
> + unsigned long base_addr;
> + void __iomem *disc_table;
> + struct cdev cdev;
> + dev_t devt;
> + int devid;
> +};
> +
> +struct pmt_telem_priv {
> + struct pmt_telem_entry *entry;
> + int num_entries;
> + struct device *dev;
> +};
> +
> +/*
> + * devfs
> + */
> +static int pmt_telem_open(struct inode *inode, struct file *filp)
> +{
> + struct pmt_telem_priv *priv;
> + struct pmt_telem_entry *entry;
> + struct pci_driver *pci_drv;
> + struct pci_dev *pci_dev;
> +
> + if (!capable(CAP_SYS_ADMIN))

Thanks for supplying these patches.
Are there any reasons not to expose this feature to CAP_PERFMON privileged
processes too that currently have access to performance monitoring features
of the kernel without root/CAP_SYS_ADMIN credentials? This could be done by
pefmon_capable() function call starting from v5.8+.

Thanks,
Alexei

> + return -EPERM;
> +
> + entry = container_of(inode->i_cdev, struct pmt_telem_entry, cdev);
> + priv = entry->priv;
> + pci_dev = to_pci_dev(priv->dev->parent);
> +
> + pci_drv = pci_dev_driver(pci_dev);
> + if (!pci_drv)
> + return -ENODEV;
> +
> + filp->private_data = entry;
> + get_device(&pci_dev->dev);
> +
> + if (!try_module_get(pci_drv->driver.owner)) {
> + put_device(&pci_dev->dev);
> + return -ENODEV;
> + }
> +
> + return 0;
> +}
> +
> +static int pmt_telem_release(struct inode *inode, struct file *filp)
> +{
> + struct pmt_telem_entry *entry = filp->private_data;
> + struct pci_dev *pci_dev = to_pci_dev(entry->priv->dev->parent);
> + struct pci_driver *pci_drv = pci_dev_driver(pci_dev);
> +
> + put_device(&pci_dev->dev);
> + module_put(pci_drv->driver.owner);
> +
> + return 0;
> +}
> +
> +static int pmt_telem_mmap(struct file *filp, struct vm_area_struct *vma)
> +{
> + struct pmt_telem_entry *entry = filp->private_data;
> + struct pmt_telem_priv *priv;
> + unsigned long vsize = vma->vm_end - vma->vm_start;
> + unsigned long phys = entry->base_addr;
> + unsigned long pfn = PFN_DOWN(phys);
> + unsigned long psize;
> +
> + priv = entry->priv;
> + psize = (PFN_UP(entry->base_addr + entry->header.size) - pfn) *
> + PAGE_SIZE;
> + if (vsize > psize) {
> + dev_err(priv->dev, "Requested mmap size is too large\n");
> + return -EINVAL;
> + }
> +
> + if ((vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_MAYWRITE))
> + return -EPERM;
> +
> + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> +
> + if (io_remap_pfn_range(vma, vma->vm_start, pfn, vsize,
> + vma->vm_page_prot))
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> +static const struct file_operations pmt_telem_fops = {
> + .owner = THIS_MODULE,
> + .open = pmt_telem_open,
> + .mmap = pmt_telem_mmap,
> + .release = pmt_telem_release,
> +};
> +
> +/*
> + * sysfs
> + */
> +static ssize_t guid_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct pmt_telem_entry *entry = dev_get_drvdata(dev);
> +
> + return sprintf(buf, "0x%x\n", entry->header.guid);
> +}
> +static DEVICE_ATTR_RO(guid);
> +
> +static ssize_t size_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct pmt_telem_entry *entry = dev_get_drvdata(dev);
> +
> + /* Display buffer size in bytes */
> + return sprintf(buf, "%u\n", entry->header.size);
> +}
> +static DEVICE_ATTR_RO(size);
> +
> +static ssize_t offset_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct pmt_telem_entry *entry = dev_get_drvdata(dev);
> +
> + /* Display buffer offset in bytes */
> + return sprintf(buf, "%lu\n", offset_in_page(entry->base_addr));
> +}
> +static DEVICE_ATTR_RO(offset);
> +
> +static struct attribute *pmt_telem_attrs[] = {
> + &dev_attr_guid.attr,
> + &dev_attr_size.attr,
> + &dev_attr_offset.attr,
> + NULL
> +};
> +ATTRIBUTE_GROUPS(pmt_telem);
> +
> +struct class pmt_telem_class = {
> + .owner = THIS_MODULE,
> + .name = "pmt_telemetry",
> + .dev_groups = pmt_telem_groups,
> +};
> +
> +/*
> + * driver initialization
> + */
> +static const struct pci_device_id pmt_telem_early_client_pci_ids[] = {
> + { PCI_VDEVICE(INTEL, 0x9a0d) }, /* TGL */
> + { }
> +};
> +
> +static bool pmt_telem_is_early_client_hw(struct device *dev)
> +{
> + struct pci_dev *parent;
> +
> + parent = to_pci_dev(dev->parent);
> +
> + return !!pci_match_id(pmt_telem_early_client_pci_ids, parent);
> +}
> +
> +static int pmt_telem_create_dev(struct pmt_telem_priv *priv,
> + struct pmt_telem_entry *entry)
> +{
> + struct pci_dev *pci_dev;
> + struct device *dev;
> + int ret;
> +
> + cdev_init(&entry->cdev, &pmt_telem_fops);
> + ret = cdev_add(&entry->cdev, entry->devt, 1);
> + if (ret) {
> + dev_err(priv->dev, "Could not add char dev\n");
> + return ret;
> + }
> +
> + pci_dev = to_pci_dev(priv->dev->parent);
> + dev = device_create(&pmt_telem_class, &pci_dev->dev, entry->devt,
> + entry, "telem%d", entry->devid);
> + if (IS_ERR(dev)) {
> + dev_err(priv->dev, "Could not create device node\n");
> + cdev_del(&entry->cdev);
> + }
> +
> + return PTR_ERR_OR_ZERO(dev);
> +}
> +
> +static void pmt_telem_populate_header(void __iomem *disc_offset,
> + struct telem_header *header)
> +{
> + header->access_type = TELEM_ACCESS(readb(disc_offset));
> + header->telem_type = TELEM_TYPE(readb(disc_offset));
> + header->size = TELEM_SIZE(readl(disc_offset));
> + header->guid = readl(disc_offset + TELEM_GUID_OFFSET);
> + header->base_offset = readl(disc_offset + TELEM_BASE_OFFSET);
> +
> + /*
> + * For non-local access types the lower 3 bits of base offset
> + * contains the index of the base address register where the
> + * telemetry can be found.
> + */
> + header->tbir = header->base_offset & TELEM_TBIR_MASK;
> + header->base_offset ^= header->tbir;
> +}
> +
> +static int pmt_telem_add_entry(struct pmt_telem_priv *priv,
> + struct pmt_telem_entry *entry)
> +{
> + struct resource *res = entry->header_res;
> + struct pci_dev *pci_dev = to_pci_dev(priv->dev->parent);
> + int ret;
> +
> + pmt_telem_populate_header(entry->disc_table, &entry->header);
> +
> + /* Local access and BARID only for now */
> + switch (entry->header.access_type) {
> + case TELEM_ACCESS_LOCAL:
> + if (entry->header.tbir) {
> + dev_err(priv->dev,
> + "Unsupported BAR index %d for access type %d\n",
> + entry->header.tbir, entry->header.access_type);
> + return -EINVAL;
> + }
> +
> + /*
> + * For access_type LOCAL, the base address is as follows:
> + * base address = header address + header length + base offset
> + */
> + entry->base_addr = res->start + resource_size(res) +
> + entry->header.base_offset;
> + break;
> +
> + case TELEM_ACCESS_BARID:
> + entry->base_addr = pci_dev->resource[entry->header.tbir].start +
> + entry->header.base_offset;
> + break;
> +
> + default:
> + dev_err(priv->dev, "Unsupported access type %d\n",
> + entry->header.access_type);
> + return -EINVAL;
> + }
> +
> + ret = alloc_chrdev_region(&entry->devt, 0, 1, TELEM_DEV_NAME);
> + if (ret) {
> + dev_err(priv->dev,
> + "PMT telemetry chrdev_region error: %d\n", ret);
> + return ret;
> + }
> +
> + ret = xa_alloc(&telem_array, &entry->devid, entry, TELEM_XA_LIMIT,
> + GFP_KERNEL);
> + if (ret)
> + goto fail_xa_alloc;
> +
> + ret = pmt_telem_create_dev(priv, entry);
> + if (ret)
> + goto fail_create_dev;
> +
> + entry->priv = priv;
> + priv->num_entries++;
> + return 0;
> +
> +fail_create_dev:
> + xa_erase(&telem_array, entry->devid);
> +fail_xa_alloc:
> + unregister_chrdev_region(entry->devt, 1);
> +
> + return ret;
> +}
> +
> +static bool pmt_telem_region_overlaps(struct platform_device *pdev,
> + void __iomem *disc_table)
> +{
> + u32 guid;
> +
> + guid = readl(disc_table + TELEM_GUID_OFFSET);
> +
> + return guid == TELEM_CLIENT_FIXED_BLOCK_GUID;
> +}
> +
> +static void pmt_telem_remove_entries(struct pmt_telem_priv *priv)
> +{
> + int i;
> +
> + for (i = 0; i < priv->num_entries; i++) {
> + device_destroy(&pmt_telem_class, priv->entry[i].devt);
> + cdev_del(&priv->entry[i].cdev);
> + xa_erase(&telem_array, priv->entry[i].devid);
> + unregister_chrdev_region(priv->entry[i].devt, 1);
> + }
> +}
> +
> +static int pmt_telem_probe(struct platform_device *pdev)
> +{
> + struct pmt_telem_priv *priv;
> + struct pmt_telem_entry *entry;
> + bool early_hw;
> + int i;
> +
> + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
> + if (!priv)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, priv);
> + priv->dev = &pdev->dev;
> +
> + priv->entry = devm_kcalloc(&pdev->dev, pdev->num_resources,
> + sizeof(struct pmt_telem_entry), GFP_KERNEL);
> + if (!priv->entry)
> + return -ENOMEM;
> +
> + if (pmt_telem_is_early_client_hw(&pdev->dev))
> + early_hw = true;
> +
> + for (i = 0, entry = priv->entry; i < pdev->num_resources;
> + i++, entry++) {
> + int ret;
> +
> + entry->header_res = platform_get_resource(pdev, IORESOURCE_MEM,
> + i);
> + if (!entry->header_res) {
> + pmt_telem_remove_entries(priv);
> + return -ENODEV;
> + }
> +
> + entry->disc_table = devm_platform_ioremap_resource(pdev, i);
> + if (IS_ERR(entry->disc_table)) {
> + pmt_telem_remove_entries(priv);
> + return PTR_ERR(entry->disc_table);
> + }
> +
> + if (pmt_telem_region_overlaps(pdev, entry->disc_table) &&
> + early_hw)
> + continue;
> +
> + ret = pmt_telem_add_entry(priv, entry);
> + if (ret) {
> + pmt_telem_remove_entries(priv);
> + return ret;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int pmt_telem_remove(struct platform_device *pdev)
> +{
> + struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
> +
> + pmt_telem_remove_entries(priv);
> +
> + return 0;
> +}
> +
> +static const struct platform_device_id pmt_telem_table[] = {
> + {
> + .name = "pmt_telemetry",
> + },
> + {}
> +};
> +MODULE_DEVICE_TABLE(platform, pmt_telem_table);
> +
> +static struct platform_driver pmt_telem_driver = {
> + .driver = {
> + .name = TELEM_DEV_NAME,
> + },
> + .probe = pmt_telem_probe,
> + .remove = pmt_telem_remove,
> + .id_table = pmt_telem_table,
> +};
> +
> +static int __init pmt_telem_init(void)
> +{
> + int ret = class_register(&pmt_telem_class);
> +
> + if (ret)
> + return ret;
> +
> + ret = platform_driver_register(&pmt_telem_driver);
> + if (ret)
> + class_unregister(&pmt_telem_class);
> +
> + return ret;
> +}
> +module_init(pmt_telem_init);
> +
> +static void __exit pmt_telem_exit(void)
> +{
> + platform_driver_unregister(&pmt_telem_driver);
> + class_unregister(&pmt_telem_class);
> + xa_destroy(&telem_array);
> +}
> +module_exit(pmt_telem_exit);
> +
> +MODULE_AUTHOR("David E. Box <[email protected]>");
> +MODULE_DESCRIPTION("Intel PMT Telemetry driver");
> +MODULE_ALIAS("platform:" TELEM_DEV_NAME);
> +MODULE_LICENSE("GPL v2");
>

2020-07-15 23:59:32

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V3 3/3] platform/x86: Intel PMT Telemetry capability driver

On Wed, 2020-07-15 at 10:39 +0300, Alexey Budankov wrote:
> Hi David,
>
> On 14.07.2020 9:23, David E. Box wrote:

...

> >
> > +static int pmt_telem_open(struct inode *inode, struct file *filp)
> > +{
> > + struct pmt_telem_priv *priv;
> > + struct pmt_telem_entry *entry;
> > + struct pci_driver *pci_drv;
> > + struct pci_dev *pci_dev;
> > +
> > + if (!capable(CAP_SYS_ADMIN))
>
> Thanks for supplying these patches.
> Are there any reasons not to expose this feature to CAP_PERFMON
> privileged
> processes too that currently have access to performance monitoring
> features
> of the kernel without root/CAP_SYS_ADMIN credentials? This could be
> done by
> pefmon_capable() function call starting from v5.8+.

The new capability is well suited for this feature. I'll make the
change. Thanks.

David

2020-07-16 02:58:28

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH V3 3/3] platform/x86: Intel PMT Telemetry capability driver

On 7/13/20 11:23 PM, David E. Box wrote:
> diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
> index 0581a54cf562..5e1f7ce6e69f 100644
> --- a/drivers/platform/x86/Kconfig
> +++ b/drivers/platform/x86/Kconfig
> @@ -1396,6 +1396,16 @@ config INTEL_TELEMETRY
> directly via debugfs files. Various tools may use
> this interface for SoC state monitoring.
>
> +config INTEL_PMT_TELEMETRY
> + tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
> + help
> + The Intel Platform Monitory Technology (PMT) Telemetry driver provides
> + access to hardware telemetry metrics on devices that support the
> + feature.
> +
> + For more information, see
> + <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
> +
> endif # X86_PLATFORM_DEVICES
>
> config PMC_ATOM

The text under "help" should be indented with one tab + 2 spaces,
as is done in patch 2/3.

--
~Randy

2020-07-16 02:58:35

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability

On 7/13/20 11:23 PM, David E. Box wrote:
> Add PCIe DVSEC extended capability ID and defines for the header offsets.
> Defined in PCIe r5.0, sec 7.9.6.
>
> Signed-off-by: David E. Box <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> include/uapi/linux/pci_regs.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index f9701410d3b5..09daa9f07b6b 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -720,6 +720,7 @@
> +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
> @@ -1062,6 +1063,10 @@
> +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
> +#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
> +#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */

Just a little comment: It would make more sense to me to
s/DVSEC/DVSPEC/g.

But then I don't have the PCIe documentation.

--
~Randy

2020-07-16 06:00:03

by Alexey Budankov

[permalink] [raw]
Subject: Re: [PATCH V3 3/3] platform/x86: Intel PMT Telemetry capability driver


On 16.07.2020 2:59, David E. Box wrote:
> On Wed, 2020-07-15 at 10:39 +0300, Alexey Budankov wrote:
>> Hi David,
>>
>> On 14.07.2020 9:23, David E. Box wrote:
>
> ...
>
>>>
>>> +static int pmt_telem_open(struct inode *inode, struct file *filp)
>>> +{
>>> + struct pmt_telem_priv *priv;
>>> + struct pmt_telem_entry *entry;
>>> + struct pci_driver *pci_drv;
>>> + struct pci_dev *pci_dev;
>>> +
>>> + if (!capable(CAP_SYS_ADMIN))
>>
>> Thanks for supplying these patches.
>> Are there any reasons not to expose this feature to CAP_PERFMON
>> privileged
>> processes too that currently have access to performance monitoring
>> features
>> of the kernel without root/CAP_SYS_ADMIN credentials? This could be
>> done by
>> pefmon_capable() function call starting from v5.8+.
>
> The new capability is well suited for this feature. I'll make the
> change. Thanks.

I appreciate your cooperation. Thanks!

Alexei

2020-07-16 15:08:00

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability

On Wed, Jul 15, 2020 at 07:55:11PM -0700, Randy Dunlap wrote:
> On 7/13/20 11:23 PM, David E. Box wrote:
> > Add PCIe DVSEC extended capability ID and defines for the header offsets.
> > Defined in PCIe r5.0, sec 7.9.6.
> >
> > Signed-off-by: David E. Box <[email protected]>
> > Acked-by: Bjorn Helgaas <[email protected]>
> > ---
> > include/uapi/linux/pci_regs.h | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> > index f9701410d3b5..09daa9f07b6b 100644
> > --- a/include/uapi/linux/pci_regs.h
> > +++ b/include/uapi/linux/pci_regs.h
> > @@ -720,6 +720,7 @@
> > +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
> > @@ -1062,6 +1063,10 @@
> > +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
> > +#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
> > +#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */
>
> Just a little comment: It would make more sense to me to
> s/DVSEC/DVSPEC/g.

Yeah, that is confusing, but "DVSEC" is the term used in the spec. I
think it stands for "Designated Vendor-Specific Extended Capability".

2020-07-16 15:09:03

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability

On 7/16/20 8:07 AM, Bjorn Helgaas wrote:
> On Wed, Jul 15, 2020 at 07:55:11PM -0700, Randy Dunlap wrote:
>> On 7/13/20 11:23 PM, David E. Box wrote:
>>> Add PCIe DVSEC extended capability ID and defines for the header offsets.
>>> Defined in PCIe r5.0, sec 7.9.6.
>>>
>>> Signed-off-by: David E. Box <[email protected]>
>>> Acked-by: Bjorn Helgaas <[email protected]>
>>> ---
>>> include/uapi/linux/pci_regs.h | 5 +++++
>>> 1 file changed, 5 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
>>> index f9701410d3b5..09daa9f07b6b 100644
>>> --- a/include/uapi/linux/pci_regs.h
>>> +++ b/include/uapi/linux/pci_regs.h
>>> @@ -720,6 +720,7 @@
>>> +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
>>> @@ -1062,6 +1063,10 @@
>>> +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
>>> +#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
>>> +#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */
>>
>> Just a little comment: It would make more sense to me to
>> s/DVSEC/DVSPEC/g.
>
> Yeah, that is confusing, but "DVSEC" is the term used in the spec. I
> think it stands for "Designated Vendor-Specific Extended Capability".

Right. I noticed that after I sent the email.

thanks.
--
~Randy

2020-07-16 17:21:00

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability



On 7/15/2020 7:55 PM, Randy Dunlap wrote:
> On 7/13/20 11:23 PM, David E. Box wrote:
>> Add PCIe DVSEC extended capability ID and defines for the header offsets.
>> Defined in PCIe r5.0, sec 7.9.6.
>>
>> Signed-off-by: David E. Box <[email protected]>
>> Acked-by: Bjorn Helgaas <[email protected]>
>> ---
>> include/uapi/linux/pci_regs.h | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
>> index f9701410d3b5..09daa9f07b6b 100644
>> --- a/include/uapi/linux/pci_regs.h
>> +++ b/include/uapi/linux/pci_regs.h
>> @@ -720,6 +720,7 @@
>> +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
>> @@ -1062,6 +1063,10 @@
>> +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
>> +#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific Header1 */
>> +#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific Header2 */
>
> Just a little comment: It would make more sense to me to
> s/DVSEC/DVSPEC/g.
>
> But then I don't have the PCIe documentation.

Arguably some of the confusion might be from the patch title. DVSEC is
acronym for Designated Vendor-Specific Extended Capability if I recall
correctly. It would probably be best to call that out since the extended
implies it lives in the config space accessible via the memory mapped
config.

2020-07-16 18:34:19

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V3 1/3] PCI: Add defines for Designated Vendor-Specific Capability

On Thu, 2020-07-16 at 10:18 -0700, Alexander Duyck wrote:
>
> On 7/15/2020 7:55 PM, Randy Dunlap wrote:
> > On 7/13/20 11:23 PM, David E. Box wrote:
> > > Add PCIe DVSEC extended capability ID and defines for the header
> > > offsets.
> > > Defined in PCIe r5.0, sec 7.9.6.
> > >
> > > Signed-off-by: David E. Box <[email protected]>
> > > Acked-by: Bjorn Helgaas <[email protected]>
> > > ---
> > > include/uapi/linux/pci_regs.h | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/include/uapi/linux/pci_regs.h
> > > b/include/uapi/linux/pci_regs.h
> > > index f9701410d3b5..09daa9f07b6b 100644
> > > --- a/include/uapi/linux/pci_regs.h
> > > +++ b/include/uapi/linux/pci_regs.h
> > > @@ -720,6 +720,7 @@
> > > +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-
> > > Specific */
> > > @@ -1062,6 +1063,10 @@
> > > +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
> > > +#define PCI_DVSEC_HEADER1 0x4 /* Vendor-Specific
> > > Header1 */
> > > +#define PCI_DVSEC_HEADER2 0x8 /* Vendor-Specific
> > > Header2 */

These comments I'll fix to say "Designated Vendor-Specific"

> >
> > Just a little comment: It would make more sense to me to
> > s/DVSEC/DVSPEC/g.
> >
> > But then I don't have the PCIe documentation.
>
> Arguably some of the confusion might be from the patch title. DVSEC
> is
> acronym for Designated Vendor-Specific Extended Capability if I
> recall
> correctly. It would probably be best to call that out since the
> extended
> implies it lives in the config space accessible via the memory
> mapped
> config.

I'll change the patch title as well, but agree DVSEC is better as it's
consistent with the spec.

Thanks

David

2020-07-17 19:06:02

by David E. Box

[permalink] [raw]
Subject: [PATCH V4 1/3] PCI: Add defines for Designated Vendor-Specific Extended Capability

Add PCIe Designated Vendor-Specific Extended Capability (DVSEC) and defines
for the header offsets. Defined in PCIe r5.0, sec 7.9.6.

Signed-off-by: David E. Box <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
---
include/uapi/linux/pci_regs.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index f9701410d3b5..beafeee39e44 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -720,6 +720,7 @@
#define PCI_EXT_CAP_ID_DPC 0x1D /* Downstream Port Containment */
#define PCI_EXT_CAP_ID_L1SS 0x1E /* L1 PM Substates */
#define PCI_EXT_CAP_ID_PTM 0x1F /* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
#define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */
#define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */
#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
@@ -1062,6 +1063,10 @@
#define PCI_L1SS_CTL1_LTR_L12_TH_SCALE 0xe0000000 /* LTR_L1.2_THRESHOLD_Scale */
#define PCI_L1SS_CTL2 0x0c /* Control 2 Register */

+/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
+#define PCI_DVSEC_HEADER1 0x4 /* Designated Vendor-Specific Header1 */
+#define PCI_DVSEC_HEADER2 0x8 /* Designated Vendor-Specific Header2 */
+
/* Data Link Feature */
#define PCI_DLF_CAP 0x04 /* Capabilities Register */
#define PCI_DLF_EXCHANGE_ENABLE 0x80000000 /* Data Link Feature Exchange Enable */
--
2.20.1

2020-07-17 19:06:09

by David E. Box

[permalink] [raw]
Subject: [PATCH V4 0/3] Intel Platform Monitoring Technology

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.
With customers increasingly asking for hardware telemetry, engineers not
only have to figure out how to measure and collect data, but also how to
deliver it and make it discoverable. The latter may be through some device
specific method requiring device specific tools to collect the data. This
in turn requires customers to manage a suite of different tools in order to
collect the differing assortment of monitoring data on their systems. Even
when such information can be provided in kernel drivers, they may require
constant maintenance to update register mappings as they change with
firmware updates and new versions of hardware. PMT provides a solution for
discovering and reading telemetry from a device through a hardware agnostic
framework that allows for updates to systems without requiring patches to
the kernel or software tools.

PMT defines several capabilities to support collecting monitoring data from
hardware. All are discoverable as separate instances of the PCIE Designated
Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
field uniquely identifies the capability. Each DVSEC also provides a BAR
offset to a header that defines capability-specific attributes, including
GUID, feature type, offset and length, as well as configuration settings
where applicable. The GUID uniquely identifies the register space of any
monitor data exposed by the capability. The GUID is associated with an XML
file from the vendor that describes the mapping of the register space along
with properties of the monitor data. This allows vendors to perform
firmware updates that can change the mapping (e.g. add new metrics) without
requiring any changes to drivers or software tools. The new mapping is
confirmed by an updated GUID, read from the hardware, which software uses
with a new XML.

The current capabilities defined by PMT are Telemetry, Watcher, and
Crashlog. The Telemetry capability provides access to a continuous block
of read only data. The Watcher capability provides access to hardware
sampling and tracing features. Crashlog provides access to device crash
dumps. While there is some relationship between capabilities (Watcher can
be configured to sample from the Telemetry data set) each exists as stand
alone features with no dependency on any other. The design therefore splits
them into individual, capability specific drivers. MFD is used to create
platform devices for each capability so that they may be managed by their
own driver. The PMT architecture is (for the most part) agnostic to the
type of device it can collect from. Devices nodes are consequently generic
in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
creates a class to manage the list of devices supporting it. Software can
determine which devices support a PMT feature by searching through each
device node entry in the sysfs class folder. It can additionally determine
if a particular device supports a PMT feature by checking for a PMT class
folder in the device folder.

This patch set provides support for the PMT framework, along with support
for Telemetry on Tiger Lake.

Changes from V3:
- Write out full acronym for DVSEC in PCI patch commit message and
add 'Designated' to comments
- remove unused variable caught by kernel test robot <[email protected]>
- Add required Co-developed-by signoffs, noted by Andy
- Allow access using new CAP_PERFMON capability as suggested by
Alexey Bundankov
- Fix spacing in Kconfig, noted by Randy
- Other style changes and fixups suggested by Andy

Changes from V2:
- In order to handle certain HW bugs from the telemetry capability
driver, create a single platform device per capability instead of
a device per entry. Add the entry data as device resources and
let the capability driver manage them as a set allowing for
cleaner HW bug resolution.
- Handle discovery table offset bug in intel_pmt.c
- Handle overlapping regions in intel_pmt_telemetry.c
- Add description of sysfs class to testing ABI.
- Don't check size and count until confirming support for the PMT
capability to avoid bailing out when we need to skip it.
- Remove unneeded header file. Move code to the intel_pmt.c, the
only place where it's needed.
- Remove now unused platform data.
- Add missing header files types.h, bits.h.
- Rename file name and build options from telem to telemetry.
- Code cleanup suggested by Andy S.
- x86 mailing list added.

Changes from V1:
- In the telemetry driver, set the device in device_create() to
the parent PCI device (the monitoring device) for clear
association in sysfs. Was set before to the platform device
created by the PCI parent.
- Move telem struct into driver and delete unneeded header file.
- Start telem device numbering from 0 instead of 1. 1 was used
due to anticipated changes, no longer needed.
- Use helper macros suggested by Andy S.
- Rename class to pmt_telemetry, spelling out full name
- Move monitor device name defines to common header
- Coding style, spelling, and Makefile/MAINTAINERS ordering fixes

David E. Box (3):
PCI: Add defines for Designated Vendor-Specific Extended Capability
mfd: Intel Platform Monitoring Technology support
platform/x86: Intel PMT Telemetry capability driver

.../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
MAINTAINERS | 6 +
drivers/mfd/Kconfig | 10 +
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 215 +++++++++
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 448 ++++++++++++++++++
include/uapi/linux/pci_regs.h | 5 +
9 files changed, 742 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

--
2.20.1

2020-07-17 19:06:37

by David E. Box

[permalink] [raw]
Subject: [PATCH V4 2/3] mfd: Intel Platform Monitoring Technology support

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring facilities. PMT supports
multiple types of monitoring capabilities. This driver creates platform
devices for each type so that they may be managed by capability specific
drivers (to be introduced). Capabilities are discovered using PCIe DVSEC
ids. Support is included for the 3 current capability types, Telemetry,
Watcher, and Crashlog. The features are available on new Intel platforms
starting from Tiger Lake for which support is added.

Also add a quirk mechanism for several early hardware differences and bugs.
For Tiger Lake, do not support Watcher and Crashlog capabilities since they
will not be compatible with future product. Also, fix use a quirk to fix
the discovery table offset.

Reviewed-by: Andy Shevchenko <[email protected]>
Co-developed-by: Alexander Duyck <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David E. Box <[email protected]>
---
MAINTAINERS | 5 +
drivers/mfd/Kconfig | 10 ++
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 215 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 231 insertions(+)
create mode 100644 drivers/mfd/intel_pmt.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b4a43a9e7fbc..2e42bf0c41ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8845,6 +8845,11 @@ F: drivers/mfd/intel_soc_pmic*
F: include/linux/mfd/intel_msic.h
F: include/linux/mfd/intel_soc_pmic*

+INTEL PMT DRIVER
+M: "David E. Box" <[email protected]>
+S: Maintained
+F: drivers/mfd/intel_pmt.c
+
INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
L: [email protected]
diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index a37d7d171382..1a62ce2c68d9 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -670,6 +670,16 @@ config MFD_INTEL_PMC_BXT
Register and P-unit access. In addition this creates devices
for iTCO watchdog and telemetry that are part of the PMC.

+config MFD_INTEL_PMT
+ tristate "Intel Platform Monitoring Technology support"
+ depends on PCI
+ select MFD_CORE
+ help
+ The Intel Platform Monitoring Technology (PMT) is an interface that
+ provides access to hardware monitor registers. This driver supports
+ Telemetry, Watcher, and Crashlog PMT capabilities/devices for
+ platforms starting from Tiger Lake.
+
config MFD_IPAQ_MICRO
bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
depends on SA1100_H3100 || SA1100_H3600
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 9367a92f795a..1961b4737985 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -216,6 +216,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS_PCI) += intel-lpss-pci.o
obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
obj-$(CONFIG_MFD_INTEL_PMC_BXT) += intel_pmc_bxt.o
+obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
obj-$(CONFIG_MFD_PALMAS) += palmas.o
obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
new file mode 100644
index 000000000000..6857eaf4ff86
--- /dev/null
+++ b/drivers/mfd/intel_pmt.c
@@ -0,0 +1,215 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitoring Technology MFD driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Authors: David E. Box <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/mfd/core.h>
+#include <linux/types.h>
+
+/* Intel DVSEC capability vendor space offsets */
+#define INTEL_DVSEC_ENTRIES 0xA
+#define INTEL_DVSEC_SIZE 0xB
+#define INTEL_DVSEC_TABLE 0xC
+#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
+#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) & GENMASK(31, 3))
+#define INTEL_DVSEC_ENTRY_SIZE 4
+
+/* PMT capabilities */
+#define DVSEC_INTEL_ID_TELEMETRY 2
+#define DVSEC_INTEL_ID_WATCHER 3
+#define DVSEC_INTEL_ID_CRASHLOG 4
+
+#define TELEMETRY_DEV_NAME "pmt_telemetry"
+#define WATCHER_DEV_NAME "pmt_watcher"
+#define CRASHLOG_DEV_NAME "pmt_crashlog"
+
+struct intel_dvsec_header {
+ u16 length;
+ u16 id;
+ u8 num_entries;
+ u8 entry_size;
+ u8 tbir;
+ u32 offset;
+};
+
+enum pmt_quirks {
+ /* Watcher capability not supported */
+ PMT_QUIRK_NO_WATCHER = BIT(0),
+
+ /* Crashlog capability not supported */
+ PMT_QUIRK_NO_CRASHLOG = BIT(1),
+
+ /* Use shift instead of mask to read discovery table offset */
+ PMT_QUIRK_TABLE_SHIFT = BIT(2),
+};
+
+struct pmt_platform_info {
+ unsigned long quirks;
+};
+
+static const struct pmt_platform_info tgl_info = {
+ .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG |
+ PMT_QUIRK_TABLE_SHIFT,
+};
+
+static int
+pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header *header,
+ struct pmt_platform_info *info)
+{
+ struct device *dev = &pdev->dev;
+ struct resource *res, *tmp;
+ struct mfd_cell *cell;
+ const char *name;
+ int count = header->num_entries;
+ int size = header->entry_size;
+ int i;
+
+ switch (header->id) {
+ case DVSEC_INTEL_ID_TELEMETRY:
+ name = TELEMETRY_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_WATCHER:
+ if (info->quirks & PMT_QUIRK_NO_WATCHER) {
+ dev_info(dev, "Watcher not supported\n");
+ return 0;
+ }
+ name = WATCHER_DEV_NAME;
+ break;
+ case DVSEC_INTEL_ID_CRASHLOG:
+ if (info->quirks & PMT_QUIRK_NO_CRASHLOG) {
+ dev_info(dev, "Crashlog not supported\n");
+ return 0;
+ }
+ name = CRASHLOG_DEV_NAME;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (!header->num_entries || !header->entry_size) {
+ dev_warn(dev, "Invalid count or size for %s header\n", name);
+ return -EINVAL;
+ }
+
+ cell = devm_kzalloc(dev, sizeof(*cell), GFP_KERNEL);
+ if (!cell)
+ return -ENOMEM;
+
+ res = devm_kcalloc(dev, count, sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ if (info->quirks & PMT_QUIRK_TABLE_SHIFT)
+ header->offset >>= 3;
+
+ for (i = 0, tmp = res; i < count; i++, tmp++) {
+ tmp->start = pdev->resource[header->tbir].start +
+ header->offset + i * (size << 2);
+ tmp->end = tmp->start + (size << 2) - 1;
+ tmp->flags = IORESOURCE_MEM;
+ }
+
+ cell->resources = res;
+ cell->num_resources = count;
+ cell->name = name;
+
+ return devm_mfd_add_devices(dev, PLATFORM_DEVID_AUTO, cell, 1, NULL, 0,
+ NULL);
+}
+
+static int
+pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct intel_dvsec_header header;
+ struct pmt_platform_info *info;
+ bool found_devices = false;
+ int ret, pos = 0;
+ u32 table;
+ u16 vid;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
+ GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
+ while (pos) {
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
+ if (vid != PCI_VENDOR_ID_INTEL)
+ continue;
+
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
+ &header.id);
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
+ &header.num_entries);
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
+ &header.entry_size);
+ pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
+ &table);
+
+ header.tbir = INTEL_DVSEC_TABLE_BAR(table);
+ header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
+
+ ret = pmt_add_dev(pdev, &header, info);
+ if (ret)
+ dev_warn(&pdev->dev,
+ "Failed to add devices for DVSEC id %d\n",
+ header.id);
+ found_devices = true;
+
+ pos = pci_find_next_ext_capability(pdev, pos,
+ PCI_EXT_CAP_ID_DVSEC);
+ }
+
+ if (!found_devices) {
+ dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
+ return -ENODEV;
+ }
+
+ pm_runtime_put(&pdev->dev);
+ pm_runtime_allow(&pdev->dev);
+
+ return 0;
+}
+
+static void pmt_pci_remove(struct pci_dev *pdev)
+{
+ pm_runtime_forbid(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);
+}
+
+#define PCI_DEVICE_ID_INTEL_PMT_TGL 0x9a0d
+
+static const struct pci_device_id pmt_pci_ids[] = {
+ { PCI_DEVICE_DATA(INTEL, PMT_TGL, &tgl_info) },
+ { }
+};
+MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
+
+static struct pci_driver pmt_pci_driver = {
+ .name = "intel-pmt",
+ .id_table = pmt_pci_ids,
+ .probe = pmt_pci_probe,
+ .remove = pmt_pci_remove,
+};
+module_pci_driver(pmt_pci_driver);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel Platform Monitoring Technology MFD driver");
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-07-17 19:06:57

by David E. Box

[permalink] [raw]
Subject: [PATCH V4 3/3] platform/x86: Intel PMT Telemetry capability driver

PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
The Telemetry capability provides access to device telemetry metrics that
provide hardware performance data to users from continuous, memory mapped,
read-only register spaces.

Register mappings are not provided by the driver. Instead, a GUID is read
from a header for each endpoint. The GUID identifies the device and is to
be used with an XML, provided by the vendor, to discover the available set
of metrics and their register mapping. This allows firmware updates to
modify the register space without needing to update the driver every time
with new mappings. Firmware writes a new GUID in this case to specify the
new mapping. Software tools with access to the associated XML file can
then interpret the changes.

The module manages access to all PMT Telemetry endpoints on a system,
independent of the device exporting them. It creates a pmt_telemetry class
to manage the devices. For each telemetry endpoint, sysfs files provide
GUID and size information as well as a pointer to the parent device the
telemetry came from. Software may discover the association between
endpoints and devices by iterating through the list in sysfs, or by looking
for the existence of the class folder under the device of interest. A
device node of the same name allows software to then map the telemetry
space for direct access.

Also create a PCI device id list for early telemetry hardware that require
workarounds for known issues.

Reviewed-by: Andy Shevchenko <[email protected]>
Co-developed-by: Alexander Duyck <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David E. Box <[email protected]>
---
.../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
MAINTAINERS | 1 +
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 448 ++++++++++++++++++
5 files changed, 506 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

diff --git a/Documentation/ABI/testing/sysfs-class-pmt_telemetry b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
new file mode 100644
index 000000000000..b0b096db9cae
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
@@ -0,0 +1,46 @@
+What: /sys/class/pmt_telemetry/
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ The pmt_telemetry/ class directory contains information for
+ devices that expose hardware telemetry using Intel Platform
+ Monitoring Technology (PMT)
+
+What: /sys/class/pmt_telemetry/telem<x>
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ The telem<x> directory contains files describing an instance of
+ a PMT telemetry device that exposes hardware telemetry. Each
+ telem<x> directory has an associated /dev/telem<x> node. This
+ node may be opened and mapped to access the telemetry space of
+ the device. The register layout of the telemetry space is
+ determined from an XML file that matches the PCI device id and
+ GUID for the device.
+
+What: /sys/class/pmt_telemetry/telem<x>/guid
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The GUID for this telemetry device. The GUID identifies
+ the version of the XML file for the parent device that is to
+ be used to get the register layout.
+
+What: /sys/class/pmt_telemetry/telem<x>/size
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The size of telemetry region in bytes that corresponds to
+ the mapping size for the /dev/telem<x> device node.
+
+What: /sys/class/pmt_telemetry/telem<x>/offset
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The offset of telemetry region in bytes that corresponds to
+ the mapping for the /dev/telem<x> device node.
diff --git a/MAINTAINERS b/MAINTAINERS
index 2e42bf0c41ab..ebc145894abd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8849,6 +8849,7 @@ INTEL PMT DRIVER
M: "David E. Box" <[email protected]>
S: Maintained
F: drivers/mfd/intel_pmt.c
+F: drivers/platform/x86/intel_pmt_*

INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 0581a54cf562..8552b094d005 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1339,6 +1339,16 @@ config INTEL_PMC_CORE
- LTR Ignore
- MPHY/PLL gating status (Sunrisepoint PCH only)

+config INTEL_PMT_TELEMETRY
+ tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
+ help
+ The Intel Platform Monitory Technology (PMT) Telemetry driver provides
+ access to hardware telemetry metrics on devices that support the
+ feature.
+
+ For more information, see
+ <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
+
config INTEL_PUNIT_IPC
tristate "Intel P-Unit IPC Driver"
help
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 2b85852a1a87..95cd3d0be17f 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -139,6 +139,7 @@ obj-$(CONFIG_INTEL_MFLD_THERMAL) += intel_mid_thermal.o
obj-$(CONFIG_INTEL_MID_POWER_BUTTON) += intel_mid_powerbtn.o
obj-$(CONFIG_INTEL_MRFLD_PWRBTN) += intel_mrfld_pwrbtn.o
obj-$(CONFIG_INTEL_PMC_CORE) += intel_pmc_core.o intel_pmc_core_pltdrv.o
+obj-$(CONFIG_INTEL_PMT_TELEMETRY) += intel_pmt_telemetry.o
obj-$(CONFIG_INTEL_PUNIT_IPC) += intel_punit_ipc.o
obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
obj-$(CONFIG_INTEL_SCU_PCI) += intel_scu_pcidrv.o
diff --git a/drivers/platform/x86/intel_pmt_telemetry.c b/drivers/platform/x86/intel_pmt_telemetry.c
new file mode 100644
index 000000000000..544a84d72cf7
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telemetry.c
@@ -0,0 +1,448 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitory Technology Telemetry driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: "David E. Box" <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/cdev.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/xarray.h>
+
+#define TELEM_DEV_NAME "pmt_telemetry"
+
+/* Telemetry access types */
+#define TELEM_ACCESS_FUTURE 1
+#define TELEM_ACCESS_BARID 2
+#define TELEM_ACCESS_LOCAL 3
+
+#define TELEM_GUID_OFFSET 0x4
+#define TELEM_BASE_OFFSET 0x8
+#define TELEM_TBIR_MASK GENMASK(2, 0)
+#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
+#define TELEM_TYPE(v) (((v) & GENMASK(7, 4)) >> 4)
+/* size is in bytes */
+#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
+
+#define TELEM_XA_START 0
+#define TELEM_XA_MAX INT_MAX
+#define TELEM_XA_LIMIT XA_LIMIT(TELEM_XA_START, TELEM_XA_MAX)
+
+/* Used by client hardware to identify a fixed telemetry entry*/
+#define TELEM_CLIENT_FIXED_BLOCK_GUID 0x10000000
+
+static DEFINE_XARRAY_ALLOC(telem_array);
+
+struct pmt_telem_priv;
+
+struct telem_header {
+ u8 access_type;
+ u8 telem_type;
+ u16 size;
+ u32 guid;
+ u32 base_offset;
+ u8 tbir;
+};
+
+struct pmt_telem_entry {
+ struct pmt_telem_priv *priv;
+ struct telem_header header;
+ struct resource *header_res;
+ unsigned long base_addr;
+ void __iomem *disc_table;
+ struct cdev cdev;
+ dev_t devt;
+ int devid;
+};
+
+struct pmt_telem_priv {
+ struct pmt_telem_entry *entry;
+ int num_entries;
+ struct device *dev;
+};
+
+/*
+ * devfs
+ */
+static int pmt_telem_open(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv;
+ struct pmt_telem_entry *entry;
+ struct pci_driver *pci_drv;
+ struct pci_dev *pci_dev;
+
+ if (!perfmon_capable())
+ return -EPERM;
+
+ entry = container_of(inode->i_cdev, struct pmt_telem_entry, cdev);
+ priv = entry->priv;
+ pci_dev = to_pci_dev(priv->dev->parent);
+
+ pci_drv = pci_dev_driver(pci_dev);
+ if (!pci_drv)
+ return -ENODEV;
+
+ filp->private_data = entry;
+ get_device(&pci_dev->dev);
+
+ if (!try_module_get(pci_drv->driver.owner)) {
+ put_device(&pci_dev->dev);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static int pmt_telem_release(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_entry *entry = filp->private_data;
+ struct pci_dev *pci_dev = to_pci_dev(entry->priv->dev->parent);
+ struct pci_driver *pci_drv = pci_dev_driver(pci_dev);
+
+ put_device(&pci_dev->dev);
+ module_put(pci_drv->driver.owner);
+
+ return 0;
+}
+
+static int pmt_telem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ struct pmt_telem_entry *entry = filp->private_data;
+ struct pmt_telem_priv *priv;
+ unsigned long vsize = vma->vm_end - vma->vm_start;
+ unsigned long phys = entry->base_addr;
+ unsigned long pfn = PFN_DOWN(phys);
+ unsigned long psize;
+
+ priv = entry->priv;
+ psize = (PFN_UP(entry->base_addr + entry->header.size) - pfn) * PAGE_SIZE;
+ if (vsize > psize) {
+ dev_err(priv->dev, "Requested mmap size is too large\n");
+ return -EINVAL;
+ }
+
+ if ((vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_MAYWRITE))
+ return -EPERM;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+ if (io_remap_pfn_range(vma, vma->vm_start, pfn, vsize,
+ vma->vm_page_prot))
+ return -EINVAL;
+
+ return 0;
+}
+
+static const struct file_operations pmt_telem_fops = {
+ .owner = THIS_MODULE,
+ .open = pmt_telem_open,
+ .mmap = pmt_telem_mmap,
+ .release = pmt_telem_release,
+};
+
+/*
+ * sysfs
+ */
+static ssize_t guid_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ return sprintf(buf, "0x%x\n", entry->header.guid);
+}
+static DEVICE_ATTR_RO(guid);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ /* Display buffer size in bytes */
+ return sprintf(buf, "%u\n", entry->header.size);
+}
+static DEVICE_ATTR_RO(size);
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ /* Display buffer offset in bytes */
+ return sprintf(buf, "%lu\n", offset_in_page(entry->base_addr));
+}
+static DEVICE_ATTR_RO(offset);
+
+static struct attribute *pmt_telem_attrs[] = {
+ &dev_attr_guid.attr,
+ &dev_attr_size.attr,
+ &dev_attr_offset.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(pmt_telem);
+
+static struct class pmt_telem_class = {
+ .owner = THIS_MODULE,
+ .name = "pmt_telemetry",
+ .dev_groups = pmt_telem_groups,
+};
+
+/*
+ * driver initialization
+ */
+static const struct pci_device_id pmt_telem_early_client_pci_ids[] = {
+ { PCI_VDEVICE(INTEL, 0x9a0d) }, /* TGL */
+ { }
+};
+
+static bool pmt_telem_is_early_client_hw(struct device *dev)
+{
+ struct pci_dev *parent = to_pci_dev(dev->parent);
+
+ return !!pci_match_id(pmt_telem_early_client_pci_ids, parent);
+}
+
+static int pmt_telem_create_dev(struct pmt_telem_priv *priv,
+ struct pmt_telem_entry *entry)
+{
+ struct pci_dev *pci_dev;
+ struct device *dev;
+ int ret;
+
+ cdev_init(&entry->cdev, &pmt_telem_fops);
+ ret = cdev_add(&entry->cdev, entry->devt, 1);
+ if (ret) {
+ dev_err(priv->dev, "Could not add char dev\n");
+ return ret;
+ }
+
+ pci_dev = to_pci_dev(priv->dev->parent);
+ dev = device_create(&pmt_telem_class, &pci_dev->dev, entry->devt,
+ entry, "telem%d", entry->devid);
+ if (IS_ERR(dev)) {
+ dev_err(priv->dev, "Could not create device node\n");
+ cdev_del(&entry->cdev);
+ }
+
+ return PTR_ERR_OR_ZERO(dev);
+}
+
+static void pmt_telem_populate_header(void __iomem *disc_offset,
+ struct telem_header *header)
+{
+ header->access_type = TELEM_ACCESS(readb(disc_offset));
+ header->telem_type = TELEM_TYPE(readb(disc_offset));
+ header->size = TELEM_SIZE(readl(disc_offset));
+ header->guid = readl(disc_offset + TELEM_GUID_OFFSET);
+ header->base_offset = readl(disc_offset + TELEM_BASE_OFFSET);
+
+ /*
+ * For non-local access types the lower 3 bits of base offset
+ * contains the index of the base address register where the
+ * telemetry can be found.
+ */
+ header->tbir = header->base_offset & TELEM_TBIR_MASK;
+ header->base_offset ^= header->tbir;
+}
+
+static int pmt_telem_add_entry(struct pmt_telem_priv *priv,
+ struct pmt_telem_entry *entry)
+{
+ struct resource *res = entry->header_res;
+ struct pci_dev *pci_dev = to_pci_dev(priv->dev->parent);
+ int ret;
+
+ pmt_telem_populate_header(entry->disc_table, &entry->header);
+
+ /* Local access and BARID only for now */
+ switch (entry->header.access_type) {
+ case TELEM_ACCESS_LOCAL:
+ if (entry->header.tbir) {
+ dev_err(priv->dev,
+ "Unsupported BAR index %d for access type %d\n",
+ entry->header.tbir, entry->header.access_type);
+ return -EINVAL;
+ }
+
+ /*
+ * For access_type LOCAL, the base address is as follows:
+ * base address = header address + header length + base offset
+ */
+ entry->base_addr = res->start + resource_size(res) +
+ entry->header.base_offset;
+ break;
+
+ case TELEM_ACCESS_BARID:
+ entry->base_addr = pci_dev->resource[entry->header.tbir].start +
+ entry->header.base_offset;
+ break;
+
+ default:
+ dev_err(priv->dev, "Unsupported access type %d\n",
+ entry->header.access_type);
+ return -EINVAL;
+ }
+
+ ret = alloc_chrdev_region(&entry->devt, 0, 1, TELEM_DEV_NAME);
+ if (ret) {
+ dev_err(priv->dev,
+ "PMT telemetry chrdev_region error: %d\n", ret);
+ return ret;
+ }
+
+ ret = xa_alloc(&telem_array, &entry->devid, entry, TELEM_XA_LIMIT,
+ GFP_KERNEL);
+ if (ret)
+ goto fail_xa_alloc;
+
+ ret = pmt_telem_create_dev(priv, entry);
+ if (ret)
+ goto fail_create_dev;
+
+ entry->priv = priv;
+ priv->num_entries++;
+ return 0;
+
+fail_create_dev:
+ xa_erase(&telem_array, entry->devid);
+fail_xa_alloc:
+ unregister_chrdev_region(entry->devt, 1);
+
+ return ret;
+}
+
+static bool pmt_telem_region_overlaps(struct platform_device *pdev,
+ void __iomem *disc_table)
+{
+ u32 guid;
+
+ guid = readl(disc_table + TELEM_GUID_OFFSET);
+
+ return guid == TELEM_CLIENT_FIXED_BLOCK_GUID;
+}
+
+static void pmt_telem_remove_entries(struct pmt_telem_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->num_entries; i++) {
+ device_destroy(&pmt_telem_class, priv->entry[i].devt);
+ cdev_del(&priv->entry[i].cdev);
+ xa_erase(&telem_array, priv->entry[i].devid);
+ unregister_chrdev_region(priv->entry[i].devt, 1);
+ }
+}
+
+static int pmt_telem_probe(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv;
+ struct pmt_telem_entry *entry;
+ bool early_hw;
+ int i;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, priv);
+ priv->dev = &pdev->dev;
+
+ priv->entry = devm_kcalloc(&pdev->dev, pdev->num_resources,
+ sizeof(struct pmt_telem_entry), GFP_KERNEL);
+ if (!priv->entry)
+ return -ENOMEM;
+
+ if (pmt_telem_is_early_client_hw(&pdev->dev))
+ early_hw = true;
+
+ for (i = 0, entry = priv->entry; i < pdev->num_resources;
+ i++, entry++) {
+ int ret;
+
+ entry->header_res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+ if (!entry->header_res) {
+ pmt_telem_remove_entries(priv);
+ return -ENODEV;
+ }
+
+ entry->disc_table = devm_platform_ioremap_resource(pdev, i);
+ if (IS_ERR(entry->disc_table)) {
+ pmt_telem_remove_entries(priv);
+ return PTR_ERR(entry->disc_table);
+ }
+
+ if (pmt_telem_region_overlaps(pdev, entry->disc_table) &&
+ early_hw)
+ continue;
+
+ ret = pmt_telem_add_entry(priv, entry);
+ if (ret) {
+ pmt_telem_remove_entries(priv);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+static int pmt_telem_remove(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
+
+ pmt_telem_remove_entries(priv);
+
+ return 0;
+}
+
+static const struct platform_device_id pmt_telem_table[] = {
+ {
+ .name = "pmt_telemetry",
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(platform, pmt_telem_table);
+
+static struct platform_driver pmt_telem_driver = {
+ .driver = {
+ .name = TELEM_DEV_NAME,
+ },
+ .probe = pmt_telem_probe,
+ .remove = pmt_telem_remove,
+ .id_table = pmt_telem_table,
+};
+
+static int __init pmt_telem_init(void)
+{
+ int ret = class_register(&pmt_telem_class);
+
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&pmt_telem_driver);
+ if (ret)
+ class_unregister(&pmt_telem_class);
+
+ return ret;
+}
+module_init(pmt_telem_init);
+
+static void __exit pmt_telem_exit(void)
+{
+ platform_driver_unregister(&pmt_telem_driver);
+ class_unregister(&pmt_telem_class);
+ xa_destroy(&telem_array);
+}
+module_exit(pmt_telem_exit);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel PMT Telemetry driver");
+MODULE_ALIAS("platform:" TELEM_DEV_NAME);
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-07-17 20:14:36

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH V4 1/3] PCI: Add defines for Designated Vendor-Specific Extended Capability

On Fri, Jul 17, 2020 at 10:05 PM David E. Box
<[email protected]> wrote:
>
> Add PCIe Designated Vendor-Specific Extended Capability (DVSEC) and defines
> for the header offsets. Defined in PCIe r5.0, sec 7.9.6.
>

FWIW,
Reviewed-by: Andy Shevchenko <[email protected]>

> Signed-off-by: David E. Box <[email protected]>
> Acked-by: Bjorn Helgaas <[email protected]>
> ---
> include/uapi/linux/pci_regs.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index f9701410d3b5..beafeee39e44 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -720,6 +720,7 @@
> #define PCI_EXT_CAP_ID_DPC 0x1D /* Downstream Port Containment */
> #define PCI_EXT_CAP_ID_L1SS 0x1E /* L1 PM Substates */
> #define PCI_EXT_CAP_ID_PTM 0x1F /* Precision Time Measurement */
> +#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
> #define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */
> #define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */
> #define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
> @@ -1062,6 +1063,10 @@
> #define PCI_L1SS_CTL1_LTR_L12_TH_SCALE 0xe0000000 /* LTR_L1.2_THRESHOLD_Scale */
> #define PCI_L1SS_CTL2 0x0c /* Control 2 Register */
>
> +/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
> +#define PCI_DVSEC_HEADER1 0x4 /* Designated Vendor-Specific Header1 */
> +#define PCI_DVSEC_HEADER2 0x8 /* Designated Vendor-Specific Header2 */
> +
> /* Data Link Feature */
> #define PCI_DLF_CAP 0x04 /* Capabilities Register */
> #define PCI_DLF_EXCHANGE_ENABLE 0x80000000 /* Data Link Feature Exchange Enable */
> --
> 2.20.1
>


--
With Best Regards,
Andy Shevchenko

2020-07-27 11:00:40

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH V4 0/3] Intel Platform Monitoring Technology

On Fri, Jul 17, 2020 at 10:05 PM David E. Box
<[email protected]> wrote:
>
> Intel Platform Monitoring Technology (PMT) is an architecture for
> enumerating and accessing hardware monitoring capabilities on a device.
> With customers increasingly asking for hardware telemetry, engineers not
> only have to figure out how to measure and collect data, but also how to
> deliver it and make it discoverable. The latter may be through some device
> specific method requiring device specific tools to collect the data. This
> in turn requires customers to manage a suite of different tools in order to
> collect the differing assortment of monitoring data on their systems. Even
> when such information can be provided in kernel drivers, they may require
> constant maintenance to update register mappings as they change with
> firmware updates and new versions of hardware. PMT provides a solution for
> discovering and reading telemetry from a device through a hardware agnostic
> framework that allows for updates to systems without requiring patches to
> the kernel or software tools.
>
> PMT defines several capabilities to support collecting monitoring data from
> hardware. All are discoverable as separate instances of the PCIE Designated
> Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
> field uniquely identifies the capability. Each DVSEC also provides a BAR
> offset to a header that defines capability-specific attributes, including
> GUID, feature type, offset and length, as well as configuration settings
> where applicable. The GUID uniquely identifies the register space of any
> monitor data exposed by the capability. The GUID is associated with an XML
> file from the vendor that describes the mapping of the register space along
> with properties of the monitor data. This allows vendors to perform
> firmware updates that can change the mapping (e.g. add new metrics) without
> requiring any changes to drivers or software tools. The new mapping is
> confirmed by an updated GUID, read from the hardware, which software uses
> with a new XML.
>
> The current capabilities defined by PMT are Telemetry, Watcher, and
> Crashlog. The Telemetry capability provides access to a continuous block
> of read only data. The Watcher capability provides access to hardware
> sampling and tracing features. Crashlog provides access to device crash
> dumps. While there is some relationship between capabilities (Watcher can
> be configured to sample from the Telemetry data set) each exists as stand
> alone features with no dependency on any other. The design therefore splits
> them into individual, capability specific drivers. MFD is used to create
> platform devices for each capability so that they may be managed by their
> own driver. The PMT architecture is (for the most part) agnostic to the
> type of device it can collect from. Devices nodes are consequently generic
> in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
> creates a class to manage the list of devices supporting it. Software can
> determine which devices support a PMT feature by searching through each
> device node entry in the sysfs class folder. It can additionally determine
> if a particular device supports a PMT feature by checking for a PMT class
> folder in the device folder.
>
> This patch set provides support for the PMT framework, along with support
> for Telemetry on Tiger Lake.
>

I assume this goes thru MFD tree.

> Changes from V3:
> - Write out full acronym for DVSEC in PCI patch commit message and
> add 'Designated' to comments
> - remove unused variable caught by kernel test robot <[email protected]>
> - Add required Co-developed-by signoffs, noted by Andy
> - Allow access using new CAP_PERFMON capability as suggested by
> Alexey Bundankov
> - Fix spacing in Kconfig, noted by Randy
> - Other style changes and fixups suggested by Andy
>
> Changes from V2:
> - In order to handle certain HW bugs from the telemetry capability
> driver, create a single platform device per capability instead of
> a device per entry. Add the entry data as device resources and
> let the capability driver manage them as a set allowing for
> cleaner HW bug resolution.
> - Handle discovery table offset bug in intel_pmt.c
> - Handle overlapping regions in intel_pmt_telemetry.c
> - Add description of sysfs class to testing ABI.
> - Don't check size and count until confirming support for the PMT
> capability to avoid bailing out when we need to skip it.
> - Remove unneeded header file. Move code to the intel_pmt.c, the
> only place where it's needed.
> - Remove now unused platform data.
> - Add missing header files types.h, bits.h.
> - Rename file name and build options from telem to telemetry.
> - Code cleanup suggested by Andy S.
> - x86 mailing list added.
>
> Changes from V1:
> - In the telemetry driver, set the device in device_create() to
> the parent PCI device (the monitoring device) for clear
> association in sysfs. Was set before to the platform device
> created by the PCI parent.
> - Move telem struct into driver and delete unneeded header file.
> - Start telem device numbering from 0 instead of 1. 1 was used
> due to anticipated changes, no longer needed.
> - Use helper macros suggested by Andy S.
> - Rename class to pmt_telemetry, spelling out full name
> - Move monitor device name defines to common header
> - Coding style, spelling, and Makefile/MAINTAINERS ordering fixes
>
> David E. Box (3):
> PCI: Add defines for Designated Vendor-Specific Extended Capability
> mfd: Intel Platform Monitoring Technology support
> platform/x86: Intel PMT Telemetry capability driver
>
> .../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
> MAINTAINERS | 6 +
> drivers/mfd/Kconfig | 10 +
> drivers/mfd/Makefile | 1 +
> drivers/mfd/intel_pmt.c | 215 +++++++++
> drivers/platform/x86/Kconfig | 10 +
> drivers/platform/x86/Makefile | 1 +
> drivers/platform/x86/intel_pmt_telemetry.c | 448 ++++++++++++++++++
> include/uapi/linux/pci_regs.h | 5 +
> 9 files changed, 742 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
> create mode 100644 drivers/mfd/intel_pmt.c
> create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c
>
> --
> 2.20.1
>


--
With Best Regards,
Andy Shevchenko

2020-07-27 18:50:17

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V4 0/3] Intel Platform Monitoring Technology

On Mon, 2020-07-27 at 13:23 +0300, Andy Shevchenko wrote:
> On Fri, Jul 17, 2020 at 10:05 PM David E. Box
> <[email protected]> wrote:
> > Intel Platform Monitoring Technology (PMT) is an architecture for
> > enumerating and accessing hardware monitoring capabilities on a
> > device.
> > With customers increasingly asking for hardware telemetry,
> > engineers not
> > only have to figure out how to measure and collect data, but also
> > how to
> > deliver it and make it discoverable. The latter may be through some
> > device
> > specific method requiring device specific tools to collect the
> > data. This
> > in turn requires customers to manage a suite of different tools in
> > order to
> > collect the differing assortment of monitoring data on their
> > systems. Even
> > when such information can be provided in kernel drivers, they may
> > require
> > constant maintenance to update register mappings as they change
> > with
> > firmware updates and new versions of hardware. PMT provides a
> > solution for
> > discovering and reading telemetry from a device through a hardware
> > agnostic
> > framework that allows for updates to systems without requiring
> > patches to
> > the kernel or software tools.
> >
> > PMT defines several capabilities to support collecting monitoring
> > data from
> > hardware. All are discoverable as separate instances of the PCIE
> > Designated
> > Vendor extended capability (DVSEC) with the Intel vendor code. The
> > DVSEC ID
> > field uniquely identifies the capability. Each DVSEC also provides
> > a BAR
> > offset to a header that defines capability-specific attributes,
> > including
> > GUID, feature type, offset and length, as well as configuration
> > settings
> > where applicable. The GUID uniquely identifies the register space
> > of any
> > monitor data exposed by the capability. The GUID is associated with
> > an XML
> > file from the vendor that describes the mapping of the register
> > space along
> > with properties of the monitor data. This allows vendors to perform
> > firmware updates that can change the mapping (e.g. add new metrics)
> > without
> > requiring any changes to drivers or software tools. The new mapping
> > is
> > confirmed by an updated GUID, read from the hardware, which
> > software uses
> > with a new XML.
> >
> > The current capabilities defined by PMT are Telemetry, Watcher, and
> > Crashlog. The Telemetry capability provides access to a continuous
> > block
> > of read only data. The Watcher capability provides access to
> > hardware
> > sampling and tracing features. Crashlog provides access to device
> > crash
> > dumps. While there is some relationship between capabilities
> > (Watcher can
> > be configured to sample from the Telemetry data set) each exists as
> > stand
> > alone features with no dependency on any other. The design
> > therefore splits
> > them into individual, capability specific drivers. MFD is used to
> > create
> > platform devices for each capability so that they may be managed by
> > their
> > own driver. The PMT architecture is (for the most part) agnostic to
> > the
> > type of device it can collect from. Devices nodes are consequently
> > generic
> > in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability
> > driver
> > creates a class to manage the list of devices supporting
> > it. Software can
> > determine which devices support a PMT feature by searching through
> > each
> > device node entry in the sysfs class folder. It can additionally
> > determine
> > if a particular device supports a PMT feature by checking for a PMT
> > class
> > folder in the device folder.
> >
> > This patch set provides support for the PMT framework, along with
> > support
> > for Telemetry on Tiger Lake.
> >
>
> I assume this goes thru MFD tree.

Yes, looking for pull by MFD. Thanks Andy.


2020-07-28 07:59:31

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] mfd: Intel Platform Monitoring Technology support

On Fri, 17 Jul 2020, David E. Box wrote:

> Intel Platform Monitoring Technology (PMT) is an architecture for
> enumerating and accessing hardware monitoring facilities. PMT supports
> multiple types of monitoring capabilities. This driver creates platform
> devices for each type so that they may be managed by capability specific
> drivers (to be introduced). Capabilities are discovered using PCIe DVSEC
> ids. Support is included for the 3 current capability types, Telemetry,
> Watcher, and Crashlog. The features are available on new Intel platforms
> starting from Tiger Lake for which support is added.
>
> Also add a quirk mechanism for several early hardware differences and bugs.
> For Tiger Lake, do not support Watcher and Crashlog capabilities since they
> will not be compatible with future product. Also, fix use a quirk to fix
> the discovery table offset.
>
> Reviewed-by: Andy Shevchenko <[email protected]>
> Co-developed-by: Alexander Duyck <[email protected]>
> Signed-off-by: Alexander Duyck <[email protected]>
> Signed-off-by: David E. Box <[email protected]>

This should be in chronological order.

> ---
> MAINTAINERS | 5 +
> drivers/mfd/Kconfig | 10 ++
> drivers/mfd/Makefile | 1 +
> drivers/mfd/intel_pmt.c | 215 ++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 231 insertions(+)
> create mode 100644 drivers/mfd/intel_pmt.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b4a43a9e7fbc..2e42bf0c41ab 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8845,6 +8845,11 @@ F: drivers/mfd/intel_soc_pmic*
> F: include/linux/mfd/intel_msic.h
> F: include/linux/mfd/intel_soc_pmic*
>
> +INTEL PMT DRIVER
> +M: "David E. Box" <[email protected]>
> +S: Maintained
> +F: drivers/mfd/intel_pmt.c
> +
> INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
> M: Stanislav Yakovlev <[email protected]>
> L: [email protected]
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index a37d7d171382..1a62ce2c68d9 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -670,6 +670,16 @@ config MFD_INTEL_PMC_BXT
> Register and P-unit access. In addition this creates devices
> for iTCO watchdog and telemetry that are part of the PMC.
>
> +config MFD_INTEL_PMT
> + tristate "Intel Platform Monitoring Technology support"

Nit: "Intel Platform Monitoring Technology (PMT) support"

> + depends on PCI
> + select MFD_CORE
> + help
> + The Intel Platform Monitoring Technology (PMT) is an interface that
> + provides access to hardware monitor registers. This driver supports
> + Telemetry, Watcher, and Crashlog PMT capabilities/devices for
> + platforms starting from Tiger Lake.
> +
> config MFD_IPAQ_MICRO
> bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
> depends on SA1100_H3100 || SA1100_H3600
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index 9367a92f795a..1961b4737985 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -216,6 +216,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS_PCI) += intel-lpss-pci.o
> obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
> obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
> obj-$(CONFIG_MFD_INTEL_PMC_BXT) += intel_pmc_bxt.o
> +obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
> obj-$(CONFIG_MFD_PALMAS) += palmas.o
> obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
> obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
> diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
> new file mode 100644
> index 000000000000..6857eaf4ff86
> --- /dev/null
> +++ b/drivers/mfd/intel_pmt.c
> @@ -0,0 +1,215 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel Platform Monitoring Technology MFD driver

s/MFD/(PMT)/

> + * Copyright (c) 2020, Intel Corporation.
> + * All Rights Reserved.
> + *
> + * Authors: David E. Box <[email protected]>

Looks odd to use a plural for a single author.

> + */
> +
> +#include <linux/bits.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/mfd/core.h>
> +#include <linux/types.h>

Alphabetical please.

> +/* Intel DVSEC capability vendor space offsets */
> +#define INTEL_DVSEC_ENTRIES 0xA
> +#define INTEL_DVSEC_SIZE 0xB
> +#define INTEL_DVSEC_TABLE 0xC
> +#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
> +#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) & GENMASK(31, 3))
> +#define INTEL_DVSEC_ENTRY_SIZE 4
> +
> +/* PMT capabilities */
> +#define DVSEC_INTEL_ID_TELEMETRY 2
> +#define DVSEC_INTEL_ID_WATCHER 3
> +#define DVSEC_INTEL_ID_CRASHLOG 4
> +
> +#define TELEMETRY_DEV_NAME "pmt_telemetry"
> +#define WATCHER_DEV_NAME "pmt_watcher"
> +#define CRASHLOG_DEV_NAME "pmt_crashlog"

Please don't define names of things. It makes grepping a pain, at the
very least. Just use the 'raw' string in-place.

> +struct intel_dvsec_header {
> + u16 length;
> + u16 id;
> + u8 num_entries;
> + u8 entry_size;
> + u8 tbir;
> + u32 offset;
> +};
> +
> +enum pmt_quirks {
> + /* Watcher capability not supported */
> + PMT_QUIRK_NO_WATCHER = BIT(0),
> +
> + /* Crashlog capability not supported */
> + PMT_QUIRK_NO_CRASHLOG = BIT(1),
> +
> + /* Use shift instead of mask to read discovery table offset */
> + PMT_QUIRK_TABLE_SHIFT = BIT(2),
> +};
> +
> +struct pmt_platform_info {
> + unsigned long quirks;
> +};
> +
> +static const struct pmt_platform_info tgl_info = {
> + .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG |
> + PMT_QUIRK_TABLE_SHIFT,
> +};
> +
> +static int
> +pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header *header,
> + struct pmt_platform_info *info)

My personal preference is to a) only break when you have to and b) to
align with the '('. Perhaps point b) is satisfied and it's just the
patch format that's shifting the tab though?

> +{
> + struct device *dev = &pdev->dev;
> + struct resource *res, *tmp;
> + struct mfd_cell *cell;
> + const char *name;
> + int count = header->num_entries;
> + int size = header->entry_size;
> + int i;
> +
> + switch (header->id) {
> + case DVSEC_INTEL_ID_TELEMETRY:
> + name = TELEMETRY_DEV_NAME;
> + break;
> + case DVSEC_INTEL_ID_WATCHER:
> + if (info->quirks & PMT_QUIRK_NO_WATCHER) {
> + dev_info(dev, "Watcher not supported\n");
> + return 0;
> + }
> + name = WATCHER_DEV_NAME;
> + break;
> + case DVSEC_INTEL_ID_CRASHLOG:
> + if (info->quirks & PMT_QUIRK_NO_CRASHLOG) {
> + dev_info(dev, "Crashlog not supported\n");
> + return 0;
> + }
> + name = CRASHLOG_DEV_NAME;
> + break;
> + default:
> + return -EINVAL;

Doesn't deserve an error message?

> + }
> +
> + if (!header->num_entries || !header->entry_size) {
> + dev_warn(dev, "Invalid count or size for %s header\n", name);
> + return -EINVAL;

If you're returning an error, this should be dev_err().

Even if you only handle it as a warning at the call site.

> + }
> +
> + cell = devm_kzalloc(dev, sizeof(*cell), GFP_KERNEL);
> + if (!cell)
> + return -ENOMEM;
> +
> + res = devm_kcalloc(dev, count, sizeof(*res), GFP_KERNEL);
> + if (!res)
> + return -ENOMEM;
> +
> + if (info->quirks & PMT_QUIRK_TABLE_SHIFT)
> + header->offset >>= 3;
> +
> + for (i = 0, tmp = res; i < count; i++, tmp++) {
> + tmp->start = pdev->resource[header->tbir].start +
> + header->offset + i * (size << 2);

Deserves a comment I think.

> + tmp->end = tmp->start + (size << 2) - 1;
> + tmp->flags = IORESOURCE_MEM;
> + }
> +
> + cell->resources = res;
> + cell->num_resources = count;
> + cell->name = name;
> +
> + return devm_mfd_add_devices(dev, PLATFORM_DEVID_AUTO, cell, 1, NULL, 0,
> + NULL);
> +}
> +
> +static int
> +pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + struct intel_dvsec_header header;
> + struct pmt_platform_info *info;
> + bool found_devices = false;
> + int ret, pos = 0;
> + u32 table;
> + u16 vid;
> +
> + ret = pcim_enable_device(pdev);
> + if (ret)
> + return ret;
> +
> + info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
> + GFP_KERNEL);
> + if (!info)
> + return -ENOMEM;
> +
> + pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
> + while (pos) {

If you do:

do {
int pos;

pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
if (!pos)
break;

Then you can invoke pci_find_next_ext_capability() once, no?

> + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
> + if (vid != PCI_VENDOR_ID_INTEL)
> + continue;
> +
> + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
> + &header.id);
> + pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
> + &header.num_entries);
> + pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
> + &header.entry_size);
> + pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
> + &table);
> +
> + header.tbir = INTEL_DVSEC_TABLE_BAR(table);
> + header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
> +
> + ret = pmt_add_dev(pdev, &header, info);
> + if (ret)
> + dev_warn(&pdev->dev,
> + "Failed to add devices for DVSEC id %d\n",

"device", so not all devices, right?

> + header.id);

Don't you want to continue here?

Else you're going to set found_devices for a failed device.

> + found_devices = true;
> +
> + pos = pci_find_next_ext_capability(pdev, pos,
> + PCI_EXT_CAP_ID_DVSEC);
> + }
> +
> + if (!found_devices) {
> + dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
> + return -ENODEV;
> + }
> +
> + pm_runtime_put(&pdev->dev);
> + pm_runtime_allow(&pdev->dev);
> +
> + return 0;
> +}
> +
> +static void pmt_pci_remove(struct pci_dev *pdev)
> +{
> + pm_runtime_forbid(&pdev->dev);
> + pm_runtime_get_sync(&pdev->dev);
> +}
> +
> +#define PCI_DEVICE_ID_INTEL_PMT_TGL 0x9a0d

What's this for?

If this is PCI_DEVICE_DATA magic, it would be worth tying it to the
struct i.e. remove the empty line between it and the table below.

> +static const struct pci_device_id pmt_pci_ids[] = {
> + { PCI_DEVICE_DATA(INTEL, PMT_TGL, &tgl_info) },
> + { }
> +};
> +MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
> +
> +static struct pci_driver pmt_pci_driver = {
> + .name = "intel-pmt",
> + .id_table = pmt_pci_ids,
> + .probe = pmt_pci_probe,
> + .remove = pmt_pci_remove,
> +};
> +module_pci_driver(pmt_pci_driver);
> +
> +MODULE_AUTHOR("David E. Box <[email protected]>");
> +MODULE_DESCRIPTION("Intel Platform Monitoring Technology MFD driver");

s/MFD/(PMT)/

> +MODULE_LICENSE("GPL v2");

--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

2020-07-28 20:38:24

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] mfd: Intel Platform Monitoring Technology support

Hi Lee,

Thanks for this thorough review. Ack on all the comments with
particular thanks for spoting the missing continue.

David

On Tue, 2020-07-28 at 08:58 +0100, Lee Jones wrote:
> On Fri, 17 Jul 2020, David E. Box wrote:
>
> > Intel Platform Monitoring Technology (PMT) is an architecture for
> > enumerating and accessing hardware monitoring facilities. PMT
> > supports
> > multiple types of monitoring capabilities. This driver creates
> > platform
> > devices for each type so that they may be managed by capability
> > specific
> > drivers (to be introduced). Capabilities are discovered using PCIe
> > DVSEC
> > ids. Support is included for the 3 current capability types,
> > Telemetry,
> > Watcher, and Crashlog. The features are available on new Intel
> > platforms
> > starting from Tiger Lake for which support is added.
> >
> > Also add a quirk mechanism for several early hardware differences
> > and bugs.
> > For Tiger Lake, do not support Watcher and Crashlog capabilities
> > since they
> > will not be compatible with future product. Also, fix use a quirk
> > to fix
> > the discovery table offset.
> >
> > Reviewed-by: Andy Shevchenko <[email protected]>
> > Co-developed-by: Alexander Duyck <[email protected]
> > >
> > Signed-off-by: Alexander Duyck <[email protected]>
> > Signed-off-by: David E. Box <[email protected]>
>
> This should be in chronological order.
>
> > ---
> > MAINTAINERS | 5 +
> > drivers/mfd/Kconfig | 10 ++
> > drivers/mfd/Makefile | 1 +
> > drivers/mfd/intel_pmt.c | 215
> > ++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 231 insertions(+)
> > create mode 100644 drivers/mfd/intel_pmt.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index b4a43a9e7fbc..2e42bf0c41ab 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -8845,6 +8845,11 @@ F: drivers/mfd/intel_soc_pmic*
> > F: include/linux/mfd/intel_msic.h
> > F: include/linux/mfd/intel_soc_pmic*
> >
> > +INTEL PMT DRIVER
> > +M: "David E. Box" <[email protected]>
> > +S: Maintained
> > +F: drivers/mfd/intel_pmt.c
> > +
> > INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION
> > SUPPORT
> > M: Stanislav Yakovlev <[email protected]>
> > L: [email protected]
> > diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> > index a37d7d171382..1a62ce2c68d9 100644
> > --- a/drivers/mfd/Kconfig
> > +++ b/drivers/mfd/Kconfig
> > @@ -670,6 +670,16 @@ config MFD_INTEL_PMC_BXT
> > Register and P-unit access. In addition this creates devices
> > for iTCO watchdog and telemetry that are part of the PMC.
> >
> > +config MFD_INTEL_PMT
> > + tristate "Intel Platform Monitoring Technology support"
>
> Nit: "Intel Platform Monitoring Technology (PMT) support"
>
> > + depends on PCI
> > + select MFD_CORE
> > + help
> > + The Intel Platform Monitoring Technology (PMT) is an
> > interface that
> > + provides access to hardware monitor registers. This driver
> > supports
> > + Telemetry, Watcher, and Crashlog PMT capabilities/devices for
> > + platforms starting from Tiger Lake.
> > +
> > config MFD_IPAQ_MICRO
> > bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
> > depends on SA1100_H3100 || SA1100_H3600
> > diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> > index 9367a92f795a..1961b4737985 100644
> > --- a/drivers/mfd/Makefile
> > +++ b/drivers/mfd/Makefile
> > @@ -216,6 +216,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS_PCI) +=
> > intel-lpss-pci.o
> > obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
> > obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
> > obj-$(CONFIG_MFD_INTEL_PMC_BXT) += intel_pmc_bxt.o
> > +obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
> > obj-$(CONFIG_MFD_PALMAS) += palmas.o
> > obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
> > obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
> > diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
> > new file mode 100644
> > index 000000000000..6857eaf4ff86
> > --- /dev/null
> > +++ b/drivers/mfd/intel_pmt.c
> > @@ -0,0 +1,215 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Intel Platform Monitoring Technology MFD driver
>
> s/MFD/(PMT)/
>
> > + * Copyright (c) 2020, Intel Corporation.
> > + * All Rights Reserved.
> > + *
> > + * Authors: David E. Box <[email protected]>
>
> Looks odd to use a plural for a single author.
>
> > + */
> > +
> > +#include <linux/bits.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/pci.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/pm.h>
> > +#include <linux/pm_runtime.h>
> > +#include <linux/mfd/core.h>
> > +#include <linux/types.h>
>
> Alphabetical please.
>
> > +/* Intel DVSEC capability vendor space offsets */
> > +#define INTEL_DVSEC_ENTRIES 0xA
> > +#define INTEL_DVSEC_SIZE 0xB
> > +#define INTEL_DVSEC_TABLE 0xC
> > +#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
> > +#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) & GENMASK(31, 3))
> > +#define INTEL_DVSEC_ENTRY_SIZE 4
> > +
> > +/* PMT capabilities */
> > +#define DVSEC_INTEL_ID_TELEMETRY 2
> > +#define DVSEC_INTEL_ID_WATCHER 3
> > +#define DVSEC_INTEL_ID_CRASHLOG 4
> > +
> > +#define TELEMETRY_DEV_NAME "pmt_telemetry"
> > +#define WATCHER_DEV_NAME "pmt_watcher"
> > +#define CRASHLOG_DEV_NAME "pmt_crashlog"
>
> Please don't define names of things. It makes grepping a pain, at
> the
> very least. Just use the 'raw' string in-place.
>
> > +struct intel_dvsec_header {
> > + u16 length;
> > + u16 id;
> > + u8 num_entries;
> > + u8 entry_size;
> > + u8 tbir;
> > + u32 offset;
> > +};
> > +
> > +enum pmt_quirks {
> > + /* Watcher capability not supported */
> > + PMT_QUIRK_NO_WATCHER = BIT(0),
> > +
> > + /* Crashlog capability not supported */
> > + PMT_QUIRK_NO_CRASHLOG = BIT(1),
> > +
> > + /* Use shift instead of mask to read discovery table offset */
> > + PMT_QUIRK_TABLE_SHIFT = BIT(2),
> > +};
> > +
> > +struct pmt_platform_info {
> > + unsigned long quirks;
> > +};
> > +
> > +static const struct pmt_platform_info tgl_info = {
> > + .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG |
> > + PMT_QUIRK_TABLE_SHIFT,
> > +};
> > +
> > +static int
> > +pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header
> > *header,
> > + struct pmt_platform_info *info)
>
> My personal preference is to a) only break when you have to and b) to
> align with the '('. Perhaps point b) is satisfied and it's just the
> patch format that's shifting the tab though?
>
> > +{
> > + struct device *dev = &pdev->dev;
> > + struct resource *res, *tmp;
> > + struct mfd_cell *cell;
> > + const char *name;
> > + int count = header->num_entries;
> > + int size = header->entry_size;
> > + int i;
> > +
> > + switch (header->id) {
> > + case DVSEC_INTEL_ID_TELEMETRY:
> > + name = TELEMETRY_DEV_NAME;
> > + break;
> > + case DVSEC_INTEL_ID_WATCHER:
> > + if (info->quirks & PMT_QUIRK_NO_WATCHER) {
> > + dev_info(dev, "Watcher not supported\n");
> > + return 0;
> > + }
> > + name = WATCHER_DEV_NAME;
> > + break;
> > + case DVSEC_INTEL_ID_CRASHLOG:
> > + if (info->quirks & PMT_QUIRK_NO_CRASHLOG) {
> > + dev_info(dev, "Crashlog not supported\n");
> > + return 0;
> > + }
> > + name = CRASHLOG_DEV_NAME;
> > + break;
> > + default:
> > + return -EINVAL;
>
> Doesn't deserve an error message?
>
> > + }
> > +
> > + if (!header->num_entries || !header->entry_size) {
> > + dev_warn(dev, "Invalid count or size for %s header\n",
> > name);
> > + return -EINVAL;
>
> If you're returning an error, this should be dev_err().
>
> Even if you only handle it as a warning at the call site.
>
> > + }
> > +
> > + cell = devm_kzalloc(dev, sizeof(*cell), GFP_KERNEL);
> > + if (!cell)
> > + return -ENOMEM;
> > +
> > + res = devm_kcalloc(dev, count, sizeof(*res), GFP_KERNEL);
> > + if (!res)
> > + return -ENOMEM;
> > +
> > + if (info->quirks & PMT_QUIRK_TABLE_SHIFT)
> > + header->offset >>= 3;
> > +
> > + for (i = 0, tmp = res; i < count; i++, tmp++) {
> > + tmp->start = pdev->resource[header->tbir].start +
> > + header->offset + i * (size << 2);
>
> Deserves a comment I think.
>
> > + tmp->end = tmp->start + (size << 2) - 1;
> > + tmp->flags = IORESOURCE_MEM;
> > + }
> > +
> > + cell->resources = res;
> > + cell->num_resources = count;
> > + cell->name = name;
> > +
> > + return devm_mfd_add_devices(dev, PLATFORM_DEVID_AUTO, cell, 1,
> > NULL, 0,
> > + NULL);
> > +}
> > +
> > +static int
> > +pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id
> > *id)
> > +{
> > + struct intel_dvsec_header header;
> > + struct pmt_platform_info *info;
> > + bool found_devices = false;
> > + int ret, pos = 0;
> > + u32 table;
> > + u16 vid;
> > +
> > + ret = pcim_enable_device(pdev);
> > + if (ret)
> > + return ret;
> > +
> > + info = devm_kmemdup(&pdev->dev, (void *)id->driver_data,
> > sizeof(*info),
> > + GFP_KERNEL);
> > + if (!info)
> > + return -ENOMEM;
> > +
> > + pos = pci_find_next_ext_capability(pdev, pos,
> > PCI_EXT_CAP_ID_DVSEC);
> > + while (pos) {
>
> If you do:
>
> do {
> int pos;
>
> pos = pci_find_next_ext_capability(pdev, pos,
> PCI_EXT_CAP_ID_DVSEC);
> if (!pos)
> break;
>
> Then you can invoke pci_find_next_ext_capability() once, no?
>
> > + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1,
> > &vid);
> > + if (vid != PCI_VENDOR_ID_INTEL)
> > + continue;
> > +
> > + pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
> > + &header.id);
> > + pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
> > + &header.num_entries);
> > + pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
> > + &header.entry_size);
> > + pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
> > + &table);
> > +
> > + header.tbir = INTEL_DVSEC_TABLE_BAR(table);
> > + header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
> > +
> > + ret = pmt_add_dev(pdev, &header, info);
> > + if (ret)
> > + dev_warn(&pdev->dev,
> > + "Failed to add devices for DVSEC id
> > %d\n",
>
> "device", so not all devices, right?
>
> > + header.id);
>
> Don't you want to continue here?
>
> Else you're going to set found_devices for a failed device.
>
> > + found_devices = true;
> > +
> > + pos = pci_find_next_ext_capability(pdev, pos,
> > + PCI_EXT_CAP_ID_DVSEC
> > );
> > + }
> > +
> > + if (!found_devices) {
> > + dev_err(&pdev->dev, "No supported PMT capabilities
> > found.\n");
> > + return -ENODEV;
> > + }
> > +
> > + pm_runtime_put(&pdev->dev);
> > + pm_runtime_allow(&pdev->dev);
> > +
> > + return 0;
> > +}
> > +
> > +static void pmt_pci_remove(struct pci_dev *pdev)
> > +{
> > + pm_runtime_forbid(&pdev->dev);
> > + pm_runtime_get_sync(&pdev->dev);
> > +}
> > +
> > +#define PCI_DEVICE_ID_INTEL_PMT_TGL 0x9a0d
>
> What's this for?
>
> If this is PCI_DEVICE_DATA magic, it would be worth tying it to the
> struct i.e. remove the empty line between it and the table below.
>
> > +static const struct pci_device_id pmt_pci_ids[] = {
> > + { PCI_DEVICE_DATA(INTEL, PMT_TGL, &tgl_info) },
> > + { }
> > +};
> > +MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
> > +
> > +static struct pci_driver pmt_pci_driver = {
> > + .name = "intel-pmt",
> > + .id_table = pmt_pci_ids,
> > + .probe = pmt_pci_probe,
> > + .remove = pmt_pci_remove,
> > +};
> > +module_pci_driver(pmt_pci_driver);
> > +
> > +MODULE_AUTHOR("David E. Box <[email protected]>");
> > +MODULE_DESCRIPTION("Intel Platform Monitoring Technology MFD
> > driver");
>
> s/MFD/(PMT)/
>
> > +MODULE_LICENSE("GPL v2");

2020-07-29 21:36:40

by David E. Box

[permalink] [raw]
Subject: [PATCH V5 3/3] platform/x86: Intel PMT Telemetry capability driver

PMT Telemetry is a capability of the Intel Platform Monitoring Technology.
The Telemetry capability provides access to device telemetry metrics that
provide hardware performance data to users from continuous, memory mapped,
read-only register spaces.

Register mappings are not provided by the driver. Instead, a GUID is read
from a header for each endpoint. The GUID identifies the device and is to
be used with an XML, provided by the vendor, to discover the available set
of metrics and their register mapping. This allows firmware updates to
modify the register space without needing to update the driver every time
with new mappings. Firmware writes a new GUID in this case to specify the
new mapping. Software tools with access to the associated XML file can
then interpret the changes.

The module manages access to all PMT Telemetry endpoints on a system,
independent of the device exporting them. It creates a pmt_telemetry class
to manage the devices. For each telemetry endpoint, sysfs files provide
GUID and size information as well as a pointer to the parent device the
telemetry came from. Software may discover the association between
endpoints and devices by iterating through the list in sysfs, or by looking
for the existence of the class folder under the device of interest. A
device node of the same name allows software to then map the telemetry
space for direct access.

Also create a PCI device id list for early telemetry hardware that require
workarounds for known issues.

Co-developed-by: Alexander Duyck <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David E. Box <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
.../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
MAINTAINERS | 1 +
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 448 ++++++++++++++++++
5 files changed, 506 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

diff --git a/Documentation/ABI/testing/sysfs-class-pmt_telemetry b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
new file mode 100644
index 000000000000..b0b096db9cae
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-pmt_telemetry
@@ -0,0 +1,46 @@
+What: /sys/class/pmt_telemetry/
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ The pmt_telemetry/ class directory contains information for
+ devices that expose hardware telemetry using Intel Platform
+ Monitoring Technology (PMT)
+
+What: /sys/class/pmt_telemetry/telem<x>
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ The telem<x> directory contains files describing an instance of
+ a PMT telemetry device that exposes hardware telemetry. Each
+ telem<x> directory has an associated /dev/telem<x> node. This
+ node may be opened and mapped to access the telemetry space of
+ the device. The register layout of the telemetry space is
+ determined from an XML file that matches the PCI device id and
+ GUID for the device.
+
+What: /sys/class/pmt_telemetry/telem<x>/guid
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The GUID for this telemetry device. The GUID identifies
+ the version of the XML file for the parent device that is to
+ be used to get the register layout.
+
+What: /sys/class/pmt_telemetry/telem<x>/size
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The size of telemetry region in bytes that corresponds to
+ the mapping size for the /dev/telem<x> device node.
+
+What: /sys/class/pmt_telemetry/telem<x>/offset
+Date: July 2020
+KernelVersion: 5.9
+Contact: David Box <[email protected]>
+Description:
+ (RO) The offset of telemetry region in bytes that corresponds to
+ the mapping for the /dev/telem<x> device node.
diff --git a/MAINTAINERS b/MAINTAINERS
index b69429c70330..40794cc721af 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8849,6 +8849,7 @@ INTEL PMT DRIVER
M: "David E. Box" <[email protected]>
S: Maintained
F: drivers/mfd/intel_pmt.c
+F: drivers/platform/x86/intel_pmt_*

INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig
index 0581a54cf562..8552b094d005 100644
--- a/drivers/platform/x86/Kconfig
+++ b/drivers/platform/x86/Kconfig
@@ -1339,6 +1339,16 @@ config INTEL_PMC_CORE
- LTR Ignore
- MPHY/PLL gating status (Sunrisepoint PCH only)

+config INTEL_PMT_TELEMETRY
+ tristate "Intel Platform Monitoring Technology (PMT) Telemetry driver"
+ help
+ The Intel Platform Monitory Technology (PMT) Telemetry driver provides
+ access to hardware telemetry metrics on devices that support the
+ feature.
+
+ For more information, see
+ <file:Documentation/ABI/testing/sysfs-class-intel_pmt_telem>
+
config INTEL_PUNIT_IPC
tristate "Intel P-Unit IPC Driver"
help
diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile
index 2b85852a1a87..95cd3d0be17f 100644
--- a/drivers/platform/x86/Makefile
+++ b/drivers/platform/x86/Makefile
@@ -139,6 +139,7 @@ obj-$(CONFIG_INTEL_MFLD_THERMAL) += intel_mid_thermal.o
obj-$(CONFIG_INTEL_MID_POWER_BUTTON) += intel_mid_powerbtn.o
obj-$(CONFIG_INTEL_MRFLD_PWRBTN) += intel_mrfld_pwrbtn.o
obj-$(CONFIG_INTEL_PMC_CORE) += intel_pmc_core.o intel_pmc_core_pltdrv.o
+obj-$(CONFIG_INTEL_PMT_TELEMETRY) += intel_pmt_telemetry.o
obj-$(CONFIG_INTEL_PUNIT_IPC) += intel_punit_ipc.o
obj-$(CONFIG_INTEL_SCU_IPC) += intel_scu_ipc.o
obj-$(CONFIG_INTEL_SCU_PCI) += intel_scu_pcidrv.o
diff --git a/drivers/platform/x86/intel_pmt_telemetry.c b/drivers/platform/x86/intel_pmt_telemetry.c
new file mode 100644
index 000000000000..17f814ece30a
--- /dev/null
+++ b/drivers/platform/x86/intel_pmt_telemetry.c
@@ -0,0 +1,448 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitory Technology Telemetry driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: "David E. Box" <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/cdev.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+#include <linux/xarray.h>
+
+#define TELEM_DEV_NAME "pmt_telemetry"
+
+/* Telemetry access types */
+#define TELEM_ACCESS_FUTURE 1
+#define TELEM_ACCESS_BARID 2
+#define TELEM_ACCESS_LOCAL 3
+
+#define TELEM_GUID_OFFSET 0x4
+#define TELEM_BASE_OFFSET 0x8
+#define TELEM_TBIR_MASK GENMASK(2, 0)
+#define TELEM_ACCESS(v) ((v) & GENMASK(3, 0))
+#define TELEM_TYPE(v) (((v) & GENMASK(7, 4)) >> 4)
+/* size is in bytes */
+#define TELEM_SIZE(v) (((v) & GENMASK(27, 12)) >> 10)
+
+#define TELEM_XA_START 0
+#define TELEM_XA_MAX INT_MAX
+#define TELEM_XA_LIMIT XA_LIMIT(TELEM_XA_START, TELEM_XA_MAX)
+
+/* Used by client hardware to identify a fixed telemetry entry*/
+#define TELEM_CLIENT_FIXED_BLOCK_GUID 0x10000000
+
+static DEFINE_XARRAY_ALLOC(telem_array);
+
+struct pmt_telem_priv;
+
+struct telem_header {
+ u8 access_type;
+ u8 telem_type;
+ u16 size;
+ u32 guid;
+ u32 base_offset;
+ u8 tbir;
+};
+
+struct pmt_telem_entry {
+ struct pmt_telem_priv *priv;
+ struct telem_header header;
+ struct resource *header_res;
+ unsigned long base_addr;
+ void __iomem *disc_table;
+ struct cdev cdev;
+ dev_t devt;
+ int devid;
+};
+
+struct pmt_telem_priv {
+ struct pmt_telem_entry *entry;
+ int num_entries;
+ struct device *dev;
+};
+
+/*
+ * devfs
+ */
+static int pmt_telem_open(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_priv *priv;
+ struct pmt_telem_entry *entry;
+ struct pci_driver *pci_drv;
+ struct pci_dev *pci_dev;
+
+ if (!perfmon_capable())
+ return -EPERM;
+
+ entry = container_of(inode->i_cdev, struct pmt_telem_entry, cdev);
+ priv = entry->priv;
+ pci_dev = to_pci_dev(priv->dev->parent);
+
+ pci_drv = pci_dev_driver(pci_dev);
+ if (!pci_drv)
+ return -ENODEV;
+
+ filp->private_data = entry;
+ get_device(&pci_dev->dev);
+
+ if (!try_module_get(pci_drv->driver.owner)) {
+ put_device(&pci_dev->dev);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static int pmt_telem_release(struct inode *inode, struct file *filp)
+{
+ struct pmt_telem_entry *entry = filp->private_data;
+ struct pci_dev *pci_dev = to_pci_dev(entry->priv->dev->parent);
+ struct pci_driver *pci_drv = pci_dev_driver(pci_dev);
+
+ put_device(&pci_dev->dev);
+ module_put(pci_drv->driver.owner);
+
+ return 0;
+}
+
+static int pmt_telem_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+ struct pmt_telem_entry *entry = filp->private_data;
+ struct pmt_telem_priv *priv;
+ unsigned long vsize = vma->vm_end - vma->vm_start;
+ unsigned long phys = entry->base_addr;
+ unsigned long pfn = PFN_DOWN(phys);
+ unsigned long psize;
+
+ priv = entry->priv;
+ psize = (PFN_UP(entry->base_addr + entry->header.size) - pfn) * PAGE_SIZE;
+ if (vsize > psize) {
+ dev_err(priv->dev, "Requested mmap size is too large\n");
+ return -EINVAL;
+ }
+
+ if ((vma->vm_flags & VM_WRITE) || (vma->vm_flags & VM_MAYWRITE))
+ return -EPERM;
+
+ vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
+ if (io_remap_pfn_range(vma, vma->vm_start, pfn, vsize,
+ vma->vm_page_prot))
+ return -EINVAL;
+
+ return 0;
+}
+
+static const struct file_operations pmt_telem_fops = {
+ .owner = THIS_MODULE,
+ .open = pmt_telem_open,
+ .mmap = pmt_telem_mmap,
+ .release = pmt_telem_release,
+};
+
+/*
+ * sysfs
+ */
+static ssize_t guid_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ return sprintf(buf, "0x%x\n", entry->header.guid);
+}
+static DEVICE_ATTR_RO(guid);
+
+static ssize_t size_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ /* Display buffer size in bytes */
+ return sprintf(buf, "%u\n", entry->header.size);
+}
+static DEVICE_ATTR_RO(size);
+
+static ssize_t offset_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct pmt_telem_entry *entry = dev_get_drvdata(dev);
+
+ /* Display buffer offset in bytes */
+ return sprintf(buf, "%lu\n", offset_in_page(entry->base_addr));
+}
+static DEVICE_ATTR_RO(offset);
+
+static struct attribute *pmt_telem_attrs[] = {
+ &dev_attr_guid.attr,
+ &dev_attr_size.attr,
+ &dev_attr_offset.attr,
+ NULL
+};
+ATTRIBUTE_GROUPS(pmt_telem);
+
+static struct class pmt_telem_class = {
+ .owner = THIS_MODULE,
+ .name = "pmt_telemetry",
+ .dev_groups = pmt_telem_groups,
+};
+
+/*
+ * driver initialization
+ */
+static const struct pci_device_id pmt_telem_early_client_pci_ids[] = {
+ { PCI_VDEVICE(INTEL, 0x9a0d) }, /* TGL */
+ { }
+};
+
+static bool pmt_telem_is_early_client_hw(struct device *dev)
+{
+ struct pci_dev *parent = to_pci_dev(dev->parent);
+
+ return !!pci_match_id(pmt_telem_early_client_pci_ids, parent);
+}
+
+static int pmt_telem_create_dev(struct pmt_telem_priv *priv,
+ struct pmt_telem_entry *entry)
+{
+ struct pci_dev *pci_dev;
+ struct device *dev;
+ int ret;
+
+ cdev_init(&entry->cdev, &pmt_telem_fops);
+ ret = cdev_add(&entry->cdev, entry->devt, 1);
+ if (ret) {
+ dev_err(priv->dev, "Could not add char dev\n");
+ return ret;
+ }
+
+ pci_dev = to_pci_dev(priv->dev->parent);
+ dev = device_create(&pmt_telem_class, &pci_dev->dev, entry->devt,
+ entry, "telem%d", entry->devid);
+ if (IS_ERR(dev)) {
+ dev_err(priv->dev, "Could not create device node\n");
+ cdev_del(&entry->cdev);
+ }
+
+ return PTR_ERR_OR_ZERO(dev);
+}
+
+static void pmt_telem_populate_header(void __iomem *disc_offset,
+ struct telem_header *header)
+{
+ header->access_type = TELEM_ACCESS(readb(disc_offset));
+ header->telem_type = TELEM_TYPE(readb(disc_offset));
+ header->size = TELEM_SIZE(readl(disc_offset));
+ header->guid = readl(disc_offset + TELEM_GUID_OFFSET);
+ header->base_offset = readl(disc_offset + TELEM_BASE_OFFSET);
+
+ /*
+ * For non-local access types the lower 3 bits of base offset
+ * contains the index of the base address register where the
+ * telemetry can be found.
+ */
+ header->tbir = header->base_offset & TELEM_TBIR_MASK;
+ header->base_offset ^= header->tbir;
+}
+
+static int pmt_telem_add_entry(struct pmt_telem_priv *priv,
+ struct pmt_telem_entry *entry)
+{
+ struct resource *res = entry->header_res;
+ struct pci_dev *pci_dev = to_pci_dev(priv->dev->parent);
+ int ret;
+
+ pmt_telem_populate_header(entry->disc_table, &entry->header);
+
+ /* Local access and BARID only for now */
+ switch (entry->header.access_type) {
+ case TELEM_ACCESS_LOCAL:
+ if (entry->header.tbir) {
+ dev_err(priv->dev,
+ "Unsupported BAR index %d for access type %d\n",
+ entry->header.tbir, entry->header.access_type);
+ return -EINVAL;
+ }
+
+ /*
+ * For access_type LOCAL, the base address is as follows:
+ * base address = header address + header length + base offset
+ */
+ entry->base_addr = res->start + resource_size(res) +
+ entry->header.base_offset;
+ break;
+
+ case TELEM_ACCESS_BARID:
+ entry->base_addr = pci_dev->resource[entry->header.tbir].start +
+ entry->header.base_offset;
+ break;
+
+ default:
+ dev_err(priv->dev, "Unsupported access type %d\n",
+ entry->header.access_type);
+ return -EINVAL;
+ }
+
+ ret = alloc_chrdev_region(&entry->devt, 0, 1, TELEM_DEV_NAME);
+ if (ret) {
+ dev_err(priv->dev,
+ "PMT telemetry chrdev_region error: %d\n", ret);
+ return ret;
+ }
+
+ ret = xa_alloc(&telem_array, &entry->devid, entry, TELEM_XA_LIMIT,
+ GFP_KERNEL);
+ if (ret)
+ goto fail_xa_alloc;
+
+ ret = pmt_telem_create_dev(priv, entry);
+ if (ret)
+ goto fail_create_dev;
+
+ entry->priv = priv;
+ priv->num_entries++;
+ return 0;
+
+fail_create_dev:
+ xa_erase(&telem_array, entry->devid);
+fail_xa_alloc:
+ unregister_chrdev_region(entry->devt, 1);
+
+ return ret;
+}
+
+static bool pmt_telem_region_overlaps(struct platform_device *pdev,
+ void __iomem *disc_table)
+{
+ u32 guid;
+
+ guid = readl(disc_table + TELEM_GUID_OFFSET);
+
+ return guid == TELEM_CLIENT_FIXED_BLOCK_GUID;
+}
+
+static void pmt_telem_remove_entries(struct pmt_telem_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->num_entries; i++) {
+ device_destroy(&pmt_telem_class, priv->entry[i].devt);
+ cdev_del(&priv->entry[i].cdev);
+ xa_erase(&telem_array, priv->entry[i].devid);
+ unregister_chrdev_region(priv->entry[i].devt, 1);
+ }
+}
+
+static int pmt_telem_probe(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv;
+ struct pmt_telem_entry *entry;
+ bool early_hw;
+ int i;
+
+ priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, priv);
+ priv->dev = &pdev->dev;
+
+ priv->entry = devm_kcalloc(&pdev->dev, pdev->num_resources,
+ sizeof(struct pmt_telem_entry), GFP_KERNEL);
+ if (!priv->entry)
+ return -ENOMEM;
+
+ if (pmt_telem_is_early_client_hw(&pdev->dev))
+ early_hw = true;
+
+ for (i = 0, entry = priv->entry; i < pdev->num_resources;
+ i++, entry++) {
+ int ret;
+
+ entry->header_res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+ if (!entry->header_res) {
+ pmt_telem_remove_entries(priv);
+ return -ENODEV;
+ }
+
+ entry->disc_table = devm_platform_ioremap_resource(pdev, i);
+ if (IS_ERR(entry->disc_table)) {
+ pmt_telem_remove_entries(priv);
+ return PTR_ERR(entry->disc_table);
+ }
+
+ if (pmt_telem_region_overlaps(pdev, entry->disc_table) &&
+ early_hw)
+ continue;
+
+ ret = pmt_telem_add_entry(priv, entry);
+ if (ret) {
+ pmt_telem_remove_entries(priv);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+static int pmt_telem_remove(struct platform_device *pdev)
+{
+ struct pmt_telem_priv *priv = platform_get_drvdata(pdev);
+
+ pmt_telem_remove_entries(priv);
+
+ return 0;
+}
+
+static const struct platform_device_id pmt_telem_table[] = {
+ {
+ .name = "pmt_telemetry",
+ },
+ {}
+};
+MODULE_DEVICE_TABLE(platform, pmt_telem_table);
+
+static struct platform_driver pmt_telem_driver = {
+ .driver = {
+ .name = TELEM_DEV_NAME,
+ },
+ .probe = pmt_telem_probe,
+ .remove = pmt_telem_remove,
+ .id_table = pmt_telem_table,
+};
+
+static int __init pmt_telem_init(void)
+{
+ int ret = class_register(&pmt_telem_class);
+
+ if (ret)
+ return ret;
+
+ ret = platform_driver_register(&pmt_telem_driver);
+ if (ret)
+ class_unregister(&pmt_telem_class);
+
+ return ret;
+}
+module_init(pmt_telem_init);
+
+static void __exit pmt_telem_exit(void)
+{
+ platform_driver_unregister(&pmt_telem_driver);
+ class_unregister(&pmt_telem_class);
+ xa_destroy(&telem_array);
+}
+module_exit(pmt_telem_exit);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel PMT Telemetry driver");
+MODULE_ALIAS("platform:" TELEM_DEV_NAME);
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-07-29 21:36:43

by David E. Box

[permalink] [raw]
Subject: [PATCH V5 1/3] PCI: Add defines for Designated Vendor-Specific Extended Capability

Add PCIe Designated Vendor-Specific Extended Capability (DVSEC) and defines
for the header offsets. Defined in PCIe r5.0, sec 7.9.6.

Signed-off-by: David E. Box <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
include/uapi/linux/pci_regs.h | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index f9701410d3b5..beafeee39e44 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -720,6 +720,7 @@
#define PCI_EXT_CAP_ID_DPC 0x1D /* Downstream Port Containment */
#define PCI_EXT_CAP_ID_L1SS 0x1E /* L1 PM Substates */
#define PCI_EXT_CAP_ID_PTM 0x1F /* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC 0x23 /* Designated Vendor-Specific */
#define PCI_EXT_CAP_ID_DLF 0x25 /* Data Link Feature */
#define PCI_EXT_CAP_ID_PL_16GT 0x26 /* Physical Layer 16.0 GT/s */
#define PCI_EXT_CAP_ID_MAX PCI_EXT_CAP_ID_PL_16GT
@@ -1062,6 +1063,10 @@
#define PCI_L1SS_CTL1_LTR_L12_TH_SCALE 0xe0000000 /* LTR_L1.2_THRESHOLD_Scale */
#define PCI_L1SS_CTL2 0x0c /* Control 2 Register */

+/* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
+#define PCI_DVSEC_HEADER1 0x4 /* Designated Vendor-Specific Header1 */
+#define PCI_DVSEC_HEADER2 0x8 /* Designated Vendor-Specific Header2 */
+
/* Data Link Feature */
#define PCI_DLF_CAP 0x04 /* Capabilities Register */
#define PCI_DLF_EXCHANGE_ENABLE 0x80000000 /* Data Link Feature Exchange Enable */
--
2.20.1

2020-07-29 21:37:08

by David E. Box

[permalink] [raw]
Subject: [PATCH V5 2/3] mfd: Intel Platform Monitoring Technology support

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring facilities. PMT supports
multiple types of monitoring capabilities. This driver creates platform
devices for each type so that they may be managed by capability specific
drivers (to be introduced). Capabilities are discovered using PCIe DVSEC
ids. Support is included for the 3 current capability types, Telemetry,
Watcher, and Crashlog. The features are available on new Intel platforms
starting from Tiger Lake for which support is added.

Also add a quirk mechanism for several early hardware differences and bugs.
For Tiger Lake, do not support Watcher and Crashlog capabilities since they
will not be compatible with future product. Also, fix use a quirk to fix
the discovery table offset.

Co-developed-by: Alexander Duyck <[email protected]>
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: David E. Box <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
---
MAINTAINERS | 5 +
drivers/mfd/Kconfig | 10 ++
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 220 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 236 insertions(+)
create mode 100644 drivers/mfd/intel_pmt.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f0569cf304ca..b69429c70330 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8845,6 +8845,11 @@ F: drivers/mfd/intel_soc_pmic*
F: include/linux/mfd/intel_msic.h
F: include/linux/mfd/intel_soc_pmic*

+INTEL PMT DRIVER
+M: "David E. Box" <[email protected]>
+S: Maintained
+F: drivers/mfd/intel_pmt.c
+
INTEL PRO/WIRELESS 2100, 2200BG, 2915ABG NETWORK CONNECTION SUPPORT
M: Stanislav Yakovlev <[email protected]>
L: [email protected]
diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index a37d7d171382..5dd05f1b8ce5 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -670,6 +670,16 @@ config MFD_INTEL_PMC_BXT
Register and P-unit access. In addition this creates devices
for iTCO watchdog and telemetry that are part of the PMC.

+config MFD_INTEL_PMT
+ tristate "Intel Platform Monitoring Technology (PMT) support"
+ depends on PCI
+ select MFD_CORE
+ help
+ The Intel Platform Monitoring Technology (PMT) is an interface that
+ provides access to hardware monitor registers. This driver supports
+ Telemetry, Watcher, and Crashlog PMT capabilities/devices for
+ platforms starting from Tiger Lake.
+
config MFD_IPAQ_MICRO
bool "Atmel Micro ASIC (iPAQ h3100/h3600/h3700) Support"
depends on SA1100_H3100 || SA1100_H3600
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index 9367a92f795a..1961b4737985 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -216,6 +216,7 @@ obj-$(CONFIG_MFD_INTEL_LPSS_PCI) += intel-lpss-pci.o
obj-$(CONFIG_MFD_INTEL_LPSS_ACPI) += intel-lpss-acpi.o
obj-$(CONFIG_MFD_INTEL_MSIC) += intel_msic.o
obj-$(CONFIG_MFD_INTEL_PMC_BXT) += intel_pmc_bxt.o
+obj-$(CONFIG_MFD_INTEL_PMT) += intel_pmt.o
obj-$(CONFIG_MFD_PALMAS) += palmas.o
obj-$(CONFIG_MFD_VIPERBOARD) += viperboard.o
obj-$(CONFIG_MFD_RC5T583) += rc5t583.o rc5t583-irq.o
diff --git a/drivers/mfd/intel_pmt.c b/drivers/mfd/intel_pmt.c
new file mode 100644
index 000000000000..0e572b105101
--- /dev/null
+++ b/drivers/mfd/intel_pmt.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Platform Monitoring Technology PMT driver
+ *
+ * Copyright (c) 2020, Intel Corporation.
+ * All Rights Reserved.
+ *
+ * Author: David E. Box <[email protected]>
+ */
+
+#include <linux/bits.h>
+#include <linux/kernel.h>
+#include <linux/mfd/core.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/types.h>
+
+/* Intel DVSEC capability vendor space offsets */
+#define INTEL_DVSEC_ENTRIES 0xA
+#define INTEL_DVSEC_SIZE 0xB
+#define INTEL_DVSEC_TABLE 0xC
+#define INTEL_DVSEC_TABLE_BAR(x) ((x) & GENMASK(2, 0))
+#define INTEL_DVSEC_TABLE_OFFSET(x) ((x) & GENMASK(31, 3))
+#define INTEL_DVSEC_ENTRY_SIZE 4
+
+/* PMT capabilities */
+#define DVSEC_INTEL_ID_TELEMETRY 2
+#define DVSEC_INTEL_ID_WATCHER 3
+#define DVSEC_INTEL_ID_CRASHLOG 4
+
+struct intel_dvsec_header {
+ u16 length;
+ u16 id;
+ u8 num_entries;
+ u8 entry_size;
+ u8 tbir;
+ u32 offset;
+};
+
+enum pmt_quirks {
+ /* Watcher capability not supported */
+ PMT_QUIRK_NO_WATCHER = BIT(0),
+
+ /* Crashlog capability not supported */
+ PMT_QUIRK_NO_CRASHLOG = BIT(1),
+
+ /* Use shift instead of mask to read discovery table offset */
+ PMT_QUIRK_TABLE_SHIFT = BIT(2),
+};
+
+struct pmt_platform_info {
+ unsigned long quirks;
+};
+
+static const struct pmt_platform_info tgl_info = {
+ .quirks = PMT_QUIRK_NO_WATCHER | PMT_QUIRK_NO_CRASHLOG |
+ PMT_QUIRK_TABLE_SHIFT,
+};
+
+static int pmt_add_dev(struct pci_dev *pdev, struct intel_dvsec_header *header,
+ struct pmt_platform_info *info)
+{
+ struct device *dev = &pdev->dev;
+ struct resource *res, *tmp;
+ struct mfd_cell *cell;
+ const char *name;
+ int count = header->num_entries;
+ int size = header->entry_size;
+ int id = header->id;
+ int i;
+
+ switch (id) {
+ case DVSEC_INTEL_ID_TELEMETRY:
+ name = "pmt_telemetry";
+ break;
+ case DVSEC_INTEL_ID_WATCHER:
+ if (info->quirks & PMT_QUIRK_NO_WATCHER) {
+ dev_info(dev, "Watcher not supported\n");
+ return 0;
+ }
+ name = "pmt_watcher";
+ break;
+ case DVSEC_INTEL_ID_CRASHLOG:
+ if (info->quirks & PMT_QUIRK_NO_CRASHLOG) {
+ dev_info(dev, "Crashlog not supported\n");
+ return 0;
+ }
+ name = "pmt_crashlog";
+ break;
+ default:
+ dev_err(dev, "Unrecognized PMT capability: %d\n", id);
+ return -EINVAL;
+ }
+
+ if (!header->num_entries || !header->entry_size) {
+ dev_err(dev, "Invalid count or size for %s header\n", name);
+ return -EINVAL;
+ }
+
+ cell = devm_kzalloc(dev, sizeof(*cell), GFP_KERNEL);
+ if (!cell)
+ return -ENOMEM;
+
+ res = devm_kcalloc(dev, count, sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ if (info->quirks & PMT_QUIRK_TABLE_SHIFT)
+ header->offset >>= 3;
+
+ /*
+ * The PMT DVSEC contains the starting offset and count for a block of
+ * discovery tables, each providing access to monitoring facilities for
+ * a section of the device. Create a resource list of these tables to
+ * provide to the driver.
+ */
+ for (i = 0, tmp = res; i < count; i++, tmp++) {
+ tmp->start = pdev->resource[header->tbir].start +
+ header->offset + i * (size << 2);
+ tmp->end = tmp->start + (size << 2) - 1;
+ tmp->flags = IORESOURCE_MEM;
+ }
+
+ cell->resources = res;
+ cell->num_resources = count;
+ cell->name = name;
+
+ return devm_mfd_add_devices(dev, PLATFORM_DEVID_AUTO, cell, 1, NULL, 0,
+ NULL);
+}
+
+static int pmt_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct pmt_platform_info *info;
+ bool found_devices = false;
+ int ret, pos = 0;
+
+ ret = pcim_enable_device(pdev);
+ if (ret)
+ return ret;
+
+ info = devm_kmemdup(&pdev->dev, (void *)id->driver_data, sizeof(*info),
+ GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ do {
+ struct intel_dvsec_header header;
+ u32 table;
+ u16 vid;
+
+ pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
+ if (!pos)
+ break;
+
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vid);
+ if (vid != PCI_VENDOR_ID_INTEL)
+ continue;
+
+ pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2,
+ &header.id);
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_ENTRIES,
+ &header.num_entries);
+ pci_read_config_byte(pdev, pos + INTEL_DVSEC_SIZE,
+ &header.entry_size);
+ pci_read_config_dword(pdev, pos + INTEL_DVSEC_TABLE,
+ &table);
+
+ header.tbir = INTEL_DVSEC_TABLE_BAR(table);
+ header.offset = INTEL_DVSEC_TABLE_OFFSET(table);
+
+ ret = pmt_add_dev(pdev, &header, info);
+ if (ret) {
+ dev_warn(&pdev->dev,
+ "Failed to add device for DVSEC id %d\n",
+ header.id);
+ continue;
+ }
+
+ found_devices = true;
+ } while (true);
+
+ if (!found_devices) {
+ dev_err(&pdev->dev, "No supported PMT capabilities found.\n");
+ return -ENODEV;
+ }
+
+ pm_runtime_put(&pdev->dev);
+ pm_runtime_allow(&pdev->dev);
+
+ return 0;
+}
+
+static void pmt_pci_remove(struct pci_dev *pdev)
+{
+ pm_runtime_forbid(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);
+}
+
+#define PCI_DEVICE_ID_INTEL_PMT_TGL 0x9a0d
+static const struct pci_device_id pmt_pci_ids[] = {
+ { PCI_DEVICE_DATA(INTEL, PMT_TGL, &tgl_info) },
+ { }
+};
+MODULE_DEVICE_TABLE(pci, pmt_pci_ids);
+
+static struct pci_driver pmt_pci_driver = {
+ .name = "intel-pmt",
+ .id_table = pmt_pci_ids,
+ .probe = pmt_pci_probe,
+ .remove = pmt_pci_remove,
+};
+module_pci_driver(pmt_pci_driver);
+
+MODULE_AUTHOR("David E. Box <[email protected]>");
+MODULE_DESCRIPTION("Intel Platform Monitoring Technology PMT driver");
+MODULE_LICENSE("GPL v2");
--
2.20.1

2020-07-29 21:37:32

by David E. Box

[permalink] [raw]
Subject: [PATCH V5 0/3] Intel Platform Monitoring Technology

Intel Platform Monitoring Technology (PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.
With customers increasingly asking for hardware telemetry, engineers not
only have to figure out how to measure and collect data, but also how to
deliver it and make it discoverable. The latter may be through some device
specific method requiring device specific tools to collect the data. This
in turn requires customers to manage a suite of different tools in order to
collect the differing assortment of monitoring data on their systems. Even
when such information can be provided in kernel drivers, they may require
constant maintenance to update register mappings as they change with
firmware updates and new versions of hardware. PMT provides a solution for
discovering and reading telemetry from a device through a hardware agnostic
framework that allows for updates to systems without requiring patches to
the kernel or software tools.

PMT defines several capabilities to support collecting monitoring data from
hardware. All are discoverable as separate instances of the PCIE Designated
Vendor extended capability (DVSEC) with the Intel vendor code. The DVSEC ID
field uniquely identifies the capability. Each DVSEC also provides a BAR
offset to a header that defines capability-specific attributes, including
GUID, feature type, offset and length, as well as configuration settings
where applicable. The GUID uniquely identifies the register space of any
monitor data exposed by the capability. The GUID is associated with an XML
file from the vendor that describes the mapping of the register space along
with properties of the monitor data. This allows vendors to perform
firmware updates that can change the mapping (e.g. add new metrics) without
requiring any changes to drivers or software tools. The new mapping is
confirmed by an updated GUID, read from the hardware, which software uses
with a new XML.

The current capabilities defined by PMT are Telemetry, Watcher, and
Crashlog. The Telemetry capability provides access to a continuous block
of read only data. The Watcher capability provides access to hardware
sampling and tracing features. Crashlog provides access to device crash
dumps. While there is some relationship between capabilities (Watcher can
be configured to sample from the Telemetry data set) each exists as stand
alone features with no dependency on any other. The design therefore splits
them into individual, capability specific drivers. MFD is used to create
platform devices for each capability so that they may be managed by their
own driver. The PMT architecture is (for the most part) agnostic to the
type of device it can collect from. Devices nodes are consequently generic
in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability driver
creates a class to manage the list of devices supporting it. Software can
determine which devices support a PMT feature by searching through each
device node entry in the sysfs class folder. It can additionally determine
if a particular device supports a PMT feature by checking for a PMT class
folder in the device folder.

This patch set provides support for the PMT framework, along with support
for Telemetry on Tiger Lake.

Changes from V4:
- Replace MFD with PMT in driver title
- Fix commit tags in chronological order
- Fix includes in alphabetical order
- Use 'raw' string instead of defines for device names
- Add an error message when returning an error code for
unrecognized capability id
- Use dev_err instead of dev_warn for messages when returning
an error
- Change while loop to call pci_find_next_ext_capability once
- Add missing continue in while loop
- Keep PCI platform defines using PCI_DEVICE_DATA magic tied to
the pci_device_id table
- Comment and kernel message cleanup

Changes from V3:
- Write out full acronym for DVSEC in PCI patch commit message and
add 'Designated' to comments
- remove unused variable caught by kernel test robot <[email protected]>
- Add required Co-developed-by signoffs, noted by Andy
- Allow access using new CAP_PERFMON capability as suggested by
Alexey Bundankov
- Fix spacing in Kconfig, noted by Randy
- Other style changes and fixups suggested by Andy

Changes from V2:
- In order to handle certain HW bugs from the telemetry capability
driver, create a single platform device per capability instead of
a device per entry. Add the entry data as device resources and
let the capability driver manage them as a set allowing for
cleaner HW bug resolution.
- Handle discovery table offset bug in intel_pmt.c
- Handle overlapping regions in intel_pmt_telemetry.c
- Add description of sysfs class to testing ABI.
- Don't check size and count until confirming support for the PMT
capability to avoid bailing out when we need to skip it.
- Remove unneeded header file. Move code to the intel_pmt.c, the
only place where it's needed.
- Remove now unused platform data.
- Add missing header files types.h, bits.h.
- Rename file name and build options from telem to telemetry.
- Code cleanup suggested by Andy S.
- x86 mailing list added.

Changes from V1:
- In the telemetry driver, set the device in device_create() to
the parent PCI device (the monitoring device) for clear
association in sysfs. Was set before to the platform device
created by the PCI parent.
- Move telem struct into driver and delete unneeded header file.
- Start telem device numbering from 0 instead of 1. 1 was used
due to anticipated changes, no longer needed.
- Use helper macros suggested by Andy S.
- Rename class to pmt_telemetry, spelling out full name
- Move monitor device name defines to common header
- Coding style, spelling, and Makefile/MAINTAINERS ordering fixes

David E. Box (3):
PCI: Add defines for Designated Vendor-Specific Extended Capability
mfd: Intel Platform Monitoring Technology support
platform/x86: Intel PMT Telemetry capability driver

.../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
MAINTAINERS | 6 +
drivers/mfd/Kconfig | 10 +
drivers/mfd/Makefile | 1 +
drivers/mfd/intel_pmt.c | 220 +++++++++
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_telemetry.c | 448 ++++++++++++++++++
include/uapi/linux/pci_regs.h | 5 +
9 files changed, 747 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_telemetry
create mode 100644 drivers/mfd/intel_pmt.c
create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c

--
2.20.1

2020-07-29 23:01:03

by Mark D Rustad

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] mfd: Intel Platform Monitoring Technology support

at 12:58 AM, Lee Jones <[email protected]> wrote:

> If you do:
>
> do {
> int pos;
>
> pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
> if (!pos)
> break;
>
> Then you can invoke pci_find_next_ext_capability() once, no?

Part of your suggestion here won't work, because pos needs to be
initialized to 0 the first time. As such it needs to be declared and
initialized outside the loop. Other than that it may be ok.

--
Mark Rustad, [email protected]


Attachments:
signature.asc (890.00 B)
Message signed with OpenPGP

2020-07-30 17:55:23

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] mfd: Intel Platform Monitoring Technology support

On Wed, 2020-07-29 at 15:59 -0700, Mark D Rustad wrote:
> at 12:58 AM, Lee Jones <[email protected]> wrote:
>
> > If you do:
> >
> > do {
> > int pos;
> >
> > pos = pci_find_next_ext_capability(pdev, pos,
> > PCI_EXT_CAP_ID_DVSEC);
> > if (!pos)
> > break;
> >
> > Then you can invoke pci_find_next_ext_capability() once, no?
>
> Part of your suggestion here won't work, because pos needs to be
> initialized to 0 the first time. As such it needs to be declared
> and
> initialized outside the loop. Other than that it may be ok.

Already done in V5. Thanks.

David

2020-07-31 06:20:13

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH V4 2/3] mfd: Intel Platform Monitoring Technology support

On Wed, 29 Jul 2020, Mark D Rustad wrote:

> at 12:58 AM, Lee Jones <[email protected]> wrote:
>
> > If you do:
> >
> > do {
> > int pos;
> >
> > pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DVSEC);
> > if (!pos)
> > break;
> >
> > Then you can invoke pci_find_next_ext_capability() once, no?
>
> Part of your suggestion here won't work, because pos needs to be initialized
> to 0 the first time. As such it needs to be declared and initialized outside
> the loop. Other than that it may be ok.

Right. It was just an example I quickly hacked out.

Feel free to move the variable, or make it static, etc.

--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

2020-08-10 14:17:29

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] Intel Platform Monitoring Technology

Friendly ping.

On Wed, 2020-07-29 at 14:37 -0700, David E. Box wrote:
> Intel Platform Monitoring Technology (PMT) is an architecture for
> enumerating and accessing hardware monitoring capabilities on a
> device.
> With customers increasingly asking for hardware telemetry, engineers
> not
> only have to figure out how to measure and collect data, but also how
> to
> deliver it and make it discoverable. The latter may be through some
> device
> specific method requiring device specific tools to collect the data.
> This
> in turn requires customers to manage a suite of different tools in
> order to
> collect the differing assortment of monitoring data on their
> systems. Even
> when such information can be provided in kernel drivers, they may
> require
> constant maintenance to update register mappings as they change with
> firmware updates and new versions of hardware. PMT provides a
> solution for
> discovering and reading telemetry from a device through a hardware
> agnostic
> framework that allows for updates to systems without requiring
> patches to
> the kernel or software tools.
>
> PMT defines several capabilities to support collecting monitoring
> data from
> hardware. All are discoverable as separate instances of the PCIE
> Designated
> Vendor extended capability (DVSEC) with the Intel vendor code. The
> DVSEC ID
> field uniquely identifies the capability. Each DVSEC also provides a
> BAR
> offset to a header that defines capability-specific attributes,
> including
> GUID, feature type, offset and length, as well as configuration
> settings
> where applicable. The GUID uniquely identifies the register space of
> any
> monitor data exposed by the capability. The GUID is associated with
> an XML
> file from the vendor that describes the mapping of the register space
> along
> with properties of the monitor data. This allows vendors to perform
> firmware updates that can change the mapping (e.g. add new metrics)
> without
> requiring any changes to drivers or software tools. The new mapping
> is
> confirmed by an updated GUID, read from the hardware, which software
> uses
> with a new XML.
>
> The current capabilities defined by PMT are Telemetry, Watcher, and
> Crashlog. The Telemetry capability provides access to a continuous
> block
> of read only data. The Watcher capability provides access to hardware
> sampling and tracing features. Crashlog provides access to device
> crash
> dumps. While there is some relationship between capabilities
> (Watcher can
> be configured to sample from the Telemetry data set) each exists as
> stand
> alone features with no dependency on any other. The design therefore
> splits
> them into individual, capability specific drivers. MFD is used to
> create
> platform devices for each capability so that they may be managed by
> their
> own driver. The PMT architecture is (for the most part) agnostic to
> the
> type of device it can collect from. Devices nodes are consequently
> generic
> in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability
> driver
> creates a class to manage the list of devices supporting
> it. Software can
> determine which devices support a PMT feature by searching through
> each
> device node entry in the sysfs class folder. It can additionally
> determine
> if a particular device supports a PMT feature by checking for a PMT
> class
> folder in the device folder.
>
> This patch set provides support for the PMT framework, along with
> support
> for Telemetry on Tiger Lake.
>
> Changes from V4:
> - Replace MFD with PMT in driver title
> - Fix commit tags in chronological order
> - Fix includes in alphabetical order
> - Use 'raw' string instead of defines for device names
> - Add an error message when returning an error code for
> unrecognized capability id
> - Use dev_err instead of dev_warn for messages when returning
> an error
> - Change while loop to call pci_find_next_ext_capability once
> - Add missing continue in while loop
> - Keep PCI platform defines using PCI_DEVICE_DATA magic tied to
> the pci_device_id table
> - Comment and kernel message cleanup
>
> Changes from V3:
> - Write out full acronym for DVSEC in PCI patch commit message
> and
> add 'Designated' to comments
> - remove unused variable caught by kernel test robot <
> [email protected]>
> - Add required Co-developed-by signoffs, noted by Andy
> - Allow access using new CAP_PERFMON capability as suggested by
> Alexey Bundankov
> - Fix spacing in Kconfig, noted by Randy
> - Other style changes and fixups suggested by Andy
>
> Changes from V2:
> - In order to handle certain HW bugs from the telemetry
> capability
> driver, create a single platform device per capability
> instead of
> a device per entry. Add the entry data as device resources
> and
> let the capability driver manage them as a set allowing for
> cleaner HW bug resolution.
> - Handle discovery table offset bug in intel_pmt.c
> - Handle overlapping regions in intel_pmt_telemetry.c
> - Add description of sysfs class to testing ABI.
> - Don't check size and count until confirming support for the
> PMT
> capability to avoid bailing out when we need to skip it.
> - Remove unneeded header file. Move code to the intel_pmt.c,
> the
> only place where it's needed.
> - Remove now unused platform data.
> - Add missing header files types.h, bits.h.
> - Rename file name and build options from telem to telemetry.
> - Code cleanup suggested by Andy S.
> - x86 mailing list added.
>
> Changes from V1:
> - In the telemetry driver, set the device in device_create() to
> the parent PCI device (the monitoring device) for clear
> association in sysfs. Was set before to the platform device
> created by the PCI parent.
> - Move telem struct into driver and delete unneeded header
> file.
> - Start telem device numbering from 0 instead of 1. 1 was used
> due to anticipated changes, no longer needed.
> - Use helper macros suggested by Andy S.
> - Rename class to pmt_telemetry, spelling out full name
> - Move monitor device name defines to common header
> - Coding style, spelling, and Makefile/MAINTAINERS ordering
> fixes
>
> David E. Box (3):
> PCI: Add defines for Designated Vendor-Specific Extended Capability
> mfd: Intel Platform Monitoring Technology support
> platform/x86: Intel PMT Telemetry capability driver
>
> .../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
> MAINTAINERS | 6 +
> drivers/mfd/Kconfig | 10 +
> drivers/mfd/Makefile | 1 +
> drivers/mfd/intel_pmt.c | 220 +++++++++
> drivers/platform/x86/Kconfig | 10 +
> drivers/platform/x86/Makefile | 1 +
> drivers/platform/x86/intel_pmt_telemetry.c | 448
> ++++++++++++++++++
> include/uapi/linux/pci_regs.h | 5 +
> 9 files changed, 747 insertions(+)
> create mode 100644 Documentation/ABI/testing/sysfs-class-
> pmt_telemetry
> create mode 100644 drivers/mfd/intel_pmt.c
> create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c
>

2020-08-10 14:44:01

by Umesh A

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] Intel Platform Monitoring Technology

MCTP and PLDM are the latest in Platform management Technology. Sw
application and drivers can be implemented on the PCIe platform.
Previously I spent some time on this.

On Mon, Aug 10, 2020 at 7:49 PM David E. Box
<[email protected]> wrote:
>
> Friendly ping.
>
> On Wed, 2020-07-29 at 14:37 -0700, David E. Box wrote:
> > Intel Platform Monitoring Technology (PMT) is an architecture for
> > enumerating and accessing hardware monitoring capabilities on a
> > device.
> > With customers increasingly asking for hardware telemetry, engineers
> > not
> > only have to figure out how to measure and collect data, but also how
> > to
> > deliver it and make it discoverable. The latter may be through some
> > device
> > specific method requiring device specific tools to collect the data.
> > This
> > in turn requires customers to manage a suite of different tools in
> > order to
> > collect the differing assortment of monitoring data on their
> > systems. Even
> > when such information can be provided in kernel drivers, they may
> > require
> > constant maintenance to update register mappings as they change with
> > firmware updates and new versions of hardware. PMT provides a
> > solution for
> > discovering and reading telemetry from a device through a hardware
> > agnostic
> > framework that allows for updates to systems without requiring
> > patches to
> > the kernel or software tools.
> >
> > PMT defines several capabilities to support collecting monitoring
> > data from
> > hardware. All are discoverable as separate instances of the PCIE
> > Designated
> > Vendor extended capability (DVSEC) with the Intel vendor code. The
> > DVSEC ID
> > field uniquely identifies the capability. Each DVSEC also provides a
> > BAR
> > offset to a header that defines capability-specific attributes,
> > including
> > GUID, feature type, offset and length, as well as configuration
> > settings
> > where applicable. The GUID uniquely identifies the register space of
> > any
> > monitor data exposed by the capability. The GUID is associated with
> > an XML
> > file from the vendor that describes the mapping of the register space
> > along
> > with properties of the monitor data. This allows vendors to perform
> > firmware updates that can change the mapping (e.g. add new metrics)
> > without
> > requiring any changes to drivers or software tools. The new mapping
> > is
> > confirmed by an updated GUID, read from the hardware, which software
> > uses
> > with a new XML.
> >
> > The current capabilities defined by PMT are Telemetry, Watcher, and
> > Crashlog. The Telemetry capability provides access to a continuous
> > block
> > of read only data. The Watcher capability provides access to hardware
> > sampling and tracing features. Crashlog provides access to device
> > crash
> > dumps. While there is some relationship between capabilities
> > (Watcher can
> > be configured to sample from the Telemetry data set) each exists as
> > stand
> > alone features with no dependency on any other. The design therefore
> > splits
> > them into individual, capability specific drivers. MFD is used to
> > create
> > platform devices for each capability so that they may be managed by
> > their
> > own driver. The PMT architecture is (for the most part) agnostic to
> > the
> > type of device it can collect from. Devices nodes are consequently
> > generic
> > in naming, e.g. /dev/telem<n> and /dev/smplr<n>. Each capability
> > driver
> > creates a class to manage the list of devices supporting
> > it. Software can
> > determine which devices support a PMT feature by searching through
> > each
> > device node entry in the sysfs class folder. It can additionally
> > determine
> > if a particular device supports a PMT feature by checking for a PMT
> > class
> > folder in the device folder.
> >
> > This patch set provides support for the PMT framework, along with
> > support
> > for Telemetry on Tiger Lake.
> >
> > Changes from V4:
> > - Replace MFD with PMT in driver title
> > - Fix commit tags in chronological order
> > - Fix includes in alphabetical order
> > - Use 'raw' string instead of defines for device names
> > - Add an error message when returning an error code for
> > unrecognized capability id
> > - Use dev_err instead of dev_warn for messages when returning
> > an error
> > - Change while loop to call pci_find_next_ext_capability once
> > - Add missing continue in while loop
> > - Keep PCI platform defines using PCI_DEVICE_DATA magic tied to
> > the pci_device_id table
> > - Comment and kernel message cleanup
> >
> > Changes from V3:
> > - Write out full acronym for DVSEC in PCI patch commit message
> > and
> > add 'Designated' to comments
> > - remove unused variable caught by kernel test robot <
> > [email protected]>
> > - Add required Co-developed-by signoffs, noted by Andy
> > - Allow access using new CAP_PERFMON capability as suggested by
> > Alexey Bundankov
> > - Fix spacing in Kconfig, noted by Randy
> > - Other style changes and fixups suggested by Andy
> >
> > Changes from V2:
> > - In order to handle certain HW bugs from the telemetry
> > capability
> > driver, create a single platform device per capability
> > instead of
> > a device per entry. Add the entry data as device resources
> > and
> > let the capability driver manage them as a set allowing for
> > cleaner HW bug resolution.
> > - Handle discovery table offset bug in intel_pmt.c
> > - Handle overlapping regions in intel_pmt_telemetry.c
> > - Add description of sysfs class to testing ABI.
> > - Don't check size and count until confirming support for the
> > PMT
> > capability to avoid bailing out when we need to skip it.
> > - Remove unneeded header file. Move code to the intel_pmt.c,
> > the
> > only place where it's needed.
> > - Remove now unused platform data.
> > - Add missing header files types.h, bits.h.
> > - Rename file name and build options from telem to telemetry.
> > - Code cleanup suggested by Andy S.
> > - x86 mailing list added.
> >
> > Changes from V1:
> > - In the telemetry driver, set the device in device_create() to
> > the parent PCI device (the monitoring device) for clear
> > association in sysfs. Was set before to the platform device
> > created by the PCI parent.
> > - Move telem struct into driver and delete unneeded header
> > file.
> > - Start telem device numbering from 0 instead of 1. 1 was used
> > due to anticipated changes, no longer needed.
> > - Use helper macros suggested by Andy S.
> > - Rename class to pmt_telemetry, spelling out full name
> > - Move monitor device name defines to common header
> > - Coding style, spelling, and Makefile/MAINTAINERS ordering
> > fixes
> >
> > David E. Box (3):
> > PCI: Add defines for Designated Vendor-Specific Extended Capability
> > mfd: Intel Platform Monitoring Technology support
> > platform/x86: Intel PMT Telemetry capability driver
> >
> > .../ABI/testing/sysfs-class-pmt_telemetry | 46 ++
> > MAINTAINERS | 6 +
> > drivers/mfd/Kconfig | 10 +
> > drivers/mfd/Makefile | 1 +
> > drivers/mfd/intel_pmt.c | 220 +++++++++
> > drivers/platform/x86/Kconfig | 10 +
> > drivers/platform/x86/Makefile | 1 +
> > drivers/platform/x86/intel_pmt_telemetry.c | 448
> > ++++++++++++++++++
> > include/uapi/linux/pci_regs.h | 5 +
> > 9 files changed, 747 insertions(+)
> > create mode 100644 Documentation/ABI/testing/sysfs-class-
> > pmt_telemetry
> > create mode 100644 drivers/mfd/intel_pmt.c
> > create mode 100644 drivers/platform/x86/intel_pmt_telemetry.c
> >
>

2020-08-11 08:05:39

by Lee Jones

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] Intel Platform Monitoring Technology

On Mon, 10 Aug 2020, David E. Box wrote:

> Friendly ping.

Don't do that. Sending contentless pings is seldom helpful.

If you think your set has been dropped please just send a [RESEND].

This is probably worth doing anyway, since you've sent v2, v3, v4 and
now v5 has reply-tos of one another. The thread has become quite
messy as a result.

Also please take the time to identify where we are with respect to the
current release cycle. The merge-window is open presently. Meaning
that most maintainers are busy, either sending out pull-requests or
ramping up for the next cycle (or just taking a quick breather).

--
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

2020-08-11 14:51:10

by David E. Box

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] Intel Platform Monitoring Technology

On Tue, 2020-08-11 at 09:04 +0100, Lee Jones wrote:
> On Mon, 10 Aug 2020, David E. Box wrote:
>
> > Friendly ping.
>
> Don't do that. Sending contentless pings is seldom helpful.
>
> If you think your set has been dropped please just send a [RESEND].
>
> This is probably worth doing anyway, since you've sent v2, v3, v4 and
> now v5 has reply-tos of one another. The thread has become quite
> messy as a result.
>
> Also please take the time to identify where we are with respect to
> the
> current release cycle. The merge-window is open presently. Meaning
> that most maintainers are busy, either sending out pull-requests or
> ramping up for the next cycle (or just taking a quick breather).
>

No problem. I'll resend v5 in a new thread when rc1 is tagged. Thanks.