2023-06-21 09:38:01

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v6 0/5] Improve PTT filter interface and some fixes

From: Yicong Yang <[email protected]>

This series tends to improve the PTT's filter interface in 2 aspects (Patch 2&3):
- Support dynamically filter updating to response to hotplug
Previous the supported filter list is settled down once the driver probed and
it maybe out-of-date if hotplug events happen later. User need to reload the
driver to update list. Patch 1/2 enable the driver to update the list by
registering a PCI bus notifier and the filter list will always be the latest.
- Export the available filters through sysfs
Previous user needs to calculate the filters and filter value using device's
BDF number, which requires the user to know the hardware well. Patch 3/3 tends
to export the available filter information through sysfs attributes, the filter
value will be gotten by reading the file. This will be more user friendly.

In order to support above function, this series also includes a patch 1/4 to factor
out the allocation and release function of PTT filters.

Also includes an improvement and a fix. Patch 4 tends to set proper PMU capability
to avoid collecting unnecessary data to save the storage. Patch 5 fix an improper
use of pci_irq_vector() which have potential problem.

Change since v5:
- PERF_PMU_CAP_EXCLUSIVE is still needed so keep it as is, thanks Junhao for pointing it
Link: https://lore.kernel.org/all/[email protected]/

Change since v4:
- Add tags for Patch 2,3,5 and tweak some comments/docs in Patch 3
Link: https://lore.kernel.org/all/[email protected]/

Change since v3:
- Addressed the comment from Jonathan. Add tags for Patch 1 and 4. Thanks.
- Add one bugfix in Patch 5/5
Link: https://lore.kernel.org/linux-pci/[email protected]/

Change since v2:
- Fix one possible issue for dereferencing a NULL pointer
Link: https://lore.kernel.org/linux-pci/[email protected]/

Change since v1:
- Drop the patch for handling the cpumask since it seems to be redundant
- Refine of the codes per Jonathan
- Add Patch 1/4 for refactor the filters allocation and release
- Thanks the review of Jonathan.
Link: https://lore.kernel.org/linux-pci/[email protected]/T/#m47e4de552d69920035214b3e91080cdc185f61f5

Yicong Yang (5):
hwtracing: hisi_ptt: Factor out filter allocation and release
operation
hwtracing: hisi_ptt: Add support for dynamically updating the filter
list
hwtracing: hisi_ptt: Export available filters through sysfs
hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU
hwtracing: hisi_ptt: Fix potential sleep in atomic context

.../ABI/testing/sysfs-devices-hisi_ptt | 52 ++
Documentation/trace/hisi-ptt.rst | 12 +-
drivers/hwtracing/ptt/hisi_ptt.c | 444 ++++++++++++++++--
drivers/hwtracing/ptt/hisi_ptt.h | 56 +++
4 files changed, 526 insertions(+), 38 deletions(-)

--
2.24.0



2023-06-21 09:41:30

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v6 5/5] hwtracing: hisi_ptt: Fix potential sleep in atomic context

From: Yicong Yang <[email protected]>

We're using pci_irq_vector() to obtain the interrupt number and then
bind it to the CPU start perf under the protection of spinlock in
pmu::start(). pci_irq_vector() might sleep since [1] because it will
call msi_domain_get_virq() to get the MSI interrupt number and it
needs to acquire dev->msi.data->mutex. Getting a mutex will sleep on
contention. So use pci_irq_vector() in an atomic context is problematic.

This patch cached the interrupt number in the probe() and uses the
cached data instead to avoid potential sleep.

[1] commit 82ff8e6b78fc ("PCI/MSI: Use msi_get_virq() in pci_get_vector()")

Fixes: ff0de066b463 ("hwtracing: hisi_ptt: Add trace function support for HiSilicon PCIe Tune and Trace device")
Reviewed-by: Jonathan Cameron <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/hwtracing/ptt/hisi_ptt.c | 12 +++++-------
drivers/hwtracing/ptt/hisi_ptt.h | 2 ++
2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index 103fb6b9bffb..ba081b6d2435 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -341,13 +341,13 @@ static int hisi_ptt_register_irq(struct hisi_ptt *hisi_ptt)
if (ret < 0)
return ret;

- ret = devm_request_threaded_irq(&pdev->dev,
- pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ),
+ hisi_ptt->trace_irq = pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ);
+ ret = devm_request_threaded_irq(&pdev->dev, hisi_ptt->trace_irq,
NULL, hisi_ptt_isr, 0,
DRV_NAME, hisi_ptt);
if (ret) {
pci_err(pdev, "failed to request irq %d, ret = %d\n",
- pci_irq_vector(pdev, HISI_PTT_TRACE_DMA_IRQ), ret);
+ hisi_ptt->trace_irq, ret);
return ret;
}

@@ -1098,8 +1098,7 @@ static void hisi_ptt_pmu_start(struct perf_event *event, int flags)
* core in event_function_local(). If CPU passed is offline we'll fail
* here, just log it since we can do nothing here.
*/
- ret = irq_set_affinity(pci_irq_vector(hisi_ptt->pdev, HISI_PTT_TRACE_DMA_IRQ),
- cpumask_of(cpu));
+ ret = irq_set_affinity(hisi_ptt->trace_irq, cpumask_of(cpu));
if (ret)
dev_warn(dev, "failed to set the affinity of trace interrupt\n");

@@ -1394,8 +1393,7 @@ static int hisi_ptt_cpu_teardown(unsigned int cpu, struct hlist_node *node)
* Also make sure the interrupt bind to the migrated CPU as well. Warn
* the user on failure here.
*/
- if (irq_set_affinity(pci_irq_vector(hisi_ptt->pdev, HISI_PTT_TRACE_DMA_IRQ),
- cpumask_of(target)))
+ if (irq_set_affinity(hisi_ptt->trace_irq, cpumask_of(target)))
dev_warn(dev, "failed to set the affinity of trace interrupt\n");

hisi_ptt->trace_ctrl.on_cpu = target;
diff --git a/drivers/hwtracing/ptt/hisi_ptt.h b/drivers/hwtracing/ptt/hisi_ptt.h
index 164012dba4ec..e17f045d7e72 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.h
+++ b/drivers/hwtracing/ptt/hisi_ptt.h
@@ -201,6 +201,7 @@ struct hisi_ptt_pmu_buf {
* @pdev: pci_dev of this PTT device
* @tune_lock: lock to serialize the tune process
* @pmu_lock: lock to serialize the perf process
+ * @trace_irq: interrupt number used by trace
* @upper_bdf: the upper BDF range of the PCI devices managed by this PTT device
* @lower_bdf: the lower BDF range of the PCI devices managed by this PTT device
* @port_filters: the filter list of root ports
@@ -221,6 +222,7 @@ struct hisi_ptt {
struct pci_dev *pdev;
struct mutex tune_lock;
spinlock_t pmu_lock;
+ int trace_irq;
u32 upper_bdf;
u32 lower_bdf;

--
2.24.0


2023-06-21 09:43:16

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v6 4/5] hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU

From: Yicong Yang <[email protected]>

The PTT trace collects PCIe TLP headers from the PCIe link and don't
have the ability to exclude certain context. It doesn't support itrace
as well. So replace PERF_PMU_CAP_ITRACE with PERF_PMU_CAP_NO_EXCLUDE.
This will greatly save the storage of final data. Tested tracing idle
link for ~15s, without this patch we'll collect ~28.682MB data for
additional information and with this patch it reduced to ~0.226MB.

Reviewed-by: Jonathan Cameron <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/hwtracing/ptt/hisi_ptt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index 5c366a757573..103fb6b9bffb 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -1212,7 +1212,7 @@ static int hisi_ptt_register_pmu(struct hisi_ptt *hisi_ptt)

hisi_ptt->hisi_ptt_pmu = (struct pmu) {
.module = THIS_MODULE,
- .capabilities = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_ITRACE,
+ .capabilities = PERF_PMU_CAP_EXCLUSIVE | PERF_PMU_CAP_NO_EXCLUDE,
.task_ctx_nr = perf_sw_context,
.attr_groups = hisi_ptt_pmu_groups,
.event_init = hisi_ptt_pmu_event_init,
--
2.24.0


2023-06-21 09:43:16

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v6 3/5] hwtracing: hisi_ptt: Export available filters through sysfs

From: Yicong Yang <[email protected]>

The PTT can only filter the traced TLP headers by the Root Ports or the
Requester ID of the Endpoint, which are located on the same PCIe core of
the PTT device. The filter value used is derived from the BDF number of
the supported Root Port or the Endpoint. It's not friendly enough for the
users since it requires the user to be familiar enough with the platform
and calculate the filter value manually.

This patch export the available filters through sysfs. Each available
filters is presented as an individual file with the name of the BDF
number of the related PCIe device. The files are created under
$(PTT PMU dir)/available_root_port_filters and
$(PTT PMU dir)/available_requester_filters respectively. The filter
value can be known by reading the related file.

Then the users can easily know the available filters for trace and get
the filter values without calculating.

Reviewed-by: Jonathan Cameron <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
.../ABI/testing/sysfs-devices-hisi_ptt | 52 +++++
Documentation/trace/hisi-ptt.rst | 6 +
drivers/hwtracing/ptt/hisi_ptt.c | 207 ++++++++++++++++++
drivers/hwtracing/ptt/hisi_ptt.h | 14 ++
4 files changed, 279 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-hisi_ptt b/Documentation/ABI/testing/sysfs-devices-hisi_ptt
index 82de6d710266..d7e206b4901c 100644
--- a/Documentation/ABI/testing/sysfs-devices-hisi_ptt
+++ b/Documentation/ABI/testing/sysfs-devices-hisi_ptt
@@ -59,3 +59,55 @@ Description: (RW) Control the allocated buffer watermark of outbound packets.
The available tune data is [0, 1, 2]. Writing a negative value
will return an error, and out of range values will be converted
to 2. The value indicates a probable level of the event.
+
+What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters
+Date: May 2023
+KernelVersion: 6.5
+Contact: Yicong Yang <[email protected]>
+Description: This directory contains the files providing the PCIe Root Port filters
+ information used for PTT trace. Each file is named after the supported
+ Root Port device name <domain>:<bus>:<device>.<function>.
+
+ See the description of the "filter" in Documentation/trace/hisi-ptt.rst
+ for more information.
+
+What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters/multiselect
+Date: May 2023
+KernelVersion: 6.5
+Contact: Yicong Yang <[email protected]>
+Description: (Read) Indicates if this kind of filter can be selected at the same
+ time as others filters, or must be used on it's own. 1 indicates
+ the former case and 0 indicates the latter.
+
+What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters/<bdf>
+Date: May 2023
+KernelVersion: 6.5
+Contact: Yicong Yang <[email protected]>
+Description: (Read) Indicates the filter value of this Root Port filter, which
+ can be used to control the TLP headers to trace by the PTT trace.
+
+What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters
+Date: May 2023
+KernelVersion: 6.5
+Contact: Yicong Yang <[email protected]>
+Description: This directory contains the files providing the PCIe Requester filters
+ information used for PTT trace. Each file is named after the supported
+ Endpoint device name <domain>:<bus>:<device>.<function>.
+
+ See the description of the "filter" in Documentation/trace/hisi-ptt.rst
+ for more information.
+
+What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters/multiselect
+Date: May 2023
+KernelVersion: 6.5
+Contact: Yicong Yang <[email protected]>
+Description: (Read) Indicates if this kind of filter can be selected at the same
+ time as others filters, or must be used on it's own. 1 indicates
+ the former case and 0 indicates the latter.
+
+What: /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters/<bdf>
+Date: May 2023
+KernelVersion: 6.5
+Contact: Yicong Yang <[email protected]>
+Description: (Read) Indicates the filter value of this Requester filter, which
+ can be used to control the TLP headers to trace by the PTT trace.
diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
index 69c538153838..989255eb5622 100644
--- a/Documentation/trace/hisi-ptt.rst
+++ b/Documentation/trace/hisi-ptt.rst
@@ -148,6 +148,12 @@ For example, if the desired filter is Endpoint function 0000:01:00.1 the filter
value will be 0x00101. If the desired filter is Root Port 0000:00:10.0 then
then filter value is calculated as 0x80001.

+The driver also presents every supported Root Port and Requester filter through
+sysfs. Each filter will be an individual file with name of its related PCIe
+device name (domain:bus:device.function). The files of Root Port filters are
+under $(PTT PMU dir)/root_port_filters and files of Requester filters
+are under $(PTT PMU dir)/requester_filters.
+
Note that multiple Root Ports can be specified at one time, but only one
Endpoint function can be specified in one trace. Specifying both Root Port
and function at the same time is not supported. Driver maintains a list of
diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index 31471488fc91..5c366a757573 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -405,6 +405,144 @@ hisi_ptt_alloc_add_filter(struct hisi_ptt *hisi_ptt, u16 devid, bool is_port)
return filter;
}

+static ssize_t hisi_ptt_filter_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct hisi_ptt_filter_desc *filter;
+ unsigned long filter_val;
+
+ filter = container_of(attr, struct hisi_ptt_filter_desc, attr);
+ filter_val = hisi_ptt_get_filter_val(filter->devid, filter->is_port) |
+ (filter->is_port ? HISI_PTT_PMU_FILTER_IS_PORT : 0);
+
+ return sysfs_emit(buf, "0x%05lx\n", filter_val);
+}
+
+static int hisi_ptt_create_rp_filter_attr(struct hisi_ptt *hisi_ptt,
+ struct hisi_ptt_filter_desc *filter)
+{
+ struct kobject *kobj = &hisi_ptt->hisi_ptt_pmu.dev->kobj;
+
+ sysfs_attr_init(&filter->attr.attr);
+ filter->attr.attr.name = filter->name;
+ filter->attr.attr.mode = 0400; /* DEVICE_ATTR_ADMIN_RO */
+ filter->attr.show = hisi_ptt_filter_show;
+
+ return sysfs_add_file_to_group(kobj, &filter->attr.attr,
+ HISI_PTT_RP_FILTERS_GRP_NAME);
+}
+
+static void hisi_ptt_remove_rp_filter_attr(struct hisi_ptt *hisi_ptt,
+ struct hisi_ptt_filter_desc *filter)
+{
+ struct kobject *kobj = &hisi_ptt->hisi_ptt_pmu.dev->kobj;
+
+ sysfs_remove_file_from_group(kobj, &filter->attr.attr,
+ HISI_PTT_RP_FILTERS_GRP_NAME);
+}
+
+static int hisi_ptt_create_req_filter_attr(struct hisi_ptt *hisi_ptt,
+ struct hisi_ptt_filter_desc *filter)
+{
+ struct kobject *kobj = &hisi_ptt->hisi_ptt_pmu.dev->kobj;
+
+ sysfs_attr_init(&filter->attr.attr);
+ filter->attr.attr.name = filter->name;
+ filter->attr.attr.mode = 0400; /* DEVICE_ATTR_ADMIN_RO */
+ filter->attr.show = hisi_ptt_filter_show;
+
+ return sysfs_add_file_to_group(kobj, &filter->attr.attr,
+ HISI_PTT_REQ_FILTERS_GRP_NAME);
+}
+
+static void hisi_ptt_remove_req_filter_attr(struct hisi_ptt *hisi_ptt,
+ struct hisi_ptt_filter_desc *filter)
+{
+ struct kobject *kobj = &hisi_ptt->hisi_ptt_pmu.dev->kobj;
+
+ sysfs_remove_file_from_group(kobj, &filter->attr.attr,
+ HISI_PTT_REQ_FILTERS_GRP_NAME);
+}
+
+static int hisi_ptt_create_filter_attr(struct hisi_ptt *hisi_ptt,
+ struct hisi_ptt_filter_desc *filter)
+{
+ int ret;
+
+ if (filter->is_port)
+ ret = hisi_ptt_create_rp_filter_attr(hisi_ptt, filter);
+ else
+ ret = hisi_ptt_create_req_filter_attr(hisi_ptt, filter);
+
+ if (ret)
+ pci_err(hisi_ptt->pdev, "failed to create sysfs attribute for filter %s\n",
+ filter->name);
+
+ return ret;
+}
+
+static void hisi_ptt_remove_filter_attr(struct hisi_ptt *hisi_ptt,
+ struct hisi_ptt_filter_desc *filter)
+{
+ if (filter->is_port)
+ hisi_ptt_remove_rp_filter_attr(hisi_ptt, filter);
+ else
+ hisi_ptt_remove_req_filter_attr(hisi_ptt, filter);
+}
+
+static void hisi_ptt_remove_all_filter_attributes(void *data)
+{
+ struct hisi_ptt_filter_desc *filter;
+ struct hisi_ptt *hisi_ptt = data;
+
+ mutex_lock(&hisi_ptt->filter_lock);
+
+ list_for_each_entry(filter, &hisi_ptt->req_filters, list)
+ hisi_ptt_remove_filter_attr(hisi_ptt, filter);
+
+ list_for_each_entry(filter, &hisi_ptt->port_filters, list)
+ hisi_ptt_remove_filter_attr(hisi_ptt, filter);
+
+ hisi_ptt->sysfs_inited = false;
+ mutex_unlock(&hisi_ptt->filter_lock);
+}
+
+static int hisi_ptt_init_filter_attributes(struct hisi_ptt *hisi_ptt)
+{
+ struct hisi_ptt_filter_desc *filter;
+ int ret;
+
+ mutex_lock(&hisi_ptt->filter_lock);
+
+ /*
+ * Register the reset callback in the first stage. In reset we traverse
+ * the filters list to remove the sysfs attributes so the callback can
+ * be called safely even without below filter attributes creation.
+ */
+ ret = devm_add_action(&hisi_ptt->pdev->dev,
+ hisi_ptt_remove_all_filter_attributes,
+ hisi_ptt);
+ if (ret)
+ goto out;
+
+ list_for_each_entry(filter, &hisi_ptt->port_filters, list) {
+ ret = hisi_ptt_create_filter_attr(hisi_ptt, filter);
+ if (ret)
+ goto out;
+ }
+
+ list_for_each_entry(filter, &hisi_ptt->req_filters, list) {
+ ret = hisi_ptt_create_filter_attr(hisi_ptt, filter);
+ if (ret)
+ goto out;
+ }
+
+ hisi_ptt->sysfs_inited = true;
+out:
+ mutex_unlock(&hisi_ptt->filter_lock);
+ return ret;
+}
+
static void hisi_ptt_update_filters(struct work_struct *work)
{
struct delayed_work *delayed_work = to_delayed_work(work);
@@ -429,6 +567,18 @@ static void hisi_ptt_update_filters(struct work_struct *work)
filter = hisi_ptt_alloc_add_filter(hisi_ptt, info.devid, info.is_port);
if (!filter)
continue;
+
+ /*
+ * If filters' sysfs entries hasn't been initialized,
+ * then we're still at probe stage. Add the filters to
+ * the list and later hisi_ptt_init_filter_attributes()
+ * will create sysfs attributes for all the filters.
+ */
+ if (hisi_ptt->sysfs_inited &&
+ hisi_ptt_create_filter_attr(hisi_ptt, filter)) {
+ hisi_ptt_del_free_filter(hisi_ptt, filter);
+ continue;
+ }
} else {
struct hisi_ptt_filter_desc *tmp;
struct list_head *target_list;
@@ -438,6 +588,9 @@ static void hisi_ptt_update_filters(struct work_struct *work)

list_for_each_entry_safe(filter, tmp, target_list, list)
if (filter->devid == info.devid) {
+ if (hisi_ptt->sysfs_inited)
+ hisi_ptt_remove_filter_attr(hisi_ptt, filter);
+
hisi_ptt_del_free_filter(hisi_ptt, filter);
break;
}
@@ -663,10 +816,58 @@ static struct attribute_group hisi_ptt_pmu_format_group = {
.attrs = hisi_ptt_pmu_format_attrs,
};

+static ssize_t hisi_ptt_filter_multiselect_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct dev_ext_attribute *ext_attr;
+
+ ext_attr = container_of(attr, struct dev_ext_attribute, attr);
+ return sysfs_emit(buf, "%s\n", (char *)ext_attr->var);
+}
+
+static struct dev_ext_attribute root_port_filters_multiselect = {
+ .attr = {
+ .attr = { .name = "multiselect", .mode = 0400 },
+ .show = hisi_ptt_filter_multiselect_show,
+ },
+ .var = "1",
+};
+
+static struct attribute *hisi_ptt_pmu_root_ports_attrs[] = {
+ &root_port_filters_multiselect.attr.attr,
+ NULL
+};
+
+static struct attribute_group hisi_ptt_pmu_root_ports_group = {
+ .name = HISI_PTT_RP_FILTERS_GRP_NAME,
+ .attrs = hisi_ptt_pmu_root_ports_attrs,
+};
+
+static struct dev_ext_attribute requester_filters_multiselect = {
+ .attr = {
+ .attr = { .name = "multiselect", .mode = 0400 },
+ .show = hisi_ptt_filter_multiselect_show,
+ },
+ .var = "0",
+};
+
+static struct attribute *hisi_ptt_pmu_requesters_attrs[] = {
+ &requester_filters_multiselect.attr.attr,
+ NULL
+};
+
+static struct attribute_group hisi_ptt_pmu_requesters_group = {
+ .name = HISI_PTT_REQ_FILTERS_GRP_NAME,
+ .attrs = hisi_ptt_pmu_requesters_attrs,
+};
+
static const struct attribute_group *hisi_ptt_pmu_groups[] = {
&hisi_ptt_cpumask_attr_group,
&hisi_ptt_pmu_format_group,
&hisi_ptt_tune_group,
+ &hisi_ptt_pmu_root_ports_group,
+ &hisi_ptt_pmu_requesters_group,
NULL
};

@@ -1147,6 +1348,12 @@ static int hisi_ptt_probe(struct pci_dev *pdev,
return ret;
}

+ ret = hisi_ptt_init_filter_attributes(hisi_ptt);
+ if (ret) {
+ pci_err(pdev, "failed to init sysfs filter attributes, ret = %d", ret);
+ return ret;
+ }
+
return 0;
}

diff --git a/drivers/hwtracing/ptt/hisi_ptt.h b/drivers/hwtracing/ptt/hisi_ptt.h
index d598411e9cb8..164012dba4ec 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.h
+++ b/drivers/hwtracing/ptt/hisi_ptt.h
@@ -11,6 +11,7 @@

#include <linux/bits.h>
#include <linux/cpumask.h>
+#include <linux/device.h>
#include <linux/kfifo.h>
#include <linux/list.h>
#include <linux/mutex.h>
@@ -139,14 +140,25 @@ struct hisi_ptt_trace_ctrl {
u32 type:4;
};

+/*
+ * sysfs attribute group name for root port filters and requester filters:
+ * /sys/devices/hisi_ptt<sicl_id>_<core_id>/root_port_filters
+ * and
+ * /sys/devices/hisi_ptt<sicl_id>_<core_id>/requester_filters
+ */
+#define HISI_PTT_RP_FILTERS_GRP_NAME "root_port_filters"
+#define HISI_PTT_REQ_FILTERS_GRP_NAME "requester_filters"
+
/**
* struct hisi_ptt_filter_desc - Descriptor of the PTT trace filter
+ * @attr: sysfs attribute of this filter
* @list: entry of this descriptor in the filter list
* @is_port: the PCI device of the filter is a Root Port or not
* @name: name of this filter, same as the name of the related PCI device
* @devid: the PCI device's devid of the filter
*/
struct hisi_ptt_filter_desc {
+ struct device_attribute attr;
struct list_head list;
bool is_port;
char *name;
@@ -194,6 +206,7 @@ struct hisi_ptt_pmu_buf {
* @port_filters: the filter list of root ports
* @req_filters: the filter list of requester ID
* @filter_lock: lock to protect the filters
+ * @sysfs_inited: whether the filters' sysfs entries has been initialized
* @port_mask: port mask of the managed root ports
* @work: delayed work for filter updating
* @filter_update_lock: spinlock to protect the filter update fifo
@@ -221,6 +234,7 @@ struct hisi_ptt {
struct list_head port_filters;
struct list_head req_filters;
struct mutex filter_lock;
+ bool sysfs_inited;
u16 port_mask;

/*
--
2.24.0


2023-06-21 09:51:08

by Yicong Yang

[permalink] [raw]
Subject: [PATCH v6 2/5] hwtracing: hisi_ptt: Add support for dynamically updating the filter list

From: Yicong Yang <[email protected]>

The PCIe devices supported by the PTT trace can be removed/rescanned by
hotplug or through sysfs. Add support for dynamically updating the
available filter list by registering a PCI bus notifier block. Then user
can always get latest information about available tracing filters and
driver can block the invalid filters of which related devices no longer
exist in the system.

Reviewed-by: Jonathan Cameron <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
Documentation/trace/hisi-ptt.rst | 6 +-
drivers/hwtracing/ptt/hisi_ptt.c | 170 +++++++++++++++++++++++++++++--
drivers/hwtracing/ptt/hisi_ptt.h | 40 ++++++++
3 files changed, 205 insertions(+), 11 deletions(-)

diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
index 4f87d8e21065..69c538153838 100644
--- a/Documentation/trace/hisi-ptt.rst
+++ b/Documentation/trace/hisi-ptt.rst
@@ -153,9 +153,9 @@ Endpoint function can be specified in one trace. Specifying both Root Port
and function at the same time is not supported. Driver maintains a list of
available filters and will check the invalid inputs.

-Currently the available filters are detected in driver's probe. If the supported
-devices are removed/added after probe, you may need to reload the driver to update
-the filters.
+The available filters will be dynamically updated, which means you will always
+get correct filter information when hotplug events happen, or when you manually
+remove/rescan the devices.

2. Type
-------
diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c
index 548cfef51ace..31471488fc91 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.c
+++ b/drivers/hwtracing/ptt/hisi_ptt.c
@@ -357,24 +357,42 @@ static int hisi_ptt_register_irq(struct hisi_ptt *hisi_ptt)
static void hisi_ptt_del_free_filter(struct hisi_ptt *hisi_ptt,
struct hisi_ptt_filter_desc *filter)
{
+ if (filter->is_port)
+ hisi_ptt->port_mask &= ~hisi_ptt_get_filter_val(filter->devid, true);
+
list_del(&filter->list);
+ kfree(filter->name);
kfree(filter);
}

static struct hisi_ptt_filter_desc *
-hisi_ptt_alloc_add_filter(struct hisi_ptt *hisi_ptt, struct pci_dev *pdev)
+hisi_ptt_alloc_add_filter(struct hisi_ptt *hisi_ptt, u16 devid, bool is_port)
{
struct hisi_ptt_filter_desc *filter;
+ u8 devfn = devid & 0xff;
+ char *filter_name;
+
+ filter_name = kasprintf(GFP_KERNEL, "%04x:%02x:%02x.%d", pci_domain_nr(hisi_ptt->pdev->bus),
+ PCI_BUS_NUM(devid), PCI_SLOT(devfn), PCI_FUNC(devfn));
+ if (!filter_name) {
+ pci_err(hisi_ptt->pdev, "failed to allocate name for filter %04x:%02x:%02x.%d\n",
+ pci_domain_nr(hisi_ptt->pdev->bus), PCI_BUS_NUM(devid),
+ PCI_SLOT(devfn), PCI_FUNC(devfn));
+ return NULL;
+ }

filter = kzalloc(sizeof(*filter), GFP_KERNEL);
if (!filter) {
pci_err(hisi_ptt->pdev, "failed to add filter for %s\n",
- pci_name(pdev));
+ filter_name);
+ kfree(filter_name);
return NULL;
}

- filter->devid = PCI_DEVID(pdev->bus->number, pdev->devfn);
- filter->is_port = pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT;
+ filter->name = filter_name;
+ filter->is_port = is_port;
+ filter->devid = devid;
+
if (filter->is_port) {
list_add_tail(&filter->list, &hisi_ptt->port_filters);

@@ -387,6 +405,102 @@ hisi_ptt_alloc_add_filter(struct hisi_ptt *hisi_ptt, struct pci_dev *pdev)
return filter;
}

+static void hisi_ptt_update_filters(struct work_struct *work)
+{
+ struct delayed_work *delayed_work = to_delayed_work(work);
+ struct hisi_ptt_filter_update_info info;
+ struct hisi_ptt_filter_desc *filter;
+ struct hisi_ptt *hisi_ptt;
+
+ hisi_ptt = container_of(delayed_work, struct hisi_ptt, work);
+
+ if (!mutex_trylock(&hisi_ptt->filter_lock)) {
+ schedule_delayed_work(&hisi_ptt->work, HISI_PTT_WORK_DELAY_MS);
+ return;
+ }
+
+ while (kfifo_get(&hisi_ptt->filter_update_kfifo, &info)) {
+ if (info.is_add) {
+ /*
+ * Notify the users if failed to add this filter, others
+ * still work and available. See the comments in
+ * hisi_ptt_init_filters().
+ */
+ filter = hisi_ptt_alloc_add_filter(hisi_ptt, info.devid, info.is_port);
+ if (!filter)
+ continue;
+ } else {
+ struct hisi_ptt_filter_desc *tmp;
+ struct list_head *target_list;
+
+ target_list = info.is_port ? &hisi_ptt->port_filters :
+ &hisi_ptt->req_filters;
+
+ list_for_each_entry_safe(filter, tmp, target_list, list)
+ if (filter->devid == info.devid) {
+ hisi_ptt_del_free_filter(hisi_ptt, filter);
+ break;
+ }
+ }
+ }
+
+ mutex_unlock(&hisi_ptt->filter_lock);
+}
+
+/*
+ * A PCI bus notifier is used here for dynamically updating the filter
+ * list.
+ */
+static int hisi_ptt_notifier_call(struct notifier_block *nb, unsigned long action,
+ void *data)
+{
+ struct hisi_ptt *hisi_ptt = container_of(nb, struct hisi_ptt, hisi_ptt_nb);
+ struct hisi_ptt_filter_update_info info;
+ struct pci_dev *pdev, *root_port;
+ struct device *dev = data;
+ u32 port_devid;
+
+ pdev = to_pci_dev(dev);
+ root_port = pcie_find_root_port(pdev);
+ if (!root_port)
+ return 0;
+
+ port_devid = PCI_DEVID(root_port->bus->number, root_port->devfn);
+ if (port_devid < hisi_ptt->lower_bdf ||
+ port_devid > hisi_ptt->upper_bdf)
+ return 0;
+
+ info.is_port = pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT;
+ info.devid = PCI_DEVID(pdev->bus->number, pdev->devfn);
+
+ switch (action) {
+ case BUS_NOTIFY_ADD_DEVICE:
+ info.is_add = true;
+ break;
+ case BUS_NOTIFY_DEL_DEVICE:
+ info.is_add = false;
+ break;
+ default:
+ return 0;
+ }
+
+ /*
+ * The FIFO size is 16 which is sufficient for almost all the cases,
+ * since each PCIe core will have most 8 Root Ports (typically only
+ * 1~4 Root Ports). On failure log the failed filter and let user
+ * handle it.
+ */
+ if (kfifo_in_spinlocked(&hisi_ptt->filter_update_kfifo, &info, 1,
+ &hisi_ptt->filter_update_lock))
+ schedule_delayed_work(&hisi_ptt->work, 0);
+ else
+ pci_warn(hisi_ptt->pdev,
+ "filter update fifo overflow for target %s\n",
+ pci_name(pdev));
+
+ return 0;
+}
+
static int hisi_ptt_init_filters(struct pci_dev *pdev, void *data)
{
struct pci_dev *root_port = pcie_find_root_port(pdev);
@@ -407,7 +521,8 @@ static int hisi_ptt_init_filters(struct pci_dev *pdev, void *data)
* should be partial initialized and users would know which filter fails
* through the log. Other functions of PTT device are still available.
*/
- filter = hisi_ptt_alloc_add_filter(hisi_ptt, pdev);
+ filter = hisi_ptt_alloc_add_filter(hisi_ptt, PCI_DEVID(pdev->bus->number, pdev->devfn),
+ pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT);
if (!filter)
return -ENOMEM;

@@ -466,8 +581,13 @@ static int hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
int ret;
u32 reg;

+ INIT_DELAYED_WORK(&hisi_ptt->work, hisi_ptt_update_filters);
+ INIT_KFIFO(hisi_ptt->filter_update_kfifo);
+ spin_lock_init(&hisi_ptt->filter_update_lock);
+
INIT_LIST_HEAD(&hisi_ptt->port_filters);
INIT_LIST_HEAD(&hisi_ptt->req_filters);
+ mutex_init(&hisi_ptt->filter_lock);

ret = hisi_ptt_config_trace_buf(hisi_ptt);
if (ret)
@@ -620,6 +740,7 @@ static int hisi_ptt_trace_valid_filter(struct hisi_ptt *hisi_ptt, u64 config)
{
unsigned long val, port_mask = hisi_ptt->port_mask;
struct hisi_ptt_filter_desc *filter;
+ int ret = 0;

hisi_ptt->trace_ctrl.is_port = FIELD_GET(HISI_PTT_PMU_FILTER_IS_PORT, config);
val = FIELD_GET(HISI_PTT_PMU_FILTER_VAL_MASK, config);
@@ -633,16 +754,20 @@ static int hisi_ptt_trace_valid_filter(struct hisi_ptt *hisi_ptt, u64 config)
* For Requester ID filters, walk the available filter list to see
* whether we have one matched.
*/
+ mutex_lock(&hisi_ptt->filter_lock);
if (!hisi_ptt->trace_ctrl.is_port) {
list_for_each_entry(filter, &hisi_ptt->req_filters, list) {
if (val == hisi_ptt_get_filter_val(filter->devid, filter->is_port))
- return 0;
+ goto out;
}
} else if (bitmap_subset(&val, &port_mask, BITS_PER_LONG)) {
- return 0;
+ goto out;
}

- return -EINVAL;
+ ret = -EINVAL;
+out:
+ mutex_unlock(&hisi_ptt->filter_lock);
+ return ret;
}

static void hisi_ptt_pmu_init_configs(struct hisi_ptt *hisi_ptt, struct perf_event *event)
@@ -916,6 +1041,31 @@ static int hisi_ptt_register_pmu(struct hisi_ptt *hisi_ptt)
&hisi_ptt->hisi_ptt_pmu);
}

+static void hisi_ptt_unregister_filter_update_notifier(void *data)
+{
+ struct hisi_ptt *hisi_ptt = data;
+
+ bus_unregister_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
+
+ /* Cancel any work that has been queued */
+ cancel_delayed_work_sync(&hisi_ptt->work);
+}
+
+/* Register the bus notifier for dynamically updating the filter list */
+static int hisi_ptt_register_filter_update_notifier(struct hisi_ptt *hisi_ptt)
+{
+ int ret;
+
+ hisi_ptt->hisi_ptt_nb.notifier_call = hisi_ptt_notifier_call;
+ ret = bus_register_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
+ if (ret)
+ return ret;
+
+ return devm_add_action_or_reset(&hisi_ptt->pdev->dev,
+ hisi_ptt_unregister_filter_update_notifier,
+ hisi_ptt);
+}
+
/*
* The DMA of PTT trace can only use direct mappings due to some
* hardware restriction. Check whether there is no IOMMU or the
@@ -987,6 +1137,10 @@ static int hisi_ptt_probe(struct pci_dev *pdev,
return ret;
}

+ ret = hisi_ptt_register_filter_update_notifier(hisi_ptt);
+ if (ret)
+ pci_warn(pdev, "failed to register filter update notifier, ret = %d", ret);
+
ret = hisi_ptt_register_pmu(hisi_ptt);
if (ret) {
pci_err(pdev, "failed to register PMU device, ret = %d", ret);
diff --git a/drivers/hwtracing/ptt/hisi_ptt.h b/drivers/hwtracing/ptt/hisi_ptt.h
index 5beb1648c93a..d598411e9cb8 100644
--- a/drivers/hwtracing/ptt/hisi_ptt.h
+++ b/drivers/hwtracing/ptt/hisi_ptt.h
@@ -11,12 +11,15 @@

#include <linux/bits.h>
#include <linux/cpumask.h>
+#include <linux/kfifo.h>
#include <linux/list.h>
#include <linux/mutex.h>
+#include <linux/notifier.h>
#include <linux/pci.h>
#include <linux/perf_event.h>
#include <linux/spinlock.h>
#include <linux/types.h>
+#include <linux/workqueue.h>

#define DRV_NAME "hisi_ptt"

@@ -71,6 +74,11 @@
#define HISI_PTT_WAIT_TRACE_TIMEOUT_US 100UL
#define HISI_PTT_WAIT_POLL_INTERVAL_US 10UL

+/* FIFO size for dynamically updating the PTT trace filter list. */
+#define HISI_PTT_FILTER_UPDATE_FIFO_SIZE 16
+/* Delay time for filter updating work */
+#define HISI_PTT_WORK_DELAY_MS 100UL
+
#define HISI_PCIE_CORE_PORT_ID(devfn) ((PCI_SLOT(devfn) & 0x7) << 1)

/* Definition of the PMU configs */
@@ -135,11 +143,25 @@ struct hisi_ptt_trace_ctrl {
* struct hisi_ptt_filter_desc - Descriptor of the PTT trace filter
* @list: entry of this descriptor in the filter list
* @is_port: the PCI device of the filter is a Root Port or not
+ * @name: name of this filter, same as the name of the related PCI device
* @devid: the PCI device's devid of the filter
*/
struct hisi_ptt_filter_desc {
struct list_head list;
bool is_port;
+ char *name;
+ u16 devid;
+};
+
+/**
+ * struct hisi_ptt_filter_update_info - Information for PTT filter updating
+ * @is_port: the PCI device to update is a Root Port or not
+ * @is_add: adding to the filter or not
+ * @devid: the PCI device's devid of the filter
+ */
+struct hisi_ptt_filter_update_info {
+ bool is_port;
+ bool is_add;
u16 devid;
};

@@ -160,6 +182,7 @@ struct hisi_ptt_pmu_buf {
/**
* struct hisi_ptt - Per PTT device data
* @trace_ctrl: the control information of PTT trace
+ * @hisi_ptt_nb: dynamic filter update notifier
* @hotplug_node: node for register cpu hotplug event
* @hisi_ptt_pmu: the pum device of trace
* @iobase: base IO address of the device
@@ -170,10 +193,15 @@ struct hisi_ptt_pmu_buf {
* @lower_bdf: the lower BDF range of the PCI devices managed by this PTT device
* @port_filters: the filter list of root ports
* @req_filters: the filter list of requester ID
+ * @filter_lock: lock to protect the filters
* @port_mask: port mask of the managed root ports
+ * @work: delayed work for filter updating
+ * @filter_update_lock: spinlock to protect the filter update fifo
+ * @filter_update_fifo: fifo of the filters waiting to update the filter list
*/
struct hisi_ptt {
struct hisi_ptt_trace_ctrl trace_ctrl;
+ struct notifier_block hisi_ptt_nb;
struct hlist_node hotplug_node;
struct pmu hisi_ptt_pmu;
void __iomem *iobase;
@@ -192,7 +220,19 @@ struct hisi_ptt {
*/
struct list_head port_filters;
struct list_head req_filters;
+ struct mutex filter_lock;
u16 port_mask;
+
+ /*
+ * We use a delayed work here to avoid indefinitely waiting for
+ * the hisi_ptt->mutex which protecting the filter list. The
+ * work will be delayed only if the mutex can not be held,
+ * otherwise no delay will be applied.
+ */
+ struct delayed_work work;
+ spinlock_t filter_update_lock;
+ DECLARE_KFIFO(filter_update_kfifo, struct hisi_ptt_filter_update_info,
+ HISI_PTT_FILTER_UPDATE_FIFO_SIZE);
};

#define to_hisi_ptt(pmu) container_of(pmu, struct hisi_ptt, hisi_ptt_pmu)
--
2.24.0


2023-06-21 11:27:08

by Suzuki K Poulose

[permalink] [raw]
Subject: Re: [PATCH v6 0/5] Improve PTT filter interface and some fixes

On Wed, 21 Jun 2023 17:27:59 +0800, Yicong Yang wrote:
> From: Yicong Yang <[email protected]>
>
> This series tends to improve the PTT's filter interface in 2 aspects (Patch 2&3):
> - Support dynamically filter updating to response to hotplug
> Previous the supported filter list is settled down once the driver probed and
> it maybe out-of-date if hotplug events happen later. User need to reload the
> driver to update list. Patch 1/2 enable the driver to update the list by
> registering a PCI bus notifier and the filter list will always be the latest.
> - Export the available filters through sysfs
> Previous user needs to calculate the filters and filter value using device's
> BDF number, which requires the user to know the hardware well. Patch 3/3 tends
> to export the available filter information through sysfs attributes, the filter
> value will be gotten by reading the file. This will be more user friendly.
>
> [...]

Applied, thanks!

[1/5] hwtracing: hisi_ptt: Factor out filter allocation and release operation
commit: a3ecaba7017f5d02d1ad60229cc14d5f0cda0c20
[2/5] hwtracing: hisi_ptt: Add support for dynamically updating the filter list
commit: 556ef09392dbc2d0b9aad5fd880d5d11addfc40d
[3/5] hwtracing: hisi_ptt: Export available filters through sysfs
commit: 6373c463ac894e41cab24469d1947ff91aaea486
[4/5] hwtracing: hisi_ptt: Advertise PERF_PMU_CAP_NO_EXCLUDE for PTT PMU
commit: 45c90292ad0e275ef4b870838b3b5273b3ef8ade
[5/5] hwtracing: hisi_ptt: Fix potential sleep in atomic context
commit: 6c50384ef8b94a527445e3694ae6549e1f15d859

Best regards,
--
Suzuki K Poulose <[email protected]>