From: Yicong Yang <[email protected]>
This series mainly fix and optimize the several usage of the driver:
- Add more events to complement the TLP bandwidth counting
- Fix the wrong counting on using the event group
- Properly check the target filter to avoid invalid filter value
- Optimize the handling of related events which are not in an event group
- Update the docs
Junhao He (4):
drivers/perf: hisi_pcie: Check the target filter properly
drivers/perf: hisi_pcie: Relax the check on related events
drivers/perf: hisi_pcie: Merge find_related_event() and
get_event_idx()
docs: perf: Update usage for target filter of hisi-pcie-pmu
Yicong Yang (3):
drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_filter()
drivers/perf: hisi_pcie: Fix incorrect counting under metric mode
drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth
.../admin-guide/perf/hisi-pcie-pmu.rst | 31 ++++--
drivers/perf/hisilicon/hisi_pcie_pmu.c | 99 ++++++++++---------
2 files changed, 74 insertions(+), 56 deletions(-)
--
2.24.0
From: Yicong Yang <[email protected]>
Factor out retrieving of the register value for the
corresponding event from hisi_pcie_config_filter() into a
new function hisi_pcie_pmu_get_filter() allowing future reuse.
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index b90ba8aca3fa..11a819cd07f2 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -216,10 +216,8 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
writeq_relaxed(val, pcie_pmu->base + offset);
}
-static void hisi_pcie_pmu_config_filter(struct perf_event *event)
+static u64 hisi_pcie_pmu_get_filter(struct perf_event *event)
{
- struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
- struct hw_perf_event *hwc = &event->hw;
u64 port, trig_len, thr_len, len_mode;
u64 reg = HISI_PCIE_INIT_SET;
@@ -256,6 +254,15 @@ static void hisi_pcie_pmu_config_filter(struct perf_event *event)
else
reg |= FIELD_PREP(HISI_PCIE_LEN_M, HISI_PCIE_LEN_M_DEFAULT);
+ return reg;
+}
+
+static void hisi_pcie_pmu_config_filter(struct perf_event *event)
+{
+ struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
+ struct hw_perf_event *hwc = &event->hw;
+ u64 reg = hisi_pcie_pmu_get_filter(event);
+
hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, reg);
}
--
2.24.0
From: Junhao He <[email protected]>
If we use two events with the same filter and related event type
(see the following example), the driver check whether they are related
events and are in the same group, otherwise the function
hisi_pcie_pmu_find_related_event() return -EINVAL, then the 2nd event
cannot count but the 1st event is running, although the PCIe PMU has
other idle counters.
In this case, The perf event scheduler will make the two events to
multiplex a counter, if the user use the formula
(1st event_value / 2nd event_value) to calculate the bandwidth, he/she
won't get the correct value, because they are not counting at the
same period.
This patch tries to fix this by making the related events to use
different idle counters if they are not in the same event group.
And finally, I'm going to say. The related events are best used in the
same group [1]. There are two ways to know if they are related events.
a) By event name, such as the latency events "xxx_latency, xxx_cnt" or
bandwidth events "xxx_flux, xxx_time".
b) By event type, such as "event=0xXXXX, event=0x1XXXX".
Use group to count the related events:
[1] -e "{pmu_name/xxx_latency,port=1/,pmu_name/xxx_cnt,port=1/}"
example:
1st event: hisi_pcie0_core1/event=0x804,port=1
2nd event: hisi_pcie0_core1/event=0x10804,port=1
test cmd:
perf stata -e hisi_pcie0_core1/event=0x804,port=1/ \
-e hisi_pcie0_core1/event=0x10804,port=1/
before patch:
25,281 hisi_pcie0_core1/event=0x804,port=1/ (49.91%)
470,598 hisi_pcie0_core1/event=0x10804,port=1/ (50.09%)
after patch:
24,147 hisi_pcie0_core1/event=0x804,port=1/
474,558 hisi_pcie0_core1/event=0x10804,port=1/
Signed-off-by: Junhao He <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index b91f03c02c57..1b45aeb82859 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -408,14 +408,10 @@ static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
if (!sibling)
continue;
- if (!hisi_pcie_pmu_cmp_event(sibling, event))
- continue;
-
/* Related events must be used in group */
- if (sibling->group_leader == event->group_leader)
+ if (hisi_pcie_pmu_cmp_event(sibling, event) &&
+ sibling->group_leader == event->group_leader)
return idx;
- else
- return -EINVAL;
}
return idx;
--
2.24.0
From: Junhao He <[email protected]>
One of the "port" and "bdf" target filter interface must be set, and
the related events should preferably used in the same group.
Update the usage in the documentation.
Signed-off-by: Junhao He <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
.../admin-guide/perf/hisi-pcie-pmu.rst | 31 ++++++++++++++-----
1 file changed, 23 insertions(+), 8 deletions(-)
diff --git a/Documentation/admin-guide/perf/hisi-pcie-pmu.rst b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst
index 7e863662e2d4..c98ea4b0f0d4 100644
--- a/Documentation/admin-guide/perf/hisi-pcie-pmu.rst
+++ b/Documentation/admin-guide/perf/hisi-pcie-pmu.rst
@@ -37,9 +37,20 @@ Example usage of perf::
hisi_pcie0_core0/rx_mwr_cnt/ [kernel PMU event]
------------------------------------------
- $# perf stat -e hisi_pcie0_core0/rx_mwr_latency/
- $# perf stat -e hisi_pcie0_core0/rx_mwr_cnt/
- $# perf stat -g -e hisi_pcie0_core0/rx_mwr_latency/ -e hisi_pcie0_core0/rx_mwr_cnt/
+ $# perf stat -e hisi_pcie0_core0/rx_mwr_latency,port=0xffff/
+ $# perf stat -e hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/
+
+The related events usually used to calculate the bandwidth, latency or others.
+They need to start and end counting at the same time, therefore related events
+are best used in the same event group to get the expected value. There are two
+ways to know if they are related events:
+a) By event name, such as the latency events "xxx_latency, xxx_cnt" or
+ bandwidth events "xxx_flux, xxx_time".
+b) By event type, such as "event=0xXXXX, event=0x1XXXX".
+
+Example usage of perf group::
+
+ $# perf stat -e "{hisi_pcie0_core0/rx_mwr_latency,port=0xffff/,hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/}"
The current driver does not support sampling. So "perf record" is unsupported.
Also attach to a task is unsupported for PCIe PMU.
@@ -51,8 +62,12 @@ Filter options
PMU could only monitor the performance of traffic downstream target Root
Ports or downstream target Endpoint. PCIe PMU driver support "port" and
- "bdf" interfaces for users, and these two interfaces aren't supported at the
- same time.
+ "bdf" interfaces for users.
+ Please notice that, one of these two interfaces must be set, and can not
+ be supported at the same time. If they are both set, only "port" filter is
+ valid.
+ If "port" filter not being set or is set explicitly to zero (default), the
+ "bdf" filter will be in effect, because "bdf=0" meaning 0000:000:00.0.
- port
@@ -95,7 +110,7 @@ Filter options
Example usage of perf::
- $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,trig_len=0x4,trig_mode=1/ sleep 5
+ $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,trig_len=0x4,trig_mode=1/ sleep 5
3. Threshold filter
@@ -109,7 +124,7 @@ Filter options
Example usage of perf::
- $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,thr_len=0x4,thr_mode=1/ sleep 5
+ $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,thr_len=0x4,thr_mode=1/ sleep 5
4. TLP Length filter
@@ -127,4 +142,4 @@ Filter options
Example usage of perf::
- $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,len_mode=0x1/ sleep 5
+ $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,len_mode=0x1/ sleep 5
--
2.24.0
From: Junhao He <[email protected]>
The PMU can monitor traffic of certain target Root Port or downstream
target Endpoint. User can specify the target filter by the "port" or
"bdf" option respectively. The PMU can only monitor the Root Port or
Endpoint on the same PCIe core so the value of "port" or "bdf" should
be valid and will be checked by the driver.
Currently at least and only one of "port" and "bdf" option must be set.
If "port" filter is not set or is set explicitly to zero (default),
driver will regard the user specifies a "bdf" option since "port" option
is a bitmask of the target Root Ports and zero is not a valid
value.
If user not explicitly set "port" or "bdf" filter, the driver uses "bdf"
default value (zero) to set target filter, but driver will skip the
check of bdf=0, although it's a valid value (meaning 0000:000:00.0).
Then the user just gets zero.
Therefore, we need to check if both "port" and "bdf" are invalid, then
return failure and report warning.
Testing:
before the patch:
0 hisi_pcie0_core1/rx_mrd_flux/
0 hisi_pcie0_core1/rx_mrd_flux,port=0/
24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/
0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/
<not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
after the patch:
<not supported> hisi_pcie0_core1/rx_mrd_flux/
<not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/
24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/
<not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/
<not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
Signed-off-by: Junhao He <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 83be3390686c..b91f03c02c57 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -306,10 +306,10 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL)
return false;
- if (requester_id) {
- if (!hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
- return false;
- }
+ /* Need to explicitly set filter of "port" or "bdf" */
+ if (!hisi_pcie_get_port(event) &&
+ !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
+ return false;
return true;
}
--
2.24.0
From: Junhao He <[email protected]>
The function xxx_find_related_event() scan all working events to find
related events. During this process, we also can find the idle counters.
If not found related events, return the first idle counter to simplify
the code.
Signed-off-by: Junhao He <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 55 ++++++++++----------------
1 file changed, 21 insertions(+), 34 deletions(-)
diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 1b45aeb82859..2edde66675e7 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -397,16 +397,24 @@ static u64 hisi_pcie_pmu_read_counter(struct perf_event *event)
return hisi_pcie_pmu_readq(pcie_pmu, event->hw.event_base, idx);
}
-static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
- struct perf_event *event)
+/*
+ * Check all work events, if a relevant event is found then we return it
+ * first, otherwise return the first idle counter (need to reset).
+ */
+static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu,
+ struct perf_event *event)
{
+ int first_idle = HISI_PCIE_MAX_COUNTERS;
struct perf_event *sibling;
int idx;
for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) {
sibling = pcie_pmu->hw_events[idx];
- if (!sibling)
+ if (!sibling) {
+ if (first_idle == HISI_PCIE_MAX_COUNTERS)
+ first_idle = idx;
continue;
+ }
/* Related events must be used in group */
if (hisi_pcie_pmu_cmp_event(sibling, event) &&
@@ -414,19 +422,7 @@ static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
return idx;
}
- return idx;
-}
-
-static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu)
-{
- int idx;
-
- for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) {
- if (!pcie_pmu->hw_events[idx])
- return idx;
- }
-
- return -EINVAL;
+ return first_idle;
}
static void hisi_pcie_pmu_event_update(struct perf_event *event)
@@ -552,27 +548,18 @@ static int hisi_pcie_pmu_add(struct perf_event *event, int flags)
hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
- /* Check all working events to find a related event. */
- idx = hisi_pcie_pmu_find_related_event(pcie_pmu, event);
- if (idx < 0)
- return idx;
-
- /* Current event shares an enabled counter with the related event */
- if (idx < HISI_PCIE_MAX_COUNTERS) {
- hwc->idx = idx;
- goto start_count;
- }
-
- idx = hisi_pcie_pmu_get_event_idx(pcie_pmu);
- if (idx < 0)
- return idx;
+ idx = hisi_pcie_pmu_get_event_idx(pcie_pmu, event);
+ if (idx == HISI_PCIE_MAX_COUNTERS)
+ return -EAGAIN;
hwc->idx = idx;
- pcie_pmu->hw_events[idx] = event;
- /* Reset Counter to avoid previous statistic interference. */
- hisi_pcie_pmu_reset_counter(pcie_pmu, idx);
-start_count:
+ /* No enabled counter found with related event, reset it */
+ if (!pcie_pmu->hw_events[idx]) {
+ hisi_pcie_pmu_reset_counter(pcie_pmu, idx);
+ pcie_pmu->hw_events[idx] = event;
+ }
+
if (flags & PERF_EF_START)
hisi_pcie_pmu_start(event, PERF_EF_RELOAD);
--
2.24.0
From: Yicong Yang <[email protected]>
A typical PCIe transaction is consisted of various TLP packets in both
direction. For counting bandwidth only memory read events are exported
currently. Add memory write and completion counting events of both
direction to complementation.
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 9623bed93876..83be3390686c 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -726,10 +726,18 @@ static struct attribute *hisi_pcie_pmu_events_attr[] = {
HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_cnt, 0x10210),
HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_latency, 0x0011),
HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_cnt, 0x10011),
+ HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_flux, 0x0104),
+ HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_time, 0x10104),
HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_flux, 0x0804),
HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_time, 0x10804),
+ HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_flux, 0x2004),
+ HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_time, 0x12004),
+ HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_flux, 0x0105),
+ HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_time, 0x10105),
HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_flux, 0x0405),
HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_time, 0x10405),
+ HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_flux, 0x1005),
+ HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_time, 0x11005),
NULL
};
--
2.24.0
From: Yicong Yang <[email protected]>
The metric counting shows incorrect results if the events in the
metric group using the same event but different filter options.
This is because we only judge the event code to decide whether
the event in the metric group should share the same hardware
counter, but ignore the settings of the filter.
For example, on a platform of 2 ports 0x1 and 0x2 but only port
0x1 has a downstream PCIe NVME device. The metric counting
shows both ports have the same counts because we misassign these
two events to one same hardware counter:
[root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}'
Performance counter stats for 'system wide':
7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/
7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/
10.153863691 seconds time elapsed
Fix this by using the whole config rather than the event only
to judge whether two events are the same and should share the
same hardware counter. With this patch, the metric counting in
the above case tends to be corrected:
[root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}'
Performance counter stats for 'system wide':
0 hisi_pcie0_core1/event=0x0104,port=0x2/
8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/
10.152875631 seconds time elapsed
Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
Signed-off-by: Yicong Yang <[email protected]>
---
drivers/perf/hisilicon/hisi_pcie_pmu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index 11a819cd07f2..9623bed93876 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -314,10 +314,15 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
return true;
}
+/*
+ * Check Whether two events share the same config. The same config means not
+ * only the event code, but also the filter settings of the two events are
+ * the same.
+ */
static bool hisi_pcie_pmu_cmp_event(struct perf_event *target,
struct perf_event *event)
{
- return hisi_pcie_get_real_event(target) == hisi_pcie_get_real_event(event);
+ return hisi_pcie_pmu_get_filter(target) == hisi_pcie_pmu_get_filter(event);
}
static bool hisi_pcie_pmu_validate_event_group(struct perf_event *event)
--
2.24.0
On Thu, 8 Feb 2024 12:06:43 +0000
Jonathan Cameron <[email protected]> wrote:
> On Sun, 4 Feb 2024 15:45:21 +0800
> Yicong Yang <[email protected]> wrote:
>
> > From: Yicong Yang <[email protected]>
> >
> > Factor out retrieving of the register value for the
> > corresponding event from hisi_pcie_config_filter() into a
> > new function hisi_pcie_pmu_get_filter() allowing future reuse.
> >
> > Signed-off-by: Yicong Yang <[email protected]>
>
> Reviewed-by: Jonathan Cameron <[email protected]>
On second thoughts, this might benefit from a clearer name.
Perhaps just call it exactly what it is
hisi_pcie_pmu_get_ctrl_reg_val_to_set()
It incorporates the event code as well as the filter.
Maybe we want to rename pmu_config_filter() as well to
pmu_config_counter() which I think is the real meaning?
>
> > ---
> > drivers/perf/hisilicon/hisi_pcie_pmu.c | 13 ++++++++++---
> > 1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > index b90ba8aca3fa..11a819cd07f2 100644
> > --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> > @@ -216,10 +216,8 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
> > writeq_relaxed(val, pcie_pmu->base + offset);
> > }
> >
> > -static void hisi_pcie_pmu_config_filter(struct perf_event *event)
> > +static u64 hisi_pcie_pmu_get_filter(struct perf_event *event)
> > {
> > - struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
> > - struct hw_perf_event *hwc = &event->hw;
> > u64 port, trig_len, thr_len, len_mode;
> > u64 reg = HISI_PCIE_INIT_SET;
> >
> > @@ -256,6 +254,15 @@ static void hisi_pcie_pmu_config_filter(struct perf_event *event)
> > else
> > reg |= FIELD_PREP(HISI_PCIE_LEN_M, HISI_PCIE_LEN_M_DEFAULT);
> >
> > + return reg;
> > +}
> > +
> > +static void hisi_pcie_pmu_config_filter(struct perf_event *event)
> > +{
> > + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
> > + struct hw_perf_event *hwc = &event->hw;
> > + u64 reg = hisi_pcie_pmu_get_filter(event);
> > +
> > hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, reg);
> > }
> >
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Sun, 4 Feb 2024 15:45:22 +0800
Yicong Yang <[email protected]> wrote:
> From: Yicong Yang <[email protected]>
>
> The metric counting shows incorrect results if the events in the
> metric group using the same event but different filter options.
> This is because we only judge the event code to decide whether
> the event in the metric group should share the same hardware
> counter, but ignore the settings of the filter.
>
> For example, on a platform of 2 ports 0x1 and 0x2 but only port
> 0x1 has a downstream PCIe NVME device. The metric counting
> shows both ports have the same counts because we misassign these
> two events to one same hardware counter:
> [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}'
>
> Performance counter stats for 'system wide':
>
> 7907484924 hisi_pcie0_core1/event=0x0104,port=0x2/
> 7907484924 hisi_pcie0_core1/event=0x0104,port=0x1/
>
> 10.153863691 seconds time elapsed
>
> Fix this by using the whole config rather than the event only
> to judge whether two events are the same and should share the
> same hardware counter. With this patch, the metric counting in
> the above case tends to be corrected:
>
> [root@localhost perf-iostat]# ./perf stat -e '{hisi_pcie0_core1/event=0x0104,port=0x2/,hisi_pcie0_core1/event=0x0104,port=0x1/}'
>
> Performance counter stats for 'system wide':
>
> 0 hisi_pcie0_core1/event=0x0104,port=0x2/
> 8123122077 hisi_pcie0_core1/event=0x0104,port=0x1/
>
> 10.152875631 seconds time elapsed
>
> Fixes: 8404b0fbc7fb ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU")
> Signed-off-by: Yicong Yang <[email protected]>
> ---
> drivers/perf/hisilicon/hisi_pcie_pmu.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 11a819cd07f2..9623bed93876 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -314,10 +314,15 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
> return true;
> }
>
> +/*
> + * Check Whether two events share the same config. The same config means not
> + * only the event code, but also the filter settings of the two events are
> + * the same.
> + */
> static bool hisi_pcie_pmu_cmp_event(struct perf_event *target,
> struct perf_event *event)
> {
> - return hisi_pcie_get_real_event(target) == hisi_pcie_get_real_event(event);
> + return hisi_pcie_pmu_get_filter(target) == hisi_pcie_pmu_get_filter(event);
This is why I looked more closely at what get_filter() was doing in previous patch.
Not obvious from the naming here that this compares the event + filters rather than
now just comparing the filters.
> }
>
> static bool hisi_pcie_pmu_validate_event_group(struct perf_event *event)
On Sun, 4 Feb 2024 15:45:23 +0800
Yicong Yang <[email protected]> wrote:
> From: Yicong Yang <[email protected]>
>
> A typical PCIe transaction is consisted of various TLP packets in both
> direction. For counting bandwidth only memory read events are exported
> currently. Add memory write and completion counting events of both
> direction to complementation.
complementation?
>
> Signed-off-by: Yicong Yang <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 9623bed93876..83be3390686c 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -726,10 +726,18 @@ static struct attribute *hisi_pcie_pmu_events_attr[] = {
> HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_cnt, 0x10210),
> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_latency, 0x0011),
> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_cnt, 0x10011),
> + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_flux, 0x0104),
> + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_time, 0x10104),
> HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_flux, 0x0804),
> HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_time, 0x10804),
> + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_flux, 0x2004),
> + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_time, 0x12004),
> + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_flux, 0x0105),
> + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_time, 0x10105),
> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_flux, 0x0405),
> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_time, 0x10405),
> + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_flux, 0x1005),
> + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_time, 0x11005),
> NULL
> };
>
On Sun, 4 Feb 2024 15:45:24 +0800
Yicong Yang <[email protected]> wrote:
> From: Junhao He <[email protected]>
>
> The PMU can monitor traffic of certain target Root Port or downstream
> target Endpoint. User can specify the target filter by the "port" or
> "bdf" option respectively. The PMU can only monitor the Root Port or
> Endpoint on the same PCIe core so the value of "port" or "bdf" should
> be valid and will be checked by the driver.
>
> Currently at least and only one of "port" and "bdf" option must be set.
> If "port" filter is not set or is set explicitly to zero (default),
> driver will regard the user specifies a "bdf" option since "port" option
> is a bitmask of the target Root Ports and zero is not a valid
> value.
>
> If user not explicitly set "port" or "bdf" filter, the driver uses "bdf"
> default value (zero) to set target filter, but driver will skip the
> check of bdf=0, although it's a valid value (meaning 0000:000:00.0).
> Then the user just gets zero.
>
> Therefore, we need to check if both "port" and "bdf" are invalid, then
> return failure and report warning.
>
> Testing:
> before the patch:
> 0 hisi_pcie0_core1/rx_mrd_flux/
> 0 hisi_pcie0_core1/rx_mrd_flux,port=0/
> 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/
> 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/
> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
Nice to include an example that works for bdf
hisi_pcie0_core1/rx_mrd_flux,bdf=1,port=0
or something like that?
>
> after the patch:
> <not supported> hisi_pcie0_core1/rx_mrd_flux/
> <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/
> 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/
> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/
> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
>
> Signed-off-by: Junhao He <[email protected]>
> Signed-off-by: Yicong Yang <[email protected]>
Clearly the current situation is wrong, but perhaps we can
have a more intuitive scheme (could be added as a follow up patch)
and have the driver figure out which port the bdf lies below?
Maybe that's a job for userspace tooling rather than the driver, but
the driver already has verification code and it wouldn't be hard
to not just check the rp is ours, but also set the filter to specify
that rp, or maybe just set the mask to include them all?
Jonathan
> ---
> drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 83be3390686c..b91f03c02c57 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -306,10 +306,10 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
> if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL)
> return false;
>
> - if (requester_id) {
> - if (!hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
> - return false;
> - }
> + /* Need to explicitly set filter of "port" or "bdf" */
> + if (!hisi_pcie_get_port(event) &&
> + !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
> + return false;
>
> return true;
> }
On Sun, 4 Feb 2024 15:45:21 +0800
Yicong Yang <[email protected]> wrote:
> From: Yicong Yang <[email protected]>
>
> Factor out retrieving of the register value for the
> corresponding event from hisi_pcie_config_filter() into a
> new function hisi_pcie_pmu_get_filter() allowing future reuse.
>
> Signed-off-by: Yicong Yang <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/perf/hisilicon/hisi_pcie_pmu.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index b90ba8aca3fa..11a819cd07f2 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -216,10 +216,8 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
> writeq_relaxed(val, pcie_pmu->base + offset);
> }
>
> -static void hisi_pcie_pmu_config_filter(struct perf_event *event)
> +static u64 hisi_pcie_pmu_get_filter(struct perf_event *event)
> {
> - struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
> - struct hw_perf_event *hwc = &event->hw;
> u64 port, trig_len, thr_len, len_mode;
> u64 reg = HISI_PCIE_INIT_SET;
>
> @@ -256,6 +254,15 @@ static void hisi_pcie_pmu_config_filter(struct perf_event *event)
> else
> reg |= FIELD_PREP(HISI_PCIE_LEN_M, HISI_PCIE_LEN_M_DEFAULT);
>
> + return reg;
> +}
> +
> +static void hisi_pcie_pmu_config_filter(struct perf_event *event)
> +{
> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
> + struct hw_perf_event *hwc = &event->hw;
> + u64 reg = hisi_pcie_pmu_get_filter(event);
> +
> hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, reg);
> }
>
On Sun, 4 Feb 2024 15:45:26 +0800
Yicong Yang <[email protected]> wrote:
> From: Junhao He <[email protected]>
>
> The function xxx_find_related_event() scan all working events to find
> related events. During this process, we also can find the idle counters.
> If not found related events, return the first idle counter to simplify
> the code.
>
> Signed-off-by: Junhao He <[email protected]>
> Signed-off-by: Yicong Yang <[email protected]>
A suggestion inline to avoid the magic HISI_PCIE_MAX_COUNTER value
being used outside of the function.
> ---
> drivers/perf/hisilicon/hisi_pcie_pmu.c | 55 ++++++++++----------------
> 1 file changed, 21 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index 1b45aeb82859..2edde66675e7 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -397,16 +397,24 @@ static u64 hisi_pcie_pmu_read_counter(struct perf_event *event)
> return hisi_pcie_pmu_readq(pcie_pmu, event->hw.event_base, idx);
> }
>
> -static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
> - struct perf_event *event)
> +/*
> + * Check all work events, if a relevant event is found then we return it
> + * first, otherwise return the first idle counter (need to reset).
> + */
> +static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu,
> + struct perf_event *event)
> {
> + int first_idle = HISI_PCIE_MAX_COUNTERS;
int first_idle = -EAGAIN;
> struct perf_event *sibling;
> int idx;
>
> for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) {
> sibling = pcie_pmu->hw_events[idx];
> - if (!sibling)
> + if (!sibling) {
> + if (first_idle == HISI_PCIE_MAX_COUNTERS)
if (first_idle == -EAGAIN)
> + first_idle = idx;
> continue;
> + }
>
> /* Related events must be used in group */
> if (hisi_pcie_pmu_cmp_event(sibling, event) &&
> @@ -414,19 +422,7 @@ static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
> return idx;
> }
>
> - return idx;
> -}
> -
> -static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu)
> -{
> - int idx;
> -
> - for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) {
> - if (!pcie_pmu->hw_events[idx])
> - return idx;
> - }
> -
> - return -EINVAL;
> + return first_idle;
Then this will return -EAGAIN;
> }
>
> static void hisi_pcie_pmu_event_update(struct perf_event *event)
> @@ -552,27 +548,18 @@ static int hisi_pcie_pmu_add(struct perf_event *event, int flags)
>
> hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
>
> - /* Check all working events to find a related event. */
> - idx = hisi_pcie_pmu_find_related_event(pcie_pmu, event);
> - if (idx < 0)
> - return idx;
> -
> - /* Current event shares an enabled counter with the related event */
> - if (idx < HISI_PCIE_MAX_COUNTERS) {
> - hwc->idx = idx;
> - goto start_count;
> - }
> -
> - idx = hisi_pcie_pmu_get_event_idx(pcie_pmu);
> - if (idx < 0)
> - return idx;
> + idx = hisi_pcie_pmu_get_event_idx(pcie_pmu, event);
> + if (idx == HISI_PCIE_MAX_COUNTERS)
> + return -EAGAIN;
Perhaps simpler to handle first_idle == HISI_PCIE_MAX_COUNTERS as
an error return in hisi_pcie_pmu_get_event_idx - see above.
if (idx < 0)
return idx;
>
> hwc->idx = idx;
> - pcie_pmu->hw_events[idx] = event;
> - /* Reset Counter to avoid previous statistic interference. */
> - hisi_pcie_pmu_reset_counter(pcie_pmu, idx);
>
> -start_count:
> + /* No enabled counter found with related event, reset it */
> + if (!pcie_pmu->hw_events[idx]) {
> + hisi_pcie_pmu_reset_counter(pcie_pmu, idx);
> + pcie_pmu->hw_events[idx] = event;
> + }
> +
> if (flags & PERF_EF_START)
> hisi_pcie_pmu_start(event, PERF_EF_RELOAD);
>
On 2024/2/8 20:18, Jonathan Cameron wrote:
> On Thu, 8 Feb 2024 12:06:43 +0000
> Jonathan Cameron <[email protected]> wrote:
>
>> On Sun, 4 Feb 2024 15:45:21 +0800
>> Yicong Yang <[email protected]> wrote:
>>
>>> From: Yicong Yang <[email protected]>
>>>
>>> Factor out retrieving of the register value for the
>>> corresponding event from hisi_pcie_config_filter() into a
>>> new function hisi_pcie_pmu_get_filter() allowing future reuse.
>>>
>>> Signed-off-by: Yicong Yang <[email protected]>
>>
>> Reviewed-by: Jonathan Cameron <[email protected]>
>
> On second thoughts, this might benefit from a clearer name.
> Perhaps just call it exactly what it is
> hisi_pcie_pmu_get_ctrl_reg_val_to_set()
>
> It incorporates the event code as well as the filter.
> Maybe we want to rename pmu_config_filter() as well to
> pmu_config_counter() which I think is the real meaning?
>
make sense to me. we can simply use the reg name as the suffix,
it'll be more clearer:
hisi_pcie_pmu_get_event_ctrl_val()
hisi_pcie_pmu_config_event_ctrl()
>
>>
>>> ---
>>> drivers/perf/hisilicon/hisi_pcie_pmu.c | 13 ++++++++++---
>>> 1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>>> index b90ba8aca3fa..11a819cd07f2 100644
>>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>>> @@ -216,10 +216,8 @@ static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset,
>>> writeq_relaxed(val, pcie_pmu->base + offset);
>>> }
>>>
>>> -static void hisi_pcie_pmu_config_filter(struct perf_event *event)
>>> +static u64 hisi_pcie_pmu_get_filter(struct perf_event *event)
>>> {
>>> - struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
>>> - struct hw_perf_event *hwc = &event->hw;
>>> u64 port, trig_len, thr_len, len_mode;
>>> u64 reg = HISI_PCIE_INIT_SET;
>>>
>>> @@ -256,6 +254,15 @@ static void hisi_pcie_pmu_config_filter(struct perf_event *event)
>>> else
>>> reg |= FIELD_PREP(HISI_PCIE_LEN_M, HISI_PCIE_LEN_M_DEFAULT);
>>>
>>> + return reg;
>>> +}
>>> +
>>> +static void hisi_pcie_pmu_config_filter(struct perf_event *event)
>>> +{
>>> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu);
>>> + struct hw_perf_event *hwc = &event->hw;
>>> + u64 reg = hisi_pcie_pmu_get_filter(event);
>>> +
>>> hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, reg);
>>> }
>>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>
On 2024/2/8 20:20, Jonathan Cameron wrote:
> On Sun, 4 Feb 2024 15:45:23 +0800
> Yicong Yang <[email protected]> wrote:
>
>> From: Yicong Yang <[email protected]>
>>
>> A typical PCIe transaction is consisted of various TLP packets in both
>> direction. For counting bandwidth only memory read events are exported
>> currently. Add memory write and completion counting events of both
>> direction to complementation.
>
> complementation?
>
sorry for the typo. will fix.
>>
>> Signed-off-by: Yicong Yang <[email protected]>
>
> Reviewed-by: Jonathan Cameron <[email protected]>
Thanks.
>
>> ---
>> drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> index 9623bed93876..83be3390686c 100644
>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> @@ -726,10 +726,18 @@ static struct attribute *hisi_pcie_pmu_events_attr[] = {
>> HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_cnt, 0x10210),
>> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_latency, 0x0011),
>> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_cnt, 0x10011),
>> + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_flux, 0x0104),
>> + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_time, 0x10104),
>> HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_flux, 0x0804),
>> HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_time, 0x10804),
>> + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_flux, 0x2004),
>> + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_time, 0x12004),
>> + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_flux, 0x0105),
>> + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_time, 0x10105),
>> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_flux, 0x0405),
>> HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_time, 0x10405),
>> + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_flux, 0x1005),
>> + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_time, 0x11005),
>> NULL
>> };
>>
>
> .
>
On 2024/2/8 20:29, Jonathan Cameron wrote:
> On Sun, 4 Feb 2024 15:45:24 +0800
> Yicong Yang <[email protected]> wrote:
>
>> From: Junhao He <[email protected]>
>>
>> The PMU can monitor traffic of certain target Root Port or downstream
>> target Endpoint. User can specify the target filter by the "port" or
>> "bdf" option respectively. The PMU can only monitor the Root Port or
>> Endpoint on the same PCIe core so the value of "port" or "bdf" should
>> be valid and will be checked by the driver.
>>
>> Currently at least and only one of "port" and "bdf" option must be set.
>> If "port" filter is not set or is set explicitly to zero (default),
>> driver will regard the user specifies a "bdf" option since "port" option
>> is a bitmask of the target Root Ports and zero is not a valid
>> value.
>>
>> If user not explicitly set "port" or "bdf" filter, the driver uses "bdf"
>> default value (zero) to set target filter, but driver will skip the
>> check of bdf=0, although it's a valid value (meaning 0000:000:00.0).
>> Then the user just gets zero.
>>
>> Therefore, we need to check if both "port" and "bdf" are invalid, then
>> return failure and report warning.
>>
>> Testing:
>> before the patch:
>> 0 hisi_pcie0_core1/rx_mrd_flux/
>> 0 hisi_pcie0_core1/rx_mrd_flux,port=0/
>> 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/
>> 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
>
> Nice to include an example that works for bdf
> hisi_pcie0_core1/rx_mrd_flux,bdf=1,port=0
> or something like that?
>>
>> after the patch:
>> <not supported> hisi_pcie0_core1/rx_mrd_flux/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/
>> 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
>>
>> Signed-off-by: Junhao He <[email protected]>
>> Signed-off-by: Yicong Yang <[email protected]>
>
> Clearly the current situation is wrong, but perhaps we can
> have a more intuitive scheme (could be added as a follow up patch)
> and have the driver figure out which port the bdf lies below?
>
> Maybe that's a job for userspace tooling rather than the driver, but
> the driver already has verification code and it wouldn't be hard
> to not just check the rp is ours, but also set the filter to specify
> that rp, or maybe just set the mask to include them all?
>
To do a check should be simple, we can decode the bdf and find the target
endpoint and related root port for doing the check.
Another example is what we've done in hisi_ptt that we maintian a list of
supported root ports and endpoints, but that will be a bit more complex.
> Jonathan
>
>
>> ---
>> drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> index 83be3390686c..b91f03c02c57 100644
>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> @@ -306,10 +306,10 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
>> if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL)
>> return false;
>>
>> - if (requester_id) {
>> - if (!hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
>> - return false;
>> - }
>> + /* Need to explicitly set filter of "port" or "bdf" */
>> + if (!hisi_pcie_get_port(event) &&
>> + !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
>> + return false;
>>
>> return true;
>> }
>
> .
>
On 2024/2/21 17:46, Yicong Yang wrote:
> On 2024/2/8 20:29, Jonathan Cameron wrote:
>> On Sun, 4 Feb 2024 15:45:24 +0800
>> Yicong Yang <[email protected]> wrote:
>>
>>> From: Junhao He <[email protected]>
>>>
>>> The PMU can monitor traffic of certain target Root Port or downstream
>>> target Endpoint. User can specify the target filter by the "port" or
>>> "bdf" option respectively. The PMU can only monitor the Root Port or
>>> Endpoint on the same PCIe core so the value of "port" or "bdf" should
>>> be valid and will be checked by the driver.
>>>
>>> Currently at least and only one of "port" and "bdf" option must be set.
>>> If "port" filter is not set or is set explicitly to zero (default),
>>> driver will regard the user specifies a "bdf" option since "port" option
>>> is a bitmask of the target Root Ports and zero is not a valid
>>> value.
>>>
>>> If user not explicitly set "port" or "bdf" filter, the driver uses "bdf"
>>> default value (zero) to set target filter, but driver will skip the
>>> check of bdf=0, although it's a valid value (meaning 0000:000:00.0).
>>> Then the user just gets zero.
>>>
>>> Therefore, we need to check if both "port" and "bdf" are invalid, then
>>> return failure and report warning.
>>>
>>> Testing:
>>> before the patch:
>>> 0 hisi_pcie0_core1/rx_mrd_flux/
>>> 0 hisi_pcie0_core1/rx_mrd_flux,port=0/
>>> 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/
>>> 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/
>>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
>> Nice to include an example that works for bdf
>> hisi_pcie0_core1/rx_mrd_flux,bdf=1,port=0
>> or something like that?
>>> after the patch:
>>> <not supported> hisi_pcie0_core1/rx_mrd_flux/
>>> <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/
>>> 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/
>>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/
>>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
>>>
>>> Signed-off-by: Junhao He <[email protected]>
>>> Signed-off-by: Yicong Yang <[email protected]>
>> Clearly the current situation is wrong, but perhaps we can
>> have a more intuitive scheme (could be added as a follow up patch)
>> and have the driver figure out which port the bdf lies below?
>>
>> Maybe that's a job for userspace tooling rather than the driver, but
>> the driver already has verification code and it wouldn't be hard
>> to not just check the rp is ours, but also set the filter to specify
>> that rp, or maybe just set the mask to include them all?
>>
> To do a check should be simple, we can decode the bdf and find the target
> endpoint and related root port for doing the check.
The function hisi_pcie_pmu_valid_requester_id() already does this.
It can get the RP of the bdf, then check whether the RP is within
the RP range of the PCIe PMU.
> Another example is what we've done in hisi_ptt that we maintian a list of
> supported root ports and endpoints, but that will be a bit more complex.
>
>> Jonathan
>>
>>
>>> ---
>>> drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++----
>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>>> index 83be3390686c..b91f03c02c57 100644
>>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>>> @@ -306,10 +306,10 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
>>> if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL)
>>> return false;
>>>
>>> - if (requester_id) {
>>> - if (!hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
>>> - return false;
>>> - }
>>> + /* Need to explicitly set filter of "port" or "bdf" */
>>> + if (!hisi_pcie_get_port(event) &&
>>> + !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
>>> + return false;
>>>
>>> return true;
>>> }
>> .
>>
On 2024/2/8 20:29, Jonathan Cameron wrote:
> On Sun, 4 Feb 2024 15:45:24 +0800
> Yicong Yang <[email protected]> wrote:
>
>> From: Junhao He <[email protected]>
>>
>> The PMU can monitor traffic of certain target Root Port or downstream
>> target Endpoint. User can specify the target filter by the "port" or
>> "bdf" option respectively. The PMU can only monitor the Root Port or
>> Endpoint on the same PCIe core so the value of "port" or "bdf" should
>> be valid and will be checked by the driver.
>>
>> Currently at least and only one of "port" and "bdf" option must be set.
>> If "port" filter is not set or is set explicitly to zero (default),
>> driver will regard the user specifies a "bdf" option since "port" option
>> is a bitmask of the target Root Ports and zero is not a valid
>> value.
>>
>> If user not explicitly set "port" or "bdf" filter, the driver uses "bdf"
>> default value (zero) to set target filter, but driver will skip the
>> check of bdf=0, although it's a valid value (meaning 0000:000:00.0).
>> Then the user just gets zero.
>>
>> Therefore, we need to check if both "port" and "bdf" are invalid, then
>> return failure and report warning.
>>
>> Testing:
>> before the patch:
>> 0 hisi_pcie0_core1/rx_mrd_flux/
>> 0 hisi_pcie0_core1/rx_mrd_flux,port=0/
>> 24,124 hisi_pcie0_core1/rx_mrd_flux,port=1/
>> 0 hisi_pcie0_core1/rx_mrd_flux,bdf=0/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
> Nice to include an example that works for bdf
> hisi_pcie0_core1/rx_mrd_flux,bdf=1,port=0
> or something like that?
Yes, I will do that.
These combined parameter test cases have been validated.
>> after the patch:
>> <not supported> hisi_pcie0_core1/rx_mrd_flux/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,port=0/
>> 24,153 hisi_pcie0_core1/rx_mrd_flux,port=1/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=0/
>> <not supported> hisi_pcie0_core1/rx_mrd_flux,bdf=1/
>>
>> Signed-off-by: Junhao He <[email protected]>
>> Signed-off-by: Yicong Yang <[email protected]>
> Clearly the current situation is wrong, but perhaps we can
> have a more intuitive scheme (could be added as a follow up patch)
> and have the driver figure out which port the bdf lies below?
>
> Maybe that's a job for userspace tooling rather than the driver, but
> the driver already has verification code and it wouldn't be hard
> to not just check the rp is ours, but also set the filter to specify
> that rp, or maybe just set the mask to include them all?
>
> Jonathan
>
>
>> ---
>> drivers/perf/hisilicon/hisi_pcie_pmu.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> index 83be3390686c..b91f03c02c57 100644
>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> @@ -306,10 +306,10 @@ static bool hisi_pcie_pmu_valid_filter(struct perf_event *event,
>> if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL)
>> return false;
>>
>> - if (requester_id) {
>> - if (!hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
>> - return false;
>> - }
>> + /* Need to explicitly set filter of "port" or "bdf" */
>> + if (!hisi_pcie_get_port(event) &&
>> + !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id))
>> + return false;
>>
>> return true;
>> }
>
Hi, Jonathan
On 2024/2/8 20:39, Jonathan Cameron wrote:
> On Sun, 4 Feb 2024 15:45:26 +0800
> Yicong Yang <[email protected]> wrote:
>
>> From: Junhao He <[email protected]>
>>
>> The function xxx_find_related_event() scan all working events to find
>> related events. During this process, we also can find the idle counters.
>> If not found related events, return the first idle counter to simplify
>> the code.
>>
>> Signed-off-by: Junhao He <[email protected]>
>> Signed-off-by: Yicong Yang <[email protected]>
> A suggestion inline to avoid the magic HISI_PCIE_MAX_COUNTER value
> being used outside of the function.
Thanks for the comments, will fix in next version.
>
>> ---
>> drivers/perf/hisilicon/hisi_pcie_pmu.c | 55 ++++++++++----------------
>> 1 file changed, 21 insertions(+), 34 deletions(-)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> index 1b45aeb82859..2edde66675e7 100644
>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> @@ -397,16 +397,24 @@ static u64 hisi_pcie_pmu_read_counter(struct perf_event *event)
>> return hisi_pcie_pmu_readq(pcie_pmu, event->hw.event_base, idx);
>> }
>>
>> -static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
>> - struct perf_event *event)
>> +/*
>> + * Check all work events, if a relevant event is found then we return it
>> + * first, otherwise return the first idle counter (need to reset).
>> + */
>> +static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu,
>> + struct perf_event *event)
>> {
>> + int first_idle = HISI_PCIE_MAX_COUNTERS;
> int first_idle = -EAGAIN;
Yes, I will do that.
>> struct perf_event *sibling;
>> int idx;
>>
>> for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) {
>> sibling = pcie_pmu->hw_events[idx];
>> - if (!sibling)
>> + if (!sibling) {
>> + if (first_idle == HISI_PCIE_MAX_COUNTERS)
> if (first_idle == -EAGAIN)
Ok, will fix it.
>
>> + first_idle = idx;
>> continue;
>> + }
>>
>> /* Related events must be used in group */
>> if (hisi_pcie_pmu_cmp_event(sibling, event) &&
>> @@ -414,19 +422,7 @@ static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu,
>> return idx;
>> }
>>
>> - return idx;
>> -}
>> -
>> -static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu)
>> -{
>> - int idx;
>> -
>> - for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) {
>> - if (!pcie_pmu->hw_events[idx])
>> - return idx;
>> - }
>> -
>> - return -EINVAL;
>> + return first_idle;
> Then this will return -EAGAIN;
Okay
>> }
>>
>> static void hisi_pcie_pmu_event_update(struct perf_event *event)
>> @@ -552,27 +548,18 @@ static int hisi_pcie_pmu_add(struct perf_event *event, int flags)
>>
>> hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
>>
>> - /* Check all working events to find a related event. */
>> - idx = hisi_pcie_pmu_find_related_event(pcie_pmu, event);
>> - if (idx < 0)
>> - return idx;
>> -
>> - /* Current event shares an enabled counter with the related event */
>> - if (idx < HISI_PCIE_MAX_COUNTERS) {
>> - hwc->idx = idx;
>> - goto start_count;
>> - }
>> -
>> - idx = hisi_pcie_pmu_get_event_idx(pcie_pmu);
>> - if (idx < 0)
>> - return idx;
>> + idx = hisi_pcie_pmu_get_event_idx(pcie_pmu, event);
>> + if (idx == HISI_PCIE_MAX_COUNTERS)
>> + return -EAGAIN;
> Perhaps simpler to handle first_idle == HISI_PCIE_MAX_COUNTERS as
> an error return in hisi_pcie_pmu_get_event_idx - see above.
>
> if (idx < 0)
> return idx;
Sure.
>
>>
>> hwc->idx = idx;
>> - pcie_pmu->hw_events[idx] = event;
>> - /* Reset Counter to avoid previous statistic interference. */
>> - hisi_pcie_pmu_reset_counter(pcie_pmu, idx);
>>
>> -start_count:
>> + /* No enabled counter found with related event, reset it */
>> + if (!pcie_pmu->hw_events[idx]) {
>> + hisi_pcie_pmu_reset_counter(pcie_pmu, idx);
>> + pcie_pmu->hw_events[idx] = event;
>> + }
>> +
>> if (flags & PERF_EF_START)
>> hisi_pcie_pmu_start(event, PERF_EF_RELOAD);
>>
>