2021-11-15 09:49:55

by Ravi Bangoria

[permalink] [raw]
Subject: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context

Ideally, a pmu which is present in each hw thread belongs to
perf_hw_context, but perf_hw_context has limitation of allowing only
one pmu (a core pmu) and thus other hw pmus need to use either sw or
invalid context which limits pmu functionalities.

This is not a new problem. It has been raised in the past, for example,
Arm big.LITTLE (same for Intel ADL) and s390 had this issue:

Arm: https://lore.kernel.org/lkml/20160425175837.GB3141@leverpostej
s390: https://lore.kernel.org/lkml/[email protected]

Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
(heterogeneous) pmus inside perf_hw_context. It makes sense as they are
still registering single pmu for each hw thread.

s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
hw pmu is count mode only, i.e. no sampling.

AMD IBS also has similar problem. IBS pmu is present in each hw thread.
But because of perf_hw_context restriction, currently it belongs to
perf_invalid_context and thus important functionalities like per-task
profiling is not possible with IBS pmu. Moving it to perf_sw_context
will:
- allow per-task monitoring
- allow cgroup wise profiling
- allow grouping of IBS with other pmu events
- disallow multiplexing

Please let me know if I missed any major benefit or drawback of
perf_sw_context. I'm also not sure how easy it would be to lift
perf_hw_context restriction and start allowing more pmus in it.

Suggestions?

Signed-off-by: Ravi Bangoria <[email protected]>
---
arch/x86/events/amd/ibs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 9739019d4b67..3cedd3f0c364 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -533,7 +533,7 @@ static struct attribute *ibs_op_format_attrs[] = {

static struct perf_ibs perf_ibs_fetch = {
.pmu = {
- .task_ctx_nr = perf_invalid_context,
+ .task_ctx_nr = perf_sw_context,

.event_init = perf_ibs_init,
.add = perf_ibs_add,
@@ -558,7 +558,7 @@ static struct perf_ibs perf_ibs_fetch = {

static struct perf_ibs perf_ibs_op = {
.pmu = {
- .task_ctx_nr = perf_invalid_context,
+ .task_ctx_nr = perf_sw_context,

.event_init = perf_ibs_init,
.add = perf_ibs_add,
--
2.27.0



2021-11-15 11:19:37

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context

On Mon, Nov 15, 2021 at 03:18:38PM +0530, Ravi Bangoria wrote:
> Ideally, a pmu which is present in each hw thread belongs to
> perf_hw_context, but perf_hw_context has limitation of allowing only
> one pmu (a core pmu) and thus other hw pmus need to use either sw or
> invalid context which limits pmu functionalities.
>
> This is not a new problem. It has been raised in the past, for example,
> Arm big.LITTLE (same for Intel ADL) and s390 had this issue:
>
> Arm: https://lore.kernel.org/lkml/20160425175837.GB3141@leverpostej
> s390: https://lore.kernel.org/lkml/[email protected]
>
> Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
> (heterogeneous) pmus inside perf_hw_context. It makes sense as they are
> still registering single pmu for each hw thread.
>
> s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
> hw pmu is count mode only, i.e. no sampling.
>
> AMD IBS also has similar problem. IBS pmu is present in each hw thread.
> But because of perf_hw_context restriction, currently it belongs to
> perf_invalid_context and thus important functionalities like per-task
> profiling is not possible with IBS pmu. Moving it to perf_sw_context
> will:
> - allow per-task monitoring
> - allow cgroup wise profiling
> - allow grouping of IBS with other pmu events
> - disallow multiplexing
>
> Please let me know if I missed any major benefit or drawback of
> perf_sw_context. I'm also not sure how easy it would be to lift
> perf_hw_context restriction and start allowing more pmus in it.
>
> Suggestions?

Same as I do every time this comes up... this patch is still lingering
and wanting TLC:

https://lore.kernel.org/lkml/[email protected]/

2021-11-15 12:02:32

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context



On 15-Nov-21 4:47 PM, Peter Zijlstra wrote:
> On Mon, Nov 15, 2021 at 03:18:38PM +0530, Ravi Bangoria wrote:
>> Ideally, a pmu which is present in each hw thread belongs to
>> perf_hw_context, but perf_hw_context has limitation of allowing only
>> one pmu (a core pmu) and thus other hw pmus need to use either sw or
>> invalid context which limits pmu functionalities.
>>
>> This is not a new problem. It has been raised in the past, for example,
>> Arm big.LITTLE (same for Intel ADL) and s390 had this issue:
>>
>> Arm: https://lore.kernel.org/lkml/20160425175837.GB3141@leverpostej
>> s390: https://lore.kernel.org/lkml/[email protected]
>>
>> Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
>> (heterogeneous) pmus inside perf_hw_context. It makes sense as they are
>> still registering single pmu for each hw thread.
>>
>> s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
>> hw pmu is count mode only, i.e. no sampling.
>>
>> AMD IBS also has similar problem. IBS pmu is present in each hw thread.
>> But because of perf_hw_context restriction, currently it belongs to
>> perf_invalid_context and thus important functionalities like per-task
>> profiling is not possible with IBS pmu. Moving it to perf_sw_context
>> will:
>> - allow per-task monitoring
>> - allow cgroup wise profiling
>> - allow grouping of IBS with other pmu events
>> - disallow multiplexing
>>
>> Please let me know if I missed any major benefit or drawback of
>> perf_sw_context. I'm also not sure how easy it would be to lift
>> perf_hw_context restriction and start allowing more pmus in it.
>>
>> Suggestions?
>
> Same as I do every time this comes up... this patch is still lingering
> and wanting TLC:
>
> https://lore.kernel.org/lkml/[email protected]/

Thanks for the pointer Peter. I have looked at the patch and it is quite complex,
altering the very way perf event scheduling works.

I don't dispute that is the right 'fix' for the issue, but do you think adding a
new perf context can help alleviate some of the issues in the interim?

Ravi

2021-11-15 12:07:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context

On Mon, Nov 15, 2021 at 05:31:21PM +0530, Ravi Bangoria wrote:
>
>
> On 15-Nov-21 4:47 PM, Peter Zijlstra wrote:
> > On Mon, Nov 15, 2021 at 03:18:38PM +0530, Ravi Bangoria wrote:
> >> Ideally, a pmu which is present in each hw thread belongs to
> >> perf_hw_context, but perf_hw_context has limitation of allowing only
> >> one pmu (a core pmu) and thus other hw pmus need to use either sw or
> >> invalid context which limits pmu functionalities.
> >>
> >> This is not a new problem. It has been raised in the past, for example,
> >> Arm big.LITTLE (same for Intel ADL) and s390 had this issue:
> >>
> >> Arm: https://lore.kernel.org/lkml/20160425175837.GB3141@leverpostej
> >> s390: https://lore.kernel.org/lkml/[email protected]
> >>
> >> Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
> >> (heterogeneous) pmus inside perf_hw_context. It makes sense as they are
> >> still registering single pmu for each hw thread.
> >>
> >> s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
> >> hw pmu is count mode only, i.e. no sampling.
> >>
> >> AMD IBS also has similar problem. IBS pmu is present in each hw thread.
> >> But because of perf_hw_context restriction, currently it belongs to
> >> perf_invalid_context and thus important functionalities like per-task
> >> profiling is not possible with IBS pmu. Moving it to perf_sw_context
> >> will:
> >> - allow per-task monitoring
> >> - allow cgroup wise profiling
> >> - allow grouping of IBS with other pmu events
> >> - disallow multiplexing
> >>
> >> Please let me know if I missed any major benefit or drawback of
> >> perf_sw_context. I'm also not sure how easy it would be to lift
> >> perf_hw_context restriction and start allowing more pmus in it.
> >>
> >> Suggestions?
> >
> > Same as I do every time this comes up... this patch is still lingering
> > and wanting TLC:
> >
> > https://lore.kernel.org/lkml/[email protected]/
>
> Thanks for the pointer Peter. I have looked at the patch and it is quite complex,
> altering the very way perf event scheduling works.
>
> I don't dispute that is the right 'fix' for the issue, but do you think adding a
> new perf context can help alleviate some of the issues in the interim?

And take away the motivation for people to do the right thing? How does
that work out in my favour?

2021-11-17 04:31:18

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [RFC] perf/amd/ibs: Move ibs pmus under perf_sw_context



On 15-Nov-21 5:37 PM, Peter Zijlstra wrote:
> On Mon, Nov 15, 2021 at 05:31:21PM +0530, Ravi Bangoria wrote:
>>
>>
>> On 15-Nov-21 4:47 PM, Peter Zijlstra wrote:
>>> On Mon, Nov 15, 2021 at 03:18:38PM +0530, Ravi Bangoria wrote:
>>>> Ideally, a pmu which is present in each hw thread belongs to
>>>> perf_hw_context, but perf_hw_context has limitation of allowing only
>>>> one pmu (a core pmu) and thus other hw pmus need to use either sw or
>>>> invalid context which limits pmu functionalities.
>>>>
>>>> This is not a new problem. It has been raised in the past, for example,
>>>> Arm big.LITTLE (same for Intel ADL) and s390 had this issue:
>>>>
>>>> Arm: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20160425175837.GB3141%40leverpostej&amp;data=04%7C01%7Cravi.bangoria%40amd.com%7C8b30ce7e52c0414dc65e08d9a8307149%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725748302455206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=59fepiNH9du3ZR0owjPcquQ4YLc%2B8hE74H%2BpdnQggQY%3D&amp;reserved=0
>>>> s390: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20160606082124.GA30154%40twins.programming.kicks-ass.net&amp;data=04%7C01%7Cravi.bangoria%40amd.com%7C8b30ce7e52c0414dc65e08d9a8307149%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725748302455206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=cSEGVgRbClDXeGJmEuGr1P8YXSIvhx%2FxbQjmdydQiuU%3D&amp;reserved=0
>>>>
>>>> Arm big.LITTLE (followed by Intel ADL) solved this by allowing multiple
>>>> (heterogeneous) pmus inside perf_hw_context. It makes sense as they are
>>>> still registering single pmu for each hw thread.
>>>>
>>>> s390 solved it by moving 2nd hw pmu to perf_sw_context, though that 2nd
>>>> hw pmu is count mode only, i.e. no sampling.
>>>>
>>>> AMD IBS also has similar problem. IBS pmu is present in each hw thread.
>>>> But because of perf_hw_context restriction, currently it belongs to
>>>> perf_invalid_context and thus important functionalities like per-task
>>>> profiling is not possible with IBS pmu. Moving it to perf_sw_context
>>>> will:
>>>> - allow per-task monitoring
>>>> - allow cgroup wise profiling
>>>> - allow grouping of IBS with other pmu events
>>>> - disallow multiplexing
>>>>
>>>> Please let me know if I missed any major benefit or drawback of
>>>> perf_sw_context. I'm also not sure how easy it would be to lift
>>>> perf_hw_context restriction and start allowing more pmus in it.
>>>>
>>>> Suggestions?
>>>
>>> Same as I do every time this comes up... this patch is still lingering
>>> and wanting TLC:
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20181010104559.GO5728%40hirez.programming.kicks-ass.net%2F&amp;data=04%7C01%7Cravi.bangoria%40amd.com%7C8b30ce7e52c0414dc65e08d9a8307149%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725748302455206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=sn7QOMMjrzsbBsvdEAPttouujDel0JuDptIFMHkm3Q0%3D&amp;reserved=0
>>
>> Thanks for the pointer Peter. I have looked at the patch and it is quite complex,
>> altering the very way perf event scheduling works.
>>
>> I don't dispute that is the right 'fix' for the issue, but do you think adding a
>> new perf context can help alleviate some of the issues in the interim?
>
> And take away the motivation for people to do the right thing? How does
> that work out in my favour?
>

:-) Understood Peter. I will take a stab at making your patch upstream ready.

Ravi