2019-12-08 04:20:23

by Qian Cai

[permalink] [raw]
Subject: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu

domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
is true where it will initialize d->mbm_over. However,
domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
checking r->mon_capable. Hence, it triggers a debugobjects warning when
offlining CPUs because those timer debugobjects are never initialized.

ODEBUG: assert_init not available (active state 0) object type:
timer_list hint: 0x0
WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
debug_print_object+0xfe/0x140
Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
I40 05/23/2018
RIP: 0010:debug_print_object+0xfe/0x140
Call Trace:
debug_object_assert_init+0x1f5/0x240
del_timer+0x6f/0xf0
try_to_grab_pending+0x42/0x3c0
cancel_delayed_work+0x7d/0x150
resctrl_offline_cpu+0x3c0/0x520
cpuhp_invoke_callback+0x197/0x1120
cpuhp_thread_fun+0x252/0x2f0
smpboot_thread_fn+0x255/0x440
kthread+0x1e6/0x210
ret_from_fork+0x3a/0x50

Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
Signed-off-by: Qian Cai <[email protected]>
---
arch/x86/kernel/cpu/resctrl/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 03eb90d00af0..89049b343c7a 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -618,7 +618,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
if (static_branch_unlikely(&rdt_mon_enable_key))
rmdir_mondata_subdir_allrdtgrp(r, d->id);
list_del(&d->list);
- if (is_mbm_enabled())
+ if (r->mon_capable && is_mbm_enabled())
cancel_delayed_work(&d->mbm_over);
if (is_llc_occupancy_enabled() && has_busy_rmid(r, d)) {
/*
--
2.21.0 (Apple Git-122.2)


2019-12-10 07:46:58

by Yu Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu

Hi Qian,

On Sun, Dec 8, 2019 at 12:14 PM Qian Cai <[email protected]> wrote:
>
> domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
> is true where it will initialize d->mbm_over. However,
> domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
> checking r->mon_capable. Hence, it triggers a debugobjects warning when
> offlining CPUs because those timer debugobjects are never initialized.
>
Could you elaborate a little more on the failure symptom?
If I understand correctly, the error you described was due to
r->mon_capable set to false while is_mbm_enabled() returns true?
Which means on this platform rdt_mon_features is non zero?
And in get_rdt_mon_resources() it will invoke rdt_get_mon_l3_config(),
however the only possible failure to do not set r->mon_capable is that it
failed in dom_data_init() due to kcalloc() failure? Then the logic in
get_rdt_resources() is that it will ignore the return error if rdt allocate
feature is supported on this platform? If this is the case, the r->mon_capable
is not an indicator for whether the overflow thread has been created, right?
Can we simply remove the check of r->mon_capable in domain_add_cpu() and
invoke domain_setup_mon_state() directly?
> ODEBUG: assert_init not available (active state 0) object type:
> timer_list hint: 0x0
> WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
> debug_print_object+0xfe/0x140
> Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
> I40 05/23/2018
> RIP: 0010:debug_print_object+0xfe/0x140
> Call Trace:
> debug_object_assert_init+0x1f5/0x240
> del_timer+0x6f/0xf0
> try_to_grab_pending+0x42/0x3c0
> cancel_delayed_work+0x7d/0x150
> resctrl_offline_cpu+0x3c0/0x520
> cpuhp_invoke_callback+0x197/0x1120
> cpuhp_thread_fun+0x252/0x2f0
> smpboot_thread_fn+0x255/0x440
> kthread+0x1e6/0x210
> ret_from_fork+0x3a/0x50
>
> Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
> Signed-off-by: Qian Cai <[email protected]>
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 03eb90d00af0..89049b343c7a 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -618,7 +618,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
> if (static_branch_unlikely(&rdt_mon_enable_key))
> rmdir_mondata_subdir_allrdtgrp(r, d->id);
> list_del(&d->list);
> - if (is_mbm_enabled())
> + if (r->mon_capable && is_mbm_enabled())
> cancel_delayed_work(&d->mbm_over);
Humm, it looks like there are two places within this function
invoked cancel_delayed_work(&d->mbm_over),
why not adding the check for both of them?

thanks,
Y

2019-12-10 12:12:51

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu



> On Dec 10, 2019, at 2:44 AM, Ryan Chen <[email protected]> wrote:
>
> Could you elaborate a little more on the failure symptom?
> If I understand correctly, the error you described was due to
> r->mon_capable set to false while is_mbm_enabled() returns true?

Yes.

> Which means on this platform rdt_mon_features is non zero?

No idea. I did add some debug code that indicated resctrl_online_cpu() found 3 resources in for_each_capable_rdt_resource(r). Only the first one set r->mon_capable.

> And in get_rdt_mon_resources() it will invoke rdt_get_mon_l3_config(),
> however the only possible failure to do not set r->mon_capable is that it
> failed in dom_data_init() due to kcalloc() failure? Then the logic in

Very likely. Should be easy to confirm.

> get_rdt_resources() is that it will ignore the return error if rdt allocate
> feature is supported on this platform? If

Yes.

> this is the case, the r->mon_capable
> is not an indicator for whether the overflow thread has been created, right?

I am not sure about that.

> Can we simply remove the check of r->mon_capable in domain_add_cpu() and
> invoke domain_setup_mon_state() directly?

That should work too, but it is so perfect align with the r->alloc_capable check above, so I am not sure it is a good idea to break it.

> Humm, it looks like there are two places within this function
> invoked cancel_delayed_work(&d->mbm_over),
> why not adding the check for both of them?

Because I am not sure about the second one. It was never executed due to “cpu != d->mbm_work_cpu“ even after offlining all CPUs except cpu 0 and never cause anything wrong yet, so I could not test it yet, but I can see why it might need a similar check too if d->mbm_work_cpu is non-zero and could trigger the same imbalance.

2019-12-10 18:08:34

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu



> On Dec 10, 2019, at 2:55 AM, Ryan Chen <[email protected]> wrote:
>
> Hi Qian,
>
> On Sun, Dec 8, 2019 at 12:14 PM Qian Cai <[email protected]> wrote:
>>
>> domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
>> is true where it will initialize d->mbm_over. However,
>> domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
>> checking r->mon_capable. Hence, it triggers a debugobjects warning when
>> offlining CPUs because those timer debugobjects are never initialized.
>>
> Could you elaborate a little more on the failure symptom?
> If I understand correctly, the error you described was due to
> r->mon_capable set to false while is_mbm_enabled() returns true?
> Which means on this platform rdt_mon_features is non zero?
> And in get_rdt_mon_resources() it will invoke rdt_get_mon_l3_config(),
> however the only possible failure to do not set r->mon_capable is that it
> failed in dom_data_init() due to kcalloc() failure? Then the logic in
> get_rdt_resources() is that it will ignore the return error if rdt allocate
> feature is supported on this platform? If this is the case, the r->mon_capable
> is not an indicator for whether the overflow thread has been created, right?
> Can we simply remove the check of r->mon_capable in domain_add_cpu() and
> invoke domain_setup_mon_state() directly?

Actually,

domain_add_cpu r->name = L3, r->alloc_capable = 1, r->mon_capable = 1
domain_add_cpu r->name = L3DATA, r->alloc_capable = 1, r->mon_capable = 0
domain_add_cpu r->name = L3CODE, r->alloc_capable = 1, r->mon_capable = 0

rdt_get_mon_l3_config() will only set r->mon_capable = 1 for L3.

>> ODEBUG: assert_init not available (active state 0) object type:
>> timer_list hint: 0x0
>> WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
>> debug_print_object+0xfe/0x140
>> Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
>> I40 05/23/2018
>> RIP: 0010:debug_print_object+0xfe/0x140
>> Call Trace:
>> debug_object_assert_init+0x1f5/0x240
>> del_timer+0x6f/0xf0
>> try_to_grab_pending+0x42/0x3c0
>> cancel_delayed_work+0x7d/0x150
>> resctrl_offline_cpu+0x3c0/0x520
>> cpuhp_invoke_callback+0x197/0x1120
>> cpuhp_thread_fun+0x252/0x2f0
>> smpboot_thread_fn+0x255/0x440
>> kthread+0x1e6/0x210
>> ret_from_fork+0x3a/0x50
>>
>> Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
>> Signed-off-by: Qian Cai <[email protected]>
>> ---
>> arch/x86/kernel/cpu/resctrl/core.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 03eb90d00af0..89049b343c7a 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -618,7 +618,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>> if (static_branch_unlikely(&rdt_mon_enable_key))
>> rmdir_mondata_subdir_allrdtgrp(r, d->id);
>> list_del(&d->list);
>> - if (is_mbm_enabled())
>> + if (r->mon_capable && is_mbm_enabled())
>> cancel_delayed_work(&d->mbm_over);
> Humm, it looks like there are two places within this function
> invoked cancel_delayed_work(&d->mbm_over),
> why not adding the check for both of them?

Here it only check L3, so it will skip correctly for L3DATA and L3CODE
to not call cancel_delayed_work(). Recalled the above that only L3 will
have r->capable set.

if (r == &rdt_resources_all[RDT_RESOURCE_L3]) {
if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);

Hence, r->mon_capable check seems redundant here.

2019-12-10 18:46:18

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu

Hi Qian,

On 12/10/2019 10:06 AM, Qian Cai wrote:
>> On Dec 10, 2019, at 2:55 AM, Ryan Chen <[email protected]> wrote:
>>
>> Hi Qian,
>>
>> On Sun, Dec 8, 2019 at 12:14 PM Qian Cai <[email protected]> wrote:
>>>
>>> domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
>>> is true where it will initialize d->mbm_over. However,
>>> domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
>>> checking r->mon_capable. Hence, it triggers a debugobjects warning when
>>> offlining CPUs because those timer debugobjects are never initialized.
>>>
>> Could you elaborate a little more on the failure symptom?
>> If I understand correctly, the error you described was due to
>> r->mon_capable set to false while is_mbm_enabled() returns true?
>> Which means on this platform rdt_mon_features is non zero?
>> And in get_rdt_mon_resources() it will invoke rdt_get_mon_l3_config(),
>> however the only possible failure to do not set r->mon_capable is that it
>> failed in dom_data_init() due to kcalloc() failure? Then the logic in
>> get_rdt_resources() is that it will ignore the return error if rdt allocate
>> feature is supported on this platform? If this is the case, the r->mon_capable
>> is not an indicator for whether the overflow thread has been created, right?
>> Can we simply remove the check of r->mon_capable in domain_add_cpu() and
>> invoke domain_setup_mon_state() directly?
>
> Actually,
>
> domain_add_cpu r->name = L3, r->alloc_capable = 1, r->mon_capable = 1
> domain_add_cpu r->name = L3DATA, r->alloc_capable = 1, r->mon_capable = 0
> domain_add_cpu r->name = L3CODE, r->alloc_capable = 1, r->mon_capable = 0
>
> rdt_get_mon_l3_config() will only set r->mon_capable = 1 for L3.
>
>>> ODEBUG: assert_init not available (active state 0) object type:
>>> timer_list hint: 0x0
>>> WARNING: CPU: 143 PID: 789 at lib/debugobjects.c:484
>>> debug_print_object+0xfe/0x140
>>> Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 Compute Module, BIOS
>>> I40 05/23/2018
>>> RIP: 0010:debug_print_object+0xfe/0x140
>>> Call Trace:
>>> debug_object_assert_init+0x1f5/0x240
>>> del_timer+0x6f/0xf0
>>> try_to_grab_pending+0x42/0x3c0
>>> cancel_delayed_work+0x7d/0x150
>>> resctrl_offline_cpu+0x3c0/0x520
>>> cpuhp_invoke_callback+0x197/0x1120
>>> cpuhp_thread_fun+0x252/0x2f0
>>> smpboot_thread_fn+0x255/0x440
>>> kthread+0x1e6/0x210
>>> ret_from_fork+0x3a/0x50
>>>
>>> Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
>>> Signed-off-by: Qian Cai <[email protected]>
>>> ---
>>> arch/x86/kernel/cpu/resctrl/core.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>> index 03eb90d00af0..89049b343c7a 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>> @@ -618,7 +618,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>>> if (static_branch_unlikely(&rdt_mon_enable_key))
>>> rmdir_mondata_subdir_allrdtgrp(r, d->id);
>>> list_del(&d->list);
>>> - if (is_mbm_enabled())
>>> + if (r->mon_capable && is_mbm_enabled())
>>> cancel_delayed_work(&d->mbm_over);
>> Humm, it looks like there are two places within this function
>> invoked cancel_delayed_work(&d->mbm_over),
>> why not adding the check for both of them?
>
> Here it only check L3, so it will skip correctly for L3DATA and L3CODE
> to not call cancel_delayed_work(). Recalled the above that only L3 will
> have r->capable set.
>
> if (r == &rdt_resources_all[RDT_RESOURCE_L3]) {
> if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
> cancel_delayed_work(&d->mbm_over);
>
> Hence, r->mon_capable check seems redundant here.
>

Thank you very much for catching this. Your change looks like the right
thing to do. As Ryan pointed out this is not obvious at first and
looking back at this commit at a later time may benefit from some more
details. For example, how about:


"A system that supports resource monitoring may have multiple resources
while not all of these resources are capable of monitoring. Monitoring
related state is initialized only for resources that are capable of
monitoring and correspondingly this state should subsequently only be
removed from these resources that are capable of monitoring.

domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
is true where it will initialize d->mbm_over. However,
domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
all resources, even those that never initialized d->mbm_over because
they are not capable of monitoring. Hence, it triggers a debugobjects
warning when offlining CPUs because those timer debugobjects are never
initialized.

ODEBUG:..."


Reinette

2019-12-10 19:10:18

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu



> On Dec 10, 2019, at 1:44 PM, Reinette Chatre <[email protected]> wrote:
>
>
> "A system that supports resource monitoring may have multiple resources
> while not all of these resources are capable of monitoring. Monitoring
> related state is initialized only for resources that are capable of
> monitoring and correspondingly this state should subsequently only be
> removed from these resources that are capable of monitoring.
>
> domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
> is true where it will initialize d->mbm_over. However,
> domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
> checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
> all resources, even those that never initialized d->mbm_over because
> they are not capable of monitoring. Hence, it triggers a debugobjects
> warning when offlining CPUs because those timer debugobjects are never
> initialized.
>
> ODEBUG:..."

Looks better to me. Do you want me to send a v2 for it or you could update it for merging?

2019-12-10 20:16:09

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu

Hi Qian,

On 12/10/2019 11:08 AM, Qian Cai wrote:
>
>
>> On Dec 10, 2019, at 1:44 PM, Reinette Chatre <[email protected]> wrote:
>>
>>
>> "A system that supports resource monitoring may have multiple resources
>> while not all of these resources are capable of monitoring. Monitoring
>> related state is initialized only for resources that are capable of
>> monitoring and correspondingly this state should subsequently only be
>> removed from these resources that are capable of monitoring.
>>
>> domain_add_cpu() calls domain_setup_mon_state() only when r->mon_capable
>> is true where it will initialize d->mbm_over. However,
>> domain_remove_cpu() calls cancel_delayed_work(&d->mbm_over) without
>> checking r->mon_capable resulting in an attempt to cancel d->mbm_over on
>> all resources, even those that never initialized d->mbm_over because
>> they are not capable of monitoring. Hence, it triggers a debugobjects
>> warning when offlining CPUs because those timer debugobjects are never
>> initialized.
>>
>> ODEBUG:..."
>
> Looks better to me. Do you want me to send a v2 for it or you could update it for merging?
>

Could you please send v2? I am not the one that provides final approval
for inclusion nor the one that will take care of merging afterwards.

Thank you very much

Reinette

2019-12-12 12:00:20

by Yu Chen

[permalink] [raw]
Subject: Re: [PATCH] x86/resctrl: fix an imbalance in domain_remove_cpu

On Wed, Dec 11, 2019 at 2:06 AM Qian Cai <[email protected]> wrote:
>
> Here it only check L3, so it will skip correctly for L3DATA and L3CODE
> to not call cancel_delayed_work(). Recalled the above that only L3 will
> have r->capable set.
>
> if (r == &rdt_resources_all[RDT_RESOURCE_L3]) {
> if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
> cancel_delayed_work(&d->mbm_over);
>
> Hence, r->mon_capable check seems redundant here.
>
I see. Thanks for explaining.

--
thanks,
Ryan