2019-04-12 23:55:02

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

If smp_call_function_single() is calling the function for itself, it's safe
to run with irqs_disabled() == true.

I hit the warning because I'm in the below path in the .suspend callback of
a "syscore_ops" to support hibernation for a VM running on Hyper-V:

hv_synic_cleanup() ->
clockevents_unbind_device() ->
clockevents_unbind() ->
smp_call_function_single().

When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
true.

Signed-off-by: Dexuan Cui <[email protected]>
---
kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index f4cf1b0bb3b8..4fdf6a378def 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -288,7 +288,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
* can't happen.
*/
WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
- && !oops_in_progress);
+ && cpu != smp_processor_id() && !oops_in_progress);

csd = &csd_stack;
if (!wait) {
--
2.19.1


2019-04-14 07:01:54

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

Dexuan Cui <[email protected]> writes:

> If smp_call_function_single() is calling the function for itself, it's safe
> to run with irqs_disabled() == true.
>
> I hit the warning because I'm in the below path in the .suspend callback of
> a "syscore_ops" to support hibernation for a VM running on Hyper-V:
>
> hv_synic_cleanup() ->
> clockevents_unbind_device() ->
> clockevents_unbind() ->
> smp_call_function_single().
>

I'd suggest fixing clockevents_unbind() instead, something like
(completely untested):

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 5e77662dd2d9..d14e881a8808 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -418,8 +418,17 @@ static void __clockevents_unbind(void *arg)
static int clockevents_unbind(struct clock_event_device *ced, int cpu)
{
struct ce_unbind cu = { .ce = ced, .res = -ENODEV };
+ int this_cpu;
+
+ this_cpu = get_cpu();
+
+ if (cpu != this_cpu)
+ smp_call_function_single(cpu, __clockevents_unbind, &cu, 1);
+ else
+ __clockevents_unbind(&cu);
+
+ put_cpu();

- smp_call_function_single(cpu, __clockevents_unbind, &cu, 1);
return cu.res;
}

> When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
> true.
>
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
> kernel/smp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index f4cf1b0bb3b8..4fdf6a378def 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -288,7 +288,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
> * can't happen.
> */
> WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
> - && !oops_in_progress);
> + && cpu != smp_processor_id() && !oops_in_progress);

You already have 'this_cpu', no need to call smp_processor_id().

>
> csd = &csd_stack;
> if (!wait) {

--
Vitaly

2019-04-15 12:24:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

On Fri, Apr 12, 2019 at 11:53:57PM +0000, Dexuan Cui wrote:
> If smp_call_function_single() is calling the function for itself, it's safe
> to run with irqs_disabled() == true.
>
> I hit the warning because I'm in the below path in the .suspend callback of
> a "syscore_ops" to support hibernation for a VM running on Hyper-V:
>
> hv_synic_cleanup() ->
> clockevents_unbind_device() ->
> clockevents_unbind() ->
> smp_call_function_single().
>
> When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
> true.

Pray tell, how well do you think mutex_lock() works with interrupts
disabled?

2019-04-15 23:42:27

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

> From: Peter Zijlstra <[email protected]>
> Sent: Monday, April 15, 2019 5:21 AM
> To: Dexuan Cui <[email protected]>
>
> On Fri, Apr 12, 2019 at 11:53:57PM +0000, Dexuan Cui wrote:
> > If smp_call_function_single() is calling the function for itself, it's safe
> > to run with irqs_disabled() == true.
> >
> > I hit the warning because I'm in the below path in the .suspend callback of
> > a "syscore_ops" to support hibernation for a VM running on Hyper-V:
> >
> > hv_synic_cleanup() ->
> > clockevents_unbind_device() ->
> > clockevents_unbind() ->
> > smp_call_function_single().
> >
> > When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
> > true.
>
> Pray tell, how well do you think mutex_lock() works with interrupts
> disabled?

Good point. I realized generally speaking this patch makes no sense, so let me
try the solution proposed by Vitaly, i.e. fix clockevents_unbind() instead.

Thanks,
-- Dexuan

2019-04-16 09:33:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

On Mon, Apr 15, 2019 at 11:39:57PM +0000, Dexuan Cui wrote:
> > From: Peter Zijlstra <[email protected]>
> > Sent: Monday, April 15, 2019 5:21 AM
> > To: Dexuan Cui <[email protected]>
> >
> > On Fri, Apr 12, 2019 at 11:53:57PM +0000, Dexuan Cui wrote:
> > > If smp_call_function_single() is calling the function for itself, it's safe
> > > to run with irqs_disabled() == true.
> > >
> > > I hit the warning because I'm in the below path in the .suspend callback of
> > > a "syscore_ops" to support hibernation for a VM running on Hyper-V:
> > >
> > > hv_synic_cleanup() ->
> > > clockevents_unbind_device() ->
> > > clockevents_unbind() ->
> > > smp_call_function_single().
> > >
> > > When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
> > > true.
> >
> > Pray tell, how well do you think mutex_lock() works with interrupts
> > disabled?
>
> Good point. I realized generally speaking this patch makes no sense, so let me
> try the solution proposed by Vitaly, i.e. fix clockevents_unbind() instead.

That's still not the problem. You're calling clockevents_unbind_device()
with IRQs disabled, that's not correct. It doesn't matter what
clockevents_unbind() does thereafter.

You simply cannot do any of this with IRQs disabled, end of story.

2019-04-16 11:21:49

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

Peter Zijlstra <[email protected]> writes:

> On Mon, Apr 15, 2019 at 11:39:57PM +0000, Dexuan Cui wrote:
>> > From: Peter Zijlstra <[email protected]>
>> > Sent: Monday, April 15, 2019 5:21 AM
>> > To: Dexuan Cui <[email protected]>
>> >
>> > On Fri, Apr 12, 2019 at 11:53:57PM +0000, Dexuan Cui wrote:
>> > > If smp_call_function_single() is calling the function for itself, it's safe
>> > > to run with irqs_disabled() == true.
>> > >
>> > > I hit the warning because I'm in the below path in the .suspend callback of
>> > > a "syscore_ops" to support hibernation for a VM running on Hyper-V:
>> > >
>> > > hv_synic_cleanup() ->
>> > > clockevents_unbind_device() ->
>> > > clockevents_unbind() ->
>> > > smp_call_function_single().
>> > >
>> > > When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
>> > > true.
>> >
>> > Pray tell, how well do you think mutex_lock() works with interrupts
>> > disabled?
>>
>> Good point. I realized generally speaking this patch makes no sense, so let me
>> try the solution proposed by Vitaly, i.e. fix clockevents_unbind() instead.
>
> That's still not the problem. You're calling clockevents_unbind_device()
> with IRQs disabled, that's not correct. It doesn't matter what
> clockevents_unbind() does thereafter.
>

True. And before we start digging deeper into this, let's step back: why
do we need to do clockevents_unbind_device() on hybernation? Can we just
disable the device and re-enable it back on resume?

Actually, all usages of clockevents_unbind_device() in kernel are
limited to Hyper-V and with Michael's patches moving this out of VMBus
driver I think it can go away completely.

--
Vitaly

2019-04-16 20:14:01

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

On Tue, 16 Apr 2019, Vitaly Kuznetsov wrote:

> Peter Zijlstra <[email protected]> writes:
>
> > On Mon, Apr 15, 2019 at 11:39:57PM +0000, Dexuan Cui wrote:
> >> > From: Peter Zijlstra <[email protected]>
> >> > Sent: Monday, April 15, 2019 5:21 AM
> >> > To: Dexuan Cui <[email protected]>
> >> >
> >> > On Fri, Apr 12, 2019 at 11:53:57PM +0000, Dexuan Cui wrote:
> >> > > If smp_call_function_single() is calling the function for itself, it's safe
> >> > > to run with irqs_disabled() == true.
> >> > >
> >> > > I hit the warning because I'm in the below path in the .suspend callback of
> >> > > a "syscore_ops" to support hibernation for a VM running on Hyper-V:
> >> > >
> >> > > hv_synic_cleanup() ->
> >> > > clockevents_unbind_device() ->
> >> > > clockevents_unbind() ->
> >> > > smp_call_function_single().
> >> > >
> >> > > When the .suspend callback runs, only CPU0 is online and irqs_disabled() is
> >> > > true.
> >> >
> >> > Pray tell, how well do you think mutex_lock() works with interrupts
> >> > disabled?
> >>
> >> Good point. I realized generally speaking this patch makes no sense, so let me
> >> try the solution proposed by Vitaly, i.e. fix clockevents_unbind() instead.
> >
> > That's still not the problem. You're calling clockevents_unbind_device()
> > with IRQs disabled, that's not correct. It doesn't matter what
> > clockevents_unbind() does thereafter.
> >
>
> True. And before we start digging deeper into this, let's step back: why
> do we need to do clockevents_unbind_device() on hybernation? Can we just
> disable the device and re-enable it back on resume?

Yes. That's the right thing to do. Simple solution is to implement the
suspend/resume callbacks on the clock events device and be done with it.

> Actually, all usages of clockevents_unbind_device() in kernel are
> limited to Hyper-V and with Michael's patches moving this out of VMBus
> driver I think it can go away completely.

Correct. There was a driver which required that, but that's gone by now and
of course nobody noticed that it was the last user. The reason why this
exists was to allow switching out an active clocksource similar to the
sysfs unbind file but without user space interaction.

Thanks,

tglx

2019-04-17 23:51:58

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] smp: Do not warn if smp_call_function_single() is doing a self call.

> From: Thomas Gleixner <[email protected]>
> Sent: Tuesday, April 16, 2019 1:13 PM
> > ...
> > True. And before we start digging deeper into this, let's step back: why
> > do we need to do clockevents_unbind_device() on hybernation? Can we just
> > disable the device and re-enable it back on resume?

We do clockevents_unbind_device as part of hv_synic_cleanup(), which is
called as a CPU hotplug callback: see vmbus_bus_init():
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "hyperv/vmbus:online",
hv_synic_init, hv_synic_cleanup);

Yes, it looks the right thing is to implement the suspend/resume callbacks of
the clock_event_device. Thank you for the suggestion! I'll look into this.

> Yes. That's the right thing to do. Simple solution is to implement the
> suspend/resume callbacks on the clock events device and be done with it.

Agreed.

> > Actually, all usages of clockevents_unbind_device() in kernel are
> > limited to Hyper-V and with Michael's patches moving this out of VMBus
> > driver I think it can go away completely.

Thanks for the heads-up! I'll rebase to Michael's patches.

> Correct. There was a driver which required that, but that's gone by now and
> of course nobody noticed that it was the last user. The reason why this
> exists was to allow switching out an active clocksource similar to the
> sysfs unbind file but without user space interaction.
>
> tglx

Thanks for the background sharing!

- Dexuan