LinuxLists.cc - Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug

2009-09-11 07:45:49

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Peter Zijlstra wrote:
> On Thu, 2009-07-30 at 17:30 -0700, [email protected] wrote:
>
>> ------------------------------------------------------
>> Subject: generic-ipi: fix the race between generic_smp_call_function_*() and hotplug_cfd()
>> From: Xiao Guangrong <[email protected]>
>>
>> There is a race between generic_smp_call_function_*() and hotplug_cfd() in
>> many cases, see below examples:
>>
>> 1: hotplug_cfd() can free cfd->cpumask, the system will crash if the
>> cpu's cfd still in the call_function list:
>>
>>
>> CPU A: CPU B
>>
>> smp_call_function_many() ......
>> cpu_down() ......
>> hotplug_cfd() -> ......
>> free_cpumask_var(cfd->cpumask) (receive function IPI interrupte)
>> /* read cfd->cpumask */
>> generic_smp_call_function_interrupt() ->
>> cpumask_test_and_clear_cpu(cpu, data->cpumask)
>>
>> CRASH!!!
>>
>> 2: It's not handle call_function list when cpu down, It's will lead to
>> dead-wait if other path is waiting this cpu to execute function
>>
>> CPU A: CPU B
>>
>> smp_call_function_many(wait=0)
>> ...... CPU B down
>> smp_call_function_many() --> (cpu down before recevie function
>> csd_lock(&data->csd); IPI interrupte)
>>
>> DEAD-WAIT!!!!
>>
>> So, CPU A will dead-wait in csd_lock(), the same as
>> smp_call_function_single()
>
> On re-reading this, I'm wondering if 2 is a real case.
>
> I'm thinking it should never happen since you're supposed to do things
> like get_online_cpus() around stuff like this, but then again, I doubt
> we actually do.
>

I think that get_online_cpus() and stop_machine() can't avoid this
race,

1: For the first example in my patch's changlog:

CPU A: CPU B

smp_call_function_many(wait=0) ......
cpu_down() ......
hotplug_cfd() -> ......
free_cpumask_var(cfd->cpumask) (receive function IPI interrupte)
/* read cfd->cpumask */
generic_smp_call_function_interrupt() ->
cpumask_test_and_clear_cpu(cpu, data->cpumask)

CRASH!!!

CPU A call smp_call_function_many(wait=0) that want CPU B to call
a specific function, after smp_call_function_many() return, we let
CPU A offline immediately. Unfortunately, if CPU B receives this
IPI interrupt after CPU A down, it will crash like above description.

2: For the second example in my patch's changlog:

If CPU B is dying, like below:

_cpu_down()
{
......

/* We suppose that have below sequences:
* before call __stop_machine(), CPU B is online (in cpu_online_mask),
* in this time, CPU A call smp_call_function_many(wait=0) and want
* CPU B to call a specific function, after CPU A finish it, CPU B
* go to __stop_machine() and disable it's interrupt
* (suppose CPU B not receive IPI interrupt in this time now)
*/
err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
......
}

Now, CPU B is down, but it's not handle CPU A's request, it cause that
can't clean the CSD_FLAG_LOCK flag of CPU A's cfd_data, if CPU A
call smp_call_function_many() next time. it will block in
csd_lock() -> csd_lock_wait(data) forever.

>> Signed-off-by: Xiao Guangrong <[email protected]>
>> Cc: Ingo Molnar <[email protected]>
>> Cc: Jens Axboe <[email protected]>
>> Cc: Nick Piggin <[email protected]>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Rusty Russell <[email protected]>
>> Signed-off-by: Andrew Morton <[email protected]>
>> ---
>>
>> kernel/smp.c | 38 ++++++++++++++++++++++++++++----------
>> 1 file changed, 28 insertions(+), 10 deletions(-)
>>
>> diff -puN kernel/smp.c~generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd kernel/smp.c
>> --- a/kernel/smp.c~generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd
>> +++ a/kernel/smp.c
>> @@ -113,14 +113,10 @@ void generic_exec_single(int cpu, struct
>> csd_lock_wait(data);
>> }
>>
>> -/*
>> - * Invoked by arch to handle an IPI for call function. Must be called with
>> - * interrupts disabled.
>> - */
>> -void generic_smp_call_function_interrupt(void)
>> +static void
>> +__generic_smp_call_function_interrupt(int cpu, int run_callbacks)
>> {
>> struct call_function_data *data;
>> - int cpu = smp_processor_id();
>>
>> /*
>> * Ensure entry is visible on call_function_queue after we have
>
> Also, if this is the last version, we're still not using run_callbacks
> for anything..
>

It's not the last version and fixed in another patch, see below URL please:
http://marc.info/?l=linux-mm-commits&m=124900028228350&w=2

Thanks,
Xiao

2009-09-11 07:51:06

by Peter Zijlstra

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Fri, 2009-09-11 at 15:45 +0800, Xiao Guangrong wrote:
> CPU A call smp_call_function_many(wait=0) that want CPU B to call

Ah, right, the .wait=0 is the clue, that makes all this
get/put_onlines_cpus() useless..

You're right, carry on ;-)

2009-09-11 19:09:36

by Suresh Siddha

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Fri, 2009-09-11 at 00:45 -0700, Xiao Guangrong wrote:
> I think that get_online_cpus() and stop_machine() can't avoid this
> race,
>
> 1: For the first example in my patch's changlog:
>
> CPU A: CPU B
>
> smp_call_function_many(wait=0) ......
> cpu_down() ......
> hotplug_cfd() -> ......
> free_cpumask_var(cfd->cpumask) (receive function IPI interrupte)
> /* read cfd->cpumask */
> generic_smp_call_function_interrupt() ->
> cpumask_test_and_clear_cpu(cpu, data->cpumask)
>
> CRASH!!!
>
> CPU A call smp_call_function_many(wait=0) that want CPU B to call
> a specific function, after smp_call_function_many() return, we let
> CPU A offline immediately. Unfortunately, if CPU B receives this
> IPI interrupt after CPU A down, it will crash like above description.

How can cpu B receive the IPI interrupt after cpu A is down?

As part of the cpu A going down, we first do the stop machine. i.e.,
schedule the stop machine worker threads on each cpu. So, by the time
all the worker threads on all the cpu's get scheduled and synchronized,
ipi on B should get delivered.

>
> 2: For the second example in my patch's changlog:
>
> If CPU B is dying, like below:
>
> _cpu_down()
> {
> ......
>
> /* We suppose that have below sequences:
> * before call __stop_machine(), CPU B is online (in cpu_online_mask),
> * in this time, CPU A call smp_call_function_many(wait=0) and want
> * CPU B to call a specific function, after CPU A finish it, CPU B
> * go to __stop_machine() and disable it's interrupt
> * (suppose CPU B not receive IPI interrupt in this time now)
> */
> err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
> ......
> }
>
> Now, CPU B is down, but it's not handle CPU A's request, it cause that
> can't clean the CSD_FLAG_LOCK flag of CPU A's cfd_data, if CPU A
> call smp_call_function_many() next time. it will block in
> csd_lock() -> csd_lock_wait(data) forever.

Here also, by the time the stop machine threads are scheduled on cpu B,
cpu B should service that IPI.

smp_call_function with wait=0 will still ensure that IPI's are
registered at the destination cpu. But smp_call_function call will
return before the interrupt handler is run and completed.

So, by the time we schedule stop machine threads and when they are all
online and get synchronized (after this point only we disable interrupts
by moving to STOPMACHINE_DISABLE_IRQ state), we should have serviced the
pending smp call function IPI's.

I am still not convinced about the need for this patch. Am I missing
something? Please elaborate the need with more sequence of steps.

thanks,
suresh

2009-09-14 07:23:11

by Xiao Guangrong

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Suresh Siddha wrote:

>> CPU A call smp_call_function_many(wait=0) that want CPU B to call
>> a specific function, after smp_call_function_many() return, we let
>> CPU A offline immediately. Unfortunately, if CPU B receives this
>> IPI interrupt after CPU A down, it will crash like above description.
>
> How can cpu B receive the IPI interrupt after cpu A is down?
>
> As part of the cpu A going down, we first do the stop machine. i.e.,
> schedule the stop machine worker threads on each cpu. So, by the time
> all the worker threads on all the cpu's get scheduled and synchronized,
> ipi on B should get delivered.
>

Actually, those two examples have the same reason, that is how long
the destination CPU will receive the IPI interruption?

If the stop machine threads can schedule in CPU B during the IPI
interruption delivering, It will occur those issue.

I understand what you say but let me confuse is how we ensure it? The IPI
interruption is delivered over the APIC bus, It need several CPU instruction
cycle I guess, I also have read the spec of Intel 64 and IA-32, but not find
the answer, could you point out for me?

Thanks,
Xiao

2009-09-15 00:17:26

by Suresh Siddha

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Mon, 2009-09-14 at 00:22 -0700, Xiao Guangrong wrote:
>
> Suresh Siddha wrote:
>
> >> CPU A call smp_call_function_many(wait=0) that want CPU B to call
> >> a specific function, after smp_call_function_many() return, we let
> >> CPU A offline immediately. Unfortunately, if CPU B receives this
> >> IPI interrupt after CPU A down, it will crash like above description.
> >
> > How can cpu B receive the IPI interrupt after cpu A is down?
> >
> > As part of the cpu A going down, we first do the stop machine. i.e.,
> > schedule the stop machine worker threads on each cpu. So, by the time
> > all the worker threads on all the cpu's get scheduled and synchronized,
> > ipi on B should get delivered.
> >
>
> Actually, those two examples have the same reason, that is how long
> the destination CPU will receive the IPI interruption?
>
> If the stop machine threads can schedule in CPU B during the IPI
> interruption delivering, It will occur those issue.
>
> I understand what you say but let me confuse is how we ensure it? The IPI
> interruption is delivered over the APIC bus, It need several CPU instruction
> cycle I guess, I also have read the spec of Intel 64 and IA-32, but not find
> the answer, could you point out for me?

Xiao, There is quite a bit of time between the time a particular cpu
sends a smp call function IPI (with wait == 0) and the time that cpu
starts running the stop machine thread and the whole system proceeds
with stop machine. With in this time, typically the smp call function
destination will receive the ipi interrupt. But theoretically the
problem you explain might happen.

>From P4 onwards, interrupts are delivered over system bus and with NHM
it is QPI. Also, the mechanism of scheduling the stop machine thread on
a particular cpu involves resched IPI etc.

Nevertheless, Have you seen a real hang or system crash due to this? If
so, on what platform?

Ideally, for xapic based platform, clear status of sender APIC ICR's
delivery status indicates that the interrupt is registered at the
receiver. for x2apic based platform, sending another interrupt will
ensure that the previous interrupt was delivered.

If you have indeed seen a crash related to this, can you review and give
the appended patch a try and see if it fixes the issue? If you agree
with the fix, then I will send the patch with a detailed change log etc.

Your current fix is not clean and not complete in my opinion (as calling
interrupt handlers manually and not doing the callbacks etc might cause
other side affects). Thanks.
---

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 9e3d8af..69ec2a9 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -93,8 +93,9 @@ void generic_smp_call_function_single_interrupt(void);
void generic_smp_call_function_interrupt(void);
void ipi_call_lock(void);
void ipi_call_unlock(void);
void ipi_call_lock_irq(void);
void ipi_call_unlock_irq(void);
+void quiesce_smp_call_functions(void);
#endif

/*
diff --git a/kernel/smp.c b/kernel/smp.c
index 8e21850..d13a888 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -479,6 +479,27 @@ int smp_call_function(void (*func)(void *), void *info, int wait)
}
EXPORT_SYMBOL(smp_call_function);

+void quiesce_smp_call_functions(void)
+{
+ struct call_single_queue *q = &__get_cpu_var(call_single_queue);
+ bool empty;
+ unsigned long flags;
+
+ do {
+ cpu_relax();
+ spin_lock_irqsave(&q->lock, flags);
+ empty = list_empty(&q->list);
+ spin_unlock_irqrestore(&q->lock, flags);
+ } while (!empty);
+
+ do {
+ cpu_relax();
+ spin_lock_irqsave(&call_function.lock, flags);
+ empty = list_empty(&call_function.queue);
+ spin_unlock_irqrestore(&call_function.lock, flags);
+ } while (!empty);
+}
+
void ipi_call_lock(void)
{
spin_lock(&call_function.lock);
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 912823e..dd2d90f 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -86,6 +86,9 @@ static void stop_cpu(struct work_struct *unused)
curstate = state;
switch (curstate) {
case STOPMACHINE_DISABLE_IRQ:
+#ifdef CONFIG_USE_GENERIC_SMP_HELPERS
+ quiesce_smp_call_functions();
+#endif
local_irq_disable();
hard_irq_disable();
break;

2009-09-15 02:04:05

by Xiao Guangrong

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Suresh Siddha wrote:
> On Mon, 2009-09-14 at 00:22 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>
>>>> CPU A call smp_call_function_many(wait=0) that want CPU B to call
>>>> a specific function, after smp_call_function_many() return, we let
>>>> CPU A offline immediately. Unfortunately, if CPU B receives this
>>>> IPI interrupt after CPU A down, it will crash like above description.
>>> How can cpu B receive the IPI interrupt after cpu A is down?
>>>
>>> As part of the cpu A going down, we first do the stop machine. i.e.,
>>> schedule the stop machine worker threads on each cpu. So, by the time
>>> all the worker threads on all the cpu's get scheduled and synchronized,
>>> ipi on B should get delivered.
>>>
>> Actually, those two examples have the same reason, that is how long
>> the destination CPU will receive the IPI interruption?
>>
>> If the stop machine threads can schedule in CPU B during the IPI
>> interruption delivering, It will occur those issue.
>>
>> I understand what you say but let me confuse is how we ensure it? The IPI
>> interruption is delivered over the APIC bus, It need several CPU instruction
>> cycle I guess, I also have read the spec of Intel 64 and IA-32, but not find
>> the answer, could you point out for me?
>
> Xiao, There is quite a bit of time between the time a particular cpu
> sends a smp call function IPI (with wait == 0) and the time that cpu
> starts running the stop machine thread and the whole system proceeds
> with stop machine. With in this time, typically the smp call function
> destination will receive the ipi interrupt. But theoretically the
> problem you explain might happen.
>

Yeah, though this case is very infrequent, but we can't avoid it.

>>From P4 onwards, interrupts are delivered over system bus and with NHM
> it is QPI. Also, the mechanism of scheduling the stop machine thread on
> a particular cpu involves resched IPI etc.
>
> Nevertheless, Have you seen a real hang or system crash due to this? If
> so, on what platform?
>
> Ideally, for xapic based platform, clear status of sender APIC ICR's
> delivery status indicates that the interrupt is registered at the
> receiver. for x2apic based platform, sending another interrupt will
> ensure that the previous interrupt was delivered.
>
> If you have indeed seen a crash related to this, can you review and give
> the appended patch a try and see if it fixes the issue? If you agree
> with the fix, then I will send the patch with a detailed change log etc.
>

I not seen this crash, just afraid it's unsafe while I review the code,
I try to generate this crash, but as you know, the race point is hard
to control.

> Your current fix is not clean and not complete in my opinion (as calling
> interrupt handlers manually and not doing the callbacks etc might cause
> other side affects). Thanks.

It is not the last version and doing the callbacks in another patch,
see below URL please:
http://marc.info/?l=linux-mm-commits&m=124900028228350&w=2

I think we do better handle this in CPU down path in kernel/smp.c, It's very
safe and let people easy to understand.

> ---
>
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 9e3d8af..69ec2a9 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -93,8 +93,9 @@ void generic_smp_call_function_single_interrupt(void);
> void generic_smp_call_function_interrupt(void);
> void ipi_call_lock(void);
> void ipi_call_unlock(void);
> void ipi_call_lock_irq(void);
> void ipi_call_unlock_irq(void);
> +void quiesce_smp_call_functions(void);
> #endif
>
> /*
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 8e21850..d13a888 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -479,6 +479,27 @@ int smp_call_function(void (*func)(void *), void *info, int wait)
> }
> EXPORT_SYMBOL(smp_call_function);
>
> +void quiesce_smp_call_functions(void)
> +{
> + struct call_single_queue *q = &__get_cpu_var(call_single_queue);
> + bool empty;
> + unsigned long flags;
> +
> + do {
> + cpu_relax();
> + spin_lock_irqsave(&q->lock, flags);
> + empty = list_empty(&q->list);
> + spin_unlock_irqrestore(&q->lock, flags);
> + } while (!empty);
> +
> + do {
> + cpu_relax();
> + spin_lock_irqsave(&call_function.lock, flags);
> + empty = list_empty(&call_function.queue);
> + spin_unlock_irqrestore(&call_function.lock, flags);
> + } while (!empty);
> +}
> +

Why we need waiting CPU to handle this? It make no sense because the CPU is dying,
we can simple ignore the IPI request.

> void ipi_call_lock(void)
> {
> spin_lock(&call_function.lock);
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 912823e..dd2d90f 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -86,6 +86,9 @@ static void stop_cpu(struct work_struct *unused)
> curstate = state;
> switch (curstate) {
> case STOPMACHINE_DISABLE_IRQ:
> +#ifdef CONFIG_USE_GENERIC_SMP_HELPERS
> + quiesce_smp_call_functions();
> +#endif

It seems ugly, we can define a noop function if CONFIG_USE_GENERIC_SMP_HELPERS is not
defined.

Another problem is that all CPU must call quiesce_smp_call_functions() here, but only
dying CPU need do it.

> local_irq_disable();
> hard_irq_disable();

It will cause another race, if CPU A send a IPI interruption after CPU B call
quiesce_smp_call_functions() and disable IRQ, it will case the same problem.
(in this time, CPU B is enter stop machine, but CPU A is not)

Thanks,
Xiao

2009-09-16 02:20:46

by Suresh Siddha

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Mon, 2009-09-14 at 19:03 -0700, Xiao Guangrong wrote:
> > Your current fix is not clean and not complete in my opinion (as calling
> > interrupt handlers manually and not doing the callbacks etc might cause
> > other side affects). Thanks.
>
> It is not the last version and doing the callbacks in another patch,
> see below URL please:
> http://marc.info/?l=linux-mm-commits&m=124900028228350&w=2

I am referring to this latest patch only. We are calling the interrupt
handler manually and not doing the callbacks in that context. In future,
we might see other side affects if we miss some of these smp ipi's.

Clean solution is to ensure that there are no unhandled smp call
function handlers and then continue with the cpu offline.

> Another problem is that all CPU must call quiesce_smp_call_functions() here, but only
> dying CPU need do it.

In stop_machine() all cpu's will wait for each other to come to the
rendezvous point. so this is completely ok (infact this is what is
happening if some cpu is already handling some ipi's etc. I am just
making it more explicit).

>
> > local_irq_disable();
> > hard_irq_disable();
>
> It will cause another race, if CPU A send a IPI interruption after CPU B call
> quiesce_smp_call_functions() and disable IRQ, it will case the same problem.
> (in this time, CPU B is enter stop machine, but CPU A is not)

No. By the time we call quiesce_ipis(), all the cpu's are already in
stop machine FIFO threads and no one else can send IPI (i.e, all the
cpus have moved past the STOPMACHINE_PREPARE state). This is when we are
calling the quiesce_smp_call_functions().

thanks,
suresh

2009-09-17 03:01:23

by Xiao Guangrong

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Suresh Siddha wrote:
> On Mon, 2009-09-14 at 19:03 -0700, Xiao Guangrong wrote:
>>> Your current fix is not clean and not complete in my opinion (as calling
>>> interrupt handlers manually and not doing the callbacks etc might cause
>>> other side affects). Thanks.
>> It is not the last version and doing the callbacks in another patch,
>> see below URL please:
>> http://marc.info/?l=linux-mm-commits&m=124900028228350&w=2
>
> I am referring to this latest patch only. We are calling the interrupt
> handler manually and not doing the callbacks in that context. In future,
> we might see other side affects if we miss some of these smp ipi's.
>

um, maybe you are right.

> Clean solution is to ensure that there are no unhandled smp call
> function handlers and then continue with the cpu offline.
>

Yeah, I agree it.

>> Another problem is that all CPU must call quiesce_smp_call_functions() here, but only
>> dying CPU need do it.
>
> In stop_machine() all cpu's will wait for each other to come to the
> rendezvous point. so this is completely ok (infact this is what is
> happening if some cpu is already handling some ipi's etc. I am just
> making it more explicit).

Waiting all cpu handle its IPI interruption in stop_machine() path will make
it slower, I'm not sure it's OK.

>
>>> local_irq_disable();
>>> hard_irq_disable();
>> It will cause another race, if CPU A send a IPI interruption after CPU B call
>> quiesce_smp_call_functions() and disable IRQ, it will case the same problem.
>> (in this time, CPU B is enter stop machine, but CPU A is not)
>
> No. By the time we call quiesce_ipis(), all the cpu's are already in
> stop machine FIFO threads and no one else can send IPI (i.e, all the
> cpus have moved past the STOPMACHINE_PREPARE state). This is when we are
> calling the quiesce_smp_call_functions().
>

Sorry for let you misunderstand, It's not clear explanation here.

The preempt is enabled when CPU enter STOPMACHINE_DISABLE_IRQ state, so
other task will preempt it and send IPI interruption.
But, Andrew point out my mistake that the stop machine workqueue thread
is the highest priority and with "SCHED_FIFO" scheduler, so It not happen,
please ignore this comment.

How about manual check/handle pending IPI interruption in the CPU context?
like this:

---
include/linux/smp.h | 3 +++
kernel/cpu.c | 3 +++
2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 9e3d8af..a9ea518 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -95,6 +95,9 @@ void ipi_call_lock(void);
void ipi_call_unlock(void);
void ipi_call_lock_irq(void);
void ipi_call_unlock_irq(void);
+#else
+static inline void generic_smp_call_function_single_interrupt(void) { }
+static inline void generic_smp_call_function_interrupt(void) { }
#endif

/*
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6ba0f1e..4ba7f92 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -173,6 +173,9 @@ static int __ref take_cpu_down(void *_param)
struct take_cpu_down_param *param = _param;
int err;

+ generic_smp_call_function_interrupt();
+ generic_smp_call_function_single_interrupt();
+
/* Ensure this CPU doesn't handle any more interrupts. */
err = __cpu_disable();
if (err < 0)
--
1.6.1.2

Thanks,
Xiao

2009-09-17 07:45:35

by Peter Zijlstra

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Thu, 2009-09-17 at 11:00 +0800, Xiao Guangrong wrote:
> Waiting all cpu handle its IPI interruption in stop_machine() path will make
> it slower, I'm not sure it's OK.

It is, its an absolute slow path.

2009-09-19 02:17:21

by Suresh Siddha

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Wed, 2009-09-16 at 20:00 -0700, Xiao Guangrong wrote:
> How about manual check/handle pending IPI interruption in the CPU context?
> like this:
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -173,6 +173,9 @@ static int __ref take_cpu_down(void *_param)
> struct take_cpu_down_param *param = _param;
> int err;
>
> + generic_smp_call_function_interrupt();
> + generic_smp_call_function_single_interrupt();
> +

At this place, how will you ensure that the smp_call_function initiated
by this dying cpu has reached and got serviced at its destination?

All the other cpu's have disabled interrupts in the stop machine state
by the time we come here and we can't wait.

2009-09-21 02:56:16

by Xiao Guangrong

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Suresh Siddha wrote:
> On Wed, 2009-09-16 at 20:00 -0700, Xiao Guangrong wrote:
>> How about manual check/handle pending IPI interruption in the CPU context?
>> like this:
>> --- a/kernel/cpu.c
>> +++ b/kernel/cpu.c
>> @@ -173,6 +173,9 @@ static int __ref take_cpu_down(void *_param)
>> struct take_cpu_down_param *param = _param;
>> int err;
>>
>> + generic_smp_call_function_interrupt();
>> + generic_smp_call_function_single_interrupt();
>> +
>
> At this place, how will you ensure that the smp_call_function initiated
> by this dying cpu has reached and got serviced at its destination?
>

Suresh, sorry for my poor English, Do you mean that how we ensure it has
pending IPI request in the dying cpu?

generic_smp_call_function_*() will check it, if the cpu has pending request,
then handle it, else directly return.

> All the other cpu's have disabled interrupts in the stop machine state
> by the time we come here and we can't wait.
>

Why we can't wait? It manual check/handle the pending IPI request not wait
interruption happen.

It not has race here because all cpu's interruption is disabled, and it not
make stop machine slow because only the dying cpu can enter take_cpu_down(),
we just wait the dying cpu handle it's pending request.

Am I misunderstand something?

Thanks,
Xiao

2009-09-21 03:12:06

by Suresh Siddha

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Sun, 2009-09-20 at 19:55 -0700, Xiao Guangrong wrote:
>
> Suresh Siddha wrote:
> > On Wed, 2009-09-16 at 20:00 -0700, Xiao Guangrong wrote:
> >> How about manual check/handle pending IPI interruption in the CPU context?
> >> like this:
> >> --- a/kernel/cpu.c
> >> +++ b/kernel/cpu.c
> >> @@ -173,6 +173,9 @@ static int __ref take_cpu_down(void *_param)
> >> struct take_cpu_down_param *param = _param;
> >> int err;
> >>
> >> + generic_smp_call_function_interrupt();
> >> + generic_smp_call_function_single_interrupt();
> >> +
> >
> > At this place, how will you ensure that the smp_call_function initiated
> > by this dying cpu has reached and got serviced at its destination?
> >
>
> Suresh, sorry for my poor English, Do you mean that how we ensure it has
> pending IPI request in the dying cpu?
>
> generic_smp_call_function_*() will check it, if the cpu has pending request,
> then handle it, else directly return.
>

I am referring to the missing csd_lock_wait() here that you had in the
first version of your patch. Let's say, if cpu X is going offline, we
need to ensure that the smp_call_function() initiated by cpu X (i.e.,
smp_call_function IPI sent to some other cpu's from cpu X) got serviced
before cpu X goes offline. We can't do csd_lock_wait() here, as that
might deadlock (as all the other cpu's are already in stop machine with
interrupts disabled).

> > All the other cpu's have disabled interrupts in the stop machine state
> > by the time we come here and we can't wait.
> >
>
> Why we can't wait? It manual check/handle the pending IPI request not wait
> interruption happen.
>
> It not has race here because all cpu's interruption is disabled, and it not
> make stop machine slow because only the dying cpu can enter take_cpu_down(),
> we just wait the dying cpu handle it's pending request.
>
> Am I misunderstand something?
>
> Thanks,
> Xiao
>

2009-09-21 04:05:53

by Xiao Guangrong

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Suresh Siddha wrote:
> On Sun, 2009-09-20 at 19:55 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>> On Wed, 2009-09-16 at 20:00 -0700, Xiao Guangrong wrote:
>>>> How about manual check/handle pending IPI interruption in the CPU context?
>>>> like this:
>>>> --- a/kernel/cpu.c
>>>> +++ b/kernel/cpu.c
>>>> @@ -173,6 +173,9 @@ static int __ref take_cpu_down(void *_param)
>>>> struct take_cpu_down_param *param = _param;
>>>> int err;
>>>>
>>>> + generic_smp_call_function_interrupt();
>>>> + generic_smp_call_function_single_interrupt();
>>>> +
>>> At this place, how will you ensure that the smp_call_function initiated
>>> by this dying cpu has reached and got serviced at its destination?
>>>
>> Suresh, sorry for my poor English, Do you mean that how we ensure it has
>> pending IPI request in the dying cpu?
>>
>> generic_smp_call_function_*() will check it, if the cpu has pending request,
>> then handle it, else directly return.
>>
>
> I am referring to the missing csd_lock_wait() here that you had in the
> first version of your patch. Let's say, if cpu X is going offline, we
> need to ensure that the smp_call_function() initiated by cpu X (i.e.,
> smp_call_function IPI sent to some other cpu's from cpu X) got serviced
> before cpu X goes offline. We can't do csd_lock_wait() here, as that
> might deadlock (as all the other cpu's are already in stop machine with
> interrupts disabled).
>

It not happen because the preemption is disabled while send IPI request and
can't schedule to stop machine path, it also stop cpu down.

Thanks,
Xiao

2009-09-22 05:33:13

by Suresh Siddha

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

On Sun, 2009-09-20 at 21:04 -0700, Xiao Guangrong wrote:
>
> Suresh Siddha wrote:
> >
> > I am referring to the missing csd_lock_wait() here that you had in the
> > first version of your patch. Let's say, if cpu X is going offline, we
> > need to ensure that the smp_call_function() initiated by cpu X (i.e.,
> > smp_call_function IPI sent to some other cpu's from cpu X) got serviced
> > before cpu X goes offline. We can't do csd_lock_wait() here, as that
> > might deadlock (as all the other cpu's are already in stop machine with
> > interrupts disabled).
> >
>
> It not happen because the preemption is disabled while send IPI request and
> can't schedule to stop machine path, it also stop cpu down.

Xiao, I am getting confused. I am referring to case '1' mentioned by you
here http://marc.info/?l=linux-kernel&m=125265516529139&w=2

thanks,
suresh

2009-09-22 06:53:40

by Xiao Guangrong

[permalink] [raw]

Subject: Re: + generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patch added to -mm tree

Suresh Siddha wrote:
> On Sun, 2009-09-20 at 21:04 -0700, Xiao Guangrong wrote:
>> Suresh Siddha wrote:
>>> I am referring to the missing csd_lock_wait() here that you had in the
>>> first version of your patch. Let's say, if cpu X is going offline, we
>>> need to ensure that the smp_call_function() initiated by cpu X (i.e.,
>>> smp_call_function IPI sent to some other cpu's from cpu X) got serviced
>>> before cpu X goes offline. We can't do csd_lock_wait() here, as that
>>> might deadlock (as all the other cpu's are already in stop machine with
>>> interrupts disabled).
>>>
>> It not happen because the preemption is disabled while send IPI request and
>> can't schedule to stop machine path, it also stop cpu down.
>
> Xiao, I am getting confused. I am referring to case '1' mentioned by you
> here http://marc.info/?l=linux-kernel&m=125265516529139&w=2
>

Ah, your meaning is that we can't do csd_lock_wait() in the CPU_DEAD
notification path in my first version patch? like below:

+static int
+hotplug_cfd(struct notifier_block *nfb, unsigned long action, void *hcpu)
+{
...
+
+#ifdef CONFIG_HOTPLUG_CPU
+ case CPU_UP_CANCELED:
+ case CPU_UP_CANCELED_FROZEN:
+
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ local_irq_save(flags);
+ __generic_smp_call_function_interrupt(cpu, 0);
+ __generic_smp_call_function_single_interrupt(cpu, 0);
+ local_irq_restore(flags);
+
/* Do you mean we can't do csd_lock_wait() here??? */
+ csd_lock_wait(&cfd->csd);
+ free_cpumask_var(cfd->cpumask);
+ break;
+#endif
+ };
+
+ return NOTIFY_OK;
+}

The CPU_DEAD notification is not sent in stop machine path, you can
see _cpu_down() function in kernel/cpu.c

Suresh, If I misunderstand your words again, could your elaborate it?

My first version patch is not clean and not complete that you point out in
previous mail:
" I am referring to this latest patch only. We are calling the interrupt
handler manually and not doing the callbacks in that context. In future,
we might see other side affects if we miss some of these smp ipi's."

How about the second patch?

Thanks,
Xiao