2005-04-26 02:23:46

by Shaohua Li

[permalink] [raw]
Subject: [PATCH]broadcast IPI race condition on CPU hotplug

Hi,
After a CPU is booted but before it's officially up (set online map, and
enable interrupt), the CPU possibly will receive a broadcast IPI. After
it's up, it will handle the stale interrupt soon and maybe cause oops if
it's a smp-call-function-interrupt. This is quite possible in CPU
hotplug case, but nearly can't occur at boot time. Below patch replaces
broadcast IPI with send_ipi_mask just like the cluster mode.

Thanks,
Shaohua

Signed-off-by: Li Shaohua<[email protected]>

--- a/include/asm-i386/mach-default/mach_ipi.h 2005-04-26 09:07:51.695390216 +0800
+++ b/include/asm-i386/mach-default/mach_ipi.h 2005-04-26 09:26:59.235937536 +0800
@@ -11,20 +11,16 @@ static inline void send_IPI_mask(cpumask

static inline void send_IPI_allbutself(int vector)
{
- /*
- * if there are no other CPUs in the system then we get an APIC send
- * error if we try to broadcast, thus avoid sending IPIs in this case.
- */
- if (!(num_online_cpus() > 1))
- return;
+ cpumask_t mask = cpu_online_map;
+ cpu_clear(smp_processor_id(), mask);

- __send_IPI_shortcut(APIC_DEST_ALLBUT, vector);
- return;
+ if (likely(!cpus_empty(mask)))
+ send_IPI_mask(mask, vector);
}

static inline void send_IPI_all(int vector)
{
- __send_IPI_shortcut(APIC_DEST_ALLINC, vector);
+ send_IPI_mask(cpu_online_map, vector);
}

#endif /* __ASM_MACH_IPI_H */
--- a/arch/x86_64/kernel/genapic_flat.c 2005-03-02 15:38:09.000000000 +0800
+++ b/arch/x86_64/kernel/genapic_flat.c 2005-04-26 09:26:09.298529176 +0800
@@ -45,20 +45,19 @@ static void flat_init_apic_ldr(void)
apic_write_around(APIC_LDR, val);
}

+static void flat_send_IPI_mask(cpumask_t cpumask, int vector);
static void flat_send_IPI_allbutself(int vector)
{
- /*
- * if there are no other CPUs in the system then
- * we get an APIC send error if we try to broadcast.
- * thus we have to avoid sending IPIs in this case.
- */
- if (num_online_cpus() > 1)
- __send_IPI_shortcut(APIC_DEST_ALLBUT, vector, APIC_DEST_LOGICAL);
+ cpumask_t mask = cpu_online_map;
+ cpu_clear(smp_processor_id(), mask);
+
+ if (likely(!cpus_empty(mask)))
+ flat_send_IPI_mask(mask, vector);
}

static void flat_send_IPI_all(int vector)
{
- __send_IPI_shortcut(APIC_DEST_ALLINC, vector, APIC_DEST_LOGICAL);
+ flat_send_IPI_mask(cpu_online_map, vector);
}

static void flat_send_IPI_mask(cpumask_t cpumask, int vector)


2005-04-26 04:23:26

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

Hi Shaohua,

On Tue, 26 Apr 2005, Li Shaohua wrote:

> After a CPU is booted but before it's officially up (set online map, and
> enable interrupt), the CPU possibly will receive a broadcast IPI. After
> it's up, it will handle the stale interrupt soon and maybe cause oops if
> it's a smp-call-function-interrupt. This is quite possible in CPU
> hotplug case, but nearly can't occur at boot time. Below patch replaces
> broadcast IPI with send_ipi_mask just like the cluster mode.

Ok, but isn't it sufficient to use send_ipi_mask in smp_call_function
instead?

Thanks,
Zwane

2005-04-26 04:28:57

by Shaohua Li

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

Hi,
On Tue, 2005-04-26 at 12:25, Zwane Mwaikambo wrote:
>
> > After a CPU is booted but before it's officially up (set online map, and
> > enable interrupt), the CPU possibly will receive a broadcast IPI. After
> > it's up, it will handle the stale interrupt soon and maybe cause oops if
> > it's a smp-call-function-interrupt. This is quite possible in CPU
> > hotplug case, but nearly can't occur at boot time. Below patch replaces
> > broadcast IPI with send_ipi_mask just like the cluster mode.
>
> Ok, but isn't it sufficient to use send_ipi_mask in smp_call_function
> instead?
I'm not sure if other routines using broadcast IPI have this bug. Fixing
the send_ipi_all API looks more generic. Is there any reason we should
use broadcast IPI?

Thanks,
Shaohua

2005-04-26 04:45:39

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

On Tue, 26 Apr 2005, Li Shaohua wrote:

> On Tue, 2005-04-26 at 12:25, Zwane Mwaikambo wrote:
> >
> > > After a CPU is booted but before it's officially up (set online map, and
> > > enable interrupt), the CPU possibly will receive a broadcast IPI. After
> > > it's up, it will handle the stale interrupt soon and maybe cause oops if
> > > it's a smp-call-function-interrupt. This is quite possible in CPU
> > > hotplug case, but nearly can't occur at boot time. Below patch replaces
> > > broadcast IPI with send_ipi_mask just like the cluster mode.
> >
> > Ok, but isn't it sufficient to use send_ipi_mask in smp_call_function
> > instead?
> I'm not sure if other routines using broadcast IPI have this bug. Fixing
> the send_ipi_all API looks more generic. Is there any reason we should
> use broadcast IPI?

I'd prefer only touching smp_call_function because the others are
primitives, we should only be fixing the users of the primitives,
otherwise we'll end up with code which won't be as versatile.

Thanks,
Zwane

2005-04-26 05:21:16

by Shaohua Li

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

Hi,
On Tue, 2005-04-26 at 12:47, Zwane Mwaikambo wrote:
> > > > After a CPU is booted but before it's officially up (set online map, and
> > > > enable interrupt), the CPU possibly will receive a broadcast IPI. After
> > > > it's up, it will handle the stale interrupt soon and maybe cause oops if
> > > > it's a smp-call-function-interrupt. This is quite possible in CPU
> > > > hotplug case, but nearly can't occur at boot time. Below patch replaces
> > > > broadcast IPI with send_ipi_mask just like the cluster mode.
> > >
> > > Ok, but isn't it sufficient to use send_ipi_mask in smp_call_function
> > > instead?
> > I'm not sure if other routines using broadcast IPI have this bug. Fixing
> > the send_ipi_all API looks more generic. Is there any reason we should
> > use broadcast IPI?
>
> I'd prefer only touching smp_call_function because the others are
> primitives, we should only be fixing the users of the primitives,
> otherwise we'll end up with code which won't be as versatile.
Ok, here it it. It's against -mm tree.

Thanks,
Shaohua

Signed-off-by: Li Shaohua<[email protected]>

--- a/arch/i386/kernel/smp.c 2005-04-26 08:47:08.762344760 +0800
+++ b/arch/i386/kernel/smp.c 2005-04-26 13:05:44.558585200 +0800
@@ -527,6 +527,7 @@ int smp_call_function (void (*func) (voi
{
struct call_data_struct data;
int cpus;
+ cpumask_t mask;

/* Holding any lock stops cpus from going down. */
spin_lock(&call_lock);
@@ -536,6 +537,8 @@ int smp_call_function (void (*func) (voi
spin_unlock(&call_lock);
return 0;
}
+ mask = cpu_online_map;
+ cpu_clear(smp_processor_id(), mask);

/* Can deadlock when called with interrupts disabled */
WARN_ON(irqs_disabled());
@@ -551,7 +554,7 @@ int smp_call_function (void (*func) (voi
mb();

/* Send a message to all other CPUs and wait for them to respond */
- send_IPI_allbutself(CALL_FUNCTION_VECTOR);
+ send_IPI_mask(mask, CALL_FUNCTION_VECTOR);

/* Wait for response */
while (atomic_read(&data.started) != cpus)
--- a/arch/x86_64/kernel/smp.c 2005-04-12 10:12:16.000000000 +0800
+++ b/arch/x86_64/kernel/smp.c 2005-04-26 13:07:26.715055056 +0800
@@ -304,10 +304,12 @@ static void __smp_call_function (void (*
{
struct call_data_struct data;
int cpus = num_online_cpus()-1;
+ cpumask_t mask = cpu_online_map;

if (!cpus)
return;

+ cpu_clear(smp_processor_id(), mask);
data.func = func;
data.info = info;
atomic_set(&data.started, 0);
@@ -318,7 +320,7 @@ static void __smp_call_function (void (*
call_data = &data;
wmb();
/* Send a message to all other CPUs and wait for them to respond */
- send_IPI_allbutself(CALL_FUNCTION_VECTOR);
+ send_IPI_mask(mask, CALL_FUNCTION_VECTOR);

/* Wait for response */
while (atomic_read(&data.started) != cpus)


2005-04-26 13:22:15

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

On Tue, Apr 26, 2005 at 10:20:44AM +0800, Li Shaohua wrote:
> Hi,
> After a CPU is booted but before it's officially up (set online map, and
> enable interrupt), the CPU possibly will receive a broadcast IPI. After
> it's up, it will handle the stale interrupt soon and maybe cause oops if
> it's a smp-call-function-interrupt. This is quite possible in CPU
> hotplug case, but nearly can't occur at boot time. Below patch replaces
> broadcast IPI with send_ipi_mask just like the cluster mode.

No way we are making this common operation much slower just
to fix an obscure race at boot time. PLease come up with a fix
that only impacts the boot process.

Thanks,
-Andi

2005-04-27 00:58:19

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

On Tue, 26 Apr 2005, Andi Kleen wrote:

> On Tue, Apr 26, 2005 at 10:20:44AM +0800, Li Shaohua wrote:
> > Hi,
> > After a CPU is booted but before it's officially up (set online map, and
> > enable interrupt), the CPU possibly will receive a broadcast IPI. After
> > it's up, it will handle the stale interrupt soon and maybe cause oops if
> > it's a smp-call-function-interrupt. This is quite possible in CPU
> > hotplug case, but nearly can't occur at boot time. Below patch replaces
> > broadcast IPI with send_ipi_mask just like the cluster mode.
>
> No way we are making this common operation much slower just
> to fix an obscure race at boot time. PLease come up with a fix
> that only impacts the boot process.

The fix only for smp_call_function seems fine to me since we really don't
want to broadcast to all processors but only online ones.
smp_call_function is particularly dangerous because the interrupt handler
accesses stack data which wouldn't be persistent due to the calling
processor not waiting for non online processors.

Thanks,
Zwane

2005-04-27 01:14:52

by Shaohua Li

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

On Tue, 2005-04-26 at 21:21, Andi Kleen wrote:
> On Tue, Apr 26, 2005 at 10:20:44AM +0800, Li Shaohua wrote:
> > Hi,
> > After a CPU is booted but before it's officially up (set online map, and
> > enable interrupt), the CPU possibly will receive a broadcast IPI. After
> > it's up, it will handle the stale interrupt soon and maybe cause oops if
> > it's a smp-call-function-interrupt. This is quite possible in CPU
> > hotplug case, but nearly can't occur at boot time. Below patch replaces
> > broadcast IPI with send_ipi_mask just like the cluster mode.
>
> No way we are making this common operation much slower just
> to fix an obscure race at boot time. PLease come up with a fix
> that only impacts the boot process.
We can't prevent a CPU to receive a broadcast interrupt. Ack the
interrupt and mark the cpu online can't be atomic operation, so the CPU
either receives unexpected interrupt or loses interrupt.

Thanks,
Shaohua

2005-04-27 12:56:51

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

On Wed, Apr 27, 2005 at 09:11:59AM +0800, Li Shaohua wrote:
> On Tue, 2005-04-26 at 21:21, Andi Kleen wrote:
> > On Tue, Apr 26, 2005 at 10:20:44AM +0800, Li Shaohua wrote:
> > > Hi,
> > > After a CPU is booted but before it's officially up (set online map, and
> > > enable interrupt), the CPU possibly will receive a broadcast IPI. After
> > > it's up, it will handle the stale interrupt soon and maybe cause oops if
> > > it's a smp-call-function-interrupt. This is quite possible in CPU
> > > hotplug case, but nearly can't occur at boot time. Below patch replaces
> > > broadcast IPI with send_ipi_mask just like the cluster mode.
> >
> > No way we are making this common operation much slower just
> > to fix an obscure race at boot time. PLease come up with a fix
> > that only impacts the boot process.
> We can't prevent a CPU to receive a broadcast interrupt. Ack the
> interrupt and mark the cpu online can't be atomic operation, so the CPU
> either receives unexpected interrupt or loses interrupt.

Cant you just check at the end of the CPU bootup if the CPU
got such an APIC interrupt and ack it then?

-Andi

2005-04-28 03:18:44

by Shaohua Li

[permalink] [raw]
Subject: Re: [PATCH]broadcast IPI race condition on CPU hotplug

On Wed, 2005-04-27 at 20:56, Andi Kleen wrote:
> > > No way we are making this common operation much slower just
> > > to fix an obscure race at boot time. PLease come up with a fix
> > > that only impacts the boot process.
> > We can't prevent a CPU to receive a broadcast interrupt. Ack the
> > interrupt and mark the cpu online can't be atomic operation, so the CPU
> > either receives unexpected interrupt or loses interrupt.
>
> Cant you just check at the end of the CPU bootup if the CPU
> got such an APIC interrupt and ack it then?
Ok, There are two solutions:
1. boot sequence hold the 'call_lock' before LAPIC is initialized. So no
broadcast IPI will be received. But you possibly think the lock is hold
too long time.
2. Hold the lock later.
The boot sequence does:
a.hold 'call_lock' (prevent upcoming IPI)
b.enable interrupt (let stale IPI get handled)
c.set cpu online
d.unlock the lock.
The smp_call_function_interrupt does:
if (cpu is offline)
ack the interrupt and return
But this approach will add check in smp_call_function_interrupt and
enable interrupt before set online map. Don't know if it's an issue.

Seems no other approaches. The ISR and IRR registers are read only.
BTW, send_ipi_mask_bitmask sends a CPU group one time, is it really much
slower than broadcast IPI? I guess we can ignore the overhead, since the
overhead is lighter against the call function interrupt handler.

Thanks,
Shaohua