2001-11-27 13:20:25

by Maneesh Soni

[permalink] [raw]
Subject: smp_call_function & BH handlers

Hi,

Why is it ok to call smp_call_function from bottom half handlers? This
could lead to deadlock in the way which we encounterd. (tried on 2.4.14 kernel)

CPU 0 CPU 1
----- -----
schedule() do_fork
read_lock(&tasklist_lock) spinning for write_lock_irq(&tasklist_lock)
.
.
.
interrupted by a timer handler
calls smp_call_function()
waiting for response from CPU 1

IMO this looks like a genereic problem and not specific to tasklist_lock and can
happen with other locks also. The solution for the above problem can be

(1) Do not use smp_call_function even from bottom half handlers.
(2) Enabling interrupts if CPU has to spin due to xxx_lock_irq() and disabling
them when the CPU gets the lock.

Though the deadlock we faced doesnot occur, using read_lock_irq(&tasklist_lock)
in schedule().

The comments above smp_call_function() also say that it can return negative
status code upon failure. But it doesnot do that and keep waiting for response
from other cpus. Why is it necessary to wait for response if we specify nowait
in the parameter?

I hope I have not missed anything here.

Thanks
Maneesh

--
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
Phone: +91-80-5044999 email: [email protected]
http://lse.sourceforge.net/locking/rcupdate.html



2001-11-27 13:31:56

by Ingo Molnar

[permalink] [raw]
Subject: Re: smp_call_function & BH handlers


On Tue, 27 Nov 2001, Maneesh Soni wrote:

> Why is it ok to call smp_call_function from bottom half handlers? [...]

which part of the kernel is calling smp_call_function() from bh contexts?

Ingo

2001-11-27 16:22:09

by Maneesh Soni

[permalink] [raw]
Subject: Re: smp_call_function & BH handlers

On Tue, Nov 27, 2001 at 04:28:49PM +0100, Ingo Molnar wrote:
>
> On Tue, 27 Nov 2001, Maneesh Soni wrote:
>
> > Why is it ok to call smp_call_function from bottom half handlers? [...]
>
> which part of the kernel is calling smp_call_function() from bh contexts?
>
> Ingo

I am working with Dipankar on Read-Copy Update, and experimenting with
smp_call_function(). We believed the comments for this routine and faced
this problem. That's why this question came. I have not yet searched
kernel sources for such places hence not sure whether there are really such
places or not.

Maneesh

--
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
Phone: +91-80-5044999 email: [email protected]
http://lse.sourceforge.net/locking/rcupdate.html

2001-11-27 16:41:11

by Ingo Molnar

[permalink] [raw]
Subject: Re: smp_call_function & BH handlers


On Tue, 27 Nov 2001, Maneesh Soni wrote:

> I am working with Dipankar on Read-Copy Update, and experimenting with
> smp_call_function(). We believed the comments for this routine and
> faced this problem. That's why this question came. I have not yet
> searched kernel sources for such places hence not sure whether there
> are really such places or not.

we had similar lockup problems before, eg. TLB flushes initiated from
IRQ/BH contexts - which is illegal now. Generally it's not safe to assume
that every CPU is responsive to synchronous events triggered from IRQ/BH
contexts. Every read_lock user is prone to this problem.

Ingo

2001-11-27 19:25:57

by Dipankar Sarma

[permalink] [raw]
Subject: Re: smp_call_function & BH handlers


In article <[email protected]> Ingo Molnar wrote:

> On Tue, 27 Nov 2001, Maneesh Soni wrote:

>> I am working with Dipankar on Read-Copy Update, and experimenting with
>> smp_call_function(). We believed the comments for this routine and
>> faced this problem. That's why this question came. I have not yet
>> searched kernel sources for such places hence not sure whether there
>> are really such places or not.

> we had similar lockup problems before, eg. TLB flushes initiated from
> IRQ/BH contexts - which is illegal now. Generally it's not safe to assume
> that every CPU is responsive to synchronous events triggered from IRQ/BH
> contexts. Every read_lock user is prone to this problem.

Thanks for the clarification. Should we update the
function header for smp_call_function() to say that it is illegal
to use it from both IRQ and BH contexts ?

Along the same lines, I am wondering if nowait broadcast IPI sender
waiting for IPI handlers to start in all other CPUs is a by-product
of the implementation. I can see the need for two types of
such IPIs - 1. send the broadcast IPI and forget about it and
2. send the broadcast IPI and wait for completion of the handlers.

Is there a need for the linux kernel to have a broadcast IPI
mechanism that waits for the start of the IPI handler elsewhere but
not till the end ?

Thanks
Dipankar
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.