2005-02-02 18:54:58

by Kevin Hilman

[permalink] [raw]
Subject: Real-Time Preemption and GFP_ATOMIC

While testing an older driver on an -RT kernel (currently using
-V0.7.37-03), I noticed something strange.

The driver was triggering a "sleeping function called from invalid
context" BUG(). It was coming from a case where the driver was doing
a __get_free_page(GFP_ATOMIC) while interrupts were disabled (example
trace below). I know this is probably real bug and it shouldn't be
allocating memory with interrupts disabled, but shouldn't this be
possible? Isn't the role of GFP_ATOMIC to say that "this caller
cannot sleep".

To produce the following trace, I wrote a simple moudle which just has
this as its init_module routine:

local_irq_disable();
p = __get_free_page(GFP_ATOMIC);
local_irq_enable();

And here's the trace:

BUG: sleeping function called from invalid context insmod(2126) at kernel/rt.c:1448
in_atomic():0 [00000000], irqs_disabled():1
[<c0102fa3>] dump_stack+0x23/0x30 (20)
[<c01133c5>] __might_sleep+0xe5/0x100 (36)
[<c012fbb8>] __spin_lock+0x38/0x60 (24)
[<c012fc8d>] _spin_lock_irqsave+0x1d/0x30 (16)
[<c0140e5c>] buffered_rmqueue+0x1c/0x190 (40)
[<c01413ce>] __alloc_pages+0x34e/0x390 (76)
[<c0141437>] __get_free_pages+0x27/0x50 (12)
[<c883205a>] kmod_init+0x5a/0x74 [kmod] (24)
[<c0138232>] sys_init_module+0x232/0x260 (28)
[<c010299c>] syscall_call+0x7/0xb (-8124)
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c0133e3d>] .... print_traces+0x1d/0x60
.....[<c0102fa3>] .. ( <= dump_stack+0x23/0x30)





2005-02-02 22:32:05

by Esben Nielsen

[permalink] [raw]
Subject: Re: Real-Time Preemption and GFP_ATOMIC

On 2 Feb 2005, Kevin Hilman wrote:

> While testing an older driver on an -RT kernel (currently using
> -V0.7.37-03), I noticed something strange.
>
> The driver was triggering a "sleeping function called from invalid
> context" BUG(). It was coming from a case where the driver was doing
> a __get_free_page(GFP_ATOMIC) while interrupts were disabled (example
> trace below). I know this is probably real bug and it shouldn't be
> allocating memory with interrupts disabled, but shouldn't this be
> possible? Isn't the role of GFP_ATOMIC to say that "this caller
> cannot sleep".
>
The problem is that almost all locks are replaced by mutexex which
can make yuo sleep. That includes locks around the various allocation
structures.

This is one of those places where I think Ingo have gone too far, but I
see that the code in mm/ is not fitted for for doing anything else
but what Ingo have done right now. It would require some rewriting to fix
it.

The basic allocations should be of the free-list form
res = first_free;
if(res) {
first_free = res->next;
}
return res;

I se no problem in protecting this kind of operation by a raw spinlock.
Using a mutex to protect such a list would be a waste: You would have to
lock and unlock the mutex's spinlock twice! If it was made that way, i.e.
the very basic free-list operation was taken out in front of the more
complicated stuff in mm/slap.c GFP_ATOMIC could be made to work as usual.

The hard job is that the refill operation has to be done under a mutex
under PREEMPT_RT. I.e. suddenly there are two locks to take care off.

Esben


2005-02-10 15:59:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: Real-Time Preemption and GFP_ATOMIC


* Kevin Hilman <[email protected]> wrote:

> To produce the following trace, I wrote a simple moudle which just has
> this as its init_module routine:
>
> local_irq_disable();
> p = __get_free_page(GFP_ATOMIC);
> local_irq_enable();

in the PREEMPT_RT kernel almost everything might sleep, so the general
rule is to not call anything 'complex' from an IRQs-off section.
Depending on which is easier in your code, if you want to fix it up for
PREEMPT_RT then either move the GFP_ATOMIC allocation from under the
irqs-off section, or introduce a spinlock for the irqs-off section and
use spin_lock_irqsave(). (that is almost always needed anyway if you
really needed the irqs-off section.)

Ingo