Marcelo,
my blk_grow_request_list() patch in -pre5 is buggy. It
can cause boot-time lockups. The window is fairly small,
but I just hit it.
drivers/ide/ide-probe.c:init_irq() does cli().
It calls down to blk_init_free_list() and
blk_grow_request_list().
blk_grow_request_list() does spin_unlock_irq(). Which
is illegal inside cli(). An interrupt comes in and
the CPU locks up in irq_enter(), spinning on global_irq_lock,
which this CPU already holds.
Below is the patch. (That's the last spin_lock_irq()
anyone will be seeing from me :))
Andre, init_irq() is somewhat broken - it appears to
be assuming that cli() will disable interrupts, but it's
calling functions which can sleep. If these functions
_do_ sleep, interrupts will be enabled, which is presumably
not what IDE wants to happen.
--- 2.4.19-pre5/drivers/block/ll_rw_blk.c~ide-lockup Fri Mar 29 21:19:11 2002
+++ 2.4.19-pre5-akpm/drivers/block/ll_rw_blk.c Fri Mar 29 21:20:04 2002
@@ -336,14 +336,16 @@ void generic_unplug_device(void *data)
*/
int blk_grow_request_list(request_queue_t *q, int nr_requests)
{
- spin_lock_irq(&io_request_lock);
+ unsigned long flags;
+
+ spin_lock_irqsave(&io_request_lock, flags);
while (q->nr_requests < nr_requests) {
struct request *rq;
int rw;
- spin_unlock_irq(&io_request_lock);
+ spin_unlock_irqrestore(&io_request_lock, flags);
rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
- spin_lock_irq(&io_request_lock);
+ spin_lock_irqsave(&io_request_lock, flags);
if (rq == NULL)
break;
memset(rq, 0, sizeof(*rq));
@@ -356,7 +358,7 @@ int blk_grow_request_list(request_queue_
q->batch_requests = q->nr_requests / 4;
if (q->batch_requests > 32)
q->batch_requests = 32;
- spin_unlock_irq(&io_request_lock);
+ spin_unlock_irqrestore(&io_request_lock, flags);
return q->nr_requests;
}
-
> - spin_unlock_irq(&io_request_lock);
> + spin_unlock_irqrestore(&io_request_lock, flags);
> rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
Great patch.
kmem_cache_alloc with SLAB_KERNEL can sleep, i.e. you've just converted
an obvious bug into a rare, difficult to find bug. What about trying to
fix it?
I agree that this won't happen during boot, but what about a hotplug PCI
ide controller?
--
Manfred
Manfred Spraul wrote:
>
> > - spin_unlock_irq(&io_request_lock);
> > + spin_unlock_irqrestore(&io_request_lock, flags);
> > rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
>
> Great patch.
> kmem_cache_alloc with SLAB_KERNEL can sleep, i.e. you've just converted
> an obvious bug into a rare, difficult to find bug. What about trying to
> fix it?
Gimme a break, Manfred. The patch fixes the new bug. Which was
hardly obvious. The longstanding (as in years-old) bug was
pointed out to the maintainer.
It may not even be a bug. Certainly I don't think it's
worth my time to fiddle with it. But you're at liberty to.
> I agree that this won't happen during boot, but what about a hotplug PCI
> ide controller?
The kernel calls request_irq() inside cli() in lots of places.
That's the same bug: "if you called cli(), how come you're
allowing kmalloc to clear it?".
In 2.4, this is a design wart. In 2.5, it will go BUG() if
the page allocator performs I/O.
-
> The kernel calls request_irq() inside cli() in lots of places.
> That's the same bug: "if you called cli(), how come you're
> allowing kmalloc to clear it?".
Those places should if possible be fixed. I take patches. If we can get 2.4
to BUG() on those kmalloc violations and clean them up it sounds like
progress
Alan Cox wrote:
>
> > The kernel calls request_irq() inside cli() in lots of places.
> > That's the same bug: "if you called cli(), how come you're
> > allowing kmalloc to clear it?".
>
> Those places should if possible be fixed. I take patches. If we can get 2.4
> to BUG() on those kmalloc violations and clean them up it sounds like
> progress
What I'd like is a debugging function `can_sleep()'. This
is good for documentary purposes, and will catch bugs.
So kmalloc() would gain:
if (gfp_flags & __GFP_WAIT)
can_sleep();
can_sleep() would do the following:
- If CONFIG_PREEMPT, check the locking depth (minus BKL depth),
whine if non-zero.
- If inside cli(), whine.
- If inside __cli(), also whine (not really a bug, but a design error).
- whining will include generation of a backtrace.
I suspect a 2.4 version would generate too many bug reports :)
It would have to implement its own lock depth accounting if
we want the sleep-inside-spinlock checking.
There's some arch-dependent stuff in there. I'll do a 2.5
patch. I suspect it'll generate showers of stuff. We can
feed fixes back into 2.4.
-
On Sat, 30 Mar 2002 11:06:25 -0800,
Andrew Morton <[email protected]> wrote:
>What I'd like is a debugging function `can_sleep()'. This
>is good for documentary purposes, and will catch bugs.
>
>So kmalloc() would gain:
>
> if (gfp_flags & __GFP_WAIT)
> can_sleep();
can_sleep_if(gfp_flags & __GFP_WAIT) would be better. can_sleep_if()
is
do { } while(0)
for no debugging, for debugging it is
if (unlikely(condition)) {
whine(__stringify(condition))
}
One line instead of two, no references to variables when debugging is
off, automatically adds unlikely.
--- 2.4/drivers/block/ll_rw_blk.c Mon Apr 1 10:53:25 2002
+++ build-2.4/drivers/block/ll_rw_blk.c Mon Apr 1 11:00:21 2002
@@ -336,14 +336,17 @@
*/
int blk_grow_request_list(request_queue_t *q, int nr_requests)
{
- spin_lock_irq(&io_request_lock);
+ unsigned long flags;
+ /* Several broken drivers assume that this function doesn't sleep,
+ * this causes system hangs during boot.
+ * As a temporary fix, make the the function non-blocking.
+ */
+ spin_lock_irqsave(&io_request_lock, flags);
while (q->nr_requests < nr_requests) {
struct request *rq;
int rw;
- spin_unlock_irq(&io_request_lock);
- rq = kmem_cache_alloc(request_cachep, SLAB_KERNEL);
- spin_lock_irq(&io_request_lock);
+ rq = kmem_cache_alloc(request_cachep, SLAB_ATOMIC);
if (rq == NULL)
break;
memset(rq, 0, sizeof(*rq));
@@ -356,7 +359,7 @@
q->batch_requests = q->nr_requests / 4;
if (q->batch_requests > 32)
q->batch_requests = 32;
- spin_unlock_irq(&io_request_lock);
+ spin_unlock_irqrestore(&io_request_lock, flags);
return q->nr_requests;
}