2001-07-23 23:40:06

by Peter T. Breuer

[permalink] [raw]
Subject: what's the semaphore in requests for?

What's the semaphore field in requests for? Are driver writers supposed
to be using it?

The reason I ask is that I've been chasing an smp bug in a block driver
of mine for a week. The bug only shows up in 2.4 kernels (not in same
code under 2.2.18) and only with smp ("nosmp" squashes it). It only
shows up when running dd in user space copying from my device to
a disk device. It doesn't show when copying to /dev/null.

The symptom is a complete kernel lockup. Not even sysreq works.
It's driving me crazy. It sems to get very easy to trigger in 2.4.6,
while it was hard or impossible to trigger back in 2.4.0 and 2.4.1.

I have added the sgi kdb stuff in order to get a handle. For a while I
was getting some ouches from the nmi watchdog saying that one cpu was
locked, followed by a jump into the kdb monitor. But I'm not getting that
now. In any case I haven't learned how to use kdb properly yet, so
I couldn't make out much from the stack info.

The bug maybe shows on write from a local disk to the device too, but
it's at least 10 times as hard to trigger that way. It does NOt trigger
when writing to the device from /dev/zero. I'm not sure it shows in all
my smp machines either .. most of them have been slightly unstable
under 2.4.* anyway, locking up on timescales of 1 day to a week. Could
be apic (asus and dell bx), but I was running my own machine noapic and
it didn't affect the bug.

The block driver is largely in userspace. All the kernel half does
is transfer requests to a local queue (with the io lock still held, of
course). The userspace daemon cycles continously doing ioctls that
copy the requests (bh by bh) into userspace, where its treated via
some networking calls, then return an ack via another ioctl.

The drivers local queue is protected by a semaphore. The thing that
puzzles me is that the bug shows only when copying to a disk device,
not to /dev/null, through userspace! Is it that the lifetime of a
request is much longer than expected?

I have some impression that the bug is dependent on speed too. If I
limit the speed of the device, I think I don't see the bug - but
definitive results are very hard to come by because I have to copy
about 2GB from the device to be sure of triggering it.

Oh well, if anyone has any insight or any plans for further hunting,
please let me know.

Peter


2001-07-24 07:45:10

by Jens Axboe

[permalink] [raw]
Subject: Re: what's the semaphore in requests for?

On Tue, Jul 24 2001, Peter T. Breuer wrote:
> What's the semaphore field in requests for? Are driver writers supposed
> to be using it?

Drivers can use it if they want completion to be signalled for a request
(see end_that_request_last). However, see 2.4.7 where it's not ->waiting
and the interface changed.

> The block driver is largely in userspace. All the kernel half does
> is transfer requests to a local queue (with the io lock still held, of
> course). The userspace daemon cycles continously doing ioctls that
> copy the requests (bh by bh) into userspace, where its treated via
> some networking calls, then return an ack via another ioctl.
>
> The drivers local queue is protected by a semaphore. The thing that
> puzzles me is that the bug shows only when copying to a disk device,
> not to /dev/null, through userspace! Is it that the lifetime of a
> request is much longer than expected?

Well all the explanations in the world doesn't help much -- show the
code.

--
Jens Axboe

2001-07-28 22:34:32

by Peter T. Breuer

[permalink] [raw]
Subject: Re: what's the semaphore in requests for?

"A month of sundays ago ptb wrote:"
> What's the semaphore field in requests for? Are driver writers supposed
> to be using it?

It seems nobody knows.

> The reason I ask is that I've been chasing an smp bug in a block driver
> of mine for a week. The bug only shows up in 2.4 kernels (not in same
> code under 2.2.18) and only with smp ("nosmp" squashes it). It only

I've made more progress in seeking this bug. The test is
just dd if=/dev/mine of=/dev/null bs=4k over 2GB of data.

2 processors + 1 userspace helper daemon on device = no bug
2 processors + 2 userspace helper daemon on device = bug (lockup)
1 processors + 1 userspace helper daemon on device = no bug
1 processors + 2 userspace helper daemon on device = no bug

Seeing this, I added a semaphore that forces the helper daemons to
exclude each other as they enter the kernel in their ioctl calls.
Still the lockup occurred with two processors and two daemons.

IMO that's impossible. With the semaphore, the daemons should have
behaved like one daemon, since they don't maintain any state. They just
run in two threads instead of one. I was careful to lock the entire
daemon interaction cycle with the kernel (a get and an ack ioctl) into
one atomic unit with the semaphore, not just exclude simultaneous entry.

OK .. so let's treat this as an opportunity to learn something more
about the kernel.

I believe the above data indicates that the act of doing an ioctl
may prompt activity in the kernel request function, perhaps as the
scheduler triggers the helper daemon process on the way in to the
kernel. And perhaps that leads to the kernel request function for
the device running twice simultaneously? It runs when the device
unplugs, surely, and never any other time?

I've been through adding spinlocks to exclude the kernel request
function and the helper daemon ioctls on shared resources. Surely
if there were a problem there I'd see it with 2 cpus and 1 helper
daemon!

Peter

2001-07-30 08:22:38

by Jens Axboe

[permalink] [raw]
Subject: Re: what's the semaphore in requests for?

On Sun, Jul 29 2001, Peter T. Breuer wrote:
> "A month of sundays ago ptb wrote:"
> > What's the semaphore field in requests for? Are driver writers supposed
> > to be using it?
>
> It seems nobody knows.

Seems you don't get mail sent to you?! I answered this on the 24th

http://asimov.lib.uaa.alaska.edu/linux-kernel/archive/2001-Week-30/0165.html

> > The reason I ask is that I've been chasing an smp bug in a block driver
> > of mine for a week. The bug only shows up in 2.4 kernels (not in same
> > code under 2.2.18) and only with smp ("nosmp" squashes it). It only
>
> I've made more progress in seeking this bug. The test is
> just dd if=/dev/mine of=/dev/null bs=4k over 2GB of data.
>
> 2 processors + 1 userspace helper daemon on device = no bug
> 2 processors + 2 userspace helper daemon on device = bug (lockup)
> 1 processors + 1 userspace helper daemon on device = no bug
> 1 processors + 2 userspace helper daemon on device = no bug
>
> Seeing this, I added a semaphore that forces the helper daemons to
> exclude each other as they enter the kernel in their ioctl calls.
> Still the lockup occurred with two processors and two daemons.

And I'll restate here what I said then too -- SHOW THE CODE! Or send me
a crystal ball and I'll be happy to solve your races for you.

--
Jens Axboe

2001-07-30 14:15:30

by Peter T. Breuer

[permalink] [raw]
Subject: Re: what's the semaphore in requests for?

"Jens Axboe wrote:"
> http://asimov.lib.uaa.alaska.edu/linux-kernel/archive/2001-Week-30/0165.html

You say there [of the semaphore field in requests]:

Drivers can use it if they want completion to be signalled for a request
(see end_that_request_last). However, see 2.4.7 where it's not ->waiting
and the interface changed.

end_that_request_ up's the semaphore if it's nonnull, but for that to make
sense, someone must down it. Nobody does (in ll_rw_blk.c), So I assume it's
entirely for my use in controlling access to the request.

So I don't believe it's involved in my problem.


> > 2 processors + 1 userspace helper daemon on device = no bug
> > 2 processors + 2 userspace helper daemon on device = bug (lockup)
> > 1 processors + 1 userspace helper daemon on device = no bug
> > 1 processors + 2 userspace helper daemon on device = no bug

> And I'll restate here what I said then too -- SHOW THE CODE! Or send me
> a crystal ball and I'll be happy to solve your races for you.

Crystal balls would be nice. I'll see if I can get it down to something
sendable. I can confirm the above results since I tried them again. After
about 1.2GB of transfers, one cpu ended up not listening to NMI and the
other was stuck in a spinlock (__down_writelock_failed, from memory),
having called my request fn from the generic_unplug_device function,
which in turn called a write spinlock on the device private request
queue. The spinlocks aren't around sections of code that can sleep.

Peter

2001-07-31 18:46:21

by Peter T. Breuer

[permalink] [raw]
Subject: Re: what's the semaphore in requests for?

"ptb wrote:"
> "Jens Axboe wrote:"
> > > > The reason I ask is that I've been chasing an smp bug in a block driver
> > > > code under 2.2.18) and only with smp ("nosmp" squashes it). It only
> > > 2 processors + 1 userspace helper daemon on device = no bug
> > > 2 processors + 2 userspace helper daemon on device = bug (lockup)
> > > 1 processors + 1 userspace helper daemon on device = no bug
> > > 1 processors + 2 userspace helper daemon on device = no bug
> > And I'll restate here what I said then too -- SHOW THE CODE! Or send me
> > a crystal ball and I'll be happy to solve your races for you.

Let me try this question:

Can the device request function be called from an interrupt?

(and is this newish?). I'm talking about when the plug is released.

All would be explained if the private spinlock were taken by the
request function on an interrupt when it was already held by the
ioctl functions that the userspace daemons use to transfer data
from the private queue up to userspace and back.

I thought the request function ran as requests were added to the
queue, which comes from pressure from the block layer.

> do_request(request_queue_t * q)
> {
> struct dev_request *req;
>
> while (!QUEUE_EMPTY) {
> struct mydevice *dev;
>
> req = CURRENT;
> dev = &dev_array[MINOR(req->rq_dev) >> SHIFT];
> blkdev_dequeue_request(req);
> write_lock(&dev->queue_spinlock);
> // transfer req to the private queue
> list_add(&req->queue, &dev->queue);
> write_unlock(&dev->queue_spinlock);
> // notify listeners
> wake_up_interruptible(& dev->wq);
> }
> }

I now think the request function runs with interrupts disabled locally,
so the raw spinlock access is OK here. But it wasn't ok in the
ioctl functions? ...

> int
> get_req (struct slot *slot, char *buffer)
> {
> struct dev_request request;
> struct request *req;
> int result = 0;
> unsigned start_time = jiffies;
> struct mydevice *dev = slot->dev;
> unsigned timeout = dev->req_timeo * HZ;
> extern struct timezone sys_tz;
>
> down (&dev->queue_lock);
> read_lock(&dev->queue_spinlock);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

since maybe the request function could have run in the context of an
interrupt while this spinlock was held, which deadlocks the cpu?

But then why doesn't the problem show itself with just one daemon
running on a 2cpu machine?

I've made more tests, and using irqsave on the private spinlock everywhere
seems to cure all ills. But I'm still very hazy as to what is going on.


Peter