Peter, Paul,
The original message (below) didn't get any response.
Could you elaborate on the impact of this NBD issue? Based on your
description, having requests just sit on the queue would _seem_ to be
quite serious. Is this purely a bookkeeping issue or is there a more
subtle side-effect of this inaccurate request count?
You stated that your patch fixes this issue "not in the proper way".
If a fix is trully needed what is the proper way?
please advise, thanks.
Mike
On 4/29/07, Peter Zijlstra <[email protected]> wrote:
> On Sun, 2007-04-29 at 21:41 +0200, Rogier Wolff wrote:
> > On Tue, Apr 17, 2007 at 10:37:38PM -0700, Andrew Morton wrote:
> > > Florin, can we please see /proc/meminfo as well?
> > >
> > > Also the result of `echo m > /proc/sysrq-trigger'
> >
> > Hi,
> >
> > It's been a while since this thread died out, but maybe I'm
> > having the same problem. Networking, large part of memory is
> > buffering writes.....
> >
> > In my case I'm using NBD.
> >
> > Oh,
> >
> > /sys/block/nbd0/stat gives:
> > 636 88 5353 1700 991 19554 162272 63156 43 1452000 61802352
> > I put some debugging stuff in nbd, and it DOES NOT KNOW about the
> > 43 requests that the io scheduler claims are in flight at the
> > driver....
>
> AFAIK nbd is a tad broken; the following patch used to fix it, although
> not in the proper way. Hence it never got merged.
>
> There is a race where the plug state of the device queue gets confused,
> which causes requests to just sit on the queue, without further action.
>
> ---
>
> Subject: nbd: request_fn fixup
>
> Dropping the queue_lock opens up a nasty race, fix this race by
> plugging the device when we're done.
>
> Also includes a small cleanup.
>
> Signed-off-by: Peter Zijlstra <[email protected]>
> CC: Daniel Phillips <[email protected]>
> CC: Pavel Machek <[email protected]>
> ---
> drivers/block/nbd.c | 67 ++++++++++++++++++++++++++++++++++++++--------------
> 1 file changed, 49 insertions(+), 18 deletions(-)
>
> Index: linux-2.6/drivers/block/nbd.c
> ===================================================================
> --- linux-2.6.orig/drivers/block/nbd.c 2006-09-07 17:20:52.000000000 +0200
> +++ linux-2.6/drivers/block/nbd.c 2006-09-07 17:35:05.000000000 +0200
> @@ -97,20 +97,24 @@ static const char *nbdcmd_to_ascii(int c
> }
> #endif /* NDEBUG */
>
> -static void nbd_end_request(struct request *req)
> +static void __nbd_end_request(struct request *req)
> {
> int uptodate = (req->errors == 0) ? 1 : 0;
> - request_queue_t *q = req->q;
> - unsigned long flags;
>
> dprintk(DBG_BLKDEV, "%s: request %p: %s\n", req->rq_disk->disk_name,
> req, uptodate? "done": "failed");
>
> - spin_lock_irqsave(q->queue_lock, flags);
> - if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> + if (!end_that_request_first(req, uptodate, req->nr_sectors))
> end_that_request_last(req, uptodate);
> - }
> - spin_unlock_irqrestore(q->queue_lock, flags);
> +}
> +
> +static void nbd_end_request(struct request *req)
> +{
> + request_queue_t *q = req->q;
> +
> + spin_lock_irq(q->queue_lock);
> + __nbd_end_request(req);
> + spin_unlock_irq(q->queue_lock);
> }
>
> /*
> @@ -435,10 +439,8 @@ static void do_nbd_request(request_queue
> mutex_unlock(&lo->tx_lock);
> printk(KERN_ERR "%s: Attempted send on closed socket\n",
> lo->disk->disk_name);
> - req->errors++;
> - nbd_end_request(req);
> spin_lock_irq(q->queue_lock);
> - continue;
> + goto error_out;
> }
>
> lo->active_req = req;
> @@ -463,10 +465,13 @@ static void do_nbd_request(request_queue
>
> error_out:
> req->errors++;
> - spin_unlock(q->queue_lock);
> - nbd_end_request(req);
> - spin_lock(q->queue_lock);
> + __nbd_end_request(req);
> }
> + /*
> + * q->queue_lock has been dropped, this opens up a race
> + * plug the device to close it.
> + */
> + blk_plug_device(q);
> return;
> }
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>