Hi Pavel,
we use nbd for our diskless systems, and it looks to me like that
it has some serious problems in 2.5.x... Can you apply this patch
and forward it to Linus?
There were:
* Missing disk's queue initialization
* Driver should use list_del_init: put_request now verifies
that req->queuelist is empty, and list_del was incompatible
with this.
* I converted nbd_end_request back to end_that_request_{first,last}
as I saw no reason why driver should do it itself... and
blk_put_request has no place under queue_lock, so apparently when
semantic changed nobody went through drivers...
Thanks,
Petr Vandrovec
[email protected]
diff -urdN linux/drivers/block/nbd.c linux/drivers/block/nbd.c
--- linux/drivers/block/nbd.c 2003-02-28 20:56:05.000000000 +0100
+++ linux/drivers/block/nbd.c 2003-03-01 22:53:36.000000000 +0100
@@ -76,22 +76,15 @@
{
int uptodate = (req->errors == 0) ? 1 : 0;
request_queue_t *q = req->q;
- struct bio *bio;
- unsigned nsect;
unsigned long flags;
#ifdef PARANOIA
requests_out++;
#endif
spin_lock_irqsave(q->queue_lock, flags);
- while((bio = req->bio) != NULL) {
- nsect = bio_sectors(bio);
- blk_finished_io(nsect);
- req->bio = bio->bi_next;
- bio->bi_next = NULL;
- bio_endio(bio, nsect << 9, uptodate ? 0 : -EIO);
+ if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
+ end_that_request_last(req);
}
- blk_put_request(req);
spin_unlock_irqrestore(q->queue_lock, flags);
}
@@ -243,7 +236,7 @@
req = list_entry(tmp, struct request, queuelist);
if (req != xreq)
continue;
- list_del(&req->queuelist);
+ list_del_init(&req->queuelist);
spin_unlock(&lo->queue_lock);
return req;
}
@@ -322,7 +315,7 @@
spin_lock(&lo->queue_lock);
if (!list_empty(&lo->queue_head)) {
req = list_entry(lo->queue_head.next, struct request, queuelist);
- list_del(&req->queuelist);
+ list_del_init(&req->queuelist);
}
spin_unlock(&lo->queue_lock);
if (req) {
@@ -387,7 +380,7 @@
if (req->errors) {
printk(KERN_ERR "nbd: nbd_send_req failed\n");
spin_lock(&lo->queue_lock);
- list_del(&req->queuelist);
+ list_del_init(&req->queuelist);
spin_unlock(&lo->queue_lock);
nbd_end_request(req);
spin_lock_irq(q->queue_lock);
@@ -592,6 +585,7 @@
disk->first_minor = i;
disk->fops = &nbd_fops;
disk->private_data = &nbd_dev[i];
+ disk->queue = &nbd_queue;
sprintf(disk->disk_name, "nbd%d", i);
set_capacity(disk, 0x3ffffe);
add_disk(disk);
Hi!
> we use nbd for our diskless systems, and it looks to me like that
> it has some serious problems in 2.5.x... Can you apply this patch
> and forward it to Linus?
>
> There were:
> * Missing disk's queue initialization
> * Driver should use list_del_init: put_request now verifies
> that req->queuelist is empty, and list_del was incompatible
> with this.
> * I converted nbd_end_request back to end_that_request_{first,last}
> as I saw no reason why driver should do it itself... and
> blk_put_request has no place under queue_lock, so apparently when
> semantic changed nobody went through drivers...
I do not think this is good idea. I am not sure who converted it to
bio, but he surely had good reason to do that.
> diff -urdN linux/drivers/block/nbd.c linux/drivers/block/nbd.c
> --- linux/drivers/block/nbd.c 2003-02-28 20:56:05.000000000 +0100
> +++ linux/drivers/block/nbd.c 2003-03-01 22:53:36.000000000 +0100
> @@ -76,22 +76,15 @@
> {
> int uptodate = (req->errors == 0) ? 1 : 0;
> request_queue_t *q = req->q;
> - struct bio *bio;
> - unsigned nsect;
> unsigned long flags;
>
> #ifdef PARANOIA
> requests_out++;
> #endif
> spin_lock_irqsave(q->queue_lock, flags);
> - while((bio = req->bio) != NULL) {
> - nsect = bio_sectors(bio);
> - blk_finished_io(nsect);
> - req->bio = bio->bi_next;
> - bio->bi_next = NULL;
> - bio_endio(bio, nsect << 9, uptodate ? 0 : -EIO);
> + if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> + end_that_request_last(req);
> }
> - blk_put_request(req);
> spin_unlock_irqrestore(q->queue_lock, flags);
> }
>
--
Horseback riding is like software...
...vgf orggre jura vgf serr.
On 3 Mar 03 at 19:39, Pavel Machek wrote:
> > we use nbd for our diskless systems, and it looks to me like that
> > it has some serious problems in 2.5.x... Can you apply this patch
> > and forward it to Linus?
> >
> > There were:
> > * Missing disk's queue initialization
> > * Driver should use list_del_init: put_request now verifies
> > that req->queuelist is empty, and list_del was incompatible
> > with this.
> > * I converted nbd_end_request back to end_that_request_{first,last}
> > as I saw no reason why driver should do it itself... and
> > blk_put_request has no place under queue_lock, so apparently when
> > semantic changed nobody went through drivers...
>
> I do not think this is good idea. I am not sure who converted it to
> bio, but he surely had good reason to do that.
I think that at the beginning of 2.5.x series there was some thinking
about removing end_that_request* completely from the API. As it never
happened, and __end_that_request_first()/end_that_request_last() has
definitely better quality (like that it does not ignore req->waiting...)
than opencoded nbd loop, I prefer using end_that_request* over opencoding
bio traversal.
If you want, then just replace blk_put_request() with __blk_put_request(),
instead of first change. But I personally will not trust such code, as
next time something in bio changes nbd will miss this change again.
Petr Vandrovec
> > diff -urdN linux/drivers/block/nbd.c linux/drivers/block/nbd.c
> > --- linux/drivers/block/nbd.c 2003-02-28 20:56:05.000000000 +0100
> > +++ linux/drivers/block/nbd.c 2003-03-01 22:53:36.000000000 +0100
> > @@ -76,22 +76,15 @@
> > {
> > int uptodate = (req->errors == 0) ? 1 : 0;
> > request_queue_t *q = req->q;
> > - struct bio *bio;
> > - unsigned nsect;
> > unsigned long flags;
> >
> > #ifdef PARANOIA
> > requests_out++;
> > #endif
> > spin_lock_irqsave(q->queue_lock, flags);
> > - while((bio = req->bio) != NULL) {
> > - nsect = bio_sectors(bio);
> > - blk_finished_io(nsect);
> > - req->bio = bio->bi_next;
> > - bio->bi_next = NULL;
> > - bio_endio(bio, nsect << 9, uptodate ? 0 : -EIO);
> > + if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> > + end_that_request_last(req);
> > }
> > - blk_put_request(req);
> > spin_unlock_irqrestore(q->queue_lock, flags);
> > }
> >
On Mon, Mar 03 2003, Petr Vandrovec wrote:
> On 3 Mar 03 at 19:39, Pavel Machek wrote:
> > > we use nbd for our diskless systems, and it looks to me like that
> > > it has some serious problems in 2.5.x... Can you apply this patch
> > > and forward it to Linus?
> > >
> > > There were:
> > > * Missing disk's queue initialization
> > > * Driver should use list_del_init: put_request now verifies
> > > that req->queuelist is empty, and list_del was incompatible
> > > with this.
> > > * I converted nbd_end_request back to end_that_request_{first,last}
> > > as I saw no reason why driver should do it itself... and
> > > blk_put_request has no place under queue_lock, so apparently when
> > > semantic changed nobody went through drivers...
> >
> > I do not think this is good idea. I am not sure who converted it to
> > bio, but he surely had good reason to do that.
>
> I think that at the beginning of 2.5.x series there was some thinking
> about removing end_that_request* completely from the API. As it never
> happened, and __end_that_request_first()/end_that_request_last() has
> definitely better quality (like that it does not ignore req->waiting...)
> than opencoded nbd loop, I prefer using end_that_request* over opencoding
> bio traversal.
>
> If you want, then just replace blk_put_request() with __blk_put_request(),
> instead of first change. But I personally will not trust such code, as
> next time something in bio changes nbd will miss this change again.
I agree with the change, there's no reason for nbd to implement its own
end_request handling. I was the one to do the bio conversion, doing XXX
drivers at one time...
A small correction to your patch, you need not hold queue_lock when
calling end_that_request_first() (which is the costly part of ending a
request), so
if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
unsigned long flags;
spin_lock_irqsave(q->queue_lock, flags);
end_that_request_last(req);
spin_unlock_irqrestore(q->queue_lock, flags);
}
would be enough. That depends on the driver having pulled the request
off the list in the first place, which nbd has.
Also, it looks like it would be much better to simply let the queue lock
for a nbd_device be inherited from ndb_device->lo_lock.
--
Jens Axboe
On Wed, Mar 05 2003, Petr Vandrovec wrote:
> On 5 Mar 03 at 10:21, Jens Axboe wrote:
> > On Mon, Mar 03 2003, Petr Vandrovec wrote:
> > > On 3 Mar 03 at 19:39, Pavel Machek wrote:
> > >
> > > I think that at the beginning of 2.5.x series there was some thinking
> > > about removing end_that_request* completely from the API. As it never
> > > happened, and __end_that_request_first()/end_that_request_last() has
> > > definitely better quality (like that it does not ignore req->waiting...)
> > > than opencoded nbd loop, I prefer using end_that_request* over opencoding
> > > bio traversal.
> > >
> > > If you want, then just replace blk_put_request() with __blk_put_request(),
> > > instead of first change. But I personally will not trust such code, as
> > > next time something in bio changes nbd will miss this change again.
> >
> > I agree with the change, there's no reason for nbd to implement its own
> > end_request handling. I was the one to do the bio conversion, doing XXX
> > drivers at one time...
> >
> > A small correction to your patch, you need not hold queue_lock when
> > calling end_that_request_first() (which is the costly part of ending a
> > request), so
> >
> > if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> > unsigned long flags;
> >
> > spin_lock_irqsave(q->queue_lock, flags);
> > end_that_request_last(req);
> > spin_unlock_irqrestore(q->queue_lock, flags);
> > }
> >
> > would be enough. That depends on the driver having pulled the request
> > off the list in the first place, which nbd has.
>
> But it also finishes whole request at once, so probably with:
>
> if (!end_that_request_first(...)) {
> ...
> } else {
> BUG();
> }
Sure
> I had patch for 2.5.3 which finished request partially after each chunk
> (usually 1500 bytes) received from server, but it did not make any
> difference in performance at that time (probably because of the way
> nbd server works and speed of network between server and client). I'll
> try it now again...
Yes that might still make sense, especially now since we actually pass
down partially completed chunks. But the bio end_io must support it, or
you will see now difference at all. And I don't think any of them do :).
Linus played with adding it to the multi-page fs helpers, but I think he
abandoned it. Should make larger read-aheads on slow media (floppy) work
a lot nicer, though.
--
Jens Axboe
On 5 Mar 03 at 10:21, Jens Axboe wrote:
> On Mon, Mar 03 2003, Petr Vandrovec wrote:
> > On 3 Mar 03 at 19:39, Pavel Machek wrote:
> >
> > I think that at the beginning of 2.5.x series there was some thinking
> > about removing end_that_request* completely from the API. As it never
> > happened, and __end_that_request_first()/end_that_request_last() has
> > definitely better quality (like that it does not ignore req->waiting...)
> > than opencoded nbd loop, I prefer using end_that_request* over opencoding
> > bio traversal.
> >
> > If you want, then just replace blk_put_request() with __blk_put_request(),
> > instead of first change. But I personally will not trust such code, as
> > next time something in bio changes nbd will miss this change again.
>
> I agree with the change, there's no reason for nbd to implement its own
> end_request handling. I was the one to do the bio conversion, doing XXX
> drivers at one time...
>
> A small correction to your patch, you need not hold queue_lock when
> calling end_that_request_first() (which is the costly part of ending a
> request), so
>
> if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
> unsigned long flags;
>
> spin_lock_irqsave(q->queue_lock, flags);
> end_that_request_last(req);
> spin_unlock_irqrestore(q->queue_lock, flags);
> }
>
> would be enough. That depends on the driver having pulled the request
> off the list in the first place, which nbd has.
But it also finishes whole request at once, so probably with:
if (!end_that_request_first(...)) {
...
} else {
BUG();
}
I had patch for 2.5.3 which finished request partially after each chunk
(usually 1500 bytes) received from server, but it did not make any
difference in performance at that time (probably because of the way
nbd server works and speed of network between server and client). I'll
try it now again...
Petr Vandrovec
[email protected]