LinuxLists.cc - Re: [PATCH] 4/4 block: explicit plugging

2007-01-03 08:09:27

Subject: Re: [PATCH] 4/4 block: explicit plugging

On Wed, 3 Jan 2007 08:48:28 +0100
Jens Axboe <[email protected]> wrote:

> This is a patch to perform block device plugging explicitly in the submitting
> process context rather than implicitly by the block device.

I don't think anyone will regret the passing of address_space_operations.sync_page().

Do you have any benchmarks which got faster with these changes?

2007-01-03 08:19:16

by Jens Axboe

[permalink] [raw]

Subject: Re: [PATCH] 4/4 block: explicit plugging

On Wed, Jan 03 2007, Andrew Morton wrote:
> On Wed, 3 Jan 2007 08:48:28 +0100
> Jens Axboe <[email protected]> wrote:
>
> > This is a patch to perform block device plugging explicitly in the submitting
> > process context rather than implicitly by the block device.
>
> I don't think anyone will regret the passing of
> address_space_operations.sync_page().

Hardly :-)

> Do you have any benchmarks which got faster with these changes?

On the hardware I have immediately available, I see no regressions wrt
performance. With instrumentation it's simple to demonstrate that most
of the queueing activity of an io heavy benchmark spends less time in
the kernel (most merging activity takes place outside of the queue lock,
hence queueing is lock free).

I've asked Ken to run this series on some of his big iron, I hope he'll
have some results for us soonish. I can run some pseudo benchmarks on a
4-way here with some simulated storage to demonstrate the locking
improvements.

I don't see 3/4 and 4/4 on lkml yet, I wonder if they got lost.

--
Jens Axboe

2007-01-03 21:51:27

by Chen, Kenneth W

[permalink] [raw]

Subject: RE: [PATCH] 4/4 block: explicit plugging

Jens Axboe wrote on Wednesday, January 03, 2007 12:22 AM
> > Do you have any benchmarks which got faster with these changes?
>
> On the hardware I have immediately available, I see no regressions wrt
> performance. With instrumentation it's simple to demonstrate that most
> of the queueing activity of an io heavy benchmark spends less time in
> the kernel (most merging activity takes place outside of the queue
lock,
> hence queueing is lock free).
>
> I've asked Ken to run this series on some of his big iron, I hope
he'll
> have some results for us soonish.

We are having some trouble with the patch set that some of our fiber
channel
host controller doesn't initialize properly anymore and thus lost whole
bunch
of disks (somewhere around 200 disks out of 900) at boot time.
Presumably FC
loop initialization command are done through block layer etc. I haven't
looked into the problem closely.

Jens, I assume the spin lock bug in __blk_run_queue is fixed in this
patch
set?

- Ken

2007-01-03 22:26:42

by Jens Axboe

[permalink] [raw]

Subject: Re: [PATCH] 4/4 block: explicit plugging

On Wed, Jan 03 2007, Chen, Kenneth W wrote:
> Jens Axboe wrote on Wednesday, January 03, 2007 12:22 AM
> > > Do you have any benchmarks which got faster with these changes?
> >
> > On the hardware I have immediately available, I see no regressions wrt
> > performance. With instrumentation it's simple to demonstrate that most
> > of the queueing activity of an io heavy benchmark spends less time in
> > the kernel (most merging activity takes place outside of the queue
> lock,
> > hence queueing is lock free).
> >
> > I've asked Ken to run this series on some of his big iron, I hope
> he'll
> > have some results for us soonish.
>
> We are having some trouble with the patch set that some of our fiber
> channel
> host controller doesn't initialize properly anymore and thus lost whole
> bunch
> of disks (somewhere around 200 disks out of 900) at boot time.
> Presumably FC
> loop initialization command are done through block layer etc. I haven't
> looked into the problem closely.
>
> Jens, I assume the spin lock bug in __blk_run_queue is fixed in this
> patch
> set?

It is. Are you still seeing problems after the initial mail exchange we
had prior to christmas, or are you referencing that initial problem?

It's not likely to be a block layer issue, more likely the SCSI <->
block interactions. If you mail me a new dmesg (if your problem is with
the __blk_run_queue() fixups), I can take a look. Otherwise please do
test with the __blk_run_queue() fixup, just use the current patchset.

--
Jens Axboe

2007-01-03 22:34:44

by Chen, Kenneth W

[permalink] [raw]

Subject: RE: [PATCH] 4/4 block: explicit plugging

Jens Axboe wrote on Wednesday, January 03, 2007 2:30 PM
> > We are having some trouble with the patch set that some of our fiber channel
> > host controller doesn't initialize properly anymore and thus lost whole
> > bunch of disks (somewhere around 200 disks out of 900) at boot time.
> > Presumably FC loop initialization command are done through block layer etc.
> > I haven't looked into the problem closely.
> >
> > Jens, I assume the spin lock bug in __blk_run_queue is fixed in this patch
> > set?
>
> It is. Are you still seeing problems after the initial mail exchange we
> had prior to christmas,

Yes. Not the same kernel panic, but a problem with FC loop reset itself.

> or are you referencing that initial problem?

No. we got passed that point thanks for the bug fix patch you give me
prior to Christmas. That fixed a kernel panic on boot up.

> It's not likely to be a block layer issue, more likely the SCSI <->
> block interactions. If you mail me a new dmesg (if your problem is with
> the __blk_run_queue() fixups), I can take a look. Otherwise please do
> test with the __blk_run_queue() fixup, just use the current patchset.

I will just retake the tip of your plug tree and retest.

2007-01-04 14:36:10

by Jens Axboe

[permalink] [raw]

Subject: Re: [PATCH] 4/4 block: explicit plugging

On Wed, Jan 03 2007, Chen, Kenneth W wrote:
> Jens Axboe wrote on Wednesday, January 03, 2007 2:30 PM
> > > We are having some trouble with the patch set that some of our fiber channel
> > > host controller doesn't initialize properly anymore and thus lost whole
> > > bunch of disks (somewhere around 200 disks out of 900) at boot time.
> > > Presumably FC loop initialization command are done through block layer etc.
> > > I haven't looked into the problem closely.
> > >
> > > Jens, I assume the spin lock bug in __blk_run_queue is fixed in this patch
> > > set?
> >
> > It is. Are you still seeing problems after the initial mail exchange we
> > had prior to christmas,
>
> Yes. Not the same kernel panic, but a problem with FC loop reset itself.
>
>
> > or are you referencing that initial problem?
>
> No. we got passed that point thanks for the bug fix patch you give me
> prior to Christmas. That fixed a kernel panic on boot up.
>
>
> > It's not likely to be a block layer issue, more likely the SCSI <->
> > block interactions. If you mail me a new dmesg (if your problem is with
> > the __blk_run_queue() fixups), I can take a look. Otherwise please do
> > test with the __blk_run_queue() fixup, just use the current patchset.
>
> I will just retake the tip of your plug tree and retest.

That would be great! There's a busy race fixed in the current branch,
make sure that one is included as well.

>From 9174fea2184187209b1f851137bd1612728fae2c Mon Sep 17 00:00:00 2001
From: Jens Axboe <[email protected]>
Date: Thu, 4 Jan 2007 10:42:33 +0100
Subject: [PATCH] [PATCH] scsi: race in checking sdev->device_busy

Save some code, create a new out label for the path that already checks
the busy count and delays the queue if necessary.

Signed-off-by: Jens Axboe <[email protected]>
---
drivers/scsi/scsi_lib.c | 11 ++++-------
1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index fce5e2f..3ffa35d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1509,12 +1509,9 @@ static void scsi_request_fn(struct request_queue *q)
* Dispatch the command to the low-level driver.
*/
rtn = scsi_dispatch_cmd(cmd);
- if (rtn) {
- if (sdev->device_busy == 0)
- blk_delay_queue(q, SCSI_QUEUE_DELAY);
- goto out_nolock;
- }
spin_lock_irq(q->queue_lock);
+ if (rtn)
+ goto out_delay;
}

goto out;
@@ -1533,13 +1530,13 @@ static void scsi_request_fn(struct request_queue *q)
spin_lock_irq(q->queue_lock);
blk_requeue_request(q, req);
sdev->device_busy--;
+out_delay:
if (sdev->device_busy == 0)
blk_delay_queue(q, SCSI_QUEUE_DELAY);
- out:
+out:
/* must be careful here...if we trigger the ->remove() function
* we cannot be holding the q lock */
spin_unlock_irq(q->queue_lock);
- out_nolock:
put_device(&sdev->sdev_gendev);
spin_lock_irq(q->queue_lock);
}
--
1.5.0.rc0.gd222

--
Jens Axboe

2007-01-05 22:04:20

by Chen, Kenneth W

[permalink] [raw]

Subject: RE: [PATCH] 4/4 block: explicit plugging

Andrew Morton wrote on Wednesday, January 03, 2007 12:09 AM
> Do you have any benchmarks which got faster with these changes?

Jens Axboe wrote on Wednesday, January 03, 2007 12:22 AM
> I've asked Ken to run this series on some of his big iron, I hope he'll
> have some results for us soonish. I can run some pseudo benchmarks on a
> 4-way here with some simulated storage to demonstrate the locking
> improvements.

> Jens Axboe wrote on Thursday, January 04, 2007 6:39 AM
> > I will just retake the tip of your plug tree and retest.
>
> That would be great! There's a busy race fixed in the current branch,
> make sure that one is included as well.

Good news: the tip of plug tree fixed the FC loop reset issue we are
seeing earlier.

Performance wise, our big db benchmark run came out with 0.14% regression
compare to 2.6.20-rc2. It is small enough that we declared it as noise
level change. Unfortunately our internal profile tool broke on 2.6.20-rc2
so I don't have an execution profile to post.

- Ken