LinuxLists.cc - smp dead lock of io_request_lock/queue

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 17:32, Peter Yao wrote:
> Hi,
> I found a smp dead lock in io_request_lock/queue_lock patch in redhat's
> 2.4.18-4 kernel.

1) Bugs in vendor kernels are best reported to the vendor, in this case
in http://bugzilla.redhat.com
2) 2.4.18-4 is really old and has been superceeded by updates a few
dozen times
3) 2.4.18-4 has both remote and local security issues and datacorruption
issues

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2004-01-12 09:19:30

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, Arjan van de Ven wrote:
> On Mon, 2004-01-12 at 17:32, Peter Yao wrote:
> > Hi,
> > I found a smp dead lock in io_request_lock/queue_lock patch in redhat's
> > 2.4.18-4 kernel.
>
> 1) Bugs in vendor kernels are best reported to the vendor, in this case
> in http://bugzilla.redhat.com
> 2) 2.4.18-4 is really old and has been superceeded by updates a few
> dozen times
> 3) 2.4.18-4 has both remote and local security issues and datacorruption
> issues

4) The problem reported and patch looks real, though, maybe a little
credit and thanks would be appropriate as well?

--
Jens Axboe

2004-01-12 09:19:52

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, Jens Axboe wrote:
> On Mon, Jan 12 2004, Arjan van de Ven wrote:
> > On Mon, 2004-01-12 at 17:32, Peter Yao wrote:
> > > Hi,
> > > I found a smp dead lock in io_request_lock/queue_lock patch in redhat's
> > > 2.4.18-4 kernel.
> >
> > 1) Bugs in vendor kernels are best reported to the vendor, in this case
> > in http://bugzilla.redhat.com
> > 2) 2.4.18-4 is really old and has been superceeded by updates a few
> > dozen times
> > 3) 2.4.18-4 has both remote and local security issues and datacorruption
> > issues
>
> 4) The problem reported and patch looks real, though, maybe a little
> credit and thanks would be appropriate as well?

... and still exists in your 2.4.21 based kernel.

--
Jens Axboe

2004-01-12 09:22:41

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, Arjan van de Ven wrote:
> On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
> >
> > ... and still exists in your 2.4.21 based kernel.
>
> The RHL 2.4.21 kernels don't have the locking patch at all...

But RHEL3 does:

http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch

and the bug is there.

--
Jens Axboe

2004-01-12 09:21:34

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
>
> ... and still exists in your 2.4.21 based kernel.

The RHL 2.4.21 kernels don't have the locking patch at all...

Attachments:

(No filename) (178.00 B)
(No filename) (189.00 B)
Download all attachments

2004-01-12 13:31:25

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 04:22, Jens Axboe wrote:
> On Mon, Jan 12 2004, Arjan van de Ven wrote:
> > On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
> > >
> > > ... and still exists in your 2.4.21 based kernel.
> >
> > The RHL 2.4.21 kernels don't have the locking patch at all...
>
> But RHEL3 does:
>
> http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch
>
> and the bug is there.

But in RHEL3 the bug is fixed already (not in a released kernel, but the
fix went into our internal kernel some time back and will be in our next
update kernel). From my internal bk tree for this stuff:

[dledford@compaq RHEL3-scsi]$ bk changes -r1.23
[email protected], 2003-11-10 17:19:54-05:00, [email protected]
drivers/scsi/scsi_error.c
Don't panic if the eh thread is dead, instead do the same thing that
scsi_softirq_handler does and just complete the command as a failed
command.
Change when we wake the eh thread in scsi_times_out to accomodate
the changes to the mlqueue operations.
Clear blocked status on the host and all devices in scsi_restart_operations
-> Don't grab the host_lock in scsi_restart_operations, we aren't doing
anything that needs it. Just goose the queues unconditionally,
scsi_request_fn() will know to not send commands if they shouldn't
go for some reason.
Make sure we account SCSI_STATE_MLQUEUE commands as not being failed
commands in scsi_unjam_host.

But, Jens is right, it's a real bug. I just fixed it in a different
way. And my fix is dependent on other changes in our scsi stack as
well.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-12 14:08:58

by Martin Peschke3

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Hi,

is there a way to merge all (or at least the common denominator) of
Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
The SCSI part of Marcelo's kernel seems to be rather backlevel,
considering all those fixes and enhancements added by the mentioned
distributors and their SCSI experts. As this discussion underlines,
there are a lot of common problems and sometimes even common
approaches. I am convinced that a number of patches ought to be
incorporated into the mainline kernel. Though, I must admit
that I am at a loss with how this could be achieved given the
unresolved question of 2.4 SCSI maintenance
(which has certainly played a part in building up those piles
of SCSI patches).

Mit freundlichen Gr??en / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349

Doug Ledford <[email protected]>@vger.kernel.org on 12/01/2004 14:27:53

Sent by: [email protected]

To: Jens Axboe <[email protected]>
cc: Arjan Van de Ven <[email protected]>, Peter Yao
<[email protected]>, [email protected], linux-scsi
mailing list <[email protected]>
Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 04:22, Jens Axboe wrote:
> On Mon, Jan 12 2004, Arjan van de Ven wrote:
> > On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
> > >
> > > ... and still exists in your 2.4.21 based kernel.
> >
> > The RHL 2.4.21 kernels don't have the locking patch at all...
>
> But RHEL3 does:
>
> http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch
>
> and the bug is there.

But in RHEL3 the bug is fixed already (not in a released kernel, but the
fix went into our internal kernel some time back and will be in our next
update kernel). From my internal bk tree for this stuff:

[dledford@compaq RHEL3-scsi]$ bk changes -r1.23
[email protected], 2003-11-10 17:19:54-05:00, [email protected]
drivers/scsi/scsi_error.c
Don't panic if the eh thread is dead, instead do the same thing that
scsi_softirq_handler does and just complete the command as a failed
command.
Change when we wake the eh thread in scsi_times_out to accomodate
the changes to the mlqueue operations.
Clear blocked status on the host and all devices in
scsi_restart_operations
-> Don't grab the host_lock in scsi_restart_operations, we aren't doing
anything that needs it. Just goose the queues unconditionally,
scsi_request_fn() will know to not send commands if they shouldn't
go for some reason.
Make sure we account SCSI_STATE_MLQUEUE commands as not being failed
commands in scsi_unjam_host.

But, Jens is right, it's a real bug. I just fixed it in a different
way. And my fix is dependent on other changes in our scsi stack as
well.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-12 14:12:05

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12, 2004 at 03:07:55PM +0100, Martin Peschke3 wrote:
> Hi,
>
> is there a way to merge all (or at least the common denominator) of
> Red Hat's and SuSE's changes into the vanilla 2.4 SCSI stack?

Since 2.4 is basically frozen, and said patches are only performance
optimisations and not functionality enhancements I would think this is a bad
idea; if you need the small performance gain, 2.6 is imo a far better place
nowadays. The 2.4 SCSI stack seems rather stable and destabilizing it this
late in the cycle sounds bad with the alternative of 2.6 being available.

Greetings,
Arjan van de Ven

Attachments:

(No filename) (613.00 B)
(No filename) (189.00 B)
Download all attachments

2004-01-12 14:14:26

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, Martin Peschke3 wrote:
> Hi,
>
> is there a way to merge all (or at least the common denominator) of
> Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
> The SCSI part of Marcelo's kernel seems to be rather backlevel,
> considering all those fixes and enhancements added by the mentioned
> distributors and their SCSI experts. As this discussion underlines,
> there are a lot of common problems and sometimes even common
> approaches. I am convinced that a number of patches ought to be
> incorporated into the mainline kernel. Though, I must admit
> that I am at a loss with how this could be achieved given the
> unresolved question of 2.4 SCSI maintenance
> (which has certainly played a part in building up those piles
> of SCSI patches).

I have mixed feelings about that. One on side, I'd love to merge the
scalability patches in mainline. We've had a significant number of bugs
in this area in the past, and it seems a shame that we all have to fix
them independently because we deviate from mainline. On the other hand,
2.4 is pretty much closed. There wont be a big number of new distro 2.4
kernels.

Had you asked me 6 months ago I probably would have advocated merging
them, but right now I think it's a waste of time.

--
Jens Axboe

2004-01-12 15:08:45

by Martin Peschke3

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Hi,

I understand that people tend to be careful with optimizations
or enhancements. In that regard, this particular patch is a
disputable example.

But, my request has been meant in a broader sense.
This iorl-patch is just one out of roughly a dozen on
both RH's side and SuSE's side. I have been asking
for getting together on the list and trying to match and merge
these piles of patches, which are possibly not all as disputable
as the patch discussed in this thread, i.e. pure (partially
vintage) bugfixes.

If people agree in that course also about a clean, common
iorl-patch, that would be another step forward, in my opinion.

Mit freundlichen Gr??en / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349

Jens Axboe <[email protected]>@vger.kernel.org on 12/01/2004 15:13:30

Sent by: [email protected]

To: Martin Peschke3/Germany/IBM@IBMDE
cc: Doug Ledford <[email protected]>, Arjan Van de Ven
<[email protected]>, Peter Yao <[email protected]>,
[email protected], linux-scsi mailing list
<[email protected]>
Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, Martin Peschke3 wrote:
> Hi,
>
> is there a way to merge all (or at least the common denominator) of
> Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
> The SCSI part of Marcelo's kernel seems to be rather backlevel,
> considering all those fixes and enhancements added by the mentioned
> distributors and their SCSI experts. As this discussion underlines,
> there are a lot of common problems and sometimes even common
> approaches. I am convinced that a number of patches ought to be
> incorporated into the mainline kernel. Though, I must admit
> that I am at a loss with how this could be achieved given the
> unresolved question of 2.4 SCSI maintenance
> (which has certainly played a part in building up those piles
> of SCSI patches).

I have mixed feelings about that. One on side, I'd love to merge the
scalability patches in mainline. We've had a significant number of bugs
in this area in the past, and it seems a shame that we all have to fix
them independently because we deviate from mainline. On the other hand,
2.4 is pretty much closed. There wont be a big number of new distro 2.4
kernels.

Had you asked me 6 months ago I probably would have advocated merging
them, but right now I think it's a waste of time.

--
Jens Axboe

2004-01-12 15:13:41

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12, 2004 at 04:07:40PM +0100, Martin Peschke3 wrote:
> as the patch discussed in this thread, i.e. pure (partially
> vintage) bugfixes.

Both SuSE and Red Hat submit bugfixes they put in the respective trees to
marcelo already. There will not be many "pure bugfixes" that you can find in
vendor trees but not in marcelo's tree.

> If people agree in that course also about a clean, common
> iorl-patch, that would be another step forward, in my opinion.

None of the vendors in question is still doing development based on 2.4 (in
fact I suspect no linux vendor/distro still is) so if you want vendors to go
to a joint patch for such a specific enhancement, which WILL include
development, I really don't see the point. Neither SuSE nor Red Hat will be
jumping from joy to do a lot of work to replace a patch that works in their
respective existing products with something else with no clear gain at all,
while requiring significant work and stability risks.

Greetings,
Arjan van de Ven

Attachments:

(No filename) (0.98 kB)
(No filename) (189.00 B)
Download all attachments

2004-01-12 15:13:24

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 09:13, Jens Axboe wrote:
> On Mon, Jan 12 2004, Martin Peschke3 wrote:
> > Hi,
> >
> > is there a way to merge all (or at least the common denominator) of
> > Redhat's and SuSE's changes into the vanilla 2.4 SCSI stack?
> > The SCSI part of Marcelo's kernel seems to be rather backlevel,
> > considering all those fixes and enhancements added by the mentioned
> > distributors and their SCSI experts. As this discussion underlines,
> > there are a lot of common problems and sometimes even common
> > approaches. I am convinced that a number of patches ought to be
> > incorporated into the mainline kernel. Though, I must admit
> > that I am at a loss with how this could be achieved given the
> > unresolved question of 2.4 SCSI maintenance
> > (which has certainly played a part in building up those piles
> > of SCSI patches).
>
> I have mixed feelings about that. One on side, I'd love to merge the
> scalability patches in mainline. We've had a significant number of bugs
> in this area in the past, and it seems a shame that we all have to fix
> them independently because we deviate from mainline.

Agreed (I've had to fix the iorl patch 3 or 4 times for things that
weren't bugs in the iorl patch until mainline got updated with a new
lock somewhere or things like that).

> On the other hand,
> 2.4 is pretty much closed. There wont be a big number of new distro 2.4
> kernels.
>
> Had you asked me 6 months ago I probably would have advocated merging
> them, but right now I think it's a waste of time.

>From my standpoint, we are going to be maintaining our 2.4 kernel RPMs
for a long time, so my preference is to have it in mainline. On top of
the performance stuff I have also been building some actual bug fix
patches. They depend on the behavior of the patched kernels, and in
some cases would be difficult to put on top of a non-iorl patched scsi
stack. In any case, my current plans include putting my 2.4 scsi stack
stuff up for perusal on linux-scsi.bkbits.net/scsi-dledford-2.4 as soon
as I can sort through the patches and break them into small pieces.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-12 15:25:31

by James Bottomley

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 10:08, Doug Ledford wrote:
> >From my standpoint, we are going to be maintaining our 2.4 kernel RPMs
> for a long time, so my preference is to have it in mainline. On top of
> the performance stuff I have also been building some actual bug fix
> patches. They depend on the behavior of the patched kernels, and in
> some cases would be difficult to put on top of a non-iorl patched scsi
> stack. In any case, my current plans include putting my 2.4 scsi stack
> stuff up for perusal on linux-scsi.bkbits.net/scsi-dledford-2.4 as soon
> as I can sort through the patches and break them into small pieces.

I was thinking about opening at least a 2.4 drivers tree given some of
the issues in 2.4 and driver updates swirling around on this list.

But, if you're going to be maintaining 2.4 for Red Hat, would you like
to take point position for reviewing 2.4 patches and passing them on to
Marcelo via a BK tree?

Thanks,

James

2004-01-12 15:44:24

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, James Bottomley wrote:
> On Mon, 2004-01-12 at 10:08, Doug Ledford wrote:
> > >From my standpoint, we are going to be maintaining our 2.4 kernel RPMs
> > for a long time, so my preference is to have it in mainline. On top of
> > the performance stuff I have also been building some actual bug fix
> > patches. They depend on the behavior of the patched kernels, and in
> > some cases would be difficult to put on top of a non-iorl patched scsi
> > stack. In any case, my current plans include putting my 2.4 scsi stack
> > stuff up for perusal on linux-scsi.bkbits.net/scsi-dledford-2.4 as soon
> > as I can sort through the patches and break them into small pieces.
>
> I was thinking about opening at least a 2.4 drivers tree given some of
> the issues in 2.4 and driver updates swirling around on this list.
>
> But, if you're going to be maintaining 2.4 for Red Hat, would you like
> to take point position for reviewing 2.4 patches and passing them on to
> Marcelo via a BK tree?

I think that would be a swell idea (assigning a 2.4 SCSI point person),
for driver updates. It'll be hard to pass tested stuff though, if it all
comes from vendor trees which are vastly different in some places.

Changing core drastically is not an option in my book, for 2.4.

--
Jens Axboe

2004-01-12 15:57:08

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 10:43, Jens Axboe wrote:
> On Mon, Jan 12 2004, James Bottomley wrote:
> > On Mon, 2004-01-12 at 10:08, Doug Ledford wrote:
> > > >From my standpoint, we are going to be maintaining our 2.4 kernel RPMs
> > > for a long time, so my preference is to have it in mainline. On top of
> > > the performance stuff I have also been building some actual bug fix
> > > patches. They depend on the behavior of the patched kernels, and in
> > > some cases would be difficult to put on top of a non-iorl patched scsi
> > > stack. In any case, my current plans include putting my 2.4 scsi stack
> > > stuff up for perusal on linux-scsi.bkbits.net/scsi-dledford-2.4 as soon
> > > as I can sort through the patches and break them into small pieces.
> >
> > I was thinking about opening at least a 2.4 drivers tree given some of
> > the issues in 2.4 and driver updates swirling around on this list.
> >
> > But, if you're going to be maintaining 2.4 for Red Hat, would you like
> > to take point position for reviewing 2.4 patches and passing them on to
> > Marcelo via a BK tree?
>
> I think that would be a swell idea (assigning a 2.4 SCSI point person),
> for driver updates. It'll be hard to pass tested stuff though, if it all
> comes from vendor trees which are vastly different in some places.
>
> Changing core drastically is not an option in my book, for 2.4.

Well, the scsi-dledford-2.4 tree is intended to be someplace I can put
all the stuff I'm having to carry forward in our kernels, so that's
distinctly different than a driver update only tree. I could do that
separately and I have no problem doing that. As for the other stuff,
I'm not pushing to necessarily get any of my changes into mainline. I
would be happy if they make it there sometime as that would relieve load
off of me, but at the same time I *am* making some changes to the core
code (sorry Jens, but there are some ways in which the 2.4 core scsi
code is just too broken to believe and leaving it broken just means
dealing with everyone that points it out in bugzilla entries) and I know
people are loath to change core code in mainline, so I kinda figured a
lot of that stuff would either A) stay separate or B) only after myself
and other interested parties agree on a patch and that patch is widely
tested and known good then maybe it might get moved over, up to Marcelo.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-12 16:05:15

by James Bottomley

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 10:52, Doug Ledford wrote:
> Well, the scsi-dledford-2.4 tree is intended to be someplace I can put
> all the stuff I'm having to carry forward in our kernels, so that's
> distinctly different than a driver update only tree. I could do that
> separately and I have no problem doing that.

I'll take that as a "yes" then ;-)

Thanks for doing this, beacuse I really wasn't looking forward to trying
to sort it all out.

> As for the other stuff,
> I'm not pushing to necessarily get any of my changes into mainline. I
> would be happy if they make it there sometime as that would relieve load
> off of me, but at the same time I *am* making some changes to the core
> code (sorry Jens, but there are some ways in which the 2.4 core scsi
> code is just too broken to believe and leaving it broken just means
> dealing with everyone that points it out in bugzilla entries) and I know
> people are loath to change core code in mainline, so I kinda figured a
> lot of that stuff would either A) stay separate or B) only after myself
> and other interested parties agree on a patch and that patch is widely
> tested and known good then maybe it might get moved over, up to Marcelo.

I trust your judgement about this, so it sounds like we have the
beginnings of a good working model for 2.4

James

2004-01-12 16:09:17

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 11:04, James Bottomley wrote:
> On Mon, 2004-01-12 at 10:52, Doug Ledford wrote:
> > Well, the scsi-dledford-2.4 tree is intended to be someplace I can put
> > all the stuff I'm having to carry forward in our kernels, so that's
> > distinctly different than a driver update only tree. I could do that
> > separately and I have no problem doing that.
>
> I'll take that as a "yes" then ;-)
>
> Thanks for doing this, beacuse I really wasn't looking forward to trying
> to sort it all out.

No problem. I'll set up a tree later today and start watching for 2.4
driver updates.

> I trust your judgement about this, so it sounds like we have the
> beginnings of a good working model for 2.4

Cool. I'm good with that.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-12 19:48:39

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > as the patch discussed in this thread, i.e. pure (partially
> > vintage) bugfixes.
>
> Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> marcelo already. There will not be many "pure bugfixes" that you can find in
> vendor trees but not in marcelo's tree.

I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
for ages. In fact I asked Jens & Doug two times whether they could sort out
the distro patches for the 2.4 stack and post them, but it seems they're busy
enough with real work so this never happened.

2004-01-12 19:54:55

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > as the patch discussed in this thread, i.e. pure (partially
> > > vintage) bugfixes.
> >
> > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > marcelo already. There will not be many "pure bugfixes" that you can find in
> > vendor trees but not in marcelo's tree.
>
> I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> for ages. In fact I asked Jens & Doug two times whether they could sort out
> the distro patches for the 2.4 stack and post them, but it seems they're busy
> enough with real work so this never happened.

More or less. But part of it also is that a lot of the patches I've
written are on top of other patches that people don't want (aka, the
iorl patch). I've got a mlqueue patch that actually *really* should go
into mainline because it solves a slew of various problems in one go,
but the current version of the patch depends on some semantic changes
made by the iorl patch. So, sorting things out can sometimes be
difficult. But, I've been told to go ahead and do what I can as far as
getting the stuff out, so I'm taking some time to try and get a bk tree
out there so people can see what I'm talking about. Once I've got the
full tree out there, then it might be possible to start picking and
choosing which things to port against mainline so that they don't depend
on patches like the iorl patch.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-12 20:04:01

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12, 2004 at 02:51:42PM -0500, Doug Ledford wrote:
> More or less. But part of it also is that a lot of the patches I've
> written are on top of other patches that people don't want (aka, the
> iorl patch).

I'm wondering whether we want it now that 2.4 is basically frozen, but
I don't think there was a strong case against it say 4 or 5 month ago.
OTOH given that success (or lack thereof) I had pushing core changes
through Marcelo the chances it had even if scsi folks ACKed wouldn't
have been too high.

> I've got a mlqueue patch that actually *really* should go
> into mainline because it solves a slew of various problems in one go,
> but the current version of the patch depends on some semantic changes
> made by the iorl patch. So, sorting things out can sometimes be
> difficult. But, I've been told to go ahead and do what I can as far as
> getting the stuff out, so I'm taking some time to try and get a bk tree
> out there so people can see what I'm talking about. Once I've got the
> full tree out there, then it might be possible to start picking and
> choosing which things to port against mainline so that they don't depend
> on patches like the iorl patch.

I personally just don't care enough about 2.4 anymore, so I don't think
I'll invest major amounts of time into it. Even though the scsi changes
you've done are fairly huge I'm wondering whether we should just throw
it all in anyway - given that you said you'll have to care for the 2.4
scsi stack for a longer time for RH and no one else seems to be interested
doing maintaince.

2004-01-12 21:13:22

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, Jan 12 2004, Christoph Hellwig wrote:
> On Mon, Jan 12, 2004 at 02:51:42PM -0500, Doug Ledford wrote:
> > More or less. But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch).
>
> I'm wondering whether we want it now that 2.4 is basically frozen, but
> I don't think there was a strong case against it say 4 or 5 month ago.
> OTOH given that success (or lack thereof) I had pushing core changes
> through Marcelo the chances it had even if scsi folks ACKed wouldn't
> have been too high.

That's the key point, is it appropriate to merge now...

But I can completely back Doug on the point he made wrt pusing stuff
back to mainline - it was hard, because we deviated too much. And that
is also what I stressed would be the most important argument for merging
the iorl + scsi core changes.

> > I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch. So, sorting things out can sometimes be
> > difficult. But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about. Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't depend
> > on patches like the iorl patch.
>
> I personally just don't care enough about 2.4 anymore, so I don't think
> I'll invest major amounts of time into it. Even though the scsi changes
> you've done are fairly huge I'm wondering whether we should just throw
> it all in anyway - given that you said you'll have to care for the 2.4
> scsi stack for a longer time for RH and no one else seems to be interested
> doing maintaince.

Ditto.

--
Jens Axboe

2004-01-13 21:06:53

by Marcelo Tosatti

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 12 Jan 2004, Doug Ledford wrote:

> On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > as the patch discussed in this thread, i.e. pure (partially
> > > > vintage) bugfixes.
> > >
> > > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > > marcelo already. There will not be many "pure bugfixes" that you can find in
> > > vendor trees but not in marcelo's tree.
> >
> > I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> > for ages. In fact I asked Jens & Doug two times whether they could sort out
> > the distro patches for the 2.4 stack and post them, but it seems they're busy
> > enough with real work so this never happened.
>
> More or less. But part of it also is that a lot of the patches I've
> written are on top of other patches that people don't want (aka, the
> iorl patch). I've got a mlqueue patch that actually *really* should go
> into mainline because it solves a slew of various problems in one go,
> but the current version of the patch depends on some semantic changes
> made by the iorl patch. So, sorting things out can sometimes be
> difficult. But, I've been told to go ahead and do what I can as far as
> getting the stuff out, so I'm taking some time to try and get a bk tree
> out there so people can see what I'm talking about. Once I've got the
> full tree out there, then it might be possible to start picking and
> choosing which things to port against mainline so that they don't depend
> on patches like the iorl patch.

Hi,

Merging "straight" _bugfixes_ (I think all we agree that merging
performance enhancements at this point in 2.4 is not interesting) which,
as you said, get reviewed by Jens/James/others is OK.

What is this mlqueue patchset fixing ? To tell you the truth, I'm not
aware of the SCSI stack problems.

2004-01-15 17:10:36

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

>
> "not in a released kernel..." Do I read this right? That you have a fix
> for a critical bug and it hasn't been pushed to customers yet? How about
> security bugs, has the fix you pushed in RH-9.0 been push to EL customers?

RHL9 does not suffer this bug by virtue of not having the code in the first
place.

RHEL has the corner case fixed pushed already

Attachments:

(No filename) (363.00 B)
(No filename) (189.00 B)
Download all attachments

2004-01-15 17:05:27

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Thu, Jan 15 2004, Bill Davidsen wrote:
> Doug Ledford wrote:
> >On Mon, 2004-01-12 at 04:22, Jens Axboe wrote:
> >
> >>On Mon, Jan 12 2004, Arjan van de Ven wrote:
> >>
> >>>On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
> >>>
> >>>>... and still exists in your 2.4.21 based kernel.
> >>>
> >>>The RHL 2.4.21 kernels don't have the locking patch at all...
> >>
> >>But RHEL3 does:
> >>
> >>http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch
> >>
> >>and the bug is there.
> >
> >
> >But in RHEL3 the bug is fixed already (not in a released kernel, but the
> >fix went into our internal kernel some time back and will be in our next
> >update kernel). From my internal bk tree for this stuff:
>
> "not in a released kernel..." Do I read this right? That you have a fix
> for a critical bug and it hasn't been pushed to customers yet? How about
> security bugs, has the fix you pushed in RH-9.0 been push to EL customers?

Calm down, it's a bug fix for a deadlock that _could_ trigger on SMP
only in SCSI error recovery.

--
Jens Axboe

2004-01-15 17:03:00

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Doug Ledford wrote:
> On Mon, 2004-01-12 at 04:22, Jens Axboe wrote:
>
>>On Mon, Jan 12 2004, Arjan van de Ven wrote:
>>
>>>On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
>>>
>>>>... and still exists in your 2.4.21 based kernel.
>>>
>>>The RHL 2.4.21 kernels don't have the locking patch at all...
>>
>>But RHEL3 does:
>>
>>http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch
>>
>>and the bug is there.
>
>
> But in RHEL3 the bug is fixed already (not in a released kernel, but the
> fix went into our internal kernel some time back and will be in our next
> update kernel). From my internal bk tree for this stuff:

"not in a released kernel..." Do I read this right? That you have a fix
for a critical bug and it hasn't been pushed to customers yet? How about
security bugs, has the fix you pushed in RH-9.0 been push to EL customers?

> [dledford@compaq RHEL3-scsi]$ bk changes -r1.23
> [email protected], 2003-11-10 17:19:54-05:00, [email protected]
> drivers/scsi/scsi_error.c
> Don't panic if the eh thread is dead, instead do the same thing that
> scsi_softirq_handler does and just complete the command as a failed
> command.
> Change when we wake the eh thread in scsi_times_out to accomodate
> the changes to the mlqueue operations.
> Clear blocked status on the host and all devices in scsi_restart_operations
> -> Don't grab the host_lock in scsi_restart_operations, we aren't doing
> anything that needs it. Just goose the queues unconditionally,
> scsi_request_fn() will know to not send commands if they shouldn't
> go for some reason.
> Make sure we account SCSI_STATE_MLQUEUE commands as not being failed
> commands in scsi_unjam_host.
>
> But, Jens is right, it's a real bug. I just fixed it in a different
> way. And my fix is dependent on other changes in our scsi stack as
> well.

Yes, thanks to Peter for that fix, nice that someone provides timely
fixes...

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

2004-01-15 17:19:31

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Doug Ledford wrote:

> More or less. But part of it also is that a lot of the patches I've
> written are on top of other patches that people don't want (aka, the
> iorl patch). I've got a mlqueue patch that actually *really* should go
> into mainline because it solves a slew of various problems in one go,
> but the current version of the patch depends on some semantic changes
> made by the iorl patch. So, sorting things out can sometimes be
> difficult. But, I've been told to go ahead and do what I can as far as
> getting the stuff out, so I'm taking some time to try and get a bk tree
> out there so people can see what I'm talking about. Once I've got the
> full tree out there, then it might be possible to start picking and
> choosing which things to port against mainline so that they don't depend
> on patches like the iorl patch.

If it leads to a more stable kernel, I don't see why iorl can't go in
(user perspective) because RH is going to maintain it instead of trying
to find a developer who is competent and willing to do the boring task
of backfitting bugfixes to sub-optimal code.

The only problem I see would be getting testing before calling it
stable. Putting out a "giant SCSI patch" for test, then into a -test
kernel should solve that. The fact that RH is stuck supporting this for
at least five years is certainly both motivation and field test for any
changes ;-)

Clearly Marcello has the decision, but it looks from here as if
stability would be improved by something like this. Assuming that no
other vendor objects, of course.

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

2004-01-15 19:34:54

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Thu, 2004-01-15 at 12:01, Bill Davidsen wrote:
> Doug Ledford wrote:
> > On Mon, 2004-01-12 at 04:22, Jens Axboe wrote:
> >
> >>On Mon, Jan 12 2004, Arjan van de Ven wrote:
> >>
> >>>On Mon, Jan 12, 2004 at 10:19:46AM +0100, Jens Axboe wrote:
> >>>
> >>>>... and still exists in your 2.4.21 based kernel.
> >>>
> >>>The RHL 2.4.21 kernels don't have the locking patch at all...
> >>
> >>But RHEL3 does:
> >>
> >>http://kernelnewbies.org/kernels/rhel3/SOURCES/linux-2.4.21-iorl.patch
> >>
> >>and the bug is there.
> >
> >
> > But in RHEL3 the bug is fixed already (not in a released kernel, but the
> > fix went into our internal kernel some time back and will be in our next
> > update kernel). From my internal bk tree for this stuff:
>
> "not in a released kernel..." Do I read this right? That you have a fix
> for a critical bug and it hasn't been pushed to customers yet?

No, you don't read this right. We have a fix for a correctness issue
that has almost 0% chance of ever triggering in real life, has exactly 0
bug reports of it ever happening, and which has been integrated into our
tree. Obviously, we always push new kernels to all of our customers
every time we have this situation, or about twice a day...

> How about
> security bugs, has the fix you pushed in RH-9.0 been push to EL customers?
>
> > [dledford@compaq RHEL3-scsi]$ bk changes -r1.23
> > [email protected], 2003-11-10 17:19:54-05:00, [email protected]
> > drivers/scsi/scsi_error.c
> > Don't panic if the eh thread is dead, instead do the same thing that
> > scsi_softirq_handler does and just complete the command as a failed
> > command.
> > Change when we wake the eh thread in scsi_times_out to accomodate
> > the changes to the mlqueue operations.
> > Clear blocked status on the host and all devices in scsi_restart_operations
> > -> Don't grab the host_lock in scsi_restart_operations, we aren't doing
> > anything that needs it. Just goose the queues unconditionally,
> > scsi_request_fn() will know to not send commands if they shouldn't
> > go for some reason.
> > Make sure we account SCSI_STATE_MLQUEUE commands as not being failed
> > commands in scsi_unjam_host.
> >
> > But, Jens is right, it's a real bug. I just fixed it in a different
> > way. And my fix is dependent on other changes in our scsi stack as
> > well.
>
> Yes, thanks to Peter for that fix, nice that someone provides timely
> fixes...

Puh-Leeze. If you actually read the source code, you would see just how
damn near impossible this bug is. But, since you obviously didn't, let
me enlighten you:

1) The offending code is part of the error handler code, specifically
the section that kicks drives back into gear after recovery is complete.

2) The only section of code it's racing against is scsi_request_fn.

3) The race only happens if scsi_request_fn() detects that
host->in_recovery is not set.

So, a sample would be something like this

CPU1

spin_lock_irqsave(device_lock)
scsi_request_fn()
check host->in_recovery == 0
[ start window ]
make a couple other sanity checks
[ end window ]
spin_lock(host_lock)
modify host command counts
spin_unlock(host_lock)
finish prepping command
spin_unlock(device_lock)
scsi_dispatch_cmd()
spin_lock(device_lock)
return
spin_unlock(device_lock)

Now, in that little, itty, bitty window in scsi_request_fn(), we have to
A) get some kind of error that starts the error handler subsystem, B)
all outstanding commands have to complete (since we don't start the
error handler thread until the controller is quiescient), C) we have to
resolve the problem and get to the point of kicking the host back in
gear, and finally D) we have to grab the host_lock before
scsi_request_fn tries to and then we have to try and grab the specific
device_lock that scsi_request_fn is holding.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-16 15:41:21

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Doug Ledford wrote:
> On Thu, 2004-01-15 at 12:01, Bill Davidsen wrote:

>>"not in a released kernel..." Do I read this right? That you have a fix
>>for a critical bug and it hasn't been pushed to customers yet?
>
>
> No, you don't read this right. We have a fix for a correctness issue
> that has almost 0% chance of ever triggering in real life, has exactly 0
> bug reports of it ever happening, and which has been integrated into our
> tree. Obviously, we always push new kernels to all of our customers
> every time we have this situation, or about twice a day...

I actually had in mind a notification so customers would know it was
there if they actually see a similar problem, and a way to get the fix
if needed. Since a patch was posted, I assume that's it's not quite as
unlikely as you seem to imply.

We have two copies of RHEL-3.0 in-house, and I have a budget line item
to upgrade 38 servers from RH-8.0 this year. Since they are all SMP/SCSI
I think my concern with this bug, and the bug notification process in
general is germane. I don't want to apply fixes for problems I don't
have, but I want to know what fixes are out there so if I have a similar
problem I can apply the fix(es) which might be related before spending a
lot of time chasing the problem.

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

2004-01-17 13:15:18

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> On Mon, 12 Jan 2004, Doug Ledford wrote:
>
> > On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > > as the patch discussed in this thread, i.e. pure (partially
> > > > > vintage) bugfixes.
> > > >
> > > > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > > > marcelo already. There will not be many "pure bugfixes" that you can find in
> > > > vendor trees but not in marcelo's tree.
> > >
> > > I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> > > for ages. In fact I asked Jens & Doug two times whether they could sort out
> > > the distro patches for the 2.4 stack and post them, but it seems they're busy
> > > enough with real work so this never happened.
> >
> > More or less. But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch). I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch. So, sorting things out can sometimes be
> > difficult. But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about. Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't depend
> > on patches like the iorl patch.
>
> Hi,
>
> Merging "straight" _bugfixes_

OK, I've got some new bk trees established on linux-scsi.bkbits.net.
There is a 2.4-for-marcelo, which is where I will stick stuff I think is
ready to go to you. There is a 2.4-drivers. Then there is a
2.4-everything. The everything tree will be a place to put any and all
patches that aren't just experimental patches. So, just about any Red
Hat SCSI patch is fair game. Same for the mods SuSE has made. Pushing
changes from here to 2.4-for-marcelo on a case by case basis is more or
less my plan. Right now, these trees are all just perfect clones of
Marcelo's tree. As I make changes and commit things, I'll update the
lists with the new stuff.

> (I think all we agree that merging
> performance enhancements at this point in 2.4 is not interesting)

Well, I can actually make a fairly impressive argument for the iorl
patch if you ask me. It's your decision if you want it, but here's my
take on the issue:

1) Red Hat has shipped the iorl patch in our 2.4.9 Advanced Server 2.1
release, our 2.4.18 based Advanced Server 2.1 for IPF release, and our
current 2.4.21 based RHEL 3 release. So, it's getting *lots* of testing
outside the community. Including things like all the specweb
benchmarks, Oracle TPC benchmarks, and all of our customers are using
this patch. That's a *lot* of real world, beat it to death testing.

2) As I recall, SuSE has shipped either their own iorl type patch, or
they used one of our versions of the patch, don't really know which.
But, I know they've got the same basic functionality. Between 1 and 2,
that greatly increases the field testing of the iorl patch and greatly
reduces the breadth of testing of the regular kernel scsi stack.

3) OK, you can call it a performance patch if you want, but that
doesn't really do it justice. So, here's a little history. I used the
lockmeter patch when I was writing the iorl patch to get a clue just how
much difference it made. Obviously, on single disk systems it makes
very little difference (but it does make some difference because the
host controller and scsi_request_fn use different locks even on a single
device and that does help some). But, on a test machine with 4 CPUs and
something like 14 disks, without the patch a disk benchmark was using
100% of CPU and hitting something like 98% contention on the
io_request_lock. On the same system with the patch, benchmark CPU usage
dropped to something like 12% and because we weren't spinning any more
scsi_request_fn and make_request dropped off the profile radar entirely
and contention was basically down to low single digits. Consider my
home server has 7 disks in a raid5 array, it certainly makes a
difference to me ;-)

4) The last issue. 2.6 already has individual host locks for drivers.
The iorl patch for 2.4 adds the same thing. So, adding the iorl patch
to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
Right now it takes some fairly convoluted #ifdef statements to get the
locking right in a driver that supports both 2.4 and 2.6. Adding the
iorl patch allows driver authors to basically state that they don't
support anything prior to whatever version of 2.4 it goes into and
remove a bunch of #ifdef crap.

> which,
> as you said, get reviewed by Jens/James/others is OK.

That's the reason for the bk trees. Anyone can check them out that way.

> What is this mlqueue patchset fixing ? To tell you the truth, I'm not
> aware of the SCSI stack problems.

OK. This is a different problem entirely. The current 2.4 scsi stack
has a problem handling QUEUE_FULL and BUSY status commands. Amongst
other things, it retries the command immediately and will basically hang
the SCSI bus if the low level driver doesn't intervene. Most well
written drivers have workaround code in them (including my aic7xxx_old
driver). The mlqueue patch changes it so that the mid layer implements
delayed retries of QUEUE_FULL and BUSY commands. It also makes the mid
layer recognize the situation where you get a QUEUE_FULL or BUSY command
status with 0 outstanding commands. This used to cause the mid layer to
hang on that device because there would be no command completion to kick
the queue back into gear. Now, with the mlqueue patch, it detects 0
outstanding commands and sets a 1/5 second timer that restarts command
flow after the delay. This solved several IBM bugs in one go. Again,
this is a problem that older drivers with workaround code didn't need,
but newer drivers such as the IBM zfcp (is that right?) and the open
source Emulex driver don't have the standard workarounds for the bug and
were getting hit by it. By fixing the problem properly, a lot of
drivers could actually remove their workaround code and use the mid
layer code. The first version of this patch just solves the hang and
immediate retry problem. Ongoing work will fix the patch to do more
things correctly, such as replaying commands in their original order,
stuff like that.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-17 13:18:42

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Thu, 2004-01-15 at 12:17, Bill Davidsen wrote:
> Doug Ledford wrote:
>
> > More or less. But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch). I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch. So, sorting things out can sometimes be
> > difficult. But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about. Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't depend
> > on patches like the iorl patch.
>
> If it leads to a more stable kernel, I don't see why iorl can't go in
> (user perspective) because RH is going to maintain it instead of trying
> to find a developer who is competent and willing to do the boring task
> of backfitting bugfixes to sub-optimal code.

We actually intended to leave it out of RHEL3. But, once we started
doing performance testing of RHEL3 vs. AS2.1, it was obvious that if we
didn't put the patch back in we could kiss all of our benchmark results
goodbye. Seriously, it makes that much difference on server systems.

> The only problem I see would be getting testing before calling it
> stable. Putting out a "giant SCSI patch" for test, then into a -test
> kernel should solve that. The fact that RH is stuck supporting this for
> at least five years is certainly both motivation and field test for any
> changes ;-)

See me last email on the testing issue.

> Clearly Marcello has the decision, but it looks from here as if
> stability would be improved by something like this. Assuming that no
> other vendor objects, of course.

Stability, maybe. Ease of writing drivers that work in both 2.4 and 2.6
using the same locking logic, absolutely.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-17 15:19:29

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Doug Ledford wrote:
> On Thu, 2004-01-15 at 12:17, Bill Davidsen wrote:
>
>>Doug Ledford wrote:
>>
>>
>>>More or less. But part of it also is that a lot of the patches I've
>>>written are on top of other patches that people don't want (aka, the
>>>iorl patch). I've got a mlqueue patch that actually *really* should go
>>>into mainline because it solves a slew of various problems in one go,
>>>but the current version of the patch depends on some semantic changes
>>>made by the iorl patch. So, sorting things out can sometimes be
>>>difficult. But, I've been told to go ahead and do what I can as far as
>>>getting the stuff out, so I'm taking some time to try and get a bk tree
>>>out there so people can see what I'm talking about. Once I've got the
>>>full tree out there, then it might be possible to start picking and
>>>choosing which things to port against mainline so that they don't depend
>>>on patches like the iorl patch.
>>
>>If it leads to a more stable kernel, I don't see why iorl can't go in
>>(user perspective) because RH is going to maintain it instead of trying
>>to find a developer who is competent and willing to do the boring task
>>of backfitting bugfixes to sub-optimal code.
>
>
> We actually intended to leave it out of RHEL3. But, once we started
> doing performance testing of RHEL3 vs. AS2.1, it was obvious that if we
> didn't put the patch back in we could kiss all of our benchmark results
> goodbye. Seriously, it makes that much difference on server systems.

I'm running 30+ usenet servers, currently on older RH versions. Better
performance would certainly be a plus, although stability WRT lockups
has been an issue, as has operation when the number of threads gets very
high. First tests of RHEL looked very promising for stability, I'm
hoping to get the okay to do serious production testing next week.

>>The only problem I see would be getting testing before calling it
>>stable. Putting out a "giant SCSI patch" for test, then into a -test
>>kernel should solve that. The fact that RH is stuck supporting this for
>>at least five years is certainly both motivation and field test for any
>>changes ;-)
>
>
> See me last email on the testing issue.
>
>
>>Clearly Marcello has the decision, but it looks from here as if
>>stability would be improved by something like this. Assuming that no
>>other vendor objects, of course.
>
>
> Stability, maybe. Ease of writing drivers that work in both 2.4 and 2.6
> using the same locking logic, absolutely.
>

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979

2004-01-17 16:12:23

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, 2004-01-17 at 10:16, Bill Davidsen wrote:
> Doug Ledford wrote:
> > On Thu, 2004-01-15 at 12:17, Bill Davidsen wrote:
> >
> >>Doug Ledford wrote:
> >>
> >>
> >>>More or less. But part of it also is that a lot of the patches I've
> >>>written are on top of other patches that people don't want (aka, the
> >>>iorl patch). I've got a mlqueue patch that actually *really* should go
> >>>into mainline because it solves a slew of various problems in one go,
> >>>but the current version of the patch depends on some semantic changes
> >>>made by the iorl patch. So, sorting things out can sometimes be
> >>>difficult. But, I've been told to go ahead and do what I can as far as
> >>>getting the stuff out, so I'm taking some time to try and get a bk tree
> >>>out there so people can see what I'm talking about. Once I've got the
> >>>full tree out there, then it might be possible to start picking and
> >>>choosing which things to port against mainline so that they don't depend
> >>>on patches like the iorl patch.
> >>
> >>If it leads to a more stable kernel, I don't see why iorl can't go in
> >>(user perspective) because RH is going to maintain it instead of trying
> >>to find a developer who is competent and willing to do the boring task
> >>of backfitting bugfixes to sub-optimal code.
> >
> >
> > We actually intended to leave it out of RHEL3. But, once we started
> > doing performance testing of RHEL3 vs. AS2.1, it was obvious that if we
> > didn't put the patch back in we could kiss all of our benchmark results
> > goodbye. Seriously, it makes that much difference on server systems.
>
> I'm running 30+ usenet servers, currently on older RH versions. Better
> performance would certainly be a plus, although stability WRT lockups
> has been an issue, as has operation when the number of threads gets very
> high. First tests of RHEL looked very promising for stability, I'm
> hoping to get the okay to do serious production testing next week.

Well, when you think about it, the iorl patch makes lockups less likely,
and when they do occur they are likely to effect a smaller subset of the
scsi subsystem. When you have one lock and you get a lockup, the whole
system stops. When you have one lock per device and one lock per
controller, when they deadlock, they only deadlock a subset of the scsi
subsystem.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-17 16:58:45

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> 4) The last issue. 2.6 already has individual host locks for drivers.
> The iorl patch for 2.4 adds the same thing. So, adding the iorl patch
> to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
> Right now it takes some fairly convoluted #ifdef statements to get the
> locking right in a driver that supports both 2.4 and 2.6. Adding the
> iorl patch allows driver authors to basically state that they don't
> support anything prior to whatever version of 2.4 it goes into and
> remove a bunch of #ifdef crap.

Well, no. For one thing all the iorl patches miss the scsi_assign_lock
interface from 2.6 which makes drivers a big ifdef hell (especially
as the AS2.1 patch uses a different name for the lock as 3.0), and even
if it was there the use of that function is strongly discuraged in 2.6
in favour of just using the host_lock.

2004-01-17 19:17:16

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, Jan 17, 2004 at 02:07:33PM -0500, Doug Ledford wrote:
> #ifdef SCSI_HAS_HOST_LOCK
> #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
> adapter->lock_ptr = &adapter->lock;
> host->lock = &adapter->lock;
> #else
> adapter->lock_ptr = &adapter->lock;
> host->host_lock = &adapter->lock;
> #endif
> #else
> adapter->lock_ptr = &io_request_lock;
> #endif

Still looks wrong for the 2.6 case which should just be;

adapter->lock_ptr = shost->host_lock;

as I just stated in the review for the megaraid update.

2004-01-17 19:12:16

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, 2004-01-17 at 11:58, Christoph Hellwig wrote:
> On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> > 4) The last issue. 2.6 already has individual host locks for drivers.
> > The iorl patch for 2.4 adds the same thing. So, adding the iorl patch
> > to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
> > Right now it takes some fairly convoluted #ifdef statements to get the
> > locking right in a driver that supports both 2.4 and 2.6. Adding the
> > iorl patch allows driver authors to basically state that they don't
> > support anything prior to whatever version of 2.4 it goes into and
> > remove a bunch of #ifdef crap.
>
> Well, no. For one thing all the iorl patches miss the scsi_assign_lock
> interface from 2.6 which makes drivers a big ifdef hell (especially
> as the AS2.1 patch uses a different name for the lock as 3.0), and even
> if it was there the use of that function is strongly discuraged in 2.6
> in favour of just using the host_lock.

Yeah, I saw that too. Of course, it's not like that's my fault :-P I
had the 2.4.9 version of the iorl patch before 2.5 had a host lock, so
2.5 *could* have followed the 2.4.9-iorl patch convention of using
host->lock as the name instead of host->host_lock and then we wouldn't
be here. But, because 2.6 uses host->host_lock, and because usage of
the scsi_assign_lock() is discouraged, I made the iorl patch in RHEL3
follow the 2.6 convention of using host->host_lock. In hindsight, I
should have just followed the 2.4.9-iorl patch convention. Then a
driver could have just done this:

#ifdef SCSI_HAS_HOST_LOCK
#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
adapter->lock_ptr = &adapter->lock;
host->lock = &adapter->lock;
#else
adapter->lock_ptr = &adapter->lock;
host->host_lock = &adapter->lock;
#endif
#else
adapter->lock_ptr = &io_request_lock;
#endif

Then you just always use adapter->lock_ptr for spin locks in the driver
and you magically work in all kernel releases. Now, by going to
host->host_lock in 2.4 we get rid of one of the if statements. This
isn't impossible to do if both Red Hat and SuSE just release their next
update kernel with host->lock changed to host->host_lock.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-17 19:26:26

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, 2004-01-17 at 14:17, Christoph Hellwig wrote:
> On Sat, Jan 17, 2004 at 02:07:33PM -0500, Doug Ledford wrote:
> > #ifdef SCSI_HAS_HOST_LOCK
> > #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
> > adapter->lock_ptr = &adapter->lock;
> > host->lock = &adapter->lock;
> > #else
> > adapter->lock_ptr = &adapter->lock;
> > host->host_lock = &adapter->lock;
> > #endif
> > #else
> > adapter->lock_ptr = &io_request_lock;
> > #endif
>
> Still looks wrong for the 2.6 case which should just be;
>
> adapter->lock_ptr = shost->host_lock;
>
> as I just stated in the review for the megaraid update.

I should pay more attention to what goes on in 2.6. That was a crap
decision. There are drivers out there that use global data and need one
lock across all controllers. Putting a lock in the scsi host struct for
per controller locks is a waste of space. Let the driver decide if it
needs a global driver lock, a per controller lock (in which case it
should be part of the driver private host struct), or just the global
io_request_lock. It really is up to the driver and putting it into the
scsi host struct doesn't make sense.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-17 19:30:10

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, Jan 17, 2004 at 02:21:43PM -0500, Doug Ledford wrote:
> I should pay more attention to what goes on in 2.6. That was a crap
> decision. There are drivers out there that use global data and need one
> lock across all controllers. Putting a lock in the scsi host struct for
> per controller locks is a waste of space. Let the driver decide if it
> needs a global driver lock, a per controller lock (in which case it
> should be part of the driver private host struct), or just the global
> io_request_lock. It really is up to the driver and putting it into the
> scsi host struct doesn't make sense.

I tend to disagree. Giving the driver control over the lock used by the
midlayer is just wrong. If it has global state (which a good driver better
shouldn't) it's up to the driver to protect it. That we still have a way
to override that lock is the big bug rather, and in fact only three drivers
are doing that, and they all override it with a lock that's per-HBA anyway.

>
> --
> Doug Ledford <[email protected]> 919-754-3700 x44233
> Red Hat, Inc.
> 1801 Varsity Dr.
> Raleigh, NC 27606
>
>
---end quoted text---

--
Christoph Hellwig <[email protected]> - Freelance Hacker
Contact me for driver hacking and kernel development consulting

2004-01-17 20:41:23

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, 2004-01-17 at 14:29, Christoph Hellwig wrote:
> On Sat, Jan 17, 2004 at 02:21:43PM -0500, Doug Ledford wrote:
> > I should pay more attention to what goes on in 2.6. That was a crap
> > decision. There are drivers out there that use global data and need one
> > lock across all controllers. Putting a lock in the scsi host struct for
> > per controller locks is a waste of space. Let the driver decide if it
> > needs a global driver lock, a per controller lock (in which case it
> > should be part of the driver private host struct), or just the global
> > io_request_lock. It really is up to the driver and putting it into the
> > scsi host struct doesn't make sense.
>
> I tend to disagree. Giving the driver control over the lock used by the
> midlayer is just wrong. If it has global state (which a good driver better
> shouldn't) it's up to the driver to protect it. That we still have a way
> to override that lock is the big bug rather, and in fact only three drivers
> are doing that, and they all override it with a lock that's per-HBA anyway.

/me calls Christoph various unrepeatable names :-P

I'm exactly the opposite. In my opinion, the mid layer should be using
the device locks for its internal operations and calling into the low
level drivers with no locks held. The fact that the mid layer still
tries to lock drivers for the drivers is a bug IMO. Every time I see a
driver do spin_unlock_irq(host->host_lock); do driver work;
spin_lock_irq(host->host_lock); return; it really points out that the
mid layer has no business locking down drivers for them.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-19 21:38:11

by Martin Peschke3

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Doug Ledford wrote:
> On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> > What is this mlqueue patchset fixing ? To tell you the truth, I'm not
> > aware of the SCSI stack problems.
>
> OK. This is a different problem entirely. The current 2.4 scsi stack
> has a problem handling QUEUE_FULL and BUSY status commands. Amongst
> other things, it retries the command immediately and will basically hang
> the SCSI bus if the low level driver doesn't intervene. Most well
> written drivers have workaround code in them (including my aic7xxx_old
> driver). The mlqueue patch changes it so that the mid layer implements
> delayed retries of QUEUE_FULL and BUSY commands. It also makes the mid
> layer recognize the situation where you get a QUEUE_FULL or BUSY command
> status with 0 outstanding commands. This used to cause the mid layer to
> hang on that device because there would be no command completion to kick
> the queue back into gear.

http://marc.theaimsgroup.com/?t=106001205900001&r=1&w=2
We also stumbled about this problem last year. Our patch wasn't
as sophisticated as Doug's new patch. However, it worked,
and some fix is certainly needed.

> Now, with the mlqueue patch, it detects 0
> outstanding commands and sets a 1/5 second timer that restarts command
> flow after the delay. This solved several IBM bugs in one go.

This includes a panic
http://marc.theaimsgroup.com/?t=105167914900002&r=1&w=2
, hung SCSI hosts due to an error handling deadlock related to
commands snoozing in the mlqueue (can't find a link),
as well as the problem mentioned above.

> Again,
> this is a problem that older drivers with workaround code didn't need,
> but newer drivers such as the IBM zfcp (is that right?) and the open
> source Emulex driver don't have the standard workarounds for the bug and
> were getting hit by it.

Doug, you are right with regard to zfcp. But given the above listing
of is issues, I don't see how we could perfectly work around :-)

> By fixing the problem properly, a lot of
> drivers could actually remove their workaround code and use the mid
> layer code. The first version of this patch just solves the hang and
> immediate retry problem. Ongoing work will fix the patch to do more
> things correctly, such as replaying commands in their original order,
> stuff like that.

Sounds good.

Mit freundlichen Gr??en / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for eServer Development
Phone: +49-(0)7031-16-2349

Doug Ledford <[email protected]> on 17/01/2004 14:10:00

To: Marcelo Tosatti <[email protected]>
cc: Christoph Hellwig <[email protected]>, Arjan Van de Ven
<[email protected]>, Martin Peschke3/Germany/IBM@IBMDE, Jens Axboe
<[email protected]>, Peter Yao <[email protected]>,
[email protected], linux-scsi mailing list
<[email protected]>, [email protected]
Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> On Mon, 12 Jan 2004, Doug Ledford wrote:
>
> > On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > > as the patch discussed in this thread, i.e. pure (partially
> > > > > vintage) bugfixes.
> > > >
> > > > Both SuSE and Red Hat submit bugfixes they put in the respective
trees to
> > > > marcelo already. There will not be many "pure bugfixes" that you
can find in
> > > > vendor trees but not in marcelo's tree.
> > >
> > > I haven't seen SCSI patches sumission for 2.4 from the vendors on
linux-scsi
> > > for ages. In fact I asked Jens & Doug two times whether they could
sort out
> > > the distro patches for the 2.4 stack and post them, but it seems
they're busy
> > > enough with real work so this never happened.
> >
> > More or less. But part of it also is that a lot of the patches I've
> > written are on top of other patches that people don't want (aka, the
> > iorl patch). I've got a mlqueue patch that actually *really* should go
> > into mainline because it solves a slew of various problems in one go,
> > but the current version of the patch depends on some semantic changes
> > made by the iorl patch. So, sorting things out can sometimes be
> > difficult. But, I've been told to go ahead and do what I can as far as
> > getting the stuff out, so I'm taking some time to try and get a bk tree
> > out there so people can see what I'm talking about. Once I've got the
> > full tree out there, then it might be possible to start picking and
> > choosing which things to port against mainline so that they don't
depend
> > on patches like the iorl patch.
>
> Hi,
>
> Merging "straight" _bugfixes_

OK, I've got some new bk trees established on linux-scsi.bkbits.net.
There is a 2.4-for-marcelo, which is where I will stick stuff I think is
ready to go to you. There is a 2.4-drivers. Then there is a
2.4-everything. The everything tree will be a place to put any and all
patches that aren't just experimental patches. So, just about any Red
Hat SCSI patch is fair game. Same for the mods SuSE has made. Pushing
changes from here to 2.4-for-marcelo on a case by case basis is more or
less my plan. Right now, these trees are all just perfect clones of
Marcelo's tree. As I make changes and commit things, I'll update the
lists with the new stuff.

> (I think all we agree that merging
> performance enhancements at this point in 2.4 is not interesting)

Well, I can actually make a fairly impressive argument for the iorl
patch if you ask me. It's your decision if you want it, but here's my
take on the issue:

1) Red Hat has shipped the iorl patch in our 2.4.9 Advanced Server 2.1
release, our 2.4.18 based Advanced Server 2.1 for IPF release, and our
current 2.4.21 based RHEL 3 release. So, it's getting *lots* of testing
outside the community. Including things like all the specweb
benchmarks, Oracle TPC benchmarks, and all of our customers are using
this patch. That's a *lot* of real world, beat it to death testing.

2) As I recall, SuSE has shipped either their own iorl type patch, or
they used one of our versions of the patch, don't really know which.
But, I know they've got the same basic functionality. Between 1 and 2,
that greatly increases the field testing of the iorl patch and greatly
reduces the breadth of testing of the regular kernel scsi stack.

3) OK, you can call it a performance patch if you want, but that
doesn't really do it justice. So, here's a little history. I used the
lockmeter patch when I was writing the iorl patch to get a clue just how
much difference it made. Obviously, on single disk systems it makes
very little difference (but it does make some difference because the
host controller and scsi_request_fn use different locks even on a single
device and that does help some). But, on a test machine with 4 CPUs and
something like 14 disks, without the patch a disk benchmark was using
100% of CPU and hitting something like 98% contention on the
io_request_lock. On the same system with the patch, benchmark CPU usage
dropped to something like 12% and because we weren't spinning any more
scsi_request_fn and make_request dropped off the profile radar entirely
and contention was basically down to low single digits. Consider my
home server has 7 disks in a raid5 array, it certainly makes a
difference to me ;-)

4) The last issue. 2.6 already has individual host locks for drivers.
The iorl patch for 2.4 adds the same thing. So, adding the iorl patch
to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
Right now it takes some fairly convoluted #ifdef statements to get the
locking right in a driver that supports both 2.4 and 2.6. Adding the
iorl patch allows driver authors to basically state that they don't
support anything prior to whatever version of 2.4 it goes into and
remove a bunch of #ifdef crap.

> which,
> as you said, get reviewed by Jens/James/others is OK.

That's the reason for the bk trees. Anyone can check them out that way.

> What is this mlqueue patchset fixing ? To tell you the truth, I'm not
> aware of the SCSI stack problems.

OK. This is a different problem entirely. The current 2.4 scsi stack
has a problem handling QUEUE_FULL and BUSY status commands. Amongst
other things, it retries the command immediately and will basically hang
the SCSI bus if the low level driver doesn't intervene. Most well
written drivers have workaround code in them (including my aic7xxx_old
driver). The mlqueue patch changes it so that the mid layer implements
delayed retries of QUEUE_FULL and BUSY commands. It also makes the mid
layer recognize the situation where you get a QUEUE_FULL or BUSY command
status with 0 outstanding commands. This used to cause the mid layer to
hang on that device because there would be no command completion to kick
the queue back into gear. Now, with the mlqueue patch, it detects 0
outstanding commands and sets a 1/5 second timer that restarts command
flow after the delay. This solved several IBM bugs in one go. Again,
this is a problem that older drivers with workaround code didn't need,
but newer drivers such as the IBM zfcp (is that right?) and the open
source Emulex driver don't have the standard workarounds for the bug and
were getting hit by it. By fixing the problem properly, a lot of
drivers could actually remove their workaround code and use the mid
layer code. The first version of this patch just solves the hang and
immediate retry problem. Ongoing work will fix the patch to do more
things correctly, such as replaying commands in their original order,
stuff like that.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-20 02:28:43

by Marcelo Tosatti

[permalink] [raw]

Subject: [2.4] scsi per-host lock was Re: smp dead lock of io_request_lock/queue_lock patch

Doug,

Where can I find a copy of uptodate iorl and mlqueue patchsets ?

I cant enjoy the feeling of splitting a global lock at this point in 2.4's
lifetime, but it has been in RedHat's and SuSE's trees for long.

A reason to have it its the dependacy of mlqueue patch. You said most
drivers are handling the BUSY/QUEUE_FULL hangs by themselves, you mention
"IBM zfcp" / Emulex which dont. What other drivers suffer from the same
problem ?

On Sat, 17 Jan 2004, Doug Ledford wrote:

> On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> > On Mon, 12 Jan 2004, Doug Ledford wrote:
> >
> > > On Mon, 2004-01-12 at 14:48, Christoph Hellwig wrote:
> > > > On Mon, Jan 12, 2004 at 04:12:30PM +0100, Arjan van de Ven wrote:
> > > > > > as the patch discussed in this thread, i.e. pure (partially
> > > > > > vintage) bugfixes.
> > > > >
> > > > > Both SuSE and Red Hat submit bugfixes they put in the respective trees to
> > > > > marcelo already. There will not be many "pure bugfixes" that you can find in
> > > > > vendor trees but not in marcelo's tree.
> > > >
> > > > I haven't seen SCSI patches sumission for 2.4 from the vendors on linux-scsi
> > > > for ages. In fact I asked Jens & Doug two times whether they could sort out
> > > > the distro patches for the 2.4 stack and post them, but it seems they're busy
> > > > enough with real work so this never happened.
> > >
> > > More or less. But part of it also is that a lot of the patches I've
> > > written are on top of other patches that people don't want (aka, the
> > > iorl patch). I've got a mlqueue patch that actually *really* should go
> > > into mainline because it solves a slew of various problems in one go,
> > > but the current version of the patch depends on some semantic changes
> > > made by the iorl patch. So, sorting things out can sometimes be
> > > difficult. But, I've been told to go ahead and do what I can as far as
> > > getting the stuff out, so I'm taking some time to try and get a bk tree
> > > out there so people can see what I'm talking about. Once I've got the
> > > full tree out there, then it might be possible to start picking and
> > > choosing which things to port against mainline so that they don't depend
> > > on patches like the iorl patch.
> >
> > Hi,
> >
> > Merging "straight" _bugfixes_
>
> OK, I've got some new bk trees established on linux-scsi.bkbits.net.
> There is a 2.4-for-marcelo, which is where I will stick stuff I think is
> ready to go to you. There is a 2.4-drivers. Then there is a
> 2.4-everything. The everything tree will be a place to put any and all
> patches that aren't just experimental patches. So, just about any Red
> Hat SCSI patch is fair game. Same for the mods SuSE has made. Pushing
> changes from here to 2.4-for-marcelo on a case by case basis is more or
> less my plan. Right now, these trees are all just perfect clones of
> Marcelo's tree. As I make changes and commit things, I'll update the
> lists with the new stuff.
>
> > (I think all we agree that merging
> > performance enhancements at this point in 2.4 is not interesting)
>
> Well, I can actually make a fairly impressive argument for the iorl
> patch if you ask me. It's your decision if you want it, but here's my
> take on the issue:
>
> 1) Red Hat has shipped the iorl patch in our 2.4.9 Advanced Server 2.1
> release, our 2.4.18 based Advanced Server 2.1 for IPF release, and our
> current 2.4.21 based RHEL 3 release. So, it's getting *lots* of testing
> outside the community. Including things like all the specweb
> benchmarks, Oracle TPC benchmarks, and all of our customers are using
> this patch. That's a *lot* of real world, beat it to death testing.
>
> 2) As I recall, SuSE has shipped either their own iorl type patch, or
> they used one of our versions of the patch, don't really know which.
> But, I know they've got the same basic functionality. Between 1 and 2,
> that greatly increases the field testing of the iorl patch and greatly
> reduces the breadth of testing of the regular kernel scsi stack.
>
> 3) OK, you can call it a performance patch if you want, but that
> doesn't really do it justice. So, here's a little history. I used the
> lockmeter patch when I was writing the iorl patch to get a clue just how
> much difference it made. Obviously, on single disk systems it makes
> very little difference (but it does make some difference because the
> host controller and scsi_request_fn use different locks even on a single
> device and that does help some). But, on a test machine with 4 CPUs and
> something like 14 disks, without the patch a disk benchmark was using
> 100% of CPU and hitting something like 98% contention on the
> io_request_lock. On the same system with the patch, benchmark CPU usage
> dropped to something like 12% and because we weren't spinning any more
> scsi_request_fn and make_request dropped off the profile radar entirely
> and contention was basically down to low single digits. Consider my
> home server has 7 disks in a raid5 array, it certainly makes a
> difference to me ;-)
>
> 4) The last issue. 2.6 already has individual host locks for drivers.
> The iorl patch for 2.4 adds the same thing. So, adding the iorl patch
> to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
> Right now it takes some fairly convoluted #ifdef statements to get the
> locking right in a driver that supports both 2.4 and 2.6. Adding the
> iorl patch allows driver authors to basically state that they don't
> support anything prior to whatever version of 2.4 it goes into and
> remove a bunch of #ifdef crap.

2004-01-20 03:25:10

[permalink] [raw]

Subject: Re: [2.4] scsi per-host lock was Re: smp dead lock of io_request_lock/queue_lock patch

On Mon, 2004-01-19 at 21:19, Marcelo Tosatti wrote:
> Doug,
>
> Where can I find a copy of uptodate iorl and mlqueue patchsets ?

Red Hat's 2.4.21-9.EL kernel? ;-)

Seriously, I've already set up 3 new bk trees on linux-scsi.bkbits.net.
They are 2.4-for-marcelo, 2.4-drivers, and 2.4-everything. Right now
the trees are still just clones of your current tree. The plan,
however, is to sort through all of the scsi patches we put into our
kernels and put them into the various trees. Obviously, the everything
tree will have both the iorl and mlqueue patches plus the other stuff.
The for-marcelo tree is stuff that is agreed to be appropriate for
mainline. The drivers tree is where I'll start sticking driver updates
that come across linux-scsi for the 2.4 kernel.

> I cant enjoy the feeling of splitting a global lock at this point in 2.4's
> lifetime, but it has been in RedHat's and SuSE's trees for long.
>
> A reason to have it its the dependacy of mlqueue patch. You said most
> drivers are handling the BUSY/QUEUE_FULL hangs by themselves, you mention
> "IBM zfcp" / Emulex which dont. What other drivers suffer from the same
> problem ?

Well, that's a hard to answer question. First, the problem only shows
up on drivers that actually enabled tagged queuing and use a reasonably
deep queue depth (normally). So, that automatically strikes out most of
the older ISA and crap hardware drivers since they only do a command at
a time. The better drivers; aic7xxx (new and old), ncr53c8xx, sym1 and
sym2, qla2x00, etc.; work around the problem. It's mainly newer drivers
that vendors are starting to write and who don't know the history of the
broken mid layer handling of QUEUE_FULL and BUSY situations. So, I
cited the two I know of off the top of my head, and there may be others
but I would actually have to look at the source code to identify them
(when 98% of the scsi market is QLogic and NCR/SYM and aic7xxx, those
other drivers fall through the cracks). Of course, one of the other
issues is that each driver that tries to handle this situation itself
does it in a different manner, and the correctness of those varying
implementations is also an issue to consider.

--
Doug Ledford <[email protected]> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606

2004-01-20 07:54:02

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

On Sat, Jan 17 2004, Doug Ledford wrote:
> On Sat, 2004-01-17 at 11:58, Christoph Hellwig wrote:
> > On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> > > 4) The last issue. 2.6 already has individual host locks for drivers.
> > > The iorl patch for 2.4 adds the same thing. So, adding the iorl patch
> > > to 2.4 makes it easier to have drivers be the same between 2.4 and 2.6.
> > > Right now it takes some fairly convoluted #ifdef statements to get the
> > > locking right in a driver that supports both 2.4 and 2.6. Adding the
> > > iorl patch allows driver authors to basically state that they don't
> > > support anything prior to whatever version of 2.4 it goes into and
> > > remove a bunch of #ifdef crap.
> >
> > Well, no. For one thing all the iorl patches miss the scsi_assign_lock
> > interface from 2.6 which makes drivers a big ifdef hell (especially
> > as the AS2.1 patch uses a different name for the lock as 3.0), and even
> > if it was there the use of that function is strongly discuraged in 2.6
> > in favour of just using the host_lock.
>
> Yeah, I saw that too. Of course, it's not like that's my fault :-P I
> had the 2.4.9 version of the iorl patch before 2.5 had a host lock, so
> 2.5 *could* have followed the 2.4.9-iorl patch convention of using
> host->lock as the name instead of host->host_lock and then we wouldn't
> be here. But, because 2.6 uses host->host_lock, and because usage of

First of all, ->lock is a _bad_ name. And secondly, we probably did this
at around the same time (the 2.5 block code was already "done" by the
time 2.5 opened, if you recall it was one of the first things merged in
2.5.1-pre1). The fact that I chose the better name (which is really a
rare incident, my names typically suck) is just history :-)

> the scsi_assign_lock() is discouraged, I made the iorl patch in RHEL3
> follow the 2.6 convention of using host->host_lock. In hindsight, I
> should have just followed the 2.4.9-iorl patch convention. Then a
> driver could have just done this:
>
> #ifdef SCSI_HAS_HOST_LOCK
> #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,0)
> adapter->lock_ptr = &adapter->lock;
> host->lock = &adapter->lock;
> #else
> adapter->lock_ptr = &adapter->lock;
> host->host_lock = &adapter->lock;
> #endif
> #else
> adapter->lock_ptr = &io_request_lock;
> #endif
>
> Then you just always use adapter->lock_ptr for spin locks in the driver
> and you magically work in all kernel releases. Now, by going to
> host->host_lock in 2.4 we get rid of one of the if statements. This
> isn't impossible to do if both Red Hat and SuSE just release their next
> update kernel with host->lock changed to host->host_lock.

I can certainly help make that happen, I completely agree it's needed.

--
Jens Axboe

2004-01-25 00:32:06

by Kurt Garloff

[permalink] [raw]

Subject: Re: smp dead lock of io_request_lock/queue_lock patch

Doug,

On Sat, Jan 17, 2004 at 08:10:00AM -0500, Doug Ledford wrote:
> On Tue, 2004-01-13 at 15:55, Marcelo Tosatti wrote:
> > Merging "straight" _bugfixes_
>
> OK, I've got some new bk trees established on linux-scsi.bkbits.net.
> There is a 2.4-for-marcelo, which is where I will stick stuff I think is
> ready to go to you. There is a 2.4-drivers. Then there is a
> 2.4-everything. The everything tree will be a place to put any and all
> patches that aren't just experimental patches. So, just about any Red
> Hat SCSI patch is fair game. Same for the mods SuSE has made.

Not everything we have is straight bugfixes :-\

> Pushing
> changes from here to 2.4-for-marcelo on a case by case basis is more or
> less my plan. Right now, these trees are all just perfect clones of
> Marcelo's tree. As I make changes and commit things, I'll update the
> lists with the new stuff.

Find the collection of changes we've done to SCSI scanning at
http://www.suse.de/~garloff/linux/scsi-scan/scsi-scanning.html
(currently syncing, might take an hour before the patches are visible)

It contains
* various blacklist updates
* various new parameters that influence the way SCSI scanning is done
(one of the areas that generated amazingly many support calls and
HW vendor requests)
* a refactoring of the code to make it resemble 2.6
* support for REPORT_LUNS (backport from 2.6 as well)

The amount of parameters is larger than in 2.6, you can also instruct
the code not to use REPORT_LUNS at all, you can blacklist devices
for REPORT_LUNS (BLIST_NOREPLUN) and you can tell the code to try
REPORT_LUNS on SCSI-2 devices (if the host adapter reports to support
more than 8 LUNs, so you avoid the usual USB screwups).

I'd certainly appreciate if this is merged into more 2.4 trees and I'll
try to find the time to push some improvements into 2.6. If anybody is
faster, I would not mind either ;-)

Destructive criticism appreciated ;-)

Doug, would you mind to post the BUSY and QUEUE_FULL fixes for review?

Regards,
--
Kurt Garloff <[email protected]> Cologne, DE
SUSE LINUX AG, Nuernberg, DE SUSE Labs (Head)

Attachments:

(No filename) (2.12 kB)
(No filename) (189.00 B)
Download all attachments

2004-03-08 21:27:22