1999-03-20 22:39:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: disk head scheduling

On Sat, 20 Mar 1999, Gerard Roudier wrote:
> The kernel IO-request queue is one queue and a device TASK SET is also one
> queue, the SCSI sub-system passes requests between the kernel request
> queue and device queues.

The kernel IO-request queue is NOT one queue.

This is a common misunderstanding, and I'm really fed up with it.

The generic kernel low-level driver interface was designed to have
multiple requests queues, and the fact is that sadly the SCSI layer
doesn't understand this, so the SCSI layer uses just one queue.

The IDE driver is much better in this regard, for example. The IDE driver
uses one queue per controller, rather than one global queue. That means
that the generic code doesn't waste time trying to schedule requests for
different controllers, only schedule them on a per-controller basis.

There is indeed a global upper limit of total requests that the kernel
will queue up, but that's a separate issue or memory management, and is
actually just a single define.. It means that all IO requests are allocated
from the same pool of requests entries, but that does NOT MEAN, and has
NEVER MEANT that there is only one queue.

Sorry for shouting, but this is one of my major gripes with the SCSI
layer: it was just badly designed. And now that bad design means that
driver writers believe that the generic kernel is doing something stupid.
It isn't. It's the SCSI layer that is stupid, and it wouldn't have to be.

Just to give you an idea: right now the generic disk interface layer in
the kernel sees only one queue for all SCSI devices, which means that it
tries to sort all the requests according to that. That obviously fails: in
the presense of multiple disks, the sorting criteria just do not make all
that much sense.

So the generic layer spends time sorting the requests onto one queue,
doesn't do a good job because of that, and then the SCSI layer spends MORE
time unsorting the requests.

This means, for example, that while the kernel is making up a request for
one SCSI disk and has "plugged" the request queue in order to give just
one large requests to the disk, that "plug" will now plug all other SCSI
devices too - even though that wasn't the reason for doing the plugging.

With just one controller, this usually isn't a problem, but if you have
multiple controllers you can only blame the SCSI layer if it doesn't scale
up nicely.

This is one reason why people like Leonard decide to not use the SCSI
layer at all for their newer drivers.