2006-03-26 06:41:08

by L A Walsh

[permalink] [raw]
Subject: Block I/O Schedulers: Can they be made selectable/device? @runtime?

Is it still the case that block I/O schedulers (AS, CFQ, etc.)
are only selectable at boot time?

How difficult would it be to allow multiple, concurrent I/O
schedulers running on different block devices?

How close is the kernel to "being there"? I.e. if someone has a
"regular" hard disk and a high-end solid state disk, can
Linux allow whichever algorithm is best for the hardware?
(or applications if they are run on separate block devices)?

-l





2006-03-26 07:08:00

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Block I/O Schedulers: Can they be made selectable/device? @runtime?

On Sat, 25 Mar 2006 22:41:00 PST, Linda Walsh said:
> Is it still the case that block I/O schedulers (AS, CFQ, etc.)
> are only selectable at boot time?

Hasn't been for quite some time. CPU schedulers are stuck at boot time, even
if you have the 'plugsched' patch (and if you don't, you're stuck with the one
scheduler in-tree currently). There was a patch posted a few days ago
that allowed on-the-fly changing of plugsched, but that's still too bleeding
edge even for me... ;)

> How difficult would it be to allow multiple, concurrent I/O
> schedulers running on different block devices?

>From my /etc/rc.local:

echo cfq > /sys/block/hda/queue/scheduler
echo noop > /sys/block/hdb/queue/scheduler

(hda is a real disk with ext3 partitions on it, hdb is a DVD/CD/RW that almost
always has exactly one process reading or writing to it at a given time, so doing
things in the order requested is just fine).

Simple enough? ;)

(This *does* require that you built more than one scheduler, and possibly
to make sure they're loaded if you managed to build them modular...)


Attachments:
(No filename) (228.00 B)

2006-03-27 03:20:32

by L A Walsh

[permalink] [raw]
Subject: Re: Block I/O Schedulers: Can they be made selectable/device? @runtime?

[email protected] wrote:
> Hasn't been for quite some time.
> From my /etc/rc.local:
>
Great...the file "Documentation/as_iosched.txt" is apparently
out of date.
> echo cfq > /sys/block/hda/queue/scheduler
> echo noop > /sys/block/hdb/queue/scheduler
>
> (hda is a real disk with ext3 partitions on it, hdb is a DVD/CD/RW that almost
> always has exactly one process reading or writing to it at a given time, so doing
> things in the order requested is just fine).
>
> Simple enough? ;)
>
---
Sounds fine. I don't suppose it's too much to ask, but where should
should I have found the updated information? :-)

-l



2006-03-27 06:17:45

by Randy Dunlap

[permalink] [raw]
Subject: Re: Block I/O Schedulers: Can they be made selectable/device? @runtime?

On Sun, 26 Mar 2006 19:20:27 -0800 Linda Walsh wrote:

> [email protected] wrote:
> > Hasn't been for quite some time.
> > From my /etc/rc.local:
> >
> Great...the file "Documentation/as_iosched.txt" is apparently
> out of date.

What kernel version are you looking at?
That file is now Documentation/block/as_iosched.txt .


> > echo cfq > /sys/block/hda/queue/scheduler
> > echo noop > /sys/block/hdb/queue/scheduler
> >
> > (hda is a real disk with ext3 partitions on it, hdb is a DVD/CD/RW that almost
> > always has exactly one process reading or writing to it at a given time, so doing
> > things in the order requested is just fine).
> >
> > Simple enough? ;)
> >
> ---
> Sounds fine. I don't suppose it's too much to ask, but where should
> should I have found the updated information? :-)

Patches accepted... Please summarize what you have found, even if not in
patch format (and I'll make it a patch).

---
~Randy

2006-03-27 08:41:21

by Valdis Klētnieks

[permalink] [raw]
Subject: [PATCH] 2.6.16 Block I/O Schedulers - document runtime selection

On Sun, 26 Mar 2006 22:19:52 PST, "Randy.Dunlap" said:

> Patches accepted... Please summarize what you have found, even if not in
> patch format (and I'll make it a patch).

From: Valdis Kletnieks <[email protected]>

We added the ability to change a block device's IO elevator scheduler both
at kernel boot and on-the-fly, but we only documented the elevator= boot
parameter. Add a quick how-to on doing it on the fly.

Signed-off-by: Valdis Kletnieks <[email protected]>
---
--- linux-2.6.16-mm1/Documentation/block/switching-sched.txt.new 2006-03-27 03:26:25.000000000 -0500
+++ linux-2.6.16-mm1/Documentation/block/switching-sched.txt 2006-03-27 03:33:39.000000000 -0500
@@ -0,0 +1,22 @@
+As of the Linux 2.6.mumble kernel, it is now possible to change the
+IO scheduler for a given block device on the fly (thus making it possible,
+for instance, to set the CFQ scheduler for the system default, but
+set a specific device to use the anticipatory or noop schedulers - which
+can improve that device's throughput).
+
+To set a specific scheduler, simply do this:
+
+echo SCHEDNAME > /sys/block/DEV/queue/scheduler
+
+where SCHEDNAME is the name of a defined IO scheduler, and DEV is the
+device name (hda, hdb, sga, or whatever you happen to have).
+
+The list of defined schedulers can be found by simply doing
+a "cat /sys/block/DEV/queue/scheduler" - the list of valid names
+will be displayed, with the currently selected scheduler in brackets:
+
+# cat /sys/block/hda/queue/scheduler
+noop anticipatory deadline [cfq]
+# echo anticipatory > /sys/block/hda/queue/scheduler
+# cat /sys/block/hda/queue/scheduler
+noop [anticipatory] deadline cfq




Attachments:
(No filename) (228.00 B)

2006-03-29 04:10:31

by Tejun Heo

[permalink] [raw]
Subject: Re: Block I/O Schedulers: Can they be made selectable/device? @runtime?

Linda Walsh wrote:
> Is it still the case that block I/O schedulers (AS, CFQ, etc.)
> are only selectable at boot time?
>
> How difficult would it be to allow multiple, concurrent I/O
> schedulers running on different block devices?
>
> How close is the kernel to "being there"? I.e. if someone has a
> "regular" hard disk and a high-end solid state disk, can
> Linux allow whichever algorithm is best for the hardware?
> (or applications if they are run on separate block devices)?
>

Hello, Linda, Jens.

Actually, I've been thinking about related stuff for sometime. e.g. It
doesn't make much sense to use any scheduler other than noop for SSDs
and it also doesn't make much sense to plug requests for milliseconds to
such devices. So, what I'm currently thinking is...

* Give LLDD a chance to say that it doesn't need fancy scheduling.

* Automagically tune plugging time. We can maintain running average of
request turn-around time and use fraction of it to plug the device. This
should be give good enough merging behavior while not adding excessive
delay to seek time.

* Don't leave device devices with queue depth > 1 idle. For queued
devices, we can push the first request fast such that the head moves to
proximity of what would probably follow. So, don't plug the first
request, plug from the second.

Any gotchas I've missed?

Thanks.

--
tejun

2006-03-29 07:24:30

by Jens Axboe

[permalink] [raw]
Subject: Re: Block I/O Schedulers: Can they be made selectable/device? @runtime?

On Wed, Mar 29 2006, Tejun Heo wrote:
> Linda Walsh wrote:
> >Is it still the case that block I/O schedulers (AS, CFQ, etc.)
> >are only selectable at boot time?
> >
> >How difficult would it be to allow multiple, concurrent I/O
> >schedulers running on different block devices?
> >
> >How close is the kernel to "being there"? I.e. if someone has a
> >"regular" hard disk and a high-end solid state disk, can
> >Linux allow whichever algorithm is best for the hardware?
> >(or applications if they are run on separate block devices)?
> >
>
> Hello, Linda, Jens.
>
> Actually, I've been thinking about related stuff for sometime. e.g. It
> doesn't make much sense to use any scheduler other than noop for SSDs
> and it also doesn't make much sense to plug requests for milliseconds to
> such devices. So, what I'm currently thinking is...
>
> * Give LLDD a chance to say that it doesn't need fancy scheduling.

Something I've been meaning to do for ages as well. I figure the
simplest way is to define a simple set of profiles, ala

enum {
BLK_QUEUE_TYPE_HD,
BLK_QUEUE_TYPE_SS,
BLK_QUEUE_TYPE_CDROM,
};

Make BLK_QUEUE_TYPE_HD the default setting, and then let setting of this
look something ala:

q = blk_init_queue(rfn, lock);
blk_set_queue_type(q, BLK_QUEUE_TYPE_SS);
...

and be done with it.

> * Automagically tune plugging time. We can maintain running average of
> request turn-around time and use fraction of it to plug the device. This
> should be give good enough merging behavior while not adding excessive
> delay to seek time.

Sounds like too much work for little (or zero) benefit. The current
heuristics are a little rough, and if you can show a tangible benefit
from actually looking/calculating this stuff, then we can talk :-)

> * Don't leave device devices with queue depth > 1 idle. For queued
> devices, we can push the first request fast such that the head moves to
> proximity of what would probably follow. So, don't plug the first
> request, plug from the second.

Trade off, if the next io is mergable it will still be a loss. But
generally I like the idea!

--
Jens Axboe