Subject: [PATCH] virtio_blk: use noop elevator by default

Hi Rusty,

Would it make sense to use noop by default? After all we do not know
what is behind the backend driver and the hypervisor is likely to do its
own scheduling anyway. I guess this is the reason the Xen guys took this
approach.

What do you think about the patch below?

- Fernando

---

From: Fernando Luis Vazquez Cao <[email protected]>
Subject: [PATCH] virtio_blk: use noop elevator by default

Using the noop elevator by default seems to be safest bet because we do
not know what is behind the backend driver and the hypervisor is likely
to do its own scheduling anyway.

Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
---

diff -urNp linux-2.6.27-rc4/drivers/block/virtio_blk.c linux-2.6.27-rc4-fixes/drivers/block/virtio_blk.c
--- linux-2.6.27-rc4/drivers/block/virtio_blk.c 2008-08-26 21:26:01.000000000 +0900
+++ linux-2.6.27-rc4-fixes/drivers/block/virtio_blk.c 2008-08-26 21:22:03.000000000 +0900
@@ -237,6 +237,8 @@ static int virtblk_probe(struct virtio_d
goto out_put_disk;
}

+ elevator_init(rq, "noop");
+
if (index < 26) {
sprintf(vblk->disk->disk_name, "vd%c", 'a' + index % 26);
} else if (index < (26 + 1) * 26) {


2008-08-26 14:39:16

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] virtio_blk: use noop elevator by default

On Tue, Aug 26 2008, Fernando Luis V?zquez Cao wrote:
> Hi Rusty,
>
> Would it make sense to use noop by default? After all we do not know
> what is behind the backend driver and the hypervisor is likely to do its
> own scheduling anyway. I guess this is the reason the Xen guys took this
> approach.
>
> What do you think about the patch below?

I plan to include some variant of disk profiling for 2.6.28 which will
let eg CFQ turn off idling for such device types, I think that is a
better solution.

--
Jens Axboe

Subject: Re: [PATCH] virtio_blk: use noop elevator by default

On Tue, 2008-08-26 at 16:39 +0200, Jens Axboe wrote:
> On Tue, Aug 26 2008, Fernando Luis Vázquez Cao wrote:
> > Hi Rusty,
> >
> > Would it make sense to use noop by default? After all we do not know
> > what is behind the backend driver and the hypervisor is likely to do its
> > own scheduling anyway. I guess this is the reason the Xen guys took this
> > approach.
> >
> > What do you think about the patch below?
>
> I plan to include some variant of disk profiling for 2.6.28 which will
> let eg CFQ turn off idling for such device types, I think that is a
> better solution.

Hi Jens,

That is good news. With the proliferation of intelligent disk
controllers and SSDs, the disk profiling approach seems to be the right
way to go in general and I think it was badly needed.

>From your example my wild guess is that disk profiling will be a I/O
controller-specific auto-tuning feature, not a new functionality of the
generic elevator layer. Is this interpretation correct? Would it make
sense in some cases to change elevators automatically depending on the
characteristics of the underlying device instead (e.g. we might not need
any of the extra features CFQ provides, for example)?

I would like to take a look at those patches so I peeked into your git
tree, but I could not find them (I probably chose the wrong branches).
Are they accessible through your kernel.org's git repository?

Thanks!

Fernando

Subject: Re: [PATCH] virtio_blk: use noop elevator by default

On Tue, 2008-08-26 at 16:39 +0200, Jens Axboe wrote:
> On Tue, Aug 26 2008, Fernando Luis Vázquez Cao wrote:
> > Hi Rusty,
> >
> > Would it make sense to use noop by default? After all we do not know
> > what is behind the backend driver and the hypervisor is likely to do its
> > own scheduling anyway. I guess this is the reason the Xen guys took this
> > approach.
> >
> > What do you think about the patch below?
>
> I plan to include some variant of disk profiling for 2.6.28 which will
> let eg CFQ turn off idling for such device types, I think that is a
> better solution.
Thank you for the advice. I will be sending patches that take advantage
of the profiling capability merged for 2.6.28.

Subject: [PATCH 1/3] block: add queue flag for paravirt frontend drivers

As is the case with SSD devices, we do not want to idle in AS/CFQ when
the block device is a paravirt front-end driver. This patch adds a flag
(QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
virtio_blk and xen-blkfront to indicate a paravirtualized device.

Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
---

diff -urNp linux-2.6.28-rc2-orig/include/linux/blkdev.h linux-2.6.28-rc2/include/linux/blkdev.h
--- linux-2.6.28-rc2-orig/include/linux/blkdev.h 2008-10-27 17:41:58.000000000 +0900
+++ linux-2.6.28-rc2/include/linux/blkdev.h 2008-10-27 17:34:49.000000000 +0900
@@ -449,6 +449,7 @@ struct request_queue
#define QUEUE_FLAG_FAIL_IO 12 /* fake timeout */
#define QUEUE_FLAG_STACKABLE 13 /* supports request stacking */
#define QUEUE_FLAG_NONROT 14 /* non-rotational device (SSD) */
+#define QUEUE_FLAG_VIRT QUEUE_FLAG_NONROT /* paravirt device */

static inline int queue_is_locked(struct request_queue *q)
{

Subject: [PATCH 2/3] virtio_blk: set queue paravirt flag

As a paravirt front-end driver, virtio_blk is not a rotational device so
we want do avoid idling in AS/CFQ. Tell the block layer about this.

Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
---

diff -urNp linux-2.6.28-rc2-orig/drivers/block/virtio_blk.c linux-2.6.28-rc2/drivers/block/virtio_blk.c
--- linux-2.6.28-rc2-orig/drivers/block/virtio_blk.c 2008-10-27 17:41:53.000000000 +0900
+++ linux-2.6.28-rc2/drivers/block/virtio_blk.c 2008-10-27 17:34:32.000000000 +0900
@@ -237,6 +237,8 @@ static int virtblk_probe(struct virtio_d
goto out_put_disk;
}

+ queue_flag_set_unlocked(QUEUE_FLAG_VIRT, vblk->disk->queue);
+
if (index < 26) {
sprintf(vblk->disk->disk_name, "vd%c", 'a' + index % 26);
} else if (index < (26 + 1) * 26) {

Subject: [PATCH 3/3] xen-blkfront: set queue paravirt flag

Xen's blkfront sets noop as the default I/O scheduler at initialization
time to avoid elevator overheads such as idling, but with the advent of
basic disk profiling capabilities this is not necessary anymore. We
should just tell the block layer that we are a paravirt front-end driver
and the elevator will automatically make the necessary adjustments.

Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
---

diff -urNp linux-2.6.28-rc2-orig/drivers/block/xen-blkfront.c linux-2.6.28-rc2/drivers/block/xen-blkfront.c
--- linux-2.6.28-rc2-orig/drivers/block/xen-blkfront.c 2008-10-27 17:41:53.000000000 +0900
+++ linux-2.6.28-rc2/drivers/block/xen-blkfront.c 2008-10-27 17:38:59.000000000 +0900
@@ -343,7 +343,7 @@ static int xlvbd_init_blk_queue(struct g
if (rq == NULL)
return -1;

- elevator_init(rq, "noop");
+ queue_flag_set_unlocked(QUEUE_FLAG_VIRT, rq);

/* Hard sector size and max sectors impersonate the equiv. hardware. */
blk_queue_hardsect_size(rq, sector_size);

2008-10-27 12:57:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/3] block: add queue flag for paravirt frontend drivers

On Mon, Oct 27 2008, Fernando Luis V?zquez Cao wrote:
> As is the case with SSD devices, we do not want to idle in AS/CFQ when
> the block device is a paravirt front-end driver. This patch adds a flag
> (QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
> virtio_blk and xen-blkfront to indicate a paravirtualized device.

All three patches look fine, although we could just reuse
QUEUE_FLAG_NONROT directly. But I agree it makes sense to make the
distinction, so I've just applied 1-3.

--
Jens Axboe

2008-11-04 23:23:54

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 1/3] block: add queue flag for paravirt frontend drivers

Jens Axboe wrote:
> On Mon, Oct 27 2008, Fernando Luis V?zquez Cao wrote:
>
>> As is the case with SSD devices, we do not want to idle in AS/CFQ when
>> the block device is a paravirt front-end driver. This patch adds a flag
>> (QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
>> virtio_blk and xen-blkfront to indicate a paravirtualized device.
>>
>
> All three patches look fine, although we could just reuse
> QUEUE_FLAG_NONROT directly. But I agree it makes sense to make the
> distinction, so I've just applied 1-3.
>

I guess in theory you could imagine that the virtual device is mapped
directly onto a physical device, and the host OS does no scheduling, in
which case it would be appropriate for the guest do the work. But I
think otherwise this makes sense.

J

2008-11-05 09:22:08

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/3] block: add queue flag for paravirt frontend drivers

On Tue, Nov 04 2008, Jeremy Fitzhardinge wrote:
> Jens Axboe wrote:
> >On Mon, Oct 27 2008, Fernando Luis V?zquez Cao wrote:
> >
> >>As is the case with SSD devices, we do not want to idle in AS/CFQ when
> >>the block device is a paravirt front-end driver. This patch adds a flag
> >>(QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
> >>virtio_blk and xen-blkfront to indicate a paravirtualized device.
> >>
> >
> >All three patches look fine, although we could just reuse
> >QUEUE_FLAG_NONROT directly. But I agree it makes sense to make the
> >distinction, so I've just applied 1-3.
> >
>
> I guess in theory you could imagine that the virtual device is mapped
> directly onto a physical device, and the host OS does no scheduling, in
> which case it would be appropriate for the guest do the work. But I
> think otherwise this makes sense.

For that specific case, it should just not set the flag.

--
Jens Axboe

Subject: Re: [PATCH 1/3] block: add queue flag for paravirt frontend drivers

On Wed, 2008-11-05 at 10:20 +0100, Jens Axboe wrote:
> On Tue, Nov 04 2008, Jeremy Fitzhardinge wrote:
> > Jens Axboe wrote:
> > >On Mon, Oct 27 2008, Fernando Luis Vázquez Cao wrote:
> > >
> > >>As is the case with SSD devices, we do not want to idle in AS/CFQ when
> > >>the block device is a paravirt front-end driver. This patch adds a flag
> > >>(QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
> > >>virtio_blk and xen-blkfront to indicate a paravirtualized device.
> > >>
> > >
> > >All three patches look fine, although we could just reuse
> > >QUEUE_FLAG_NONROT directly. But I agree it makes sense to make the
> > >distinction, so I've just applied 1-3.
> > >
> >
> > I guess in theory you could imagine that the virtual device is mapped
> > directly onto a physical device, and the host OS does no scheduling, in
> > which case it would be appropriate for the guest do the work. But I
> > think otherwise this makes sense.
>
> For that specific case, it should just not set the flag.

When the virtual device is mapped directly onto a physical device the
host OS or the hypervisor could notify this fact to the guest using the
PCI configuration space (the bytes reserved for vendor-defined purposes
look like a good candidate to me). In such a case, on the guest side the
driver should just check the configuration space and set/unset the flag
accordingly.

The good thing about this approach is that it can be used with both
paravirt drivers and regular drivers (which is important for fully
virtualized guests).

I have just started to implement this idea but I would be great it you
let me know your take on this issue.

- Fernando