Subject: request->ioprio

(CCing LKML)

Hi Jens, Rusty,

Trying to implement i/o tracking all the way up to the page cache (so
that cfq and the future cgroup-based I/O controllers can schedule
buffered I/O properly) I noticed that struct request's ioprio is
initialized but never used for I/O scheduling purposes. Indeed there
seems to be one single user of this member: virtio_blk. Virtio uses
struct request's ioprio in the request() function of the virtio block
driver, which just copies the ioprio value to the output header of
virtblk_req.

Is this the intended use of struct request's ioprio? Is it OK for device
drivers to use it? If the answer two the previous to questions is no I
would like to send some clean-up patches.

- Fernando


2008-08-06 17:24:20

by Divyesh Shah

[permalink] [raw]
Subject: Re: request->ioprio


On Aug 6, 2008, at 2:32 AM, Fernando Luis V?zquez Cao wrote:

> (CCing LKML)
>
> Hi Jens, Rusty,
>
> Trying to implement i/o tracking all the way up to the page cache (so
> that cfq and the future cgroup-based I/O controllers can schedule
> buffered I/O properly) I noticed that struct request's ioprio is
> initialized but never used for I/O scheduling purposes. Indeed there
> seems to be one single user of this member: virtio_blk. Virtio uses
> struct request's ioprio in the request() function of the virtio block
> driver, which just copies the ioprio value to the output header of
> virtblk_req.
>
> Is this the intended use of struct request's ioprio? Is it OK for
> device
> drivers to use it? If the answer two the previous to questions is no I
> would like to send some clean-up patches.

Naveen Gupta sent a priority-based anticipatory IO scheduler patchset
earlier which uses request->ioprio and the struct request seems to be
the logical place to keep the ioprio. So, please don't cleanup the
ioprio from there.

Thanks,
Divyesh

>
> - Fernando
>


2008-08-08 06:19:58

by Rusty Russell

[permalink] [raw]
Subject: Re: request->ioprio

On Wednesday 06 August 2008 19:16:36 Fernando Luis Vázquez Cao wrote:
> Hi Jens, Rusty,

Hi Fernando,

> Trying to implement i/o tracking all the way up to the page cache (so
> that cfq and the future cgroup-based I/O controllers can schedule
> buffered I/O properly) I noticed that struct request's ioprio is
> initialized but never used for I/O scheduling purposes. Indeed there
> seems to be one single user of this member: virtio_blk.

Hey, do I win a prize? :)

> Virtio uses
> struct request's ioprio in the request() function of the virtio block
> driver, which just copies the ioprio value to the output header of
> virtblk_req.

Yes, we pass it through to the host, in the assumption they might want to use
it to schedule our I/Os relative to each other.

I'm a little surprised noone else uses it, but I'm sure they will...
Rusty.

Subject: Re: request->ioprio

On Thu, 2008-08-07 at 06:33 +1000, Rusty Russell wrote:
> > Trying to implement i/o tracking all the way up to the page cache (so
> > that cfq and the future cgroup-based I/O controllers can schedule
> > buffered I/O properly) I noticed that struct request's ioprio is
> > initialized but never used for I/O scheduling purposes. Indeed there
> > seems to be one single user of this member: virtio_blk.
>
> Hey, do I win a prize? :)
I think you should! :)

> > Virtio uses
> > struct request's ioprio in the request() function of the virtio block
> > driver, which just copies the ioprio value to the output header of
> > virtblk_req.
>
> Yes, we pass it through to the host, in the assumption they might want to use
> it to schedule our I/Os relative to each other.
The reason I asked is that the the value of struct request's ioprio is
never user which means it does not contain useful information.
Schedulers such as CFQ that try to keep track of the io context of io
requests (including ioprio) use struct request's elevator_private for
that. For example, CFQ embeds the cfq_io_context there, which in turn
contains struct io_context where the request's ioprio is actually
stored. In other words, we have two ioprios per request: one which is a
member of struct request and another is accessible through struct
request's elavator_private. Unfortunately only the latter is used, which
means that virtio_blk is not passing useful information to the backend
driver.

> I'm a little surprised noone else uses it, but I'm sure they will...
> Rusty.
I am not that sure unless we clean things up. Currently bios do not hold
io context information which is the reason why struct request's ioprio
does not contain useful information. To solve this issue it is necessary
to track IO all the way up to the pagecache layer, which is precisely
what the IO tracking project that has just started is attempting to
achieve.

Besides, I guess that accessing the io context information (such as
ioprio) of a request through elevator-specific private structures is not
something we want virtio_blk (or future users) to do. I think that we
could replace struct request's ioprio with a pointer to io_context so
that IO context information could be accessed in a consistent
elevator-agnostic way. Another possible approach could be to keep
struct request's ioprio synchronized with the IO context information in
the private structures. We could even extend the elevator API to provide
IO context information, but my feeling is that the first is the cleanest
approach.

What is your take on this?

Subject: [PATCH] virtio_blk: use a wrapper function to access io context information of IO requests

Rusty, Jens

What do you think about the patch below? Does it make sense?

---

struct request has an ioprio member but it is never updated because
currently bios do not hold io context information. The implication of
this is that virtio_blk ends up passing useless information to the
backend driver.

That said, some IO schedulers such as CFQ do store io context
information in struct request, but use private members for that, which
means that that information cannot be directly accessed in a IO
scheduler-independent way.

This patch adds a function to obtain the ioprio of a request. We should
avoid accessing ioprio directly and use this function instead, so that
its users do not have to care about changes in block layer structures or
the currently active IO controller.

This patch does not introduce any functional changes but paves the way
for future clean-ups and enhancements.

Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
---

diff -urNp linux-2.6.27-rc3/drivers/block/virtio_blk.c linux-2.6.27-rc3-fixes/drivers/block/virtio_blk.c
--- linux-2.6.27-rc3/drivers/block/virtio_blk.c 2008-08-13 14:25:46.000000000 +0900
+++ linux-2.6.27-rc3-fixes/drivers/block/virtio_blk.c 2008-08-13 16:36:34.000000000 +0900
@@ -84,11 +84,11 @@ static bool do_req(struct request_queue
if (blk_fs_request(vbr->req)) {
vbr->out_hdr.type = 0;
vbr->out_hdr.sector = vbr->req->sector;
- vbr->out_hdr.ioprio = vbr->req->ioprio;
+ vbr->out_hdr.ioprio = req_get_bio(vbr->req);
} else if (blk_pc_request(vbr->req)) {
vbr->out_hdr.type = VIRTIO_BLK_T_SCSI_CMD;
vbr->out_hdr.sector = 0;
- vbr->out_hdr.ioprio = vbr->req->ioprio;
+ vbr->out_hdr.ioprio = req_get_bio(vbr->req);
} else {
/* We don't put anything else in the queue. */
BUG();
diff -urNp linux-2.6.27-rc3/include/linux/blkdev.h linux-2.6.27-rc3-fixes/include/linux/blkdev.h
--- linux-2.6.27-rc3/include/linux/blkdev.h 2008-08-13 14:25:57.000000000 +0900
+++ linux-2.6.27-rc3-fixes/include/linux/blkdev.h 2008-08-13 16:36:40.000000000 +0900
@@ -233,6 +233,11 @@ struct request {
struct request *next_rq;
};

+static inline unsigned short req_get_bio(struct request *req)
+{
+ return req->ioprio;
+}
+
/*
* State information carried for REQ_TYPE_PM_SUSPEND and REQ_TYPE_PM_RESUME
* requests. Some step values could eventually be made generic.

2008-08-14 04:15:34

by Rusty Russell

[permalink] [raw]
Subject: Re: request->ioprio

On Wednesday 13 August 2008 17:06:03 Fernando Luis Vázquez Cao wrote:
> Besides, I guess that accessing the io context information (such as
> ioprio) of a request through elevator-specific private structures is not
> something we want virtio_blk (or future users) to do.

The only semantic I assumed was "higher is better". The server (ie. host) can
really only use the information to schedule between I/Os for that particular
guest anyway.

But it sounds like I should be passing "0" in there unconditionally until the
kernel semantics are sorted out and I can do something more intelligent? I
haven't checked, but I assume that's actually what's happening at the moment
(the field is zero)?

Rusty.

Subject: Re: request->ioprio

On Thu, 2008-08-14 at 12:16 +1000, Rusty Russell wrote:
> On Wednesday 13 August 2008 17:06:03 Fernando Luis Vázquez Cao wrote:
> > Besides, I guess that accessing the io context information (such as
> > ioprio) of a request through elevator-specific private structures is not
> > something we want virtio_blk (or future users) to do.
>
> The only semantic I assumed was "higher is better". The server (ie. host) can
> really only use the information to schedule between I/Os for that particular
> guest anyway.
>
> But it sounds like I should be passing "0" in there unconditionally until the
> kernel semantics are sorted out and I can do something more intelligent? I
> haven't checked, but I assume that's actually what's happening at the moment
> (the field is zero)?
Yes, with the current implementation the field is always zero, but
things might change. Instead of passing 0 unconditionally I think we
could use a function that extracts/calculates the ioprio of requests.
The patch I sent you yesterday is the first step in that direction. Is
this a valid approach for you?

2008-08-14 04:42:44

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] virtio_blk: use a wrapper function to access io context information of IO requests

On Wednesday 13 August 2008 18:14:31 Fernando Luis Vázquez Cao wrote:
> Rusty, Jens
>
> What do you think about the patch below? Does it make sense?

Yes. Obviously it makes more sense for Jens to queue this so:

Acked-by: Rusty Russell <[email protected]>

Thanks!
Rusty.

Subject: Re: [PATCH] virtio_blk: use a wrapper function to access io context information of IO requests

On Thu, 2008-08-14 at 14:42 +1000, Rusty Russell wrote:
> On Wednesday 13 August 2008 18:14:31 Fernando Luis Vázquez Cao wrote:
> > Rusty, Jens
> >
> > What do you think about the patch below? Does it make sense?
>
> Yes. Obviously it makes more sense for Jens to queue this so:
>
> Acked-by: Rusty Russell <[email protected]>

Thank you Rusty. I'll be responding to this email with an updated patch
because I noticed the function I introduced is misnamed (it should be
req_get_ioprio() not req_get_bio()).

Subject: [PATCH] virtio_blk: use a wrapper function to access io context information of IO requests

struct request has an ioprio member but it is never updated because
currently bios do not hold io context information. The implication of
this is that virtio_blk ends up passing useless information to the
backend driver.

That said, some IO schedulers such as CFQ do store io context
information in struct request, but use private members for that, which
means that that information cannot be directly accessed in a IO
scheduler-independent way.

This patch adds a function to obtain the ioprio of a request. We should
avoid accessing ioprio directly and use this function instead, so that
its users do not have to care about future changes in block layer
structures or what the currently active IO controller is.

This patch does not introduce any functional changes but paves the way
for future clean-ups and enhancements.

Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
Acked-by: Rusty Russell <[email protected]>
---

diff -urNp linux-2.6.27-rc3/drivers/block/virtio_blk.c linux-2.6.27-rc3-fixes/drivers/block/virtio_blk.c
--- linux-2.6.27-rc3/drivers/block/virtio_blk.c 2008-08-14 13:51:38.000000000 +0900
+++ linux-2.6.27-rc3-fixes/drivers/block/virtio_blk.c 2008-08-14 13:53:13.000000000 +0900
@@ -84,11 +84,11 @@ static bool do_req(struct request_queue
if (blk_fs_request(vbr->req)) {
vbr->out_hdr.type = 0;
vbr->out_hdr.sector = vbr->req->sector;
- vbr->out_hdr.ioprio = vbr->req->ioprio;
+ vbr->out_hdr.ioprio = req_get_ioprio(vbr->req);
} else if (blk_pc_request(vbr->req)) {
vbr->out_hdr.type = VIRTIO_BLK_T_SCSI_CMD;
vbr->out_hdr.sector = 0;
- vbr->out_hdr.ioprio = vbr->req->ioprio;
+ vbr->out_hdr.ioprio = req_get_ioprio(vbr->req);
} else {
/* We don't put anything else in the queue. */
BUG();
diff -urNp linux-2.6.27-rc3/include/linux/blkdev.h linux-2.6.27-rc3-fixes/include/linux/blkdev.h
--- linux-2.6.27-rc3/include/linux/blkdev.h 2008-08-14 13:51:47.000000000 +0900
+++ linux-2.6.27-rc3-fixes/include/linux/blkdev.h 2008-08-14 13:53:34.000000000 +0900
@@ -233,6 +233,11 @@ struct request {
struct request *next_rq;
};

+static inline unsigned short req_get_ioprio(struct request *req)
+{
+ return req->ioprio;
+}
+
/*
* State information carried for REQ_TYPE_PM_SUSPEND and REQ_TYPE_PM_RESUME
* requests. Some step values could eventually be made generic.

2008-08-14 07:58:19

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] virtio_blk: use a wrapper function to access io context information of IO requests

On Thu, Aug 14 2008, Fernando Luis V?zquez Cao wrote:
> struct request has an ioprio member but it is never updated because
> currently bios do not hold io context information. The implication of
> this is that virtio_blk ends up passing useless information to the
> backend driver.
>
> That said, some IO schedulers such as CFQ do store io context
> information in struct request, but use private members for that, which
> means that that information cannot be directly accessed in a IO
> scheduler-independent way.
>
> This patch adds a function to obtain the ioprio of a request. We should
> avoid accessing ioprio directly and use this function instead, so that
> its users do not have to care about future changes in block layer
> structures or what the currently active IO controller is.
>
> This patch does not introduce any functional changes but paves the way
> for future clean-ups and enhancements.
>
> Signed-off-by: Fernando Luis Vazquez Cao <[email protected]>
> Acked-by: Rusty Russell <[email protected]>

Fine with me, applied.

--
Jens Axboe

Subject: Re: request->ioprio

On Thu, 2008-08-14 at 12:16 +1000, Rusty Russell wrote:
> On Wednesday 13 August 2008 17:06:03 Fernando Luis Vázquez Cao wrote:
> > Besides, I guess that accessing the io context information (such as
> > ioprio) of a request through elevator-specific private structures is not
> > something we want virtio_blk (or future users) to do.
>
> The only semantic I assumed was "higher is better". The server (ie. host) can
> really only use the information to schedule between I/Os for that particular
> guest anyway.
Does that mean you are not going to incorporate the prio class system
that is used in Linux?

Please note that the three upper bits of ioprio contain the ioprio
class. We currently have three classes encoded as follows:

IOPRIO_CLASS_RT: 1
IOPRIO_CLASS_BE: 2
IOPRIO_CLASS_IDLE: 3

As things stand now, if we passed ioprio as is to the backend driver
requests would get priority inverted. For example, requests in the idle
class would be prioritized in detriment of the real time ones.

This problem could be solved in req_get_ioprio() (already in Jen's
tree), or, alternatively, we could change the enum where the ioprio
classes are defined.

What are your plans regarding prio classes?

2008-08-16 07:13:52

by Rusty Russell

[permalink] [raw]
Subject: Re: request->ioprio

On Friday 15 August 2008 15:51:02 Fernando Luis Vázquez Cao wrote:
> On Thu, 2008-08-14 at 12:16 +1000, Rusty Russell wrote:
> > On Wednesday 13 August 2008 17:06:03 Fernando Luis Vázquez Cao wrote:
> > > Besides, I guess that accessing the io context information (such as
> > > ioprio) of a request through elevator-specific private structures is
> > > not something we want virtio_blk (or future users) to do.
> >
> > The only semantic I assumed was "higher is better". The server (ie.
> > host) can really only use the information to schedule between I/Os for
> > that particular guest anyway.
>
> Does that mean you are not going to incorporate the prio class system
> that is used in Linux?

Actually, since it's unused at the moment, we can define it however we want.
But note that this is an ABI; while the kernel-internal definitions are
fluid, this semantic must stay the same (even if the actual values differ).

So we should probably put an explicit mapping function there anyway.

Thanks,
Rusty.