2007-01-10 22:49:31

by Kiyoshi Ueda

[permalink] [raw]
Subject: [RFC PATCH 0/3] blk_end_request: full I/O completion handler

Hi Jens and device driver maintainers,

I would like to make block interface to hook in before completing
each chunk of request (for errors) to allow request-based multipath.
The first approach was to add a new hook in __end_that_request_first().
- http://marc.theaimsgroup.com/?l=linux-scsi&m=115520444515914&w=2
- http://marc.theaimsgroup.com/?l=linux-kernel&m=116656637425880&w=2
However, Jens pointed out that redesigning ->end_io() as a full
completion handler would be better:

On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe <[email protected]> wrote:
> Ok, I see what you are getting at. The current ->end_io() is called when
> the request has fully completed, you want notification for each chunk
> potentially completed.
>
> I think a better design here would be to use ->end_io() as the full
> completion handler, similar to how bio->bi_end_io() works. A request
> originating from __make_request() would set something ala:
.....
> instead of calling the functions manually. That would allow you to get
> notification right at the beginning and do what you need, without adding
> a special hook for this.

So I made a tentative patch based on Jens' suggestion.
Are the interface and the design acceptable for block layer and
device drivers, though a little bit big modification of device drivers
is needed to take this approach?

What is changed:
o blk_end_request() is a helper for drivers to call a full completion
handler (req->end_io) for the request.
(end_that_request_* are no longer used from outside of the handlers.)
o If blk_end_request() returns 0, the request is completed.
If blk_end_request() returns 1, the request isn't completed and
the caller should decide reactions for it.
o __make_request() sets blk_end_io() in req->end_io
o Added some templates to existing req->end_io users so they can be
full completion handler.

The interface of blk_end_request():
o About the 4th argument (int locked)
blk_end_request() has to get a queue lock before it calls
end_that_request_last(). On the other hand, some callers need to
call blk_end_request() with the lock held.
Callers can tell blk_end_request() by this argument whether a queue
lock needs to be held in blk_end_request().

o About the 5th and the 6th arguments (int (callback)(void *), void *arg)
Most of callers do caller-specific work between
end_that_request_{first,chunk} and end_that_request_last.
Callers now pass a callback for such work by these arguments.

By using the callback feature, critical sections protected by
single queue lock might need to be divided.
(e.g. In scsi_end_request(), blk_queue_end_tag() and
end_that_request_last() was protected by 1 spinlock but
1 spinlock is used for each call. See PATCH#3)
If it can't be divided, the caller has to get the lock before
calling blk_end_request(), though it may have performance affects.

To cover callers which wants to call only __end_that_request_first()
like PIO mode in cdrom_newpc_intr(), the callback can return int value.
If the return value is non 0, blk_end_request() returns 1 immediately
without the request completion. (See cdrom_newpc_intr() in PATCH#3)

If this is acceptable approach, I will break down, clean up and repost
the patch.

Thanks,
Kiyoshi Ueda