This patch adds new "end_io_first" hook in __end_that_request_first()
for request-based device-mapper.
Signed-off-by: Kiyoshi Ueda <[email protected]>
Signed-off-by: Jun'ichi Nomura <[email protected]>
diff -rupN 1-blk-get-request-irqrestore/block/ll_rw_blk.c 2-add-generic-hook/block/ll_rw_blk.c
--- 1-blk-get-request-irqrestore/block/ll_rw_blk.c 2006-12-15 10:21:29.000000000 -0500
+++ 2-add-generic-hook/block/ll_rw_blk.c 2006-12-15 10:23:30.000000000 -0500
@@ -260,6 +260,7 @@ static void rq_init(request_queue_t *q,
rq->data = NULL;
rq->nr_phys_segments = 0;
rq->sense = NULL;
+ rq->end_io_first = NULL;
rq->end_io = NULL;
rq->end_io_data = NULL;
rq->completion_data = NULL;
@@ -3216,6 +3217,22 @@ static int __end_that_request_first(stru
blk_add_trace_rq(req->q, req, BLK_TA_COMPLETE);
+ if (!uptodate) {
+ if (blk_fs_request(req) && !(req->cmd_flags & REQ_QUIET))
+ printk("end_request: I/O error, dev %s, sector %llu\n",
+ req->rq_disk ? req->rq_disk->disk_name : "?",
+ (unsigned long long)req->sector);
+ }
+
+ if (blk_fs_request(req) && req->rq_disk) {
+ const int rw = rq_data_dir(req);
+
+ disk_stat_add(req->rq_disk, sectors[rw], nr_bytes >> 9);
+ }
+
+ if (req->end_io_first)
+ return req->end_io_first(req, uptodate, nr_bytes);
+
/*
* extend uptodate bool to allow < 0 value to be direct io error
*/
@@ -3230,19 +3247,6 @@ static int __end_that_request_first(stru
if (!blk_pc_request(req))
req->errors = 0;
- if (!uptodate) {
- if (blk_fs_request(req) && !(req->cmd_flags & REQ_QUIET))
- printk("end_request: I/O error, dev %s, sector %llu\n",
- req->rq_disk ? req->rq_disk->disk_name : "?",
- (unsigned long long)req->sector);
- }
-
- if (blk_fs_request(req) && req->rq_disk) {
- const int rw = rq_data_dir(req);
-
- disk_stat_add(req->rq_disk, sectors[rw], nr_bytes >> 9);
- }
-
total_bytes = bio_nbytes = 0;
while ((bio = req->bio) != NULL) {
int nbytes;
diff -rupN 1-blk-get-request-irqrestore/include/linux/blkdev.h 2-add-generic-hook/include/linux/blkdev.h
--- 1-blk-get-request-irqrestore/include/linux/blkdev.h 2006-12-11 14:32:53.000000000 -0500
+++ 2-add-generic-hook/include/linux/blkdev.h 2006-12-15 10:23:30.000000000 -0500
@@ -126,6 +126,7 @@ void copy_io_context(struct io_context *
void swap_io_context(struct io_context **ioc1, struct io_context **ioc2);
struct request;
+typedef int (rq_end_first_fn)(struct request *, int, int);
typedef void (rq_end_io_fn)(struct request *, int);
struct request_list {
@@ -312,6 +313,7 @@ struct request {
/*
* completion callback.
*/
+ rq_end_first_fn *end_io_first;
rq_end_io_fn *end_io;
void *end_io_data;
};
On Tue, Dec 19 2006, Kiyoshi Ueda wrote:
> This patch adds new "end_io_first" hook in __end_that_request_first()
> for request-based device-mapper.
What's this for, lack of stacking?
--
Jens Axboe
Hi Jens,
Sorry for the less explanation.
On Wed, 20 Dec 2006 14:49:24 +0100, Jens Axboe <[email protected]> wrote:
> On Tue, Dec 19 2006, Kiyoshi Ueda wrote:
> > This patch adds new "end_io_first" hook in __end_that_request_first()
> > for request-based device-mapper.
>
> What's this for, lack of stacking?
I don't understand the meaning of "lack of stacking" well but
I guess that it means "Is the existing hook in end_that_request_last()
not enough?" If so, the answer is no.
(If the geuss is wrong, please let me know.)
The new hook is needed for error handling in dm.
For example, when an error occurred on a request, dm-multipath
wants to try another path before returning EIO to application.
Without the new hook, at the point of end_that_request_last(),
the bios are already finished with error and can't be retried.
Thanks,
Kiyoshi Ueda
On Wed, Dec 20 2006, Kiyoshi Ueda wrote:
> Hi Jens,
>
> Sorry for the less explanation.
>
> On Wed, 20 Dec 2006 14:49:24 +0100, Jens Axboe <[email protected]> wrote:
> > On Tue, Dec 19 2006, Kiyoshi Ueda wrote:
> > > This patch adds new "end_io_first" hook in __end_that_request_first()
> > > for request-based device-mapper.
> >
> > What's this for, lack of stacking?
>
> I don't understand the meaning of "lack of stacking" well but
> I guess that it means "Is the existing hook in end_that_request_last()
> not enough?" If so, the answer is no.
> (If the geuss is wrong, please let me know.)
>
> The new hook is needed for error handling in dm.
> For example, when an error occurred on a request, dm-multipath
> wants to try another path before returning EIO to application.
> Without the new hook, at the point of end_that_request_last(),
> the bios are already finished with error and can't be retried.
Ok, I see what you are getting at. The current ->end_io() is called when
the request has fully completed, you want notification for each chunk
potentially completed.
I think a better design here would be to use ->end_io() as the full
completion handler, similar to how bio->bi_end_io() works. A request
originating from __make_request() would set something ala:
int fs_end_io(struct request *rq, int error, unsigned int nr_bytes)
{
if (!__end_that_request_first(rq, err, nr_bytes)) {
end_that_request_last(rq, error);
return 0;
}
return 1;
}
and normal io completion from a driver would use a helper:
int blk_complete_io(struct request *rq, int error, unsigned int nr_bytes)
{
return rq->end_io(rq, error, nr_bytes);
}
instead of calling the functions manually. That would allow you to get
notification right at the beginning and do what you need, without adding
a special hook for this.
When designing these things, never be afraid to change some of the core
bits. It is a lot better than hacking around the current code, if it
doesn't quite fit your needs.
--
Jens Axboe
Hi Jens,
On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe <[email protected]> wrote:
> > The new hook is needed for error handling in dm.
> > For example, when an error occurred on a request, dm-multipath
> > wants to try another path before returning EIO to application.
> > Without the new hook, at the point of end_that_request_last(),
> > the bios are already finished with error and can't be retried.
>
> Ok, I see what you are getting at. The current ->end_io() is called when
> the request has fully completed, you want notification for each chunk
> potentially completed.
>
> I think a better design here would be to use ->end_io() as the full
> completion handler, similar to how bio->bi_end_io() works. A request
> originating from __make_request() would set something ala:
>
> int fs_end_io(struct request *rq, int error, unsigned int nr_bytes)
> {
> if (!__end_that_request_first(rq, err, nr_bytes)) {
> end_that_request_last(rq, error);
> return 0;
> }
>
> return 1;
> }
>
> and normal io completion from a driver would use a helper:
>
> int blk_complete_io(struct request *rq, int error, unsigned int nr_bytes)
> {
> return rq->end_io(rq, error, nr_bytes);
> }
>
> instead of calling the functions manually. That would allow you to get
> notification right at the beginning and do what you need, without adding
> a special hook for this.
I'm not confident about what you mean.
Something like this?
- __make_request() sets fs_end_io() to req->end_io()
- The driver calls blk_complete_io()
* if it succeeds, the request is done
* if it fails, the request is not completed
and the driver needs retry or something
- Current users of req->end_io() have to update/rewrite thier end_io.
- Features like mine will set its own end_io.
It checks error and decides whether calling fs_end_io() or not.
Depending on drivers, there are some functions called between
__end_that_request_first() and end_that_request_last().
For example:
- add_disk_randomness()
- blk_queue_end_tag()
- floppy_off()
So they might prevent such generalization.
In addition to the suggested approach, what do you think about
adding a new flag to req->cmd_flags which lets the end_io() handler
not to return bio to upper layer?
It will be useful for multipathing and can be done even within
the current __end_that_request_first().
For example,
static int __end_that_request_first()
{
.....
error = 0;
if (end_io_error(uptodate))
error = !uptodate ? -EIO : uptodate;
.....
if (error && (req->cmd_flags & "NEW_FLAG"))
return 0; /* Tell the driver to call end_that_request_last() */
total_types = bio_nbytes = 0;
while ((bio = req->bio) != NULL) {
..... /* process of finishing bios */
}
.....
}
Thanks,
Kiyoshi Ueda
Kiyoshi Ueda wrote:
>
> This patch adds new "end_io_first" hook in __end_that_request_first()
> for request-based device-mapper.
>
>
> Signed-off-by: Kiyoshi Ueda <[email protected]>
> Signed-off-by: Jun'ichi Nomura <[email protected]>
>
> diff -rupN 1-blk-get-request-irqrestore/block/ll_rw_blk.c
> 2-add-generic-hook/block/ll_rw_blk.c
> --- 1-blk-get-request-irqrestore/block/ll_rw_blk.c 2006-12-15
> 10:21:29.000000000 -0500
> +++ 2-add-generic-hook/block/ll_rw_blk.c 2006-12-15 10:23:30.000000000
> -0500
> @@ -260,6 +260,7 @@ static void rq_init(request_queue_t *q,
> rq->data = NULL;
> rq->nr_phys_segments = 0;
> rq->sense = NULL;
> + rq->end_io_first = NULL;
> rq->end_io = NULL;
> rq->end_io_data = NULL;
> rq->completion_data = NULL;
> @@ -3216,6 +3217,22 @@ static int __end_that_request_first(stru
>
> blk_add_trace_rq(req->q, req, BLK_TA_COMPLETE);
>
> + if (!uptodate) {
> + if (blk_fs_request(req) && !(req->cmd_flags & REQ_QUIET))
> + printk("end_request: I/O error, dev %s, sector %llu\n",
> + req->rq_disk ? req->rq_disk->disk_name : "?",
> + (unsigned long long)req->sector);
> + }
> +
> + if (blk_fs_request(req) && req->rq_disk) {
> + const int rw = rq_data_dir(req);
> +
> + disk_stat_add(req->rq_disk, sectors[rw], nr_bytes >> 9);
> + }
> +
> + if (req->end_io_first)
> + return req->end_io_first(req, uptodate, nr_bytes);
> +
> /*
> * extend uptodate bool to allow < 0 value to be direct io error
> */
> @@ -3230,19 +3247,6 @@ static int __end_that_request_first(stru
> if (!blk_pc_request(req))
> req->errors = 0;
>
> - if (!uptodate) {
> - if (blk_fs_request(req) && !(req->cmd_flags & REQ_QUIET))
> - printk("end_request: I/O error, dev %s, sector %llu\n",
> - req->rq_disk ? req->rq_disk->disk_name : "?",
> - (unsigned long long)req->sector);
> - }
> -
> - if (blk_fs_request(req) && req->rq_disk) {
> - const int rw = rq_data_dir(req);
> -
> - disk_stat_add(req->rq_disk, sectors[rw], nr_bytes >> 9);
> - }
> -
> total_bytes = bio_nbytes = 0;
> while ((bio = req->bio) != NULL) {
> int nbytes;
> diff -rupN 1-blk-get-request-irqrestore/include/linux/blkdev.h
> 2-add-generic-hook/include/linux/blkdev.h
> --- 1-blk-get-request-irqrestore/include/linux/blkdev.h 2006-12-11
> 14:32:53.000000000 -0500
> +++ 2-add-generic-hook/include/linux/blkdev.h 2006-12-15
> 10:23:30.000000000 -0500
> @@ -126,6 +126,7 @@ void copy_io_context(struct io_context *
> void swap_io_context(struct io_context **ioc1, struct io_context **ioc2);
>
> struct request;
> +typedef int (rq_end_first_fn)(struct request *, int, int);
> typedef void (rq_end_io_fn)(struct request *, int);
>
> struct request_list {
> @@ -312,6 +313,7 @@ struct request {
> /*
> * completion callback.
> */
> + rq_end_first_fn *end_io_first;
> rq_end_io_fn *end_io;
> void *end_io_data;
> };
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
--
View this message in context: http://www.nabble.com/-RFC-PATCH-2-8--rqbased-dm%3A-add-block-layer-hook-tf2848786.html#a8016520
Sent from the linux-kernel mailing list archive at Nabble.com.
Kiyoshi Ueda wrote:
>
> Hi Jens,
>
> On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe <[email protected]>
> wrote:
>> > The new hook is needed for error handling in dm.
>> > For example, when an error occurred on a request, dm-multipath
>> > wants to try another path before returning EIO to application.
>> > Without the new hook, at the point of end_that_request_last(),
>> > the bios are already finished with error and can't be retried.
>>
>> Ok, I see what you are getting at. The current ->end_io() is called when
>> the request has fully completed, you want notification for each chunk
>> potentially completed.
>>
>> I think a better design here would be to use ->end_io() as the full
>> completion handler, similar to how bio->bi_end_io() works. A request
>> originating from __make_request() would set something ala:
>>
>> int fs_end_io(struct request *rq, int error, unsigned int nr_bytes)
>> {
>> if (!__end_that_request_first(rq, err, nr_bytes)) {
>> end_that_request_last(rq, error);
>> return 0;
>> }
>>
>> return 1;
>> }
>>
>> and normal io completion from a driver would use a helper:
>>
>> int blk_complete_io(struct request *rq, int error, unsigned int nr_bytes)
>> {
>> return rq->end_io(rq, error, nr_bytes);
>> }
>>
>> instead of calling the functions manually. That would allow you to get
>> notification right at the beginning and do what you need, without adding
>> a special hook for this.
>
> I'm not confident about what you mean.
> Something like this?
> - __make_request() sets fs_end_io() to req->end_io()
> - The driver calls blk_complete_io()
> * if it succeeds, the request is done
> * if it fails, the request is not completed
> and the driver needs retry or something
> - Current users of req->end_io() have to update/rewrite thier end_io.
> - Features like mine will set its own end_io.
> It checks error and decides whether calling fs_end_io() or not.
>
> Depending on drivers, there are some functions called between
> __end_that_request_first() and end_that_request_last().
> For example:
> - add_disk_randomness()
> - blk_queue_end_tag()
> - floppy_off()
> So they might prevent such generalization.
>
>
> In addition to the suggested approach, what do you think about
> adding a new flag to req->cmd_flags which lets the end_io() handler
> not to return bio to upper layer?
> It will be useful for multipathing and can be done even within
> the current __end_that_request_first().
> For example,
>
> static int __end_that_request_first()
> {
> .....
> error = 0;
> if (end_io_error(uptodate))
> error = !uptodate ? -EIO : uptodate;
> .....
> if (error && (req->cmd_flags & "NEW_FLAG"))
> return 0; /* Tell the driver to call end_that_request_last() */
>
> total_types = bio_nbytes = 0;
> while ((bio = req->bio) != NULL) {
> ..... /* process of finishing bios */
> }
> .....
> }
>
> Thanks,
> Kiyoshi Ueda
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
> http://www.thebusinesssuccessgroup.com/Real-Estate-Investment-training.html
--
View this message in context: http://www.nabble.com/-RFC-PATCH-2-8--rqbased-dm%3A-add-block-layer-hook-tf2848786.html#a8016546
Sent from the linux-kernel mailing list archive at Nabble.com.
Jens Axboe-5 wrote:
>
> On Tue, Dec 19 2006, Kiyoshi Ueda wrote:
>> This patch adds new "end_io_first" hook in __end_that_request_first()
>> for request-based device-mapper.
>
> What's this for, lack of stacking?
>
> --
> Jens Axboe look at this it will halp
> http://www.thebusinesssuccessgroup.com/Real-Estate-Investment-training.html
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
--
View this message in context: http://www.nabble.com/-RFC-PATCH-2-8--rqbased-dm%3A-add-block-layer-hook-tf2848786.html#a8016555
Sent from the linux-kernel mailing list archive at Nabble.com.
Kiyoshi Ueda wrote:
>
> Hi Jens,
>
> Sorry for the less explanation.
>
> On Wed, 20 Dec 2006 14:49:24 +0100, Jens Axboe <[email protected]>
> wrote:
>> On Tue, Dec 19 2006, Kiyoshi Ueda wrote:
>> > This patch adds new "end_io_first" hook in __end_that_request_first()
>> > for request-based device-mapper.
>>
>> What's this for, lack of stacking?
>
> I don't understand the meaning of "lack of stacking" well but
> I guess that it means "Is the existing hook in end_that_request_last()
> not enough?" If so, the answer is no.
> (If the geuss is wrong, please let me know.)
>
> The new hook is needed for error handling in dm.
> For example, when an error occurred on a request, dm-multipath
> wants to try another path before returning EIO to application.
> Without the new hook, at the point of end_that_request_last(),
> the bios are already finished with error and can't be retried.
>
> Thanks,
> Kiyoshi Ueda
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> Look at me im boolin
> http://www.thebusinesssuccessgroup.com/Real-Estate-Investment-training.html
--
View this message in context: http://www.nabble.com/-RFC-PATCH-2-8--rqbased-dm%3A-add-block-layer-hook-tf2848786.html#a8016586
Sent from the linux-kernel mailing list archive at Nabble.com.
Kiyoshi Ueda wrote:
> Hi Jens,
>
> On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe <[email protected]> wrote:
>>> The new hook is needed for error handling in dm.
>>> For example, when an error occurred on a request, dm-multipath
>>> wants to try another path before returning EIO to application.
>>> Without the new hook, at the point of end_that_request_last(),
>>> the bios are already finished with error and can't be retried.
>> Ok, I see what you are getting at. The current ->end_io() is called when
>> the request has fully completed, you want notification for each chunk
>> potentially completed.
>>
>> I think a better design here would be to use ->end_io() as the full
>> completion handler, similar to how bio->bi_end_io() works. A request
>> originating from __make_request() would set something ala:
>>
>> int fs_end_io(struct request *rq, int error, unsigned int nr_bytes)
>> {
>> if (!__end_that_request_first(rq, err, nr_bytes)) {
>> end_that_request_last(rq, error);
>> return 0;
>> }
>>
>> return 1;
>> }
>>
>> and normal io completion from a driver would use a helper:
>>
>> int blk_complete_io(struct request *rq, int error, unsigned int nr_bytes)
>> {
>> return rq->end_io(rq, error, nr_bytes);
>> }
>>
>> instead of calling the functions manually. That would allow you to get
>> notification right at the beginning and do what you need, without adding
>> a special hook for this.
>
> I'm not confident about what you mean.
> Something like this?
> - __make_request() sets fs_end_io() to req->end_io()
> - The driver calls blk_complete_io()
> * if it succeeds, the request is done
> * if it fails, the request is not completed
> and the driver needs retry or something
> - Current users of req->end_io() have to update/rewrite thier end_io.
> - Features like mine will set its own end_io.
> It checks error and decides whether calling fs_end_io() or not.
>
> Depending on drivers, there are some functions called between
> __end_that_request_first() and end_that_request_last().
> For example:
> - add_disk_randomness()
> - blk_queue_end_tag()
> - floppy_off()
> So they might prevent such generalization.
>
>
> In addition to the suggested approach, what do you think about
> adding a new flag to req->cmd_flags which lets the end_io() handler
> not to return bio to upper layer?
> It will be useful for multipathing and can be done even within
> the current __end_that_request_first().
> For example,
>
> static int __end_that_request_first()
> {
> .....
> error = 0;
> if (end_io_error(uptodate))
> error = !uptodate ? -EIO : uptodate;
> .....
> if (error && (req->cmd_flags & "NEW_FLAG"))
> return 0; /* Tell the driver to call end_that_request_last() */
>
> total_types = bio_nbytes = 0;
> while ((bio = req->bio) != NULL) {
> ..... /* process of finishing bios */
> }
> .....
> }
>
Who would call end_that_request_first with the new flag set? The scsi
layer or multipath layer?
The end_io_first callout was a hack around the lack of stacking and
because I was not yet sure how to handle medium errors.
We hooked into end_that_request_first, because for SCSI we can get a
medium error and the scsi layer will complete the first X bytes of a
request, then retry the leftover part itself. For this error we want to
update the request and bio fields so that when the request is resent by
the scsi layer, the scatterlist will get made with the updated values.
Maybe if FAILFAST is made to cover all errors then we would not need
this type of hack. Having multipath handle medium errors seems a little
silly though since the scsi layer knows better what to do there.
Another alternative is to do something similar to what bio based dm does
today. The bio/bvec update code and bio mapping and stacking has a
similar problem. In dm we have that bio record/details code which copies
some of the bio fields and dm-mpath also does not do partial retries.
For example, on a medium error where part of a bio is successful but the
end part fails because of a transport error and needs to be retried this
will result in the entire bio being redriven.
Hi Mike,
On Fri, 22 Dec 2006 01:18:44 -0600, Mike Christie <[email protected]> wrote:
> > In addition to the suggested approach, what do you think about
> > adding a new flag to req->cmd_flags which lets the end_io() handler
> > not to return bio to upper layer?
> > It will be useful for multipathing and can be done even within
> > the current __end_that_request_first().
> > For example,
> >
> > static int __end_that_request_first()
> > {
> > .....
> > error = 0;
> > if (end_io_error(uptodate))
> > error = !uptodate ? -EIO : uptodate;
> > .....
> > if (error && (req->cmd_flags & "NEW_FLAG"))
> > return 0; /* Tell the driver to call end_that_request_last() */
> >
> > total_types = bio_nbytes = 0;
> > while ((bio = req->bio) != NULL) {
> > ..... /* process of finishing bios */
> > }
> > .....
> > }
> >
>
> Who would call end_that_request_first with the new flag set? The scsi
> layer or multipath layer?
Multipath layer sets the new flag.
SCSI layer of an underlying device calls __end_that_request_first()
for a cloned request. And original bios which were issued to a dm
device will be completed through the cloned request when no error
occurs on the clone. When an error occurs, the completion process of
the bios are skipped.
Thanks,
Kiyoshi Ueda