LinuxLists.cc - [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

2006-12-19 22:11:34

Subject: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Currently, blk_get_request() is not ready for being called from
interrupt context because it enables interrupt forcibly in it.

Request-based device-mapper sometimes needs to get a request
in interrupt context to create a clone.
Calling blk_get_request() from interrupt context should be OK
because blk_get_request() returns NULL without sleep if no memory
unless __GFP_WAIT mask is specified, and then the interrupt context
can plug queue to retry after and return immediately.

So this patch allows blk_get_request() to be called from interrupt
context by save/restore current state of irq.

Signed-off-by: Kiyoshi Ueda <[email protected]>
Signed-off-by: Jun'ichi Nomura <[email protected]>

diff -rupN 2.6.19.1/block/ll_rw_blk.c 1-blk-get-request-irqrestore/block/ll_rw_blk.c
--- 2.6.19.1/block/ll_rw_blk.c 2006-12-11 14:32:53.000000000 -0500
+++ 1-blk-get-request-irqrestore/block/ll_rw_blk.c 2006-12-15 10:21:29.000000000 -0500
@@ -2064,9 +2064,10 @@ static void freed_request(request_queue_
* Get a free request, queue_lock must be held.
* Returns NULL on failure, with queue_lock held.
* Returns !NULL on success, with queue_lock *not held*.
+ * If flags is given, the irq state is kept when unlocking the queue_lock.
*/
static struct request *get_request(request_queue_t *q, int rw, struct bio *bio,
- gfp_t gfp_mask)
+ gfp_t gfp_mask, unsigned long *flags)
{
struct request *rq = NULL;
struct request_list *rl = &q->rq;
@@ -2119,7 +2120,10 @@ static struct request *get_request(reque
if (priv)
rl->elvpriv++;

- spin_unlock_irq(q->queue_lock);
+ if (flags)
+ spin_unlock_irqrestore(q->queue_lock, *flags);
+ else
+ spin_unlock_irq(q->queue_lock);

rq = blk_alloc_request(q, rw, priv, gfp_mask);
if (unlikely(!rq)) {
@@ -2130,7 +2134,10 @@ static struct request *get_request(reque
* Allocating task should really be put onto the front of the
* wait queue, but this is pretty rare.
*/
- spin_lock_irq(q->queue_lock);
+ if (flags)
+ spin_lock_irqsave(q->queue_lock, *flags);
+ else
+ spin_lock_irq(q->queue_lock);
freed_request(q, rw, priv);

/*
@@ -2174,7 +2181,7 @@ static struct request *get_request_wait(
{
struct request *rq;

- rq = get_request(q, rw, bio, GFP_NOIO);
+ rq = get_request(q, rw, bio, GFP_NOIO, NULL);
while (!rq) {
DEFINE_WAIT(wait);
struct request_list *rl = &q->rq;
@@ -2182,7 +2189,7 @@ static struct request *get_request_wait(
prepare_to_wait_exclusive(&rl->wait[rw], &wait,
TASK_UNINTERRUPTIBLE);

- rq = get_request(q, rw, bio, GFP_NOIO);
+ rq = get_request(q, rw, bio, GFP_NOIO, NULL);

if (!rq) {
struct io_context *ioc;
@@ -2213,16 +2220,17 @@ static struct request *get_request_wait(
struct request *blk_get_request(request_queue_t *q, int rw, gfp_t gfp_mask)
{
struct request *rq;
+ unsigned long flags;

BUG_ON(rw != READ && rw != WRITE);

- spin_lock_irq(q->queue_lock);
+ spin_lock_irqsave(q->queue_lock, flags);
if (gfp_mask & __GFP_WAIT) {
rq = get_request_wait(q, rw, NULL);
} else {
- rq = get_request(q, rw, NULL, gfp_mask);
+ rq = get_request(q, rw, NULL, gfp_mask, &flags);
if (!rq)
- spin_unlock_irq(q->queue_lock);
+ spin_unlock_irqrestore(q->queue_lock, flags);
}
/* q->queue_lock is unlocked at this point */

2006-12-20 13:47:05

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Tue, Dec 19 2006, Kiyoshi Ueda wrote:
> Currently, blk_get_request() is not ready for being called from
> interrupt context because it enables interrupt forcibly in it.
>
> Request-based device-mapper sometimes needs to get a request
> in interrupt context to create a clone.
> Calling blk_get_request() from interrupt context should be OK
> because blk_get_request() returns NULL without sleep if no memory
> unless __GFP_WAIT mask is specified, and then the interrupt context
> can plug queue to retry after and return immediately.
>
> So this patch allows blk_get_request() to be called from interrupt
> context by save/restore current state of irq.
>
>
> Signed-off-by: Kiyoshi Ueda <[email protected]>
> Signed-off-by: Jun'ichi Nomura <[email protected]>
>
> diff -rupN 2.6.19.1/block/ll_rw_blk.c 1-blk-get-request-irqrestore/block/ll_rw_blk.c
> --- 2.6.19.1/block/ll_rw_blk.c 2006-12-11 14:32:53.000000000 -0500
> +++ 1-blk-get-request-irqrestore/block/ll_rw_blk.c 2006-12-15 10:21:29.000000000 -0500
> @@ -2064,9 +2064,10 @@ static void freed_request(request_queue_
> * Get a free request, queue_lock must be held.
> * Returns NULL on failure, with queue_lock held.
> * Returns !NULL on success, with queue_lock *not held*.
> + * If flags is given, the irq state is kept when unlocking the queue_lock.
> */
> static struct request *get_request(request_queue_t *q, int rw, struct bio *bio,
> - gfp_t gfp_mask)
> + gfp_t gfp_mask, unsigned long *flags)
> {
> struct request *rq = NULL;
> struct request_list *rl = &q->rq;
> @@ -2119,7 +2120,10 @@ static struct request *get_request(reque
> if (priv)
> rl->elvpriv++;
>
> - spin_unlock_irq(q->queue_lock);
> + if (flags)
> + spin_unlock_irqrestore(q->queue_lock, *flags);
> + else
> + spin_unlock_irq(q->queue_lock);

Big NACK on this - it's not only really ugly, it's also buggy to pass
interrupt flags as function arguments. As you also mention in the 0/1
mail, this also breaks CFQ.

Why do you need in-interrupt request allocation?

--
Jens Axboe

2006-12-20 17:50:43

by Kiyoshi Ueda

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Hi Jens,

Thank you for the comment.

On Wed, 20 Dec 2006 14:48:49 +0100, Jens Axboe <[email protected]> wrote:
> > static struct request *get_request(request_queue_t *q, int rw, struct bio *bio,
> > - gfp_t gfp_mask)
> > + gfp_t gfp_mask, unsigned long *flags)
> > {
> > struct request *rq = NULL;
> > struct request_list *rl = &q->rq;
> > @@ -2119,7 +2120,10 @@ static struct request *get_request(reque
> > if (priv)
> > rl->elvpriv++;
> >
> > - spin_unlock_irq(q->queue_lock);
> > + if (flags)
> > + spin_unlock_irqrestore(q->queue_lock, *flags);
> > + else
> > + spin_unlock_irq(q->queue_lock);
>
> Big NACK on this - it's not only really ugly, it's also buggy to pass
> interrupt flags as function arguments. As you also mention in the 0/1
> mail, this also breaks CFQ.
>
> Why do you need in-interrupt request allocation?

Because I'd like to use blk_get_request() in q->request_fn()
which can be called from interrupt context like below:
scsi_io_completion -> scsi_end_request -> scsi_next_command
-> scsi_run_queue -> blk_run_queue -> q->request_fn

Generally, device-mapper (dm) clones an original I/O and dispatches
the clones to underlying destination devices.
In the request-based dm patch, the clone creation and the dispatch
are done in q->request_fn(). To create the clone, blk_get_request()
is used to get a request from underlying destination device's queue.
By doing that in q->request_fn(), dm can deal with struct request
after bios are merged by __make_request().

Do you think creating another function like blk_get_request_nowait()
is acceptable?
Or request should not be allocated in q->request_fn() anyway?
Do you have any other ideas?

> --
> Jens Axboe

Thanks,
Kiyoshi Ueda

2006-12-20 18:47:31

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Wed, Dec 20 2006, Kiyoshi Ueda wrote:
> Hi Jens,
>
> Thank you for the comment.
>
> On Wed, 20 Dec 2006 14:48:49 +0100, Jens Axboe <[email protected]> wrote:
> > > static struct request *get_request(request_queue_t *q, int rw, struct bio *bio,
> > > - gfp_t gfp_mask)
> > > + gfp_t gfp_mask, unsigned long *flags)
> > > {
> > > struct request *rq = NULL;
> > > struct request_list *rl = &q->rq;
> > > @@ -2119,7 +2120,10 @@ static struct request *get_request(reque
> > > if (priv)
> > > rl->elvpriv++;
> > >
> > > - spin_unlock_irq(q->queue_lock);
> > > + if (flags)
> > > + spin_unlock_irqrestore(q->queue_lock, *flags);
> > > + else
> > > + spin_unlock_irq(q->queue_lock);
> >
> > Big NACK on this - it's not only really ugly, it's also buggy to pass
> > interrupt flags as function arguments. As you also mention in the 0/1
> > mail, this also breaks CFQ.
> >
> > Why do you need in-interrupt request allocation?
>
> Because I'd like to use blk_get_request() in q->request_fn()
> which can be called from interrupt context like below:
> scsi_io_completion -> scsi_end_request -> scsi_next_command
> -> scsi_run_queue -> blk_run_queue -> q->request_fn
>
> Generally, device-mapper (dm) clones an original I/O and dispatches
> the clones to underlying destination devices.
> In the request-based dm patch, the clone creation and the dispatch
> are done in q->request_fn(). To create the clone, blk_get_request()
> is used to get a request from underlying destination device's queue.
> By doing that in q->request_fn(), dm can deal with struct request
> after bios are merged by __make_request().
>
> Do you think creating another function like blk_get_request_nowait()
> is acceptable?
> Or request should not be allocated in q->request_fn() anyway?

You should not be allocating requests from that path, for a number of
reasons. The design isn't very nice either.

The easy way out would be to punt to a workqueue to handle the requests.

An alternative way would be to set aside some requests that you can get
at without allocation (maintain a little freelist of manually allocated
requests), and retrieve a free one from there when inside request_fn. If
you run out, just bail out of request_fn and make sure to reinvoke it
when some of your previously issued requests complete and are added back
to that freelist.

--
Jens Axboe

2006-12-20 19:11:42

by Chen, Kenneth W

[permalink] [raw]

Subject: RE: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Kiyoshi Ueda wrote on Wednesday, December 20, 2006 9:50 AM
> On Wed, 20 Dec 2006 14:48:49 +0100, Jens Axboe <[email protected]> wrote:
> > Big NACK on this - it's not only really ugly, it's also buggy to pass
> > interrupt flags as function arguments. As you also mention in the 0/1
> > mail, this also breaks CFQ.
> >
> > Why do you need in-interrupt request allocation?
>
> Because I'd like to use blk_get_request() in q->request_fn()
> which can be called from interrupt context like below:
> scsi_io_completion -> scsi_end_request -> scsi_next_command
> -> scsi_run_queue -> blk_run_queue -> q->request_fn
>
> [ ...]
>
> Do you think creating another function like blk_get_request_nowait()
> is acceptable?

You don't need to create another function. blk_get_request already
have both wait and nowait semantics via gfp_mask argument. If you can
not block, then clear __GFP_WAIT bit in the mask before calling
blk_get_request.

- Ken

2006-12-20 19:15:50

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Wed, Dec 20 2006, Chen, Kenneth W wrote:
> Kiyoshi Ueda wrote on Wednesday, December 20, 2006 9:50 AM
> > On Wed, 20 Dec 2006 14:48:49 +0100, Jens Axboe <[email protected]> wrote:
> > > Big NACK on this - it's not only really ugly, it's also buggy to pass
> > > interrupt flags as function arguments. As you also mention in the 0/1
> > > mail, this also breaks CFQ.
> > >
> > > Why do you need in-interrupt request allocation?
> >
> > Because I'd like to use blk_get_request() in q->request_fn()
> > which can be called from interrupt context like below:
> > scsi_io_completion -> scsi_end_request -> scsi_next_command
> > -> scsi_run_queue -> blk_run_queue -> q->request_fn
> >
> > [ ...]
> >
> > Do you think creating another function like blk_get_request_nowait()
> > is acceptable?
>
> You don't need to create another function. blk_get_request already
> have both wait and nowait semantics via gfp_mask argument. If you can
> not block, then clear __GFP_WAIT bit in the mask before calling
> blk_get_request.

Doesn't work, get_request() assumes that the caller grabbed the queue
lock and disabled interrupts, and does an unconditionaly

spin_unlock_irq()

inside it. So you can NEVER use get_request() for even GFP_ATOMIC
allocations, as it assumes the original context was a process context.

--
Jens Axboe

2006-12-20 21:56:05

by Kiyoshi Ueda

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Hi Jens,

On Wed, 20 Dec 2006 19:49:17 +0100, Jens Axboe <[email protected]> wrote:
> > > Big NACK on this - it's not only really ugly, it's also buggy to pass
> > > interrupt flags as function arguments. As you also mention in the 0/1
> > > mail, this also breaks CFQ.
> > >
> > > Why do you need in-interrupt request allocation?
> >
> > Because I'd like to use blk_get_request() in q->request_fn()
> > which can be called from interrupt context like below:
> > scsi_io_completion -> scsi_end_request -> scsi_next_command
> > -> scsi_run_queue -> blk_run_queue -> q->request_fn
> >
> > Generally, device-mapper (dm) clones an original I/O and dispatches
> > the clones to underlying destination devices.
> > In the request-based dm patch, the clone creation and the dispatch
> > are done in q->request_fn(). To create the clone, blk_get_request()
> > is used to get a request from underlying destination device's queue.
> > By doing that in q->request_fn(), dm can deal with struct request
> > after bios are merged by __make_request().
> >
> > Do you think creating another function like blk_get_request_nowait()
> > is acceptable?
> > Or request should not be allocated in q->request_fn() anyway?
>
> You should not be allocating requests from that path, for a number of
> reasons.

Could I hear the reasons for my further work if possible?
Because of breaking current CFQ? And is there any reason?

> The design isn't very nice either.
>
> The easy way out would be to punt to a workqueue to handle the requests.
>
> An alternative way would be to set aside some requests that you can get
> at without allocation (maintain a little freelist of manually allocated
> requests), and retrieve a free one from there when inside request_fn. If
> you run out, just bail out of request_fn and make sure to reinvoke it
> when some of your previously issued requests complete and are added back
> to that freelist.

Thank you for the suggestions.
OK, I'll think other designs based on your suggestions.

Thanks,
Kiyoshi Ueda

2006-12-21 07:51:17

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Wed, Dec 20 2006, Kiyoshi Ueda wrote:
> Hi Jens,
>
> On Wed, 20 Dec 2006 19:49:17 +0100, Jens Axboe <[email protected]> wrote:
> > > > Big NACK on this - it's not only really ugly, it's also buggy to pass
> > > > interrupt flags as function arguments. As you also mention in the 0/1
> > > > mail, this also breaks CFQ.
> > > >
> > > > Why do you need in-interrupt request allocation?
> > >
> > > Because I'd like to use blk_get_request() in q->request_fn()
> > > which can be called from interrupt context like below:
> > > scsi_io_completion -> scsi_end_request -> scsi_next_command
> > > -> scsi_run_queue -> blk_run_queue -> q->request_fn
> > >
> > > Generally, device-mapper (dm) clones an original I/O and dispatches
> > > the clones to underlying destination devices.
> > > In the request-based dm patch, the clone creation and the dispatch
> > > are done in q->request_fn(). To create the clone, blk_get_request()
> > > is used to get a request from underlying destination device's queue.
> > > By doing that in q->request_fn(), dm can deal with struct request
> > > after bios are merged by __make_request().
> > >
> > > Do you think creating another function like blk_get_request_nowait()
> > > is acceptable?
> > > Or request should not be allocated in q->request_fn() anyway?
> >
> > You should not be allocating requests from that path, for a number of
> > reasons.
>
> Could I hear the reasons for my further work if possible?
> Because of breaking current CFQ? And is there any reason?

Mainly I just don't like the design, there are better ways to achieve
what you need. The block layer has certain assumptions on the context
from which rq allocation happens, and this breaks it. As I also
mentioned, you cannot pass flags around as arguments. So the patch is
even broken as-is.

--
Jens Axboe

2006-12-21 18:11:42

by Kiyoshi Ueda

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Hi Jens,

OK, I understand that.
But I think that the block layer assumption (depending on "current")
is not ideal.
Anyway, thank you for the information.

Thanks,
Kiyoshi Ueda

On Thu, 21 Dec 2006 08:53:05 +0100, Jens Axboe <[email protected]> wrote:
> On Wed, Dec 20 2006, Kiyoshi Ueda wrote:
> > Hi Jens,
> >
> > On Wed, 20 Dec 2006 19:49:17 +0100, Jens Axboe <[email protected]> wrote:
> > > > > Big NACK on this - it's not only really ugly, it's also buggy to pass
> > > > > interrupt flags as function arguments. As you also mention in the 0/1
> > > > > mail, this also breaks CFQ.
> > > > >
> > > > > Why do you need in-interrupt request allocation?
> > > >
> > > > Because I'd like to use blk_get_request() in q->request_fn()
> > > > which can be called from interrupt context like below:
> > > > scsi_io_completion -> scsi_end_request -> scsi_next_command
> > > > -> scsi_run_queue -> blk_run_queue -> q->request_fn
> > > >
> > > > Generally, device-mapper (dm) clones an original I/O and dispatches
> > > > the clones to underlying destination devices.
> > > > In the request-based dm patch, the clone creation and the dispatch
> > > > are done in q->request_fn(). To create the clone, blk_get_request()
> > > > is used to get a request from underlying destination device's queue.
> > > > By doing that in q->request_fn(), dm can deal with struct request
> > > > after bios are merged by __make_request().
> > > >
> > > > Do you think creating another function like blk_get_request_nowait()
> > > > is acceptable?
> > > > Or request should not be allocated in q->request_fn() anyway?
> > >
> > > You should not be allocating requests from that path, for a number of
> > > reasons.
> >
> > Could I hear the reasons for my further work if possible?
> > Because of breaking current CFQ? And is there any reason?
>
> Mainly I just don't like the design, there are better ways to achieve
> what you need. The block layer has certain assumptions on the context
> from which rq allocation happens, and this breaks it. As I also
> mentioned, you cannot pass flags around as arguments. So the patch is
> even broken as-is.
>
> --
> Jens Axboe

2006-12-21 18:13:15

by Mike Christie

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Mike Christie wrote:
> Jens Axboe wrote:
>> On Wed, Dec 20 2006, Kiyoshi Ueda wrote:
>>> Hi Jens,
>>>
>>> On Wed, 20 Dec 2006 19:49:17 +0100, Jens Axboe <[email protected]> wrote:
>>>>>> Big NACK on this - it's not only really ugly, it's also buggy to pass
>>>>>> interrupt flags as function arguments. As you also mention in the 0/1
>>>>>> mail, this also breaks CFQ.
>>>>>>
>>>>>> Why do you need in-interrupt request allocation?
>>>>>
>>>>> Because I'd like to use blk_get_request() in q->request_fn()
>>>>> which can be called from interrupt context like below:
>>>>> scsi_io_completion -> scsi_end_request -> scsi_next_command
>>>>> -> scsi_run_queue -> blk_run_queue -> q->request_fn
>>>>>
>>>>> Generally, device-mapper (dm) clones an original I/O and dispatches
>>>>> the clones to underlying destination devices.
>>>>> In the request-based dm patch, the clone creation and the dispatch
>>>>> are done in q->request_fn(). To create the clone, blk_get_request()
>>>>> is used to get a request from underlying destination device's queue.
>>>>> By doing that in q->request_fn(), dm can deal with struct request
>>>>> after bios are merged by __make_request().
>>>>>
>>>>> Do you think creating another function like blk_get_request_nowait()
>>>>> is acceptable?
>>>>> Or request should not be allocated in q->request_fn() anyway?
>>>> You should not be allocating requests from that path, for a number of
>>>> reasons.
>>> Could I hear the reasons for my further work if possible?
>>> Because of breaking current CFQ? And is there any reason?
>> Mainly I just don't like the design, there are better ways to achieve
>> what you need. The block layer has certain assumptions on the context
>> from which rq allocation happens, and this breaks it. As I also
>> mentioned, you cannot pass flags around as arguments. So the patch is
>> even broken as-is.
>>
>
>
> I was thinking that since this was going to be hooked into dm which has
> the make_request hook in code, could we just allocate the cloned request
> when from dm's make_request callout. The dm queue would call
> __make_request, and if it detected that the bio started a new request it
> would just allocate a second request which would be used as a clone or
> maybe the block layer could allocate the clone request for us. On the
> request_fn callout side, DM could then setup the cloned rq based on the
> original fields and pass it down to the dm-multipath request_fn. The
> dm-mutlipath request_fn then just decides which path to use based on the
> path-selector modules and then we send it off.
>

Or the block layer code could set up the clone too. elv_next_request
could prep a clone based on the orignal request for the driver then dm
would not have to worry about that part.

2006-12-21 18:19:57

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Thu, Dec 21 2006, Kiyoshi Ueda wrote:
> Hi Jens,
>
> OK, I understand that.
> But I think that the block layer assumption (depending on "current")
> is not ideal.
> Anyway, thank you for the information.

(don't top post)

Well, how else would you throttle request allocations on a process
basis? IO priorities don't extend to request allocation yet, but if/when
they do, this will extend it. It may not be ideal for your situation,
but it is for a host of other uses.

It's relatively easy to do this in the driver - you need one request
allocated at init time, as a backup. Then instead of using
blk_get_request(), you use use kmalloc or kmem_cache_alloc from the
block dev rq pool. If the allocation fails and you have nothing in
flight, you grab the request you allocated at init time and use that.
Export the rq initialization as needed from ll_rw_blk.c.

And that's it, really. Without uglification to ll_rw_blk.c or stripping
features from that.

--
Jens Axboe

2006-12-21 18:22:41

by Jens Axboe

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Thu, Dec 21 2006, Mike Christie wrote:
> Or the block layer code could set up the clone too. elv_next_request
> could prep a clone based on the orignal request for the driver then dm
> would not have to worry about that part.

It really can't, since it doesn't know how to allocate the clone
request. I'd rather export this functionality as helpers.

--
Jens Axboe

2006-12-21 18:27:49

by Mike Christie

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Jens Axboe wrote:
> On Wed, Dec 20 2006, Kiyoshi Ueda wrote:
>> Hi Jens,
>>
>> On Wed, 20 Dec 2006 19:49:17 +0100, Jens Axboe <[email protected]> wrote:
>>>>> Big NACK on this - it's not only really ugly, it's also buggy to pass
>>>>> interrupt flags as function arguments. As you also mention in the 0/1
>>>>> mail, this also breaks CFQ.
>>>>>
>>>>> Why do you need in-interrupt request allocation?
>>>>
>>>> Because I'd like to use blk_get_request() in q->request_fn()
>>>> which can be called from interrupt context like below:
>>>> scsi_io_completion -> scsi_end_request -> scsi_next_command
>>>> -> scsi_run_queue -> blk_run_queue -> q->request_fn
>>>>
>>>> Generally, device-mapper (dm) clones an original I/O and dispatches
>>>> the clones to underlying destination devices.
>>>> In the request-based dm patch, the clone creation and the dispatch
>>>> are done in q->request_fn(). To create the clone, blk_get_request()
>>>> is used to get a request from underlying destination device's queue.
>>>> By doing that in q->request_fn(), dm can deal with struct request
>>>> after bios are merged by __make_request().
>>>>
>>>> Do you think creating another function like blk_get_request_nowait()
>>>> is acceptable?
>>>> Or request should not be allocated in q->request_fn() anyway?
>>> You should not be allocating requests from that path, for a number of
>>> reasons.
>> Could I hear the reasons for my further work if possible?
>> Because of breaking current CFQ? And is there any reason?
>
> Mainly I just don't like the design, there are better ways to achieve
> what you need. The block layer has certain assumptions on the context
> from which rq allocation happens, and this breaks it. As I also
> mentioned, you cannot pass flags around as arguments. So the patch is
> even broken as-is.
>

I was thinking that since this was going to be hooked into dm which has
the make_request hook in code, could we just allocate the cloned request
when from dm's make_request callout. The dm queue would call
__make_request, and if it detected that the bio started a new request it
would just allocate a second request which would be used as a clone or
maybe the block layer could allocate the clone request for us. On the
request_fn callout side, DM could then setup the cloned rq based on the
original fields and pass it down to the dm-multipath request_fn. The
dm-mutlipath request_fn then just decides which path to use based on the
path-selector modules and then we send it off.

2006-12-21 18:31:15

by Mike Christie

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Jens Axboe wrote:
> On Thu, Dec 21 2006, Mike Christie wrote:
>> Or the block layer code could set up the clone too. elv_next_request
>> could prep a clone based on the orignal request for the driver then dm
>> would not have to worry about that part.
>
> It really can't, since it doesn't know how to allocate the clone
> request. I'd rather export this functionality as helpers.
>

What do you think about dm's plan to break up make_request into a
mapping function and in to the part the builds the bio into a request.
This would fit well with them being helpers and being able to allocate
the request from the correct context.

I see patches for that did not get posted, but I thought Joe and
Alasdair used to talk about that a lot and in the dm code I think there
is sill comments about doing it. Maybe the dm comments mentioned the
merge_fn, but I guess the merge_fn did not fit what they wanted to do or
something. I think Alasdair talked about this at one of his talks at OLS
or it was in a proposal for the kernel summit. I can dig up the mail if
you want.

2006-12-21 18:37:15

by Mike Christie

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Mike Christie wrote:
> Jens Axboe wrote:
>> On Thu, Dec 21 2006, Mike Christie wrote:
>>> Or the block layer code could set up the clone too. elv_next_request
>>> could prep a clone based on the orignal request for the driver then dm
>>> would not have to worry about that part.
>> It really can't, since it doesn't know how to allocate the clone
>> request. I'd rather export this functionality as helpers.
>>
>
> What do you think about dm's plan to break up make_request into a
> mapping function and in to the part the builds the bio into a request.
> This would fit well with them being helpers and being able to allocate
> the request from the correct context.
>
> I see patches for that did not get posted, but I thought Joe and
> Alasdair used to talk about that a lot and in the dm code I think there
> is sill comments about doing it. Maybe the dm comments mentioned the
> merge_fn, but I guess the merge_fn did not fit what they wanted to do or
> something. I think Alasdair talked about this at one of his talks at OLS
> or it was in a proposal for the kernel summit. I can dig up the mail if
> you want.
>

Ignore that. The problem would be that we may not want to decide which
path to use at map time.

2006-12-21 18:39:01

by Jens Axboe

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Thu, Dec 21 2006, Mike Christie wrote:
> Jens Axboe wrote:
> > On Thu, Dec 21 2006, Mike Christie wrote:
> >> Or the block layer code could set up the clone too. elv_next_request
> >> could prep a clone based on the orignal request for the driver then dm
> >> would not have to worry about that part.
> >
> > It really can't, since it doesn't know how to allocate the clone
> > request. I'd rather export this functionality as helpers.
> >
>
> What do you think about dm's plan to break up make_request into a
> mapping function and in to the part the builds the bio into a request.
> This would fit well with them being helpers and being able to allocate
> the request from the correct context.

I think it sounds promising! dm probably still needs its own mempool for
request allocation, but that should be doable.

> I see patches for that did not get posted, but I thought Joe and
> Alasdair used to talk about that a lot and in the dm code I think there
> is sill comments about doing it. Maybe the dm comments mentioned the
> merge_fn, but I guess the merge_fn did not fit what they wanted to do or
> something. I think Alasdair talked about this at one of his talks at OLS
> or it was in a proposal for the kernel summit. I can dig up the mail if
> you want.

Not sure I remember the details of that one, so the mail/thread might be
useful.

--
Jens Axboe

2006-12-21 18:40:14

by Jens Axboe

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Thu, Dec 21 2006, Mike Christie wrote:
> Mike Christie wrote:
> > Jens Axboe wrote:
> >> On Thu, Dec 21 2006, Mike Christie wrote:
> >>> Or the block layer code could set up the clone too. elv_next_request
> >>> could prep a clone based on the orignal request for the driver then dm
> >>> would not have to worry about that part.
> >> It really can't, since it doesn't know how to allocate the clone
> >> request. I'd rather export this functionality as helpers.
> >>
> >
> > What do you think about dm's plan to break up make_request into a
> > mapping function and in to the part the builds the bio into a request.
> > This would fit well with them being helpers and being able to allocate
> > the request from the correct context.
> >
> > I see patches for that did not get posted, but I thought Joe and
> > Alasdair used to talk about that a lot and in the dm code I think there
> > is sill comments about doing it. Maybe the dm comments mentioned the
> > merge_fn, but I guess the merge_fn did not fit what they wanted to do or
> > something. I think Alasdair talked about this at one of his talks at OLS
> > or it was in a proposal for the kernel summit. I can dig up the mail if
> > you want.
> >
>
> Ignore that. The problem would be that we may not want to decide which
> path to use at map time.

Latter part, or both paragraphs? Dipping into ->make_request_fn() for
some parts do seem to make sense to me. It'll be cheaper than at
potential soft irq time (from elv_next_request()).

--
Jens Axboe

2006-12-21 18:58:15

by Mike Christie

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Jens Axboe wrote:
> On Thu, Dec 21 2006, Mike Christie wrote:
>> Mike Christie wrote:
>>> Jens Axboe wrote:
>>>> On Thu, Dec 21 2006, Mike Christie wrote:
>>>>> Or the block layer code could set up the clone too. elv_next_request
>>>>> could prep a clone based on the orignal request for the driver then dm
>>>>> would not have to worry about that part.
>>>> It really can't, since it doesn't know how to allocate the clone
>>>> request. I'd rather export this functionality as helpers.
>>>>
>>> What do you think about dm's plan to break up make_request into a
>>> mapping function and in to the part the builds the bio into a request.
>>> This would fit well with them being helpers and being able to allocate
>>> the request from the correct context.
>>>
>>> I see patches for that did not get posted, but I thought Joe and
>>> Alasdair used to talk about that a lot and in the dm code I think there
>>> is sill comments about doing it. Maybe the dm comments mentioned the
>>> merge_fn, but I guess the merge_fn did not fit what they wanted to do or
>>> something. I think Alasdair talked about this at one of his talks at OLS
>>> or it was in a proposal for the kernel summit. I can dig up the mail if
>>> you want.
>>>
>> Ignore that. The problem would be that we may not want to decide which
>> path to use at map time.
>
> Latter part, or both paragraphs? Dipping into ->make_request_fn() for
> some parts do seem to make sense to me. It'll be cheaper than at
> potential soft irq time (from elv_next_request()).
>

I think we got crisscrossed.

The original idea but using your helper suggestion would have been this:

dm->make_request_fn(bio)
{
rq = __make_request(bio)
if (this is a new request) {
allocate clone from either a real device/path specific mempool() or a
dm q mempool
}

dm->prep_fn(rq)
{
setup clone rq fields based on orig request fields.
}

dm->request_fn(rq)
{
figure out which path to use;
set rq->q;
send cloned rq to real device;
}

The second idea based on Joe and Alasdair to break up make_request would
just have been a more formal break up of the dm->make_request_fn above,
because I thought your comment about not knowing how to allocate the
clone request meant that we did not know which q's mempool to take the
request from if we were going to take the cloned request from the real
device/path's mempool. I guess this does not really matter since we can
have just a dm q mempool of requests to use for cloned requests.

2006-12-21 19:17:21

by Jens Axboe

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Thu, Dec 21 2006, Mike Christie wrote:
> Jens Axboe wrote:
> > On Thu, Dec 21 2006, Mike Christie wrote:
> >> Mike Christie wrote:
> >>> Jens Axboe wrote:
> >>>> On Thu, Dec 21 2006, Mike Christie wrote:
> >>>>> Or the block layer code could set up the clone too. elv_next_request
> >>>>> could prep a clone based on the orignal request for the driver then dm
> >>>>> would not have to worry about that part.
> >>>> It really can't, since it doesn't know how to allocate the clone
> >>>> request. I'd rather export this functionality as helpers.
> >>>>
> >>> What do you think about dm's plan to break up make_request into a
> >>> mapping function and in to the part the builds the bio into a request.
> >>> This would fit well with them being helpers and being able to allocate
> >>> the request from the correct context.
> >>>
> >>> I see patches for that did not get posted, but I thought Joe and
> >>> Alasdair used to talk about that a lot and in the dm code I think there
> >>> is sill comments about doing it. Maybe the dm comments mentioned the
> >>> merge_fn, but I guess the merge_fn did not fit what they wanted to do or
> >>> something. I think Alasdair talked about this at one of his talks at OLS
> >>> or it was in a proposal for the kernel summit. I can dig up the mail if
> >>> you want.
> >>>
> >> Ignore that. The problem would be that we may not want to decide which
> >> path to use at map time.
> >
> > Latter part, or both paragraphs? Dipping into ->make_request_fn() for
> > some parts do seem to make sense to me. It'll be cheaper than at
> > potential soft irq time (from elv_next_request()).
> >
>
> I think we got crisscrossed.
>
> The original idea but using your helper suggestion would have been this:
>
> dm->make_request_fn(bio)
> {
> rq = __make_request(bio)
> if (this is a new request) {
> allocate clone from either a real device/path specific mempool() or a
> dm q mempool
> }
>
>
> dm->prep_fn(rq)
> {
> setup clone rq fields based on orig request fields.
> }
>
> dm->request_fn(rq)
> {
> figure out which path to use;
> set rq->q;
> send cloned rq to real device;
> }

This'll work nicely, much better.

> The second idea based on Joe and Alasdair to break up make_request would
> just have been a more formal break up of the dm->make_request_fn above,
> because I thought your comment about not knowing how to allocate the
> clone request meant that we did not know which q's mempool to take the
> request from if we were going to take the cloned request from the real
> device/path's mempool. I guess this does not really matter since we can
> have just a dm q mempool of requests to use for cloned requests.

Either approach is fine with me. Note that you need to be careful with
foreign requests on a queue, see the elevator drain logic for barriers
and scheduler switching.

--
Jens Axboe

2006-12-21 22:22:14

by Mike Christie

[permalink] [raw]

Subject: Re: [dm-devel] Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

Jens Axboe wrote:
> On Thu, Dec 21 2006, Mike Christie wrote:
>> Jens Axboe wrote:
>>> On Thu, Dec 21 2006, Mike Christie wrote:
>>>> Mike Christie wrote:
>>>>> Jens Axboe wrote:
>>>>>> On Thu, Dec 21 2006, Mike Christie wrote:
>>>>>>> Or the block layer code could set up the clone too. elv_next_request
>>>>>>> could prep a clone based on the orignal request for the driver then dm
>>>>>>> would not have to worry about that part.
>>>>>> It really can't, since it doesn't know how to allocate the clone
>>>>>> request. I'd rather export this functionality as helpers.
>>>>>>
>>>>> What do you think about dm's plan to break up make_request into a
>>>>> mapping function and in to the part the builds the bio into a request.
>>>>> This would fit well with them being helpers and being able to allocate
>>>>> the request from the correct context.
>>>>>
>>>>> I see patches for that did not get posted, but I thought Joe and
>>>>> Alasdair used to talk about that a lot and in the dm code I think there
>>>>> is sill comments about doing it. Maybe the dm comments mentioned the
>>>>> merge_fn, but I guess the merge_fn did not fit what they wanted to do or
>>>>> something. I think Alasdair talked about this at one of his talks at OLS
>>>>> or it was in a proposal for the kernel summit. I can dig up the mail if
>>>>> you want.
>>>>>
>>>> Ignore that. The problem would be that we may not want to decide which
>>>> path to use at map time.
>>> Latter part, or both paragraphs? Dipping into ->make_request_fn() for
>>> some parts do seem to make sense to me. It'll be cheaper than at
>>> potential soft irq time (from elv_next_request()).
>>>
>> I think we got crisscrossed.
>>
>> The original idea but using your helper suggestion would have been this:
>>
>> dm->make_request_fn(bio)
>> {
>> rq = __make_request(bio)
>> if (this is a new request) {
>> allocate clone from either a real device/path specific mempool() or a
>> dm q mempool
>> }
>>
>>
>> dm->prep_fn(rq)
>> {
>> setup clone rq fields based on orig request fields.
>> }
>>
>> dm->request_fn(rq)
>> {
>> figure out which path to use;
>> set rq->q;
>> send cloned rq to real device;
>> }
>
> This'll work nicely, much better.
>
>> The second idea based on Joe and Alasdair to break up make_request would
>> just have been a more formal break up of the dm->make_request_fn above,
>> because I thought your comment about not knowing how to allocate the
>> clone request meant that we did not know which q's mempool to take the
>> request from if we were going to take the cloned request from the real
>> device/path's mempool. I guess this does not really matter since we can
>> have just a dm q mempool of requests to use for cloned requests.
>
> Either approach is fine with me. Note that you need to be careful with
> foreign requests on a queue, see the elevator drain logic for barriers
> and scheduler switching.
>

What I proposed may not work so nicely as is. I remember when we tried
this before, that because __make_request lets go of the q lock, the q
can then be unplugged or it can be unplugged from __make_request if you
hit the unplug threshold so we would not be able to easily allocate a
cloned request from the dm make request callout and set it to the
request that is allocated in make_request. You have to do some surgery
to the make_request function to make this work.

Maybe your preallocted requests that are used from the request_fn is a
better idea.

2006-12-22 14:01:55

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Wed, Dec 20, 2006 at 12:50:02PM -0500, Kiyoshi Ueda wrote:
> Because I'd like to use blk_get_request() in q->request_fn()
> which can be called from interrupt context like below:
> scsi_io_completion -> scsi_end_request -> scsi_next_command
> -> scsi_run_queue -> blk_run_queue -> q->request_fn

> Or request should not be allocated in q->request_fn() anyway?
> Do you have any other ideas?

The right long-term fix is to make sure request_fn is only ever called
from process context. This means moving retry handling to a thread instead
of a softirq, but this will need careful performance testing and tweaking.

2006-12-22 17:30:27

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC PATCH 1/8] rqbased-dm: allow blk_get_request() to be called from interrupt context

On Fri, Dec 22 2006, Christoph Hellwig wrote:
> On Wed, Dec 20, 2006 at 12:50:02PM -0500, Kiyoshi Ueda wrote:
> > Because I'd like to use blk_get_request() in q->request_fn()
> > which can be called from interrupt context like below:
> > scsi_io_completion -> scsi_end_request -> scsi_next_command
> > -> scsi_run_queue -> blk_run_queue -> q->request_fn
>
> > Or request should not be allocated in q->request_fn() anyway?
> > Do you have any other ideas?
>
> The right long-term fix is to make sure request_fn is only ever called
> from process context. This means moving retry handling to a thread instead
> of a softirq, but this will need careful performance testing and tweaking.

It's a lot more than just retry handling. Most driver reinvoke queueing
from the completion handling, so basically most of the time request_fn
is run from either softirq (like SCSI) or interrupt context (like IDE).

--
Jens Axboe