2002-09-16 22:29:15

by Peter T. Breuer

[permalink] [raw]
Subject: tagged block requests

Can someone point me to some documentation or an example
or give me a quick rundown on how I should use the new
tagged block request structure in 2.5.3x?

It looks like something I want. I've already tried issuing
"special" requests as (re)ordering barriers, and that works
fine. How does the tag request interface fit in with that,
if it does?

Peter


2002-09-17 05:49:50

by Jens Axboe

[permalink] [raw]
Subject: Re: tagged block requests

On Tue, Sep 17 2002, Peter T. Breuer wrote:
> Can someone point me to some documentation or an example
> or give me a quick rundown on how I should use the new
> tagged block request structure in 2.5.3x?
>
> It looks like something I want. I've already tried issuing
> "special" requests as (re)ordering barriers, and that works
> fine. How does the tag request interface fit in with that,
> if it does?

The request tagging is used for hardware that can have multiple commands
in flight at any point in time. To do this, we need some sort of cookie
to differentiate between the different commands. For SCSI and IDE, we
use integer tags to do so. An example:

my_request_fn(q)
{
struct request *rq;

next:
rq = elv_next_request(q);
if (!rq)
return;

/*
* assuming some tags are already in flight, ending those will
* restart queue handling
*/
if (blk_queue_start_tag(q, rq))
return;

/*
* now rq is tagged, rq->tag contains the integer identifier
* for this request
*/
dma_map_command();
send_command_to_hw();
goto next;
}

So request_fn calls blk_queue_start_tag(), which associates rq with a
free tag, if available. Then the hardware completes the request:

my_isr(..., devid, ...)
{
struct my_dev *dev = devid;
struct request *rq;
int stat, tag;

stat = read_device_stat(dev);

/* tag is upper 5 bits */
tag = (stat >> 3);

rq = blk_queue_find_tag(q, tag);

if (stat & DEVICE_GOOD_STAT) {
blk_queue_end_tag(q, rq);
complete_request(rq, 1);
} else {
blk_queue_invalidate_tags(q);
lock_queue;
my_request_fn(&dev->queue);
unlock_queue;
}
}

Tag is either completed normally (blk_queue_end_tag()) for good status,
and is ended. Or for bad status, we invalidate the entire pending tag
queue because this particular piece of hardware requires us to do so.
This makes sure that requests gets moved from the tag queue to the
dispatch queue for the device again, so request_fn() gets a chance to
start them over.

That's basically the API. In addition to the above,
blk_queue_init_tags(q, depth) sets up a queue for tagged operation and
blk_queue_free_tags(q) tears it down again.

Now how that fits in with whatever you are trying to do (which
apparently isn't tagging in the ordinary sense), I have no idea. But now
you should now what the interface does.

--
Jens Axboe

2002-09-17 08:07:17

by Andre Hedrick

[permalink] [raw]
Subject: Re: tagged block requests


Jens,

Help me out on this new idea of shoving tag management into BLOCK and away
from the local device. If I understand the process observed, one has to
dequeue the request from block and move it to a new listhead? If so why
do we not stuff the real queuehead with a bogus place holder? What I am
driving to specifically, why must the request be dequeued and not just
marked as inproccess and left on the queuehead?

If for some reason our device tanks and requires a reset or for some
unknown reason one needs to blast through the device and hit platter, we
end up frying all the data and requests with active tags in the device.

Thus one needs to invalidate the new-second listhead of requests moved off
the queuehead and find a way to stuff them back in the device queue. Now
if the device->queue is full, here is where I see us getting beat up.

We have to nuke requests in the queue or nuke the ones we are trying to
remerge. I have a real concern about this after tearing appart the
scsi-mid-layer and becoming ill. There are parts where the HBA marks the
hba->queue full and commands still arrive and are not blocked or the queue
is not unplugged or something. Thus seeing the cdb formation hitting
SCpnt->special and jumping the queue.

My question surrounds why remove the requests from the device->queue?
If we leave them on then if the device goes south and pees on itself, one
just clears the req->command_is_queued (may not exist yet) and req->tag
(may not exist yet).

If this is possible then we may be able to mark a series of requests a
ordered and strenghten the barrier operations, regardless if FUA is set in
SCSI or IDE (future opcode and messy!). This would require a group index
marker w/ associated ordering list, but this may be present already.

Sorry for being block stupid in 2.5 still.



On Tue, 17 Sep 2002, Jens Axboe wrote:

> On Tue, Sep 17 2002, Peter T. Breuer wrote:
> > Can someone point me to some documentation or an example
> > or give me a quick rundown on how I should use the new
> > tagged block request structure in 2.5.3x?
> >
> > It looks like something I want. I've already tried issuing
> > "special" requests as (re)ordering barriers, and that works
> > fine. How does the tag request interface fit in with that,
> > if it does?
>
> The request tagging is used for hardware that can have multiple commands
> in flight at any point in time. To do this, we need some sort of cookie
> to differentiate between the different commands. For SCSI and IDE, we
> use integer tags to do so. An example:
>
> my_request_fn(q)
> {
> struct request *rq;
>
> next:
> rq = elv_next_request(q);
> if (!rq)
> return;
>
> /*
> * assuming some tags are already in flight, ending those will
> * restart queue handling
> */
> if (blk_queue_start_tag(q, rq))
> return;
>
> /*
> * now rq is tagged, rq->tag contains the integer identifier
> * for this request
> */
> dma_map_command();
> send_command_to_hw();
> goto next;
> }
>
> So request_fn calls blk_queue_start_tag(), which associates rq with a
> free tag, if available. Then the hardware completes the request:
>
> my_isr(..., devid, ...)
> {
> struct my_dev *dev = devid;
> struct request *rq;
> int stat, tag;
>
> stat = read_device_stat(dev);
>
> /* tag is upper 5 bits */
> tag = (stat >> 3);
>
> rq = blk_queue_find_tag(q, tag);
>
> if (stat & DEVICE_GOOD_STAT) {
> blk_queue_end_tag(q, rq);
> complete_request(rq, 1);
> } else {
> blk_queue_invalidate_tags(q);
> lock_queue;
> my_request_fn(&dev->queue);
> unlock_queue;
> }
> }
>
> Tag is either completed normally (blk_queue_end_tag()) for good status,
> and is ended. Or for bad status, we invalidate the entire pending tag
> queue because this particular piece of hardware requires us to do so.
> This makes sure that requests gets moved from the tag queue to the
> dispatch queue for the device again, so request_fn() gets a chance to
> start them over.
>
> That's basically the API. In addition to the above,
> blk_queue_init_tags(q, depth) sets up a queue for tagged operation and
> blk_queue_free_tags(q) tears it down again.
>
> Now how that fits in with whatever you are trying to do (which
> apparently isn't tagging in the ordinary sense), I have no idea. But now
> you should now what the interface does.
>
> --
> Jens Axboe
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Andre Hedrick
LAD Storage Consulting Group

2002-09-17 12:24:35

by Jens Axboe

[permalink] [raw]
Subject: Re: tagged block requests

On Tue, Sep 17 2002, Andre Hedrick wrote:
>
> Jens,
>
> Help me out on this new idea of shoving tag management into BLOCK and away
> from the local device. If I understand the process observed, one has to
> dequeue the request from block and move it to a new listhead? If so why
> do we not stuff the real queuehead with a bogus place holder? What I am
> driving to specifically, why must the request be dequeued and not just
> marked as inproccess and left on the queuehead?

blk_queue_start_tag() will dequeue the request itself, so in fact it
must not be done prior to calling it. I think I'll change
blkdev_dequeue_request() to a list_del_init() though, so this wont hurt.

I see no point in leaving the request on the dispatch queue, and lots of
problems. For one, then you would have to maintain two lists which would
further increase size of struct request. You would have to complicate
elv_next_request() to skip request that have been tagged, instead of
just extracting the first one.

> If for some reason our device tanks and requires a reset or for some
> unknown reason one needs to blast through the device and hit platter, we
> end up frying all the data and requests with active tags in the device.
>
> Thus one needs to invalidate the new-second listhead of requests moved off
> the queuehead and find a way to stuff them back in the device queue. Now
> if the device->queue is full, here is where I see us getting beat up.

This is the typical error case for devices using queueing, that the
hardware queue is invalidated by the device and the software queue must
be invalidated by the device driver. blk_queue_invalidate_tags() does
this for you, and even preserves order of request.

I don't understand by device->queue. If you mean the hardware device
queue, that doesn't matter. The requests are now again safe and valid on
the block dispatch queue and will be served to the driver as if nothing
ever happened. The software device queue can never be full.

> We have to nuke requests in the queue or nuke the ones we are trying to
> remerge. I have a real concern about this after tearing appart the
> scsi-mid-layer and becoming ill. There are parts where the HBA marks the
> hba->queue full and commands still arrive and are not blocked or the queue
> is not unplugged or something. Thus seeing the cdb formation hitting
> SCpnt->special and jumping the queue.

I think you'll need to explain this bug some more, please detail in code
where you are seeing stuff go wrong.

> My question surrounds why remove the requests from the device->queue?
> If we leave them on then if the device goes south and pees on itself, one
> just clears the req->command_is_queued (may not exist yet) and req->tag
> (may not exist yet).

See my first paragraph on why this is a bad idea. The design is much
better with busy requests being on a separate queue, code is cleaner,
data structures slightly slimmer. Where are you seeing the
disadvantage??

> If this is possible then we may be able to mark a series of requests a
> ordered and strenghten the barrier operations, regardless if FUA is set in
> SCSI or IDE (future opcode and messy!). This would require a group index
> marker w/ associated ordering list, but this may be present already.

It's currently not possible to group requests, marking them as dependent
on each other.

--
Jens Axboe