2004-03-17 20:02:22

by Peter T. Breuer

[permalink] [raw]
Subject: floppy driver 2.6.3 question



In the 2.6.3 floppy driver, when the driver is asked to revalidate by
kernel check_disk_change (after the latter asks and the floppy signalled
media_changed), the floppy driver constructs a read bio for the first
block and submits it via submit_bio, and waits for completion of the
bio.

However, the bio's embedded completion only signals back if the
submitted bio was successful, as far as I can tell:


static int floppy_rb0_complete(struct bio *bio, unsigned int bytes_done, int err)
{
if (bio->bi_size)
return 1;

complete((struct completion*)bio->bi_private);
return 0;
}

Note that if the bi_size is nonzero, we return without signalling. Now
bi_size starts out nozero

bio.bi_size = size;

but I _think_ bi_size is zeroed along the way somewhere in end_request
(who knows?) if all goes well, so that nonzero means we still have more
to do in this bio. So if things go badly, completion is never signalled
and the submitted read is waited for forever? (and the result is never
tested).

submit_bio(READ, &bio);
generic_unplug_device(bdev_get_queue(bdev));
process_fd_request();
wait_for_completion(&complete);

__free_page(page);

My reading therefore is that we cannot do revalidation until we are
sure that the floppy is there. If we feel sure, but are wrong, the
test read of the first block will hang during the revalidation.

media_changed is tested using poll_drive(0,0). It's triggered by
kernel check_disk_change, which is run in open on the device.
Therefore I feel that opens may hang if the floppy is not there
when we think it is, such as if we yank it out just after it has been
polled, or if the floppy has bad sectors, or something like that.

Can somebody clear up my worry/confusion over how floppy_rb0_complete
can be correct to sometimes not signal completion? How can it risk
leaving us waiting forever? How can it even be called when the bio is
not yet complete? And if the bio is ended when not complete, don't we
want to know about it, because then we will be able to say that the
floppy is not there, or is invalid? Surely we don't want to wait!?




Peter



2004-03-18 07:14:28

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Wed, Mar 17 2004, Peter T. Breuer wrote:
>
>
> In the 2.6.3 floppy driver, when the driver is asked to revalidate by
> kernel check_disk_change (after the latter asks and the floppy signalled
> media_changed), the floppy driver constructs a read bio for the first
> block and submits it via submit_bio, and waits for completion of the
> bio.
>
> However, the bio's embedded completion only signals back if the
> submitted bio was successful, as far as I can tell:
>
>
> static int floppy_rb0_complete(struct bio *bio, unsigned int bytes_done, int err)
> {
> if (bio->bi_size)
> return 1;
>
> complete((struct completion*)bio->bi_private);
> return 0;
> }
>
> Note that if the bi_size is nonzero, we return without signalling. Now
> bi_size starts out nozero
>
> bio.bi_size = size;
>
> but I _think_ bi_size is zeroed along the way somewhere in end_request
> (who knows?) if all goes well, so that nonzero means we still have more
> to do in this bio. So if things go badly, completion is never signalled
> and the submitted read is waited for forever? (and the result is never
> tested).

You are completely missing how it works... ->bi_end_io() is invoked on
every io completion event that the hardware generates, but typically
most only care about the final completion (and not any eventual partial
completions along the way).

So driver calls end_request* which does a bio_endio() on the number of
sectors passed in, which in turn decrements bio->bi_size. _If_
bio->bi_size never hits zero, then you have a driver bug. If things 'go
badly', then the driver will signal unsuccessful completion of X
sectors.

>
> submit_bio(READ, &bio);
> generic_unplug_device(bdev_get_queue(bdev));
> process_fd_request();
> wait_for_completion(&complete);
>
> __free_page(page);
>
> My reading therefore is that we cannot do revalidation until we are
> sure that the floppy is there. If we feel sure, but are wrong, the
> test read of the first block will hang during the revalidation.

Wrong.

--
Jens Axboe

2004-03-18 08:23:44

by Peter T. Breuer

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

Thanks very much for the response, Jens. I appreciate it.


"Also sprach Jens Axboe:"
> On Wed, Mar 17 2004, Peter T. Breuer wrote:
> >
> > In the 2.6.3 floppy driver, when the driver is asked to revalidate by
> > kernel check_disk_change (after the latter asks and the floppy signalled
> > media_changed), the floppy driver constructs a read bio for the first
> > block and submits it via submit_bio, and waits for completion of the
> > bio.
> >
> > However, the bio's embedded completion only signals back if the
> > submitted bio was successful, as far as I can tell:
> >
> >
> > static int floppy_rb0_complete(struct bio *bio, unsigned int bytes_done, int err)
> > {
> > if (bio->bi_size)
> > return 1;
> >
> > complete((struct completion*)bio->bi_private);
> > return 0;
> > }
> >
> > Note that if the bi_size is nonzero, we return without signalling. Now
> > bi_size starts out nozero
> >
> > bio.bi_size = size;
> >
> > but I _think_ bi_size is zeroed along the way somewhere in end_request
> > (who knows?) if all goes well, so that nonzero means we still have more
> > to do in this bio. So if things go badly, completion is never signalled
> > and the submitted read is waited for forever? (and the result is never
> > tested).
>
> You are completely missing how it works... ->bi_end_io() is invoked on
> every io completion event that the hardware generates,

You are saying the floppy_rb0_complete is possibly called multiple times
(presumably by end_that_request_first, or whatever_it_is_called :)? I
was not aware of that, but it makes sense, thank you. Once for each bio
in the request. But how can it be called multiple times for each bio?

I'm trying to understand the 2.6 block driver mechanisms :-(.

It's hard btw to see from the code precisely what the args to the call
of bi_end_io in end_that_request_first are. err seems to be either 0 or
-EIO (if the whole request was not uptodate) and bytes_done appears
always to be bi_size for the current bio. __make_request seems to be
able to generate a WOULDBLOCK error and bytes_done equal to the whole
request size. Yep - I'm confusicated as to whether there is some
invariant like bytes_done+bio->bi_size = const or not.


> but typically
> most only care about the final completion (and not any eventual partial
> completions along the way).

OK. Where approximately are the partial completion calls generated?


> So driver calls end_request* which does a bio_endio() on the number of
> sectors passed in, which in turn decrements bio->bi_size.

You know, I didn't see any such decrement in end_that-request_first.

> _If_
> bio->bi_size never hits zero, then you have a driver bug. If things 'go
> badly', then the driver will signal unsuccessful completion of X
> sectors.

Hmm, presumably in the case of unsuccessful completion, we still get
bio->b_size == 0 in the call to floppy_rb0_complete? That's good.
Then we will always be signalled.


> > submit_bio(READ, &bio);
> > generic_unplug_device(bdev_get_queue(bdev));
> > process_fd_request();
> > wait_for_completion(&complete);
> >
> > __free_page(page);
> >
> > My reading therefore is that we cannot do revalidation until we are
> > sure that the floppy is there. If we feel sure, but are wrong, the
> > test read of the first block will hang during the revalidation.
>
> Wrong.

OK, you are saying that the complete will be signalled after the
request has been errored. Thank you very much for the explanation!

Peter

2004-03-18 09:10:25

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Thu, Mar 18 2004, Peter T. Breuer wrote:
> "Also sprach Jens Axboe:"
> > On Wed, Mar 17 2004, Peter T. Breuer wrote:
> > >
> > > In the 2.6.3 floppy driver, when the driver is asked to revalidate by
> > > kernel check_disk_change (after the latter asks and the floppy signalled
> > > media_changed), the floppy driver constructs a read bio for the first
> > > block and submits it via submit_bio, and waits for completion of the
> > > bio.
> > >
> > > However, the bio's embedded completion only signals back if the
> > > submitted bio was successful, as far as I can tell:
> > >
> > >
> > > static int floppy_rb0_complete(struct bio *bio, unsigned int bytes_done, int err)
> > > {
> > > if (bio->bi_size)
> > > return 1;
> > >
> > > complete((struct completion*)bio->bi_private);
> > > return 0;
> > > }
> > >
> > > Note that if the bi_size is nonzero, we return without signalling. Now
> > > bi_size starts out nozero
> > >
> > > bio.bi_size = size;
> > >
> > > but I _think_ bi_size is zeroed along the way somewhere in end_request
> > > (who knows?) if all goes well, so that nonzero means we still have more
> > > to do in this bio. So if things go badly, completion is never signalled
> > > and the submitted read is waited for forever? (and the result is never
> > > tested).
> >
> > You are completely missing how it works... ->bi_end_io() is invoked on
> > every io completion event that the hardware generates,
>
> You are saying the floppy_rb0_complete is possibly called multiple times
> (presumably by end_that_request_first, or whatever_it_is_called :)? I

It is called from end_that_request_first -> bio_endio() -> bi_end_io().
So _if_ the floppy driver uses partial completions (ie several calls to
end_that_request_first(), not just one at the end), then
floppy_rb0_complete will be called multiple times.

> was not aware of that, but it makes sense, thank you. Once for each bio
> in the request. But how can it be called multiple times for each bio?

A bio can contain any amount of data, basically. So if you get hardware
completions for each 512b sector and the bio contains 128K, you'd get
256 completions.

> I'm trying to understand the 2.6 block driver mechanisms :-(.
>
> It's hard btw to see from the code precisely what the args to the call
> of bi_end_io in end_that_request_first are. err seems to be either 0 or
> -EIO (if the whole request was not uptodate) and bytes_done appears
> always to be bi_size for the current bio. __make_request seems to be
> able to generate a WOULDBLOCK error and bytes_done equal to the whole
> request size. Yep - I'm confusicated as to whether there is some
> invariant like bytes_done+bio->bi_size = const or not.

I don't quite follow your last bytes_done+bio->bi_size, that doesn't
make any sense. If __make_request() calls bio_endio() with the full
size, you get floppy_rb0_complete() invoked immediately with bi_size ==
0, ie it completely ends the bio.

> > but typically
> > most only care about the final completion (and not any eventual partial
> > completions along the way).
>
> OK. Where approximately are the partial completion calls generated?

They are generated by the driver.

> > So driver calls end_request* which does a bio_endio() on the number of
> > sectors passed in, which in turn decrements bio->bi_size.
>
> You know, I didn't see any such decrement in end_that-request_first.

Follow the trace on step to bio_endio()

> > _If_
> > bio->bi_size never hits zero, then you have a driver bug. If things 'go
> > badly', then the driver will signal unsuccessful completion of X
> > sectors.
>
> Hmm, presumably in the case of unsuccessful completion, we still get
> bio->b_size == 0 in the call to floppy_rb0_complete? That's good.
> Then we will always be signalled.

Of course we do, otherwise you'd never be able to sleep on io
completion.

> > > submit_bio(READ, &bio);
> > > generic_unplug_device(bdev_get_queue(bdev));
> > > process_fd_request();
> > > wait_for_completion(&complete);
> > >
> > > __free_page(page);
> > >
> > > My reading therefore is that we cannot do revalidation until we are
> > > sure that the floppy is there. If we feel sure, but are wrong, the
> > > test read of the first block will hang during the revalidation.
> >
> > Wrong.
>
> OK, you are saying that the complete will be signalled after the
> request has been errored. Thank you very much for the explanation!

Yes

--
Jens Axboe

2004-03-18 10:05:53

by Peter T. Breuer

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

"Also sprach Jens Axboe:"
> On Thu, Mar 18 2004, Peter T. Breuer wrote:
> > You are saying the floppy_rb0_complete is possibly called multiple times
> > (presumably by end_that_request_first, or whatever_it_is_called :)? I
>
> It is called from end_that_request_first -> bio_endio() -> bi_end_io().
> So _if_ the floppy driver uses partial completions (ie several calls to
> end_that_request_first(), not just one at the end), then
> floppy_rb0_complete will be called multiple times.

Thanks - seen it now, and the bi_size decrement in bio_endio.

> > request size. Yep - I'm confusicated as to whether there is some
> > invariant like bytes_done+bio->bi_size = const or not.
>
> I don't quite follow your last bytes_done+bio->bi_size, that doesn't
> make any sense.

I mean that the call to bio->bi_endio carries a bytes_done arg, and
that I suspect that the value of bio->bi_size at that point represents
a "bytes_remaining_to_be_done", so that bytes_done+bio->bi_size will
be an invariant.

> If __make_request() calls bio_endio() with the full
> size, you get floppy_rb0_complete() invoked immediately with bi_size ==
> 0, ie it completely ends the bio.

And the value of bytes_done in the call to bio->bi_endio presumably is
tehn the full (original) size of the bio, so that the sum of the two
represents a constant invariant?

The reason why I was asking was because I was trying to figure out how
to handle removable media correctly, and following the floppy driver.

The floppy driver sends out a read_block_0 bio on device revalidation
and I was concerned because revalidation is called after a change of
state, whether for better or worse! So if the floppy change is for the
worse, then the bio will fail. But you have reassured me that it will
fail and still complete, thus allowing revalidation to continue.

The floppy driver doesn't notice if the revalidation bio failed or not,
btw (and I don't see an easy way of telling, hence my questions as to
what the bi_size and bytes_done "really" mean - I can return them). I
suppose it can tell via the drive hardware what the state is. I also
suppose reading the first block somehow triggers some kind of partition
revision or other magic?

What hurts is that when I copy what the floppy driver does in my driver,
then things hang in revalidation, with the submit_bio from the
read_block_0 hanging in a "down" (not in my code), and attempts to open
the device (which triggers the check_disk_change to run media_changed
and revalidation) hang in schedule_io or schedule with no progress.

Leaving off the read_block_0 from the revalidation helps thinsg along,
but the revalidation still seems fragile in some (as yet unknown) way,
and eventually, after a few cycles of revalidation, opens on the device
start to hang in a down or a io_schedule as before. I imagine that
perhaps the down is the one waiting for a request, but I really don't
know. 128 requests have just previously been errored due to readahead
and the check_media_changed result setting the driver request function
to error out requests. Perhaps we have run out of requests (they're
all put_ as far as I can see). Maybe the block layers get tired of
talking to a device that errors requests. I'm just feeling my way!
Any wild ideas are welcome!

Peter

2004-03-18 11:35:32

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Thu, Mar 18 2004, Peter T. Breuer wrote:
> > > request size. Yep - I'm confusicated as to whether there is some
> > > invariant like bytes_done+bio->bi_size = const or not.
> >
> > I don't quite follow your last bytes_done+bio->bi_size, that doesn't
> > make any sense.
>
> I mean that the call to bio->bi_endio carries a bytes_done arg, and
> that I suspect that the value of bio->bi_size at that point represents
> a "bytes_remaining_to_be_done", so that bytes_done+bio->bi_size will
> be an invariant.

bio->bi_size always represents the number of bytes that remain for io.

> > If __make_request() calls bio_endio() with the full
> > size, you get floppy_rb0_complete() invoked immediately with bi_size ==
> > 0, ie it completely ends the bio.
>
> And the value of bytes_done in the call to bio->bi_endio presumably is
> tehn the full (original) size of the bio, so that the sum of the two
> represents a constant invariant?

bytes_done can be anything from (for fs code) 512 to bio->bi_size, in
multiples of 512.

> The reason why I was asking was because I was trying to figure out how
> to handle removable media correctly, and following the floppy driver.
>
> The floppy driver sends out a read_block_0 bio on device revalidation
> and I was concerned because revalidation is called after a change of
> state, whether for better or worse! So if the floppy change is for the
> worse, then the bio will fail. But you have reassured me that it will
> fail and still complete, thus allowing revalidation to continue.

At least that's the way it works from the block layer, the assumption is
based on the fact that floppy isn't buggy and generates a completion on
the request with the failure set in that case. Normal behaviour.

> The floppy driver doesn't notice if the revalidation bio failed or not,
> btw (and I don't see an easy way of telling, hence my questions as to
> what the bi_size and bytes_done "really" mean - I can return them). I
> suppose it can tell via the drive hardware what the state is. I also
> suppose reading the first block somehow triggers some kind of partition
> revision or other magic?

One way would be ala the attached, see how bio_endio() clears
BIO_UPTODATE on error.

> What hurts is that when I copy what the floppy driver does in my driver,
> then things hang in revalidation, with the submit_bio from the
> read_block_0 hanging in a "down" (not in my code), and attempts to open
> the device (which triggers the check_disk_change to run media_changed
> and revalidation) hang in schedule_io or schedule with no progress.
>
> Leaving off the read_block_0 from the revalidation helps thinsg along,
> but the revalidation still seems fragile in some (as yet unknown) way,
> and eventually, after a few cycles of revalidation, opens on the device
> start to hang in a down or a io_schedule as before. I imagine that
> perhaps the down is the one waiting for a request, but I really don't
> know. 128 requests have just previously been errored due to readahead
> and the check_media_changed result setting the driver request function
> to error out requests. Perhaps we have run out of requests (they're
> all put_ as far as I can see). Maybe the block layers get tired of
> talking to a device that errors requests. I'm just feeling my way!
> Any wild ideas are welcome!

If the 129th request fails, then that's a pretty good clue that you
aren't getting the requests freed. Perhaps you are overwriting something
in the request after allocating it? Always mask change ->flags (don't
just set it), and don't overwrite ->rl.

--
Jens Axboe


Attachments:
(No filename) (3.51 kB)
floppy-revalidate-failed-1 (521.00 B)
Download all attachments

2004-03-18 12:23:51

by Peter T. Breuer

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

"Also sprach Jens Axboe:"
> > The floppy driver doesn't notice if the revalidation bio failed or not,
> > btw (and I don't see an easy way of telling, hence my questions as to
> > what the bi_size and bytes_done "really" mean - I can return them). I
> > suppose it can tell via the drive hardware what the state is. I also
> > suppose reading the first block somehow triggers some kind of partition
> > revision or other magic?
>
> One way would be ala the attached, see how bio_endio() clears
> BIO_UPTODATE on error.

Very good. Applied. Thanks!

(do get_bio(&bio) before submit_bio(&bio) and do put_bio(&bio)
afterwards and examine bio.flags&BIO_UPTODATE on return from
wait_for_completion between the two).

> > know. 128 requests have just previously been errored due to readahead
> > and the check_media_changed result setting the driver request function
> > to error out requests. Perhaps we have run out of requests (they're
> > all put_ as far as I can see). Maybe the block layers get tired of
> > talking to a device that errors requests. I'm just feeling my way!
> > Any wild ideas are welcome!
>
> If the 129th request fails, then that's a pretty good clue that you
> aren't getting the requests freed. Perhaps you are overwriting something
> in the request after allocating it? Always mask change ->flags (don't
> just set it), and don't overwrite ->rl.

Good idea. rl is inviolate, but I set at least |=REQ_NOMERGE sometimes
on flags. And I pass ioctl information in fake requests by setting
the bit just beyond the edge of those currently used (__REQ_BITS) to
indicate its an ioctl and treating it specially in end request. Maybe
on error I forgot to remove the extra bit before doing put_blk_request
..

WIll look.

Thanks again.

Peter

2004-03-18 12:28:39

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Thu, Mar 18 2004, Peter T. Breuer wrote:
> > > know. 128 requests have just previously been errored due to readahead
> > > and the check_media_changed result setting the driver request function
> > > to error out requests. Perhaps we have run out of requests (they're
> > > all put_ as far as I can see). Maybe the block layers get tired of
> > > talking to a device that errors requests. I'm just feeling my way!
> > > Any wild ideas are welcome!
> >
> > If the 129th request fails, then that's a pretty good clue that you
> > aren't getting the requests freed. Perhaps you are overwriting something
> > in the request after allocating it? Always mask change ->flags (don't
> > just set it), and don't overwrite ->rl.
>
> Good idea. rl is inviolate, but I set at least |=REQ_NOMERGE sometimes
> on flags. And I pass ioctl information in fake requests by setting

May I ask on what commands you set that bit?

> the bit just beyond the edge of those currently used (__REQ_BITS) to
> indicate its an ioctl and treating it specially in end request. Maybe
> on error I forgot to remove the extra bit before doing put_blk_request

Ugh, that sounds like very bad practice... The 'standard' way of doing
something like that is to flag REQ_SPECIAL and put whatever structure
you want in ->special.

--
Jens Axboe

2004-03-18 13:25:11

by Peter T. Breuer

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

"Also sprach Jens Axboe:"
> > Good idea. rl is inviolate, but I set at least |=REQ_NOMERGE sometimes
> > on flags. And I pass ioctl information in fake requests by setting
>
> May I ask on what commands you set that bit?

I set it on requests I have gotten myself via blk_get_request(..., WRITE,
GFP_ATOMIC) and which are destined to be passed onto the drivers request
queue and treated by the request function. The request function will
know what to do with them. The bit I mention below is also set on them:

> > the bit just beyond the edge of those currently used (__REQ_BITS) to
> > indicate its an ioctl and treating it specially in end request. Maybe
> > on error I forgot to remove the extra bit before doing put_blk_request
>
> Ugh, that sounds like very bad practice... The 'standard' way of doing
> something like that is to flag REQ_SPECIAL and put whatever structure
> you want in ->special.

Hmm .. I though SPECIAL was "just" to ensure ordering of requests and
I went to some lengths to ensure that if I receive a request then we
start diverting incoming requests to an alternate queue until we have
treated all the requests already on the device queue! Then we ack the
special and pass the requests back from the alternate queue. You are
telling me that I needn't have bothered since I'm the only one who
could generate a special? Owww.

Peter

2004-03-18 13:31:18

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Thu, Mar 18 2004, Peter T. Breuer wrote:
> "Also sprach Jens Axboe:"
> > > Good idea. rl is inviolate, but I set at least |=REQ_NOMERGE sometimes
> > > on flags. And I pass ioctl information in fake requests by setting
> >
> > May I ask on what commands you set that bit?
>
> I set it on requests I have gotten myself via blk_get_request(..., WRITE,
> GFP_ATOMIC) and which are destined to be passed onto the drivers request
> queue and treated by the request function. The request function will
> know what to do with them. The bit I mention below is also set on them:

Ok, sounds fine.

> > > the bit just beyond the edge of those currently used (__REQ_BITS) to
> > > indicate its an ioctl and treating it specially in end request. Maybe
> > > on error I forgot to remove the extra bit before doing put_blk_request
> >
> > Ugh, that sounds like very bad practice... The 'standard' way of doing
> > something like that is to flag REQ_SPECIAL and put whatever structure
> > you want in ->special.
>
> Hmm .. I though SPECIAL was "just" to ensure ordering of requests and
> I went to some lengths to ensure that if I receive a request then we
> start diverting incoming requests to an alternate queue until we have
> treated all the requests already on the device queue! Then we ack the
> special and pass the requests back from the alternate queue. You are
> telling me that I needn't have bothered since I'm the only one who
> could generate a special? Owww.

No, ->special is for the driver receiving the request, I don't know what
kind of bastard driver you are trying to create :-). If you pass it on,
you cannot use it.

But your code just using the after-last bit is still severly broken, you
must not play tricks like that.

Actually, sounds like you are attempting to create an io stack at the
wrong level. Why aren't you just hooking into ->make_request_fn()
instead? You are in for all sorts of pains doing what you describe
above.

--
Jens Axboe

2004-03-18 16:06:29

by Peter T. Breuer

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

"Also sprach Jens Axboe:"
> On Thu, Mar 18 2004, Peter T. Breuer wrote:
> > Hmm .. I though SPECIAL was "just" to ensure ordering of requests and
> > I went to some lengths to ensure that if I receive a request then we
> > start diverting incoming requests to an alternate queue until we have
> > treated all the requests already on the device queue! Then we ack the
> > special and pass the requests back from the alternate queue. You are
> > telling me that I needn't have bothered since I'm the only one who
> > could generate a special? Owww.
>
> No, ->special is for the driver receiving the request,

I mean that I thought that if ->flags & REQ_SPECIAL were set, then
we were obliged to flush the driver request queue before treating
this request, and I also thought that the driver was liable to receive
such things from somewhere else.

What I now suppose from your words above is that

a) ->flags & REQ_SPECIAL is something I can set all on my own.
b) if I do (or even if I don't?), the ->special field is mine to use?
(I am surprised since I dimly recall spotting it used for
something).

:).

> I don't know what
> kind of bastard driver you are trying to create :-). If you pass it on,
> you cannot use it.
>
> But your code just using the after-last bit is still severly broken, you
> must not play tricks like that.

It gets even worse, since I was using the very highest bits of ->flag
for a sequence counter. Very successfully, I may add. Is there any
other place I can simply add to the request a counter for the number of
(write) requests having gone through the driver? The idea is that the
driver is the only one who knows the real time order in which requests
are received, so it must be the one to stamp them, and if somebody else
wants to receive these and preserve order, this sequence number must be
observed. But I didn't see anywhere to put them.

Are you going to say, set REQ_SPECIAL on everything and add all the
real request info and a bit more to ->special? I suspect you are!

> Actually, sounds like you are attempting to create an io stack at the
> wrong level. Why aren't you just hooking into ->make_request_fn()

I don't quite understand. Using a different make_request_fn for these
"special" requests is useful and indeed I do use one. But an "io
stack"?

> instead? You are in for all sorts of pains doing what you describe
> above.

Well, everything was working fine apart from death after revalidation!
But I am attempting a rewrite now in accordance with what you say,
which is why I am happy for elaboration/confirmation of (a-b) above.

I presume I can't get "special" requests off of anywhere but the READ
and WRITE free lists via blk_get_request? It would be easier to debug
if there were a separate list.


Peter

2004-03-18 16:16:54

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Thu, Mar 18 2004, Peter T. Breuer wrote:
> "Also sprach Jens Axboe:"
> > On Thu, Mar 18 2004, Peter T. Breuer wrote:
> > > Hmm .. I though SPECIAL was "just" to ensure ordering of requests and
> > > I went to some lengths to ensure that if I receive a request then we
> > > start diverting incoming requests to an alternate queue until we have
> > > treated all the requests already on the device queue! Then we ack the
> > > special and pass the requests back from the alternate queue. You are
> > > telling me that I needn't have bothered since I'm the only one who
> > > could generate a special? Owww.
> >
> > No, ->special is for the driver receiving the request,
>
> I mean that I thought that if ->flags & REQ_SPECIAL were set, then
> we were obliged to flush the driver request queue before treating
> this request, and I also thought that the driver was liable to receive
> such things from somewhere else.

I dunno where you get those crazy ideas, REQ_SPECIAL has absolutely
nothing to do with that.

> What I now suppose from your words above is that
>
> a) ->flags & REQ_SPECIAL is something I can set all on my own.
> b) if I do (or even if I don't?), the ->special field is mine to use?
> (I am surprised since I dimly recall spotting it used for
> something).

You may use REQ_SPECIAL bit as you see fit, and ->special as well. You
don't have to use them together, must do though. However, as I said
earlier, if you push these requests on to someone else request queue,
you must not fiddle with REQ_SPECIAL and/or ->special. In that case you
cannot touch/use more than what the block layer already does.

> > I don't know what
> > kind of bastard driver you are trying to create :-). If you pass it on,
> > you cannot use it.
> >
> > But your code just using the after-last bit is still severly broken, you
> > must not play tricks like that.
>
> It gets even worse, since I was using the very highest bits of ->flag
> for a sequence counter. Very successfully, I may add. Is there any

Oh god that's horrible.

> other place I can simply add to the request a counter for the number of
> (write) requests having gone through the driver? The idea is that the
> driver is the only one who knows the real time order in which requests
> are received, so it must be the one to stamp them, and if somebody else
> wants to receive these and preserve order, this sequence number must be
> observed. But I didn't see anywhere to put them.
>
> Are you going to say, set REQ_SPECIAL on everything and add all the
> real request info and a bit more to ->special? I suspect you are!

That's where to put it, indeed. But if you pass it on...

> > Actually, sounds like you are attempting to create an io stack at the
> > wrong level. Why aren't you just hooking into ->make_request_fn()
>
> I don't quite understand. Using a different make_request_fn for these
> "special" requests is useful and indeed I do use one. But an "io
> stack"?

Sounds like you are acting as a middle man, hence an io stack. And the
place to do the stacking is at make_request_fn time, not at request_fn
time. If you happen to get it working, it'll be fragile at best.

> > instead? You are in for all sorts of pains doing what you describe
> > above.
>
> Well, everything was working fine apart from death after revalidation!
> But I am attempting a rewrite now in accordance with what you say,
> which is why I am happy for elaboration/confirmation of (a-b) above.
>
> I presume I can't get "special" requests off of anywhere but the READ
> and WRITE free lists via blk_get_request? It would be easier to debug
> if there were a separate list.

SPECIAL isn't a request type, it's not anything actually (it's to use
for the driver for whatever he wants it to mean). But typically drivers
us it to indicate that they have attached a driver_rq structure to
->special.

--
Jens Axboe

2004-03-18 18:11:17

by Peter T. Breuer

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

"Also sprach Jens Axboe:"
> > I mean that I thought that if ->flags & REQ_SPECIAL were set, then
> > we were obliged to flush the driver request queue before treating
> > this request, and I also thought that the driver was liable to receive
> > such things from somewhere else.
>
> I dunno where you get those crazy ideas, REQ_SPECIAL has absolutely
> nothing to do with that.

Sorry about that. Perhaps I saw it in 2.5.


> You may use REQ_SPECIAL bit as you see fit, and ->special as well. You
> don't have to use them together, must do though. However, as I said
> earlier, if you push these requests on to someone else request queue,
> you must not fiddle with REQ_SPECIAL and/or ->special. In that case you
> cannot touch/use more than what the block layer already does.

Well hooray. All seems to be working fine now that I shifted the burden
to ->special and stopped playing with ->flags (touch wood). Even
revalidation is working AOK as far as I can tell. I'll reenable that
read of the first block a-la-floppy to see if it causes some extra magic.

Many thanks!


> > Are you going to say, set REQ_SPECIAL on everything and add all the
> > real request info and a bit more to ->special? I suspect you are!
>
> That's where to put it, indeed. But if you pass it on...


Peter

2004-03-18 19:05:59

by Jens Axboe

[permalink] [raw]
Subject: Re: floppy driver 2.6.3 question

On Thu, Mar 18 2004, Peter T. Breuer wrote:
> > You may use REQ_SPECIAL bit as you see fit, and ->special as well. You
> > don't have to use them together, must do though. However, as I said
> > earlier, if you push these requests on to someone else request queue,
> > you must not fiddle with REQ_SPECIAL and/or ->special. In that case you
> > cannot touch/use more than what the block layer already does.
>
> Well hooray. All seems to be working fine now that I shifted the burden
> to ->special and stopped playing with ->flags (touch wood). Even
> revalidation is working AOK as far as I can tell. I'll reenable that
> read of the first block a-la-floppy to see if it causes some extra magic.
>
> Many thanks!

No problem, I'm happy it worked out for you.

--
Jens Axboe