LinuxLists.cc - Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

2001-02-06 13:58:23

Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

>Hi,
>
>On Mon, Feb 05, 2001 at 08:01:45PM +0530, [email protected] wrote:
>>
>> >It's the very essence of readahead that we wake up the earlier buffers
>> >as soon as they become available, without waiting for the later ones
>> >to complete, so we _need_ this multiple completion concept.
>>
>> I can understand this in principle, but when we have a single request
going
>> down to the device that actually fills in multiple buffers, do we get
>> notified (interrupted) by the device before all the data in that request
>> got transferred ?
>
>It depends on the device driver. Different controllers will have
>different maximum transfer size. For IDE, for example, we get wakeups
>all over the place. For SCSI, it depends on how many scatter-gather
>entries the driver can push into a single on-the-wire request. Exceed
>that limit and the driver is forced to open a new scsi mailbox, and
>you get independent completion signals for each such chunk.

I see. I remember Jens Axboe mentioning something like this with IDE.
So, in this case, you want every such chunk to check if its completed
filling up a buffer and then trigger a wakeup on that ?
But, does this also mean that in such a case combining requests beyond this
limit doesn't really help ? (Reordering requests to get contiguity would
help of course in terms of seek times, I guess, but not merging beyond this
limit)

>> >Which is exactly why we have one kiobuf per higher-level buffer, and
>> >we chain together kiobufs when we need to for a long request, but we
>> >still get the independent completion notifiers.
>>
>> As I mentioned above, the alternative is to have the i/o completion
related
>> linkage information within the wakeup structures instead. That way, it
>> doesn't matter to the lower level driver what higher level structure we
>> have above (maybe buffer heads, may be page cache structures, may be
>> kiobufs). We only chain together memory descriptors for the buffers
during
>> the io.
>
>You forgot IO failures: it is essential, once the IO completes, to
>know exactly which higher-level structures completed successfully and
>which did not. The low-level drivers have to have access to the
>independent completion notifications for this to work.
>
No, I didn't forget IO failures; just that I expect the wait structure
containing the wakeup function to be embedded in a cev structure that
contains a pointer to the wait_queue_head field in the higher level
structure. The rest is for the wakeup function to interpret (it can always
access the other fields in the higher level structure - just like
list_entry() does)

Later I realized that instead of having multiple wakeup functions queued on
the low level structures wait queue, its perhaps better to just sort of
turn the cev_wait structure upside down (entry on the lower level
structure's queue should link to the parent entries instead).

2001-02-06 14:08:14

by Jens Axboe

[permalink] [raw]

Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

On Tue, Feb 06 2001, [email protected] wrote:
> >It depends on the device driver. Different controllers will have
> >different maximum transfer size. For IDE, for example, we get wakeups
> >all over the place. For SCSI, it depends on how many scatter-gather
> >entries the driver can push into a single on-the-wire request. Exceed
> >that limit and the driver is forced to open a new scsi mailbox, and
> >you get independent completion signals for each such chunk.

SCSI does not build a request bigger than the low level driver
can handle. If you exceed the scatter count in a single request,
you just stop and fire of that request, later on restarting I/O
on the remainder.

> I see. I remember Jens Axboe mentioning something like this with IDE.
> So, in this case, you want every such chunk to check if its completed
> filling up a buffer and then trigger a wakeup on that ?

Yes. Which is why dealing with buffer heads are so nice in this
regard, you never have problems with ending I/O on a single "piece".

> But, does this also mean that in such a case combining requests beyond this
> limit doesn't really help ? (Reordering requests to get contiguity would
> help of course in terms of seek times, I guess, but not merging beyond this
> limit)

There's a slight benefit in building bigger requests than the driver
can handle, in that you can have more I/O pending on the queue. It's
not worth spending too much time on though.

--
Jens Axboe