2001-02-01 05:03:12

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains



>My first comment is that this looks very heavyweight indeed. Isn't it
>just over-engineered?

Yes, I know it is, in its current form (sigh !).

But at the same time, I do not want to give up (not yet, at least) on
trying to arrive at something that can serve the objectives, and yet be
simple in principle and lightweight too. I feel the need may grow as we
have more filter layers coming in, and as async i/o and even i/o
cancellation usage increases. And it may not be just with kiobufs ...

I took a second pass attempt at it last night based on Ben's wait queue
extensions. Will write that up in a separate note after this. Do let me
know if it seems like any improvement at all.

>We _do_ need the ability to stack completion events, but as far as the
>kiobuf work goes, my current thoughts are to do that by stacking
>lightweight "clone" kiobufs.

Would that work with stackable filesystems ?

>
>The idea is that completion needs to pass upwards (a)
>bytes-transferred, and (b) errno, to satisfy the caller: everything
>else, including any private data, can be hooked by the caller off the
>kiobuf private data (or in fact the caller's private data can embed
>the clone kiobuf).
>
>A clone kiobuf is a simple header, nothing more, nothing less: it
>shares the same page vector as its parent kiobuf. It has private
>length/offset fields, so (for example) a LVM driver can carve the
>parent kiobuf into multiple non-overlapping children, all sharing the
>same page list but each one actually referencing only a small region
>of the whole.
>
>That ought to clean up a great deal of the problems of passing kiobufs
>through soft raid, LVM or loop drivers.
>
>I am tempted to add fields to allow the children of a kiobuf to be
>tracked and identified, but I'm really not sure it's necessary so I'll
>hold off for now. We already have the "io-count" field which
>enumerates sub-ios, so we can define each child to count as one such
>sub-io; and adding a parent kiobuf reference to each kiobuf makes a
>lot of sense if we want to make it easy to pass callbacks up the
>stack. More than that seems unnecessary for now.

Being able to track the children of a kiobuf would help with I/O
cancellation (e.g. to pull sub-ios off their request queues if I/O
cancellation for the parent kiobuf was issued). Not essential, I guess, in
general, but useful in some situations.
With a clone kiobuf there is no direct way to reach a clone kiobuf given
the original kiobuf (without adding some indexing scheme ).

>
>--Stephen




2001-02-01 12:21:49

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

Hi,

On Thu, Feb 01, 2001 at 10:25:22AM +0530, [email protected] wrote:
>
> >We _do_ need the ability to stack completion events, but as far as the
> >kiobuf work goes, my current thoughts are to do that by stacking
> >lightweight "clone" kiobufs.
>
> Would that work with stackable filesystems ?

Only if the filesystems were using VFS interfaces which used kiobufs.
Right now, the only filesystem using kiobufs is XFS, and it only
passes them down to the block device layer, not to other filesystems.

> Being able to track the children of a kiobuf would help with I/O
> cancellation (e.g. to pull sub-ios off their request queues if I/O
> cancellation for the parent kiobuf was issued). Not essential, I guess, in
> general, but useful in some situations.

What exactly is the justification for IO cancellation? It really
upsets the normal flow of control through the IO stack to have
voluntary cancellation semantics.

--Stephen

2001-02-01 19:34:10

by Chaitanya Tumuluri

[permalink] [raw]
Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

On Thu, 1 Feb 2001, Stephen C. Tweedie wrote:
> Hi,
>
> On Thu, Feb 01, 2001 at 10:25:22AM +0530, [email protected] wrote:
> >
> > Being able to track the children of a kiobuf would help with I/O
> > cancellation (e.g. to pull sub-ios off their request queues if I/O
> > cancellation for the parent kiobuf was issued). Not essential, I guess, in
> > general, but useful in some situations.
>
> What exactly is the justification for IO cancellation? It really
> upsets the normal flow of control through the IO stack to have
> voluntary cancellation semantics.
>
XFS does something called a "forced shutdown" of the filesystem in which
it requires outstanding I/Os issued against file data to be cancelled.
This is triggered by (among other things) errors in writing out file
metadata. I'm cc'ing Steve Lord so he can provide more information.

Of course, I was thinking along the lines of an API flushing the requests
out of the elevator at that time .... didn't get too far with it though.

Cheers,
-Chait.