2008-06-18 15:59:37

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] knfsd: nfsd: Handle ERESTARTSYS from syscalls.

On Tue, Jun 17, 2008 at 05:03:01PM +1000, Neil Brown wrote:
> On Monday June 16, [email protected] wrote:
> > On Mon, 2008-06-16 at 11:09 -0400, Chuck Lever wrote:
> >
> > > I think an error reply is much better than no reply in nearly every
> > > case. NFS3ERR_JUKEBOX/NFS4ERR_DELAY is an interesting idea, but
> > > something else again will probably be required for v4.1 with sessions.
> >
> > NFS3ERR_JUKEBOX/NFS4ERR_DELAY may be inappropriate if the nfs daemon has
> > already started handling the RPC call, since you may be interrupting a
> > non-idempotent operation.
>
> If the filesystem allows you to interrupt a non-idempotent operation
> part way through, then the filesystem is doing something very wrong.
>
> The observed behaviour is that multiple 32K writes are outstanding
> (in different nfsd threads) when a signal is delivered to each nfsd.
>
> OCFS2 appears to be serialising these writes.
>
> One of the writes completes returning a length that is less than 32K.
> This length is returned to the client. A quick look at the client
> code suggests that it complains with a printk, and tries to write the
> remainder, which seems correct.
>
> The other writes all complete with ERESTARTSYS. Presumably they
> haven't started at all. If they had, you might expect a partial
> return from them too.
>
> So far, what OCFS2 is doing seems credible and doesn't leave us in an
> awkward position with respect to incomplete idempotent operations.
>
> I cannot be certain, but I'm willing to believe that OCFS2 only
> returns ERESTARTSYS when the operation hasn't been performed at all
> (or has been wound-back to the starting condition).
>
> I agree that NFS3ERR_JUKEBOX is more appropriate than no reply, but I
> don't think there is any reason to suspect that will not be
> sufficient.

OK. Want to send a replacement patch?

--b.