2009-10-14 22:19:12

by Trond Myklebust

[permalink] [raw]
Subject: Re: [pnfs] [PATCH 1/1] nfs41: resolve some race conditions with queued SEQUENCE operations when unmounting

On Wed, 2009-10-14 at 15:09 -0700, Alexandros Batsakis wrote:
> On Wed, Oct 14, 2009 at 2:57 PM, Trond Myklebust
> <[email protected]> wrote:
> > On Wed, 2009-10-14 at 17:53 -0400, Trond Myklebust wrote:
> >> On Wed, 2009-10-14 at 14:50 -0700, Alexandros Batsakis wrote:
> >> > a) nfs41_sequence_done() called after destroy_session() that leads to
> >> > a NULL pointer dereference
> >> > b) a BADSESSION reply to a sequence operation triggers a
> >> > reset_session() at the same time with destroy_session() (called by
> >> > umount) that leads to another NULL pointer dereference.
> >>
> >> This would mean that nfs41_sequence_done is being called _after_ the
> >> nfs_client (and hence the session) has been destroyed. That sounds like
> >> the real bug that needs to be fixed.
> >
> > Correction: it means that nfs41_sequence_done is being called after the
> > superblock that "owns" those rpc calls has been destroyed. (Which is a
> > bug... :-))
> >
>
> Agreed. FWIW and from a conceptual point of view, the patch above is a
> bit orthogonal to that as it deals with the problem within the session
> scope. It treats the umount just as a session destroyer that happens
> to always destroy the super-block in the current
> one-session-per-client implementation. The latter may change, but the
> patch will remain relevant (obviously with few adjustments).

I completely disagree. If your code may end up starting the state
manager after the nfs_client is destroyed, then you _will_ corrupt
significant swathes of memory.

> Anyway, as long as we fix the bug I am happy :)

I don't think you have...

Trond