2008-02-20 18:52:57

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [Fwd: Re: Linux 2.6.25-rc2]

On Tue, Feb 19, 2008 at 07:00:51PM +1100, Greg Banks wrote:
> Tom Tucker wrote:
> > Bruce:
> >
> > I'll take a look...
> >
> > Tom
> >
> > On Mon, 2008-02-18 at 12:45 -0500, J. Bruce Fields wrote:
> >
> >> On Sun, Feb 17, 2008 at 04:02:45PM -0500, Trond Myklebust wrote:
> >>
> >>> Hi Bruce,
> >>>
> >>> Here is a question for you.
> >>>
> >>> Why does svc_close_all() get away with deleting xprt->xpt_ready
> >>> without holding the pool->sp_lock?
> >>>
> >> >From a quick look--I think the intention is that the code that calls it
> >> (in svc_destroy()) is only called after all other server threads have
> >> exited, and that there can't be anyone else monkeying with that service
> >> any more. But I haven't verified that really carefully.
> >>
> That's certainly the intention. The serv->sv_nrthreads fields is used
> as a refcount, which counts 1 for each nfsd thread and sometimes 1 to
> guard some short-term manipulations. This refcount should not drop to
> zero until the last thread exits. So by the time svc_close_all() is
> called, no thread can be looking at a pool's sp_sockets list. Each
> xprt could still be racily added to that list from softirq mode data
> ready handlers calling svc_xprt_enqueue(), but only until the xprt's
> xpo_detach method is called (which removes any data ready callbacks).
> At that point, no code should be modifying the xpt_ready field, and
> it may or may not be used to link the xprt into some pool->sp_sockets
> but we don't care because all the pools are about to be destroyed
> anyway.

OK, thanks very much for the confirmation. It looks like they've
finally decided the problem was actually with the memory allocator:

http://bugzilla.kernel.org/show_bug.cgi?id=9973

so I think we're OK.

> That's the way it's been working since 2.6.19, and I don't think any
> of Tom's patches changed that.
>
> I can think of a couple of things that could be wrong:
>
> * serv->sv_nrthreads is used in a few places, and there might
> be bugs in that which are getting that count wrong (when I left
> the code, all increments and decrements of that field went through
> svc_get() and svc_destroy(), but other changes have crept in).
>
> * Currently running data ready callbacks might be racing with xpo_detach.
> Moving that call inside the spin_lock_bh() critical section just
> after it might help.

But if you were unsure about these two things then I suppose it might be
worth doing a closer audit and seeing if there's any refactoring or
documentation that could be added to make the rules clearer.

--b.