2010-02-26 22:33:47

by NeilBrown

[permalink] [raw]
Subject: [PATCH] sunrpc: remove unnecessary svc_xprt_put


[I found this while looking for the current refcount problem
that triggers a warning in svc_recv. This isn't that bug
but is a different refcount bug - NB]

The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt. This is seen in svc_revisit
which is where things are added to this queue. dr->xprt is set to
NULL and the reference to the xprt it put.

So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.

Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.

Cc: Tom Tucker <[email protected]>
Signed-off-by: NeilBrown <[email protected]>

diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 7d1f9e9..4f30336 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -889,11 +889,8 @@ void svc_delete_xprt(struct svc_xprt *xprt)
if (test_bit(XPT_TEMP, &xprt->xpt_flags))
serv->sv_tmpcnt--;

- for (dr = svc_deferred_dequeue(xprt); dr;
- dr = svc_deferred_dequeue(xprt)) {
- svc_xprt_put(xprt);
+ while ((dr = svc_deferred_dequeue(xprt)) != NULL)
kfree(dr);
- }

svc_xprt_put(xprt);
spin_unlock_bh(&serv->sv_lock);


2010-02-26 22:43:40

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>
> [I found this while looking for the current refcount problem
> that triggers a warning in svc_recv. This isn't that bug
> but is a different refcount bug - NB]
>
> The 'struct svc_deferred_req's on the xpt_deferred queue do not
> own a reference to the owning xprt. This is seen in svc_revisit
> which is where things are added to this queue. dr->xprt is set to
> NULL and the reference to the xprt it put.
>
> So when this list is cleaned up in svc_delete_xprt, we mustn't
> put the reference.
>
> Also, replace the 'for' with a 'while' which is arguably
> simpler and more likely to compile efficiently.

OK, thanks, queuing up for 2.6.34 and stable.

--b.

>
> Cc: Tom Tucker <[email protected]>
> Signed-off-by: NeilBrown <[email protected]>
>
> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> index 7d1f9e9..4f30336 100644
> --- a/net/sunrpc/svc_xprt.c
> +++ b/net/sunrpc/svc_xprt.c
> @@ -889,11 +889,8 @@ void svc_delete_xprt(struct svc_xprt *xprt)
> if (test_bit(XPT_TEMP, &xprt->xpt_flags))
> serv->sv_tmpcnt--;
>
> - for (dr = svc_deferred_dequeue(xprt); dr;
> - dr = svc_deferred_dequeue(xprt)) {
> - svc_xprt_put(xprt);
> + while ((dr = svc_deferred_dequeue(xprt)) != NULL)
> kfree(dr);
> - }
>
> svc_xprt_put(xprt);
> spin_unlock_bh(&serv->sv_lock);

2010-02-26 22:53:15

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>
> [I found this while looking for the current refcount problem
> that triggers a warning in svc_recv. This isn't that bug
> but is a different refcount bug - NB]

And thanks very much for looking into that, I'm worried.... Seems to
have appeared some time between v2.6.31 and v2.6.32.2. On a quick skim
commits in that range that struck me as worth a second look included
8f55f3c0a013, b0401d725334, and the 4.1 backchannel patches (3ddc8bf5f3
and preceding).

Oh, and I also have some very rough notes from when I looked at this
before, in case there's anything useful.

--b.

Re: 2.6.32.2 - WARNING: at lib/kref.c:43 kref_get+0x,23/0x2b()

Seen on: 2.6.32.2, 2.6.32.6, 2.6.32.8; probably was OK on 2.6.29.6 and
2.6.31.

Is the warning actually warning about anything that's a problem, or can
that counter by zero by design? Yes, it's actually a problem.

Is probably svc_xprt_get(xprt) in svc_recv() (only obvious kref_get I
found on a quick glance through svc_recv).
Double-check:
svc_recv+0x305/0x7e6
Note next bug is on putting a socket (that we probably
shouldn't have!?):
- BUG_ON(inode->i_state == I_CLEAR).
- Implies clear_inode() was previously called on
it.
- stack includes kref_put() call in
svc_xprt_release, which is indeed put of same
xpt_ref field that svc_xprt_get() gets.

So, most probably explanation:
- We still had a dangling reference to an xprt after putting
one. So we ended up doing another get/put pair on it later
and trying to free the same socket twice.

So, plan: look for svc_xprt_puts (after checking for other stray uses of
xpt_ref) and verify that they're all legit. And gets while we're at it:

Ignore svc_rdma for now. Those reporters that answered weren't
using rdma.

Most puts outside of rdma are in svc_xprt.c:

- svc_xprt_release (unconditional): 0 to caller (put matched
with removal from rq_xprt)
- svc_check_conn_limits(): 0 to caller
- takes an xprt off a sv_tempsocks list, gets it (and
sets XPT_CLOSE) before dropping sv_lock, then enqueues
and puts. (Note: enqueue will get, and assign to
rq_xprt, if thread found.)
- svc_age_temp_xprts: 0 to caller
- same pattern as svc_check_conn_limits().
- svc_delete_xprt: 0 to caller (put matched with removal from
xpt_list)
- if test_and_set_bit of XPT_DEAD succeeds, will
svc_xprt_put(), after calling xpo_detach, then (under
sv_lock) removing xpt_list.
- ALSO unconditionally puts once for each deferred
request it finds associated with this request. Is
that right? Yup: svc_defer() gets on success, when
assigning dr->xprt.
- svc_close_xprt:
- sets XPT_CLOSE, then if test_and_set_bit of XPT_BUSY
succeeds, gets xprt, deletes, clears BUSY, puts.
- revisit:
- puts associated xprt unconditionally.

Also some puts are in fs/nfsd/nfsctl.c, fs/nfsd/nfs4state.c,
fs/lockd/svc.c:

nfsctl.c:
ifs/nfsd/nfsctl.c:__write_ports_delxprt():
- svc_find_xprt() gets a reference; if found:
svc_close_xprt, svc_xprt_put. OK.
nfs4state.c:
free_client: svc_xprt_put(clp->cl_cb_xprt);
Looks basically correct: we take reference when we
assign that in nfsd4_create_session.

Hm. Note we copy pointer to clp->cl_cb_xprt without
taking reference? The client holds a reference,
though. Looking at cb_xprt use in client xprt code, I
can't see any references taken or dropped. This all
looks fine.

lockd/svc.c: create_lock_listener() looks innocuous.

2010-02-27 00:40:59

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

J. Bruce Fields wrote:
> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>
>> [I found this while looking for the current refcount problem
>> that triggers a warning in svc_recv. This isn't that bug
>> but is a different refcount bug - NB]
>>
>
>

I seem to recall that we added that reference for a reason. There was
an issue with unmount while there were deferrals pending. That's why the
reference was added.

Tom

> And thanks very much for looking into that, I'm worried.... Seems to
> have appeared some time between v2.6.31 and v2.6.32.2. On a quick skim
> commits in that range that struck me as worth a second look included
> 8f55f3c0a013, b0401d725334, and the 4.1 backchannel patches (3ddc8bf5f3
> and preceding).
>
> Oh, and I also have some very rough notes from when I looked at this
> before, in case there's anything useful.
>
> --b.
>
> Re: 2.6.32.2 - WARNING: at lib/kref.c:43 kref_get+0x,23/0x2b()
>
> Seen on: 2.6.32.2, 2.6.32.6, 2.6.32.8; probably was OK on 2.6.29.6 and
> 2.6.31.
>
> Is the warning actually warning about anything that's a problem, or can
> that counter by zero by design? Yes, it's actually a problem.
>
> Is probably svc_xprt_get(xprt) in svc_recv() (only obvious kref_get I
> found on a quick glance through svc_recv).
> Double-check:
> svc_recv+0x305/0x7e6
> Note next bug is on putting a socket (that we probably
> shouldn't have!?):
> - BUG_ON(inode->i_state == I_CLEAR).
> - Implies clear_inode() was previously called on
> it.
> - stack includes kref_put() call in
> svc_xprt_release, which is indeed put of same
> xpt_ref field that svc_xprt_get() gets.
>
> So, most probably explanation:
> - We still had a dangling reference to an xprt after putting
> one. So we ended up doing another get/put pair on it later
> and trying to free the same socket twice.
>
> So, plan: look for svc_xprt_puts (after checking for other stray uses of
> xpt_ref) and verify that they're all legit. And gets while we're at it:
>
> Ignore svc_rdma for now. Those reporters that answered weren't
> using rdma.
>
> Most puts outside of rdma are in svc_xprt.c:
>
> - svc_xprt_release (unconditional): 0 to caller (put matched
> with removal from rq_xprt)
> - svc_check_conn_limits(): 0 to caller
> - takes an xprt off a sv_tempsocks list, gets it (and
> sets XPT_CLOSE) before dropping sv_lock, then enqueues
> and puts. (Note: enqueue will get, and assign to
> rq_xprt, if thread found.)
> - svc_age_temp_xprts: 0 to caller
> - same pattern as svc_check_conn_limits().
> - svc_delete_xprt: 0 to caller (put matched with removal from
> xpt_list)
> - if test_and_set_bit of XPT_DEAD succeeds, will
> svc_xprt_put(), after calling xpo_detach, then (under
> sv_lock) removing xpt_list.
> - ALSO unconditionally puts once for each deferred
> request it finds associated with this request. Is
> that right? Yup: svc_defer() gets on success, when
> assigning dr->xprt.
> - svc_close_xprt:
> - sets XPT_CLOSE, then if test_and_set_bit of XPT_BUSY
> succeeds, gets xprt, deletes, clears BUSY, puts.
> - revisit:
> - puts associated xprt unconditionally.
>
> Also some puts are in fs/nfsd/nfsctl.c, fs/nfsd/nfs4state.c,
> fs/lockd/svc.c:
>
> nfsctl.c:
> ifs/nfsd/nfsctl.c:__write_ports_delxprt():
> - svc_find_xprt() gets a reference; if found:
> svc_close_xprt, svc_xprt_put. OK.
> nfs4state.c:
> free_client: svc_xprt_put(clp->cl_cb_xprt);
> Looks basically correct: we take reference when we
> assign that in nfsd4_create_session.
>
> Hm. Note we copy pointer to clp->cl_cb_xprt without
> taking reference? The client holds a reference,
> though. Looking at cb_xprt use in client xprt code, I
> can't see any references taken or dropped. This all
> looks fine.
>
> lockd/svc.c: create_lock_listener() looks innocuous.
>


2010-02-27 01:35:46

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

On Fri, 26 Feb 2010 18:40:58 -0600
Tom Tucker <[email protected]> wrote:

> J. Bruce Fields wrote:
> > On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
> >
> >> [I found this while looking for the current refcount problem
> >> that triggers a warning in svc_recv. This isn't that bug
> >> but is a different refcount bug - NB]
> >>
> >
> >
>
> I seem to recall that we added that reference for a reason. There was
> an issue with unmount while there were deferrals pending. That's why the
> reference was added.
>
> Tom

What reference?
What I (thought I) found was code that was dropping a reference which it
didn't hold. Are you saying that it is supposed to be holding a reference
here, but isn't, or that it really is holding a reference here and I didn't
see it?

And just for completeness, my understanding of the refcounting here is:

A counted references is held on an svc_xprt when:
- a 'struct rqst' refers to it through ->rq_xprt
- a 'cache_deferred_req' refers to it through ->xprt
This only happens while the req is waiting to be
revisited, and is in the hash table and on the lru.
Once the req gets revisited (svc_revisit) ->xprt
is set to NULL and the reference is dropped.
- XPT_DEAD is *not* set. So the refcount is initialised
to '1' to reflect this, and this ref is dropped
when we set XPT_DEAD.
- there are a few transient references in svc_xprt.c
which very clearly have matched 'get' and 'put'.
- svc_find_xprt returns a counted reference. This is
called once in lockd and once in nfsd, and both
calls drop the ref correctly.

Whenever we drop a counted ref that was stored in a pointer, we set that
pointer to NULL.
So if there was a race where two threads both get a reference from a pointer
and then drop that reference, you would expect that slightly different timing
would cause one of those threads to get a NULL from the pointer, dereference
it, and crash. There are no important tests-for-NULL on either of the
pointers in question, so that wouldn't be protecting us from a crash. But
we don't see that crash, so there cannot be a race there.

So: The refcount cannot possibly be zero in svc_recv :-)

I just noticed some slightly odd code later in svc_recv:

if (XPT_LISTENER && XPT_CLOSE) {
...
} else if (XPT_CLOSE) {
...
->xpo_recvfrom()
}
if (XPT_CLOSE) {
...
svc_delete_xprt()
}

So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
is possible, and if ->xpo_recvfrom returns non-zero, then we end up
processing a request on a dead socket, which doesn't sound like the right
thing to do. I don't think it can cause the present problem, but
it looks wrong. That last 'if' should just be an 'else'.
I guess that would effectively reverse b0401d7253, though - not that
that patch seems entirely right to me - if there is a problem I probably
would have fixed it differently, though I'm not sure how.
So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???

NeilBrown


2010-02-27 02:38:25

by Tom Tucker

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

Neil Brown wrote:
> On Fri, 26 Feb 2010 18:40:58 -0600
> Tom Tucker <[email protected]> wrote:
>
>
>> J. Bruce Fields wrote:
>>
>>> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
>>>
>>>
>>>> [I found this while looking for the current refcount problem
>>>> that triggers a warning in svc_recv. This isn't that bug
>>>> but is a different refcount bug - NB]
>>>>
>>>>
>>>
>>>
>> I seem to recall that we added that reference for a reason. There was
>> an issue with unmount while there were deferrals pending. That's why the
>> reference was added.
>>
>> Tom
>>
>
> What reference?
> What I (thought I) found was code that was dropping a reference which it
> didn't hold. Are you saying that it is supposed to be holding a reference
> here, but isn't, or that it really is holding a reference here and I didn't
> see it?
>

Here's the commit that I was thinking of...
22945e4a1c7454c97f5d8aee1ef526c83fef3223

I think this change adds the bug that you are now fixing. It fixed one
problem, but added another that you have now resolved.

What do you guys think?

Thanks,
Tom
> And just for completeness, my understanding of the refcounting here is:
>
> A counted references is held on an svc_xprt when:
> - a 'struct rqst' refers to it through ->rq_xprt
> - a 'cache_deferred_req' refers to it through ->xprt
> This only happens while the req is waiting to be
> revisited, and is in the hash table and on the lru.
> Once the req gets revisited (svc_revisit) ->xprt
> is set to NULL and the reference is dropped.
> - XPT_DEAD is *not* set. So the refcount is initialised
> to '1' to reflect this, and this ref is dropped
> when we set XPT_DEAD.
> - there are a few transient references in svc_xprt.c
> which very clearly have matched 'get' and 'put'.
> - svc_find_xprt returns a counted reference. This is
> called once in lockd and once in nfsd, and both
> calls drop the ref correctly.
>
> Whenever we drop a counted ref that was stored in a pointer, we set that
> pointer to NULL.
> So if there was a race where two threads both get a reference from a pointer
> and then drop that reference, you would expect that slightly different timing
> would cause one of those threads to get a NULL from the pointer, dereference
> it, and crash. There are no important tests-for-NULL on either of the
> pointers in question, so that wouldn't be protecting us from a crash. But
> we don't see that crash, so there cannot be a race there.
>
> So: The refcount cannot possibly be zero in svc_recv :-)
>
> I just noticed some slightly odd code later in svc_recv:
>
> if (XPT_LISTENER && XPT_CLOSE) {
> ...
> } else if (XPT_CLOSE) {
> ...
> ->xpo_recvfrom()
> }
> if (XPT_CLOSE) {
> ...
> svc_delete_xprt()
> }
>
> So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
> is possible, and if ->xpo_recvfrom returns non-zero, then we end up
> processing a request on a dead socket, which doesn't sound like the right
> thing to do. I don't think it can cause the present problem, but
> it looks wrong. That last 'if' should just be an 'else'.
> I guess that would effectively reverse b0401d7253, though - not that
> that patch seems entirely right to me - if there is a problem I probably
> would have fixed it differently, though I'm not sure how.
> So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
>
> NeilBrown
>


2010-03-01 14:43:33

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

On Mon, Mar 01, 2010 at 03:23:10PM +1100, Neil Brown wrote:
> On Fri, 26 Feb 2010 20:38:25 -0600
> Tom Tucker <[email protected]> wrote:
>
> > Neil Brown wrote:
> > > On Fri, 26 Feb 2010 18:40:58 -0600
> > > Tom Tucker <[email protected]> wrote:
> > >
> > >
> > >> J. Bruce Fields wrote:
> > >>
> > >>> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
> > >>>
> > >>>
> > >>>> [I found this while looking for the current refcount problem
> > >>>> that triggers a warning in svc_recv. This isn't that bug
> > >>>> but is a different refcount bug - NB]
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >> I seem to recall that we added that reference for a reason. There was
> > >> an issue with unmount while there were deferrals pending. That's why the
> > >> reference was added.
> > >>
> > >> Tom
> > >>
> > >
> > > What reference?
> > > What I (thought I) found was code that was dropping a reference which it
> > > didn't hold. Are you saying that it is supposed to be holding a reference
> > > here, but isn't, or that it really is holding a reference here and I didn't
> > > see it?
> > >
> >
> > Here's the commit that I was thinking of...
> > 22945e4a1c7454c97f5d8aee1ef526c83fef3223
> >
> > I think this change adds the bug that you are now fixing. It fixed one
> > problem, but added another that you have now resolved.
> >
> > What do you guys think?
>
> Yes, I see what you are saying.
>
> I agree that commit did fix a problem, but inadvertently introduced a new one.

Agreed. So it looks to there's nothing additional here to fix.
(Correct me if I'm overlooking something.)

--b.

2010-03-01 04:23:19

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH] sunrpc: remove unnecessary svc_xprt_put

On Fri, 26 Feb 2010 20:38:25 -0600
Tom Tucker <[email protected]> wrote:

> Neil Brown wrote:
> > On Fri, 26 Feb 2010 18:40:58 -0600
> > Tom Tucker <[email protected]> wrote:
> >
> >
> >> J. Bruce Fields wrote:
> >>
> >>> On Sat, Feb 27, 2010 at 09:33:40AM +1100, Neil Brown wrote:
> >>>
> >>>
> >>>> [I found this while looking for the current refcount problem
> >>>> that triggers a warning in svc_recv. This isn't that bug
> >>>> but is a different refcount bug - NB]
> >>>>
> >>>>
> >>>
> >>>
> >> I seem to recall that we added that reference for a reason. There was
> >> an issue with unmount while there were deferrals pending. That's why the
> >> reference was added.
> >>
> >> Tom
> >>
> >
> > What reference?
> > What I (thought I) found was code that was dropping a reference which it
> > didn't hold. Are you saying that it is supposed to be holding a reference
> > here, but isn't, or that it really is holding a reference here and I didn't
> > see it?
> >
>
> Here's the commit that I was thinking of...
> 22945e4a1c7454c97f5d8aee1ef526c83fef3223
>
> I think this change adds the bug that you are now fixing. It fixed one
> problem, but added another that you have now resolved.
>
> What do you guys think?

Yes, I see what you are saying.

I agree that commit did fix a problem, but inadvertently introduced a new one.

Thanks,
NeilBrown


>
> Thanks,
> Tom
> > And just for completeness, my understanding of the refcounting here is:
> >
> > A counted references is held on an svc_xprt when:
> > - a 'struct rqst' refers to it through ->rq_xprt
> > - a 'cache_deferred_req' refers to it through ->xprt
> > This only happens while the req is waiting to be
> > revisited, and is in the hash table and on the lru.
> > Once the req gets revisited (svc_revisit) ->xprt
> > is set to NULL and the reference is dropped.
> > - XPT_DEAD is *not* set. So the refcount is initialised
> > to '1' to reflect this, and this ref is dropped
> > when we set XPT_DEAD.
> > - there are a few transient references in svc_xprt.c
> > which very clearly have matched 'get' and 'put'.
> > - svc_find_xprt returns a counted reference. This is
> > called once in lockd and once in nfsd, and both
> > calls drop the ref correctly.
> >
> > Whenever we drop a counted ref that was stored in a pointer, we set that
> > pointer to NULL.
> > So if there was a race where two threads both get a reference from a pointer
> > and then drop that reference, you would expect that slightly different timing
> > would cause one of those threads to get a NULL from the pointer, dereference
> > it, and crash. There are no important tests-for-NULL on either of the
> > pointers in question, so that wouldn't be protecting us from a crash. But
> > we don't see that crash, so there cannot be a race there.
> >
> > So: The refcount cannot possibly be zero in svc_recv :-)
> >
> > I just noticed some slightly odd code later in svc_recv:
> >
> > if (XPT_LISTENER && XPT_CLOSE) {
> > ...
> > } else if (XPT_CLOSE) {
> > ...
> > ->xpo_recvfrom()
> > }
> > if (XPT_CLOSE) {
> > ...
> > svc_delete_xprt()
> > }
> >
> > So if XPT_CLOSE is set while xpo_recvfrom is being called, which I think
> > is possible, and if ->xpo_recvfrom returns non-zero, then we end up
> > processing a request on a dead socket, which doesn't sound like the right
> > thing to do. I don't think it can cause the present problem, but
> > it looks wrong. That last 'if' should just be an 'else'.
> > I guess that would effectively reverse b0401d7253, though - not that
> > that patch seems entirely right to me - if there is a problem I probably
> > would have fixed it differently, though I'm not sure how.
> > So maybe change "if (XPT_CLOSE)" to "if (len <= 0 && XPT_CLOSE)" ???
> >
> > NeilBrown
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html