2016-11-12 11:08:52

by Benjamin Coddington

[permalink] [raw]
Subject: CLOSE/OPEN race

I've been seeing the following on a modified version of generic/089
that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.

1. Client has open stateid A, sends a CLOSE
2. Client sends OPEN with same owner
3. Client sends another OPEN with same owner
4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B sequence 2)
5. Client does LOCK,LOCKU,FREE_STATEID from B.2
6. Client gets a reply to CLOSE in 1
7. Client gets reply to OPEN in 2, stateid is B.1
8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop

The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that the OPEN
response in 7 is able to update the open_stateid even though it has a lower
sequence number.

I think this case could be handled by never updating the open_stateid if the
stateids match but the sequence number of the new state is less than the
current open_state.

Any thoughts?

Ben


2016-11-12 12:54:14

by Jeff Layton

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
> I've been seeing the following on a modified version of generic/089
> that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
>
> 1. Client has open stateid A, sends a CLOSE
> 2. Client sends OPEN with same owner
> 3. Client sends another OPEN with same owner
> 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B sequence 2)
> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> 6. Client gets a reply to CLOSE in 1
> 7. Client gets reply to OPEN in 2, stateid is B.1
> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
>
> The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that the OPEN
> response in 7 is able to update the open_stateid even though it has a lower
> sequence number.
>
> I think this case could be handled by never updating the open_stateid if the
> stateids match but the sequence number of the new state is less than the
> current open_state.
>

What kernel is this on?

Yes, that seems wrong. The client should be picking B.2 for the open
stateid to use. I think that decision of whether to take a seqid is made
in nfs_need_update_open_stateid. The logic in there looks correct to me
at first glance though.

--
Jeff Layton <[email protected]>

2016-11-12 15:31:47

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On 12 Nov 2016, at 7:54, Jeff Layton wrote:

> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
>> I've been seeing the following on a modified version of generic/089
>> that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
>>
>> 1. Client has open stateid A, sends a CLOSE
>> 2. Client sends OPEN with same owner
>> 3. Client sends another OPEN with same owner
>> 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
>> sequence 2)
>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>> 6. Client gets a reply to CLOSE in 1
>> 7. Client gets reply to OPEN in 2, stateid is B.1
>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
>>
>> The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
>> the OPEN
>> response in 7 is able to update the open_stateid even though it has a
>> lower
>> sequence number.
>>
>> I think this case could be handled by never updating the open_stateid
>> if the
>> stateids match but the sequence number of the new state is less than
>> the
>> current open_state.
>>
>
> What kernel is this on?

On v4.9-rc2 with a couple fixups. Without them, I can't test long
enough to
reproduce this race. I don't think any of those are involved in this
problem, though.

> Yes, that seems wrong. The client should be picking B.2 for the open
> stateid to use. I think that decision of whether to take a seqid is
> made
> in nfs_need_update_open_stateid. The logic in there looks correct to
> me
> at first glance though.

nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE is
unset.
That's the precondition set up by steps 1-6. Perhaps it should not
update
the stateid if they match but the sequence number is less, and still set
NFS_OPEN_STATE once more. That will fix _this_ case. Are there other
cases
where that would be a problem?

Ben

2016-11-12 16:52:45

by Jeff Layton

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>
> >
> > On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
> > >
> > > I've been seeing the following on a modified version of generic/089
> > > that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
> > >
> > > 1. Client has open stateid A, sends a CLOSE
> > > 2. Client sends OPEN with same owner
> > > 3. Client sends another OPEN with same owner
> > > 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
> > > sequence 2)
> > > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> > > 6. Client gets a reply to CLOSE in 1
> > > 7. Client gets reply to OPEN in 2, stateid is B.1
> > > 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
> > >
> > > The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
> > > the OPEN
> > > response in 7 is able to update the open_stateid even though it has a
> > > lower
> > > sequence number.
> > >
> > > I think this case could be handled by never updating the open_stateid
> > > if the
> > > stateids match but the sequence number of the new state is less than
> > > the
> > > current open_state.
> > >
> >
> > What kernel is this on?
>
> On v4.9-rc2 with a couple fixups. Without them, I can't test long
> enough to
> reproduce this race. I don't think any of those are involved in this
> problem, though.
>
> >
> > Yes, that seems wrong. The client should be picking B.2 for the open
> > stateid to use. I think that decision of whether to take a seqid is
> > made
> > in nfs_need_update_open_stateid. The logic in there looks correct to
> > me
> > at first glance though.
>
> nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE is
> unset.
> That's the precondition set up by steps 1-6. Perhaps it should not
> update
> the stateid if they match but the sequence number is less, and still set
> NFS_OPEN_STATE once more. That will fix _this_ case. Are there other
> cases
> where that would be a problem?
>
> Ben

That seems wrong. The only close was sent in step 1, and that was for a
completely different stateid (A rather than B). It seems likely that
that is where the bug is.

--
Jeff Layton <[email protected]>

2016-11-12 18:03:05

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race



On 12 Nov 2016, at 11:52, Jeff Layton wrote:

> On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
>> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>>
>>>
>>> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
>>>>
>>>> I've been seeing the following on a modified version of generic/089
>>>> that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
>>>>
>>>> 1. Client has open stateid A, sends a CLOSE
>>>> 2. Client sends OPEN with same owner
>>>> 3. Client sends another OPEN with same owner
>>>> 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
>>>> sequence 2)
>>>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>>>> 6. Client gets a reply to CLOSE in 1
>>>> 7. Client gets reply to OPEN in 2, stateid is B.1
>>>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
>>>>
>>>> The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
>>>> the OPEN
>>>> response in 7 is able to update the open_stateid even though it has a
>>>> lower
>>>> sequence number.
>>>>
>>>> I think this case could be handled by never updating the open_stateid
>>>> if the
>>>> stateids match but the sequence number of the new state is less than
>>>> the
>>>> current open_state.
>>>>
>>>
>>> What kernel is this on?
>>
>> On v4.9-rc2 with a couple fixups. Without them, I can't test long
>> enough to
>> reproduce this race. I don't think any of those are involved in this
>> problem, though.
>>
>>>
>>> Yes, that seems wrong. The client should be picking B.2 for the open
>>> stateid to use. I think that decision of whether to take a seqid is
>>> made
>>> in nfs_need_update_open_stateid. The logic in there looks correct to
>>> me
>>> at first glance though.
>>
>> nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE is
>> unset.
>> That's the precondition set up by steps 1-6. Perhaps it should not
>> update
>> the stateid if they match but the sequence number is less, and still set
>> NFS_OPEN_STATE once more. That will fix _this_ case. Are there other
>> cases
>> where that would be a problem?
>>
>> Ben
>
> That seems wrong.

I'm not sure what you mean: what seems wrong?

> The only close was sent in step 1, and that was for a
> completely different stateid (A rather than B). It seems likely that
> that is where the bug is.

I'm still not sure what point you're trying to make..

Even though the close was sent in step 1, the response wasn't processed
until step 6..

2016-11-12 18:16:32

by Trond Myklebust

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

DQo+IE9uIE5vdiAxMiwgMjAxNiwgYXQgMDY6MDgsIEJlbmphbWluIENvZGRpbmd0b24gPGJjb2Rk
aW5nQHJlZGhhdC5jb20+IHdyb3RlOg0KPiANCj4gSSd2ZSBiZWVuIHNlZWluZyB0aGUgZm9sbG93
aW5nIG9uIGEgbW9kaWZpZWQgdmVyc2lvbiBvZiBnZW5lcmljLzA4OQ0KPiB0aGF0IGdldHMgdGhl
IGNsaWVudCBzdHVjayBzZW5kaW5nIExPQ0sgd2l0aCBORlM0RVJSX09MRF9TVEFURUlELg0KPiAN
Cj4gMS4gQ2xpZW50IGhhcyBvcGVuIHN0YXRlaWQgQSwgc2VuZHMgYSBDTE9TRQ0KPiAyLiBDbGll
bnQgc2VuZHMgT1BFTiB3aXRoIHNhbWUgb3duZXINCj4gMy4gQ2xpZW50IHNlbmRzIGFub3RoZXIg
T1BFTiB3aXRoIHNhbWUgb3duZXINCj4gNC4gQ2xpZW50IGdldHMgYSByZXBseSB0byBPUEVOIGlu
IDMsIHN0YXRlaWQgaXMgQi4yIChzdGF0ZWlkIEIgc2VxdWVuY2UgMikNCj4gNS4gQ2xpZW50IGRv
ZXMgTE9DSyxMT0NLVSxGUkVFX1NUQVRFSUQgZnJvbSBCLjINCj4gNi4gQ2xpZW50IGdldHMgYSBy
ZXBseSB0byBDTE9TRSBpbiAxDQo+IDcuIENsaWVudCBnZXRzIHJlcGx5IHRvIE9QRU4gaW4gMiwg
c3RhdGVpZCBpcyBCLjENCj4gOC4gQ2xpZW50IHNlbmRzIExPQ0sgd2l0aCBCLjEgLSBPTERfU1RB
VEVJRCwgbm93IHN0dWNrIGluIGEgbG9vcA0KPiANCj4gVGhlIENMT1NFIHJlc3BvbnNlIGluIDYg
Y2F1c2VzIHVzIHRvIGNsZWFyIE5GU19PUEVOX1NUQVRFLCBzbyB0aGF0IHRoZSBPUEVODQo+IHJl
c3BvbnNlIGluIDcgaXMgYWJsZSB0byB1cGRhdGUgdGhlIG9wZW5fc3RhdGVpZCBldmVuIHRob3Vn
aCBpdCBoYXMgYSBsb3dlcg0KPiBzZXF1ZW5jZSBudW1iZXIuDQoNCkhtbeKApiBXZSBwcm9iYWJs
eSBzaG91bGQgbm90IGRvIHRoYXQgaWYgdGhlIHN0YXRlaWQub3RoZXIgZmllbGQgb2YgQSAoaS5l
LiB0aGUgb25lIHN1cHBsaWVkIGFzIHRoZSBhcmd1bWVudCB0byBDTE9TRSkgZG9lcyBub3QgbWF0
Y2ggdGhlIHN0YXRlaWQub3RoZXIgb2YgQi4NCkluIGZhY3QsIHRoZSByZXBseSBpbiAoNCksIHdo
ZXJlIHRoZSBzdGF0ZWlkIGNoYW5nZXMgdG8gQiwgc2hvdWxkIGJlIHRoZSB0aGluZyB0aGF0IHJl
c2V0cyB0aGUgT1BFTiBzdGF0ZS4=

2016-11-12 18:47:02

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On 12 Nov 2016, at 13:16, Trond Myklebust wrote:

>
>> On Nov 12, 2016, at 06:08, Benjamin Coddington <[email protected]>
>> wrote:
>>
>> I've been seeing the following on a modified version of generic/089
>> that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
>>
>> 1. Client has open stateid A, sends a CLOSE
>> 2. Client sends OPEN with same owner
>> 3. Client sends another OPEN with same owner
>> 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
>> sequence 2)
>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>> 6. Client gets a reply to CLOSE in 1
>> 7. Client gets reply to OPEN in 2, stateid is B.1
>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
>>
>> The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
>> the OPEN
>> response in 7 is able to update the open_stateid even though it has a
>> lower
>> sequence number.
>
> Hmm… We probably should not do that if the stateid.other field of A
> (i.e.
> the one supplied as the argument to CLOSE) does not match the
> stateid.other of B. In fact, the reply in (4), where the stateid
> changes
> to B, should be the thing that resets the OPEN state.

At 4, by reset the OPEN state, do you mean we should expect and wait for
a
pending close to complete, or we should start over with a brand new
state?
Maybe something else..

The reason the NFS_OPEN_STATE flag is cleared in 6 is that all the state
counts (n_rdwr, n_rdonly, n_wronly) are zero. There could be a
"pending
open" flag or counter that would help 6 not clear NFS_OPEN_STATE..

2016-11-12 21:16:00

by Jeff Layton

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
>
> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
>
> >
> > On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
> > >
> > > On 12 Nov 2016, at 7:54, Jeff Layton wrote:
> > >
> > > >
> > > >
> > > > On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
> > > > >
> > > > >
> > > > > I've been seeing the following on a modified version of generic/089
> > > > > that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
> > > > >
> > > > > 1. Client has open stateid A, sends a CLOSE
> > > > > 2. Client sends OPEN with same owner
> > > > > 3. Client sends another OPEN with same owner
> > > > > 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
> > > > > sequence 2)
> > > > > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> > > > > 6. Client gets a reply to CLOSE in 1
> > > > > 7. Client gets reply to OPEN in 2, stateid is B.1
> > > > > 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
> > > > >
> > > > > The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
> > > > > the OPEN
> > > > > response in 7 is able to update the open_stateid even though it has a
> > > > > lower
> > > > > sequence number.
> > > > >
> > > > > I think this case could be handled by never updating the open_stateid
> > > > > if the
> > > > > stateids match but the sequence number of the new state is less than
> > > > > the
> > > > > current open_state.
> > > > >
> > > >
> > > > What kernel is this on?
> > >
> > > On v4.9-rc2 with a couple fixups. Without them, I can't test long
> > > enough to
> > > reproduce this race. I don't think any of those are involved in this
> > > problem, though.
> > >
> > > >
> > > >
> > > > Yes, that seems wrong. The client should be picking B.2 for the open
> > > > stateid to use. I think that decision of whether to take a seqid is
> > > > made
> > > > in nfs_need_update_open_stateid. The logic in there looks correct to
> > > > me
> > > > at first glance though.
> > >
> > > nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE is
> > > unset.
> > > That's the precondition set up by steps 1-6. Perhaps it should not
> > > update
> > > the stateid if they match but the sequence number is less, and still set
> > > NFS_OPEN_STATE once more. That will fix _this_ case. Are there other
> > > cases
> > > where that would be a problem?
> > >
> > > Ben
> >
> > That seems wrong.
>
> I'm not sure what you mean: what seems wrong?
>

Sorry, it seems wrong that the client would issue the LOCK with B.1
there.

> >
> > The only close was sent in step 1, and that was for a
> > completely different stateid (A rather than B). It seems likely that
> > that is where the bug is.
>
> I'm still not sure what point you're trying to make..
>
> Even though the close was sent in step 1, the response wasn't processed
> until step 6..

Not really a point per-se, I was just saying where I think the bug might
be...

When you issue a CLOSE, you issue it vs. a particular stateid (stateid
"A" in this case). Once the open stateid has been superseded by "B", the
closing of "A" should have no effect.

Perhaps nfs_clear_open_stateid needs to check and see whether the open
stateid has been superseded before doing its thing?

--
Jeff Layton <[email protected]>

2016-11-13 02:56:10

by Jeff Layton

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
> On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
> >
> > On 12 Nov 2016, at 11:52, Jeff Layton wrote:
> >
> > >
> > > On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
> > > >
> > > > On 12 Nov 2016, at 7:54, Jeff Layton wrote:
> > > >
> > > > >
> > > > >
> > > > > On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
> > > > > >
> > > > > >
> > > > > > I've been seeing the following on a modified version of generic/089
> > > > > > that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
> > > > > >
> > > > > > 1. Client has open stateid A, sends a CLOSE
> > > > > > 2. Client sends OPEN with same owner
> > > > > > 3. Client sends another OPEN with same owner
> > > > > > 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
> > > > > > sequence 2)
> > > > > > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> > > > > > 6. Client gets a reply to CLOSE in 1
> > > > > > 7. Client gets reply to OPEN in 2, stateid is B.1
> > > > > > 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
> > > > > >
> > > > > > The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that
> > > > > > the OPEN
> > > > > > response in 7 is able to update the open_stateid even though it has a
> > > > > > lower
> > > > > > sequence number.
> > > > > >
> > > > > > I think this case could be handled by never updating the open_stateid
> > > > > > if the
> > > > > > stateids match but the sequence number of the new state is less than
> > > > > > the
> > > > > > current open_state.
> > > > > >
> > > > >
> > > > > What kernel is this on?
> > > >
> > > > On v4.9-rc2 with a couple fixups. Without them, I can't test long
> > > > enough to
> > > > reproduce this race. I don't think any of those are involved in this
> > > > problem, though.
> > > >
> > > > >
> > > > >
> > > > > Yes, that seems wrong. The client should be picking B.2 for the open
> > > > > stateid to use. I think that decision of whether to take a seqid is
> > > > > made
> > > > > in nfs_need_update_open_stateid. The logic in there looks correct to
> > > > > me
> > > > > at first glance though.
> > > >
> > > > nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE is
> > > > unset.
> > > > That's the precondition set up by steps 1-6. Perhaps it should not
> > > > update
> > > > the stateid if they match but the sequence number is less, and still set
> > > > NFS_OPEN_STATE once more. That will fix _this_ case. Are there other
> > > > cases
> > > > where that would be a problem?
> > > >
> > > > Ben
> > >
> > > That seems wrong.
> >
> > I'm not sure what you mean: what seems wrong?
> >
>
> Sorry, it seems wrong that the client would issue the LOCK with B.1
> there.
>
> > >
> > > The only close was sent in step 1, and that was for a
> > > completely different stateid (A rather than B). It seems likely that
> > > that is where the bug is.
> >
> > I'm still not sure what point you're trying to make..
> >
> > Even though the close was sent in step 1, the response wasn't processed
> > until step 6..
>
> Not really a point per-se, I was just saying where I think the bug might
> be...
>
> When you issue a CLOSE, you issue it vs. a particular stateid (stateid
> "A" in this case). Once the open stateid has been superseded by "B", the
> closing of "A" should have no effect.
>
> Perhaps nfs_clear_open_stateid needs to check and see whether the open
> stateid has been superseded before doing its thing?
>

Ok, I see something that might be a problem in this call in
nfs4_close_done:

       nfs_clear_open_stateid(state, &calldata->arg.stateid,
                        res_stateid, calldata->arg.fmode);

Note that we pass two nfs4_stateids to this call. The first is the
stateid that got sent in the CLOSE call, and the second is the stateid
that came back in the CLOSE response.

RFC5661 and RFC7530 both indicate that the stateid in a CLOSE response
should be ignored.

So, I think a patch like this may be in order. As to whether it will
fix this bug, I sort of doubt it, but it might not hurt to test it out?

----------------------8<--------------------------

[RFC PATCH] nfs: properly ignore the stateid in a CLOSE response

Signed-off-by: Jeff Layton
---
fs/nfs/nfs4proc.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 7897826d7c51..58413bd0aae2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1451,7 +1451,6 @@ static void nfs_resync_open_stateid_locked(struct nfs4_state *state)
}

static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
- nfs4_stateid *arg_stateid,
nfs4_stateid *stateid, fmode_t fmode)
{
clear_bit(NFS_O_RDWR_STATE, &state->flags);
@@ -1467,12 +1466,8 @@ static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
clear_bit(NFS_O_WRONLY_STATE, &state->flags);
clear_bit(NFS_OPEN_STATE, &state->flags);
}
- if (stateid == NULL)
- return;
/* Handle races with OPEN */
- if (!nfs4_stateid_match_other(arg_stateid, &state->open_stateid) ||
- (nfs4_stateid_match_other(stateid, &state->open_stateid) &&
- !nfs4_stateid_is_newer(stateid, &state->open_stateid))) {
+ if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
nfs_resync_open_stateid_locked(state);
return;
}
@@ -1482,11 +1477,10 @@ static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
}

static void nfs_clear_open_stateid(struct nfs4_state *state,
- nfs4_stateid *arg_stateid,
nfs4_stateid *stateid, fmode_t fmode)
{
write_seqlock(&state->seqlock);
- nfs_clear_open_stateid_locked(state, arg_stateid, stateid, fmode);
+ nfs_clear_open_stateid_locked(state, stateid, fmode);
write_sequnlock(&state->seqlock);
if (test_bit(NFS_STATE_RECLAIM_NOGRACE, &state->flags))
nfs4_schedule_state_manager(state->owner->so_server->nfs_client);
@@ -3042,7 +3036,6 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
struct nfs4_closedata *calldata = data;
struct nfs4_state *state = calldata->state;
struct nfs_server *server = NFS_SERVER(calldata->inode);
- nfs4_stateid *res_stateid = NULL;

dprintk("%s: begin!\n", __func__);
if (!nfs4_sequence_done(task, &calldata->res.seq_res))
@@ -3053,7 +3046,6 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
*/
switch (task->tk_status) {
case 0:
- res_stateid = &calldata->res.stateid;
if (calldata->roc)
pnfs_roc_set_barrier(state->inode,
calldata->roc_barrier);
@@ -3081,7 +3073,7 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
}
}
nfs_clear_open_stateid(state, &calldata->arg.stateid,
- res_stateid, calldata->arg.fmode);
+ calldata->arg.fmode);
out_release:
nfs_release_seqid(calldata->arg.seqid);
nfs_refresh_inode(calldata->inode, calldata->res.fattr);
--
2.9.3

2016-11-13 03:09:21

by Jeff Layton

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On Sat, 2016-11-12 at 18:16 +0000, Trond Myklebust wrote:
> > On Nov 12, 2016, at 06:08, Benjamin Coddington <[email protected]> wrote:
> >
> > I've been seeing the following on a modified version of generic/089
> > that gets the client stuck sending LOCK with NFS4ERR_OLD_STATEID.
> >
> > 1. Client has open stateid A, sends a CLOSE
> > 2. Client sends OPEN with same owner
> > 3. Client sends another OPEN with same owner
> > 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B sequence 2)
> > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> > 6. Client gets a reply to CLOSE in 1
> > 7. Client gets reply to OPEN in 2, stateid is B.1
> > 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
> >
> > The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so that the OPEN
> > response in 7 is able to update the open_stateid even though it has a lower
> > sequence number.
>
> Hmm… We probably should not do that if the stateid.other field of A (i.e. the one supplied as the argument to CLOSE) does not match the stateid.other of B.
> In fact, the reply in (4), where the stateid changes to B, should be the thing that resets the OPEN state.NrybXǧv^)޺{.n+{"^nrzh&Gh(階ݢj"mzޖfh~m

It looks like that's already the case in nfs_clear_open_stateid_locked,
though I don't think we ought to be doing anything with the stateid in
the CLOSE response. I sent a draft patch in another part of this
thread, but I don't quite see how that would cause this problem.
--
Jeff Layton <[email protected]>

2016-11-13 13:34:14

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race



On 12 Nov 2016, at 21:56, Jeff Layton wrote:

> On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
>> On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
>>>
>>> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
>>>
>>>>
>>>> On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
>>>>>
>>>>> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
>>>>>>>
>>>>>>>
>>>>>>> I've been seeing the following on a modified version of
>>>>>>> generic/089
>>>>>>> that gets the client stuck sending LOCK with
>>>>>>> NFS4ERR_OLD_STATEID.
>>>>>>>
>>>>>>> 1. Client has open stateid A, sends a CLOSE
>>>>>>> 2. Client sends OPEN with same owner
>>>>>>> 3. Client sends another OPEN with same owner
>>>>>>> 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
>>>>>>> sequence 2)
>>>>>>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>>>>>>> 6. Client gets a reply to CLOSE in 1
>>>>>>> 7. Client gets reply to OPEN in 2, stateid is B.1
>>>>>>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
>>>>>>>
>>>>>>> The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so
>>>>>>> that
>>>>>>> the OPEN
>>>>>>> response in 7 is able to update the open_stateid even though it
>>>>>>> has a
>>>>>>> lower
>>>>>>> sequence number.
>>>>>>>
>>>>>>> I think this case could be handled by never updating the
>>>>>>> open_stateid
>>>>>>> if the
>>>>>>> stateids match but the sequence number of the new state is less
>>>>>>> than
>>>>>>> the
>>>>>>> current open_state.
>>>>>>>
>>>>>>
>>>>>> What kernel is this on?
>>>>>
>>>>> On v4.9-rc2 with a couple fixups. Without them, I can't test long
>>>>> enough to
>>>>> reproduce this race. I don't think any of those are involved in
>>>>> this
>>>>> problem, though.
>>>>>
>>>>>>
>>>>>>
>>>>>> Yes, that seems wrong. The client should be picking B.2 for the
>>>>>> open
>>>>>> stateid to use. I think that decision of whether to take a seqid
>>>>>> is
>>>>>> made
>>>>>> in nfs_need_update_open_stateid. The logic in there looks
>>>>>> correct to
>>>>>> me
>>>>>> at first glance though.
>>>>>
>>>>> nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE
>>>>> is
>>>>> unset.
>>>>> That's the precondition set up by steps 1-6. Perhaps it should
>>>>> not
>>>>> update
>>>>> the stateid if they match but the sequence number is less, and
>>>>> still set
>>>>> NFS_OPEN_STATE once more. That will fix _this_ case. Are there
>>>>> other
>>>>> cases
>>>>> where that would be a problem?
>>>>>
>>>>> Ben
>>>>
>>>> That seems wrong.
>>>
>>> I'm not sure what you mean: what seems wrong?
>>>
>>
>> Sorry, it seems wrong that the client would issue the LOCK with B.1
>> there.
>>
>>>>
>>>> The only close was sent in step 1, and that was for a
>>>> completely different stateid (A rather than B). It seems likely
>>>> that
>>>> that is where the bug is.
>>>
>>> I'm still not sure what point you're trying to make..
>>>
>>> Even though the close was sent in step 1, the response wasn't
>>> processed
>>> until step 6..
>>
>> Not really a point per-se, I was just saying where I think the bug
>> might
>> be...
>>
>> When you issue a CLOSE, you issue it vs. a particular stateid
>> (stateid
>> "A" in this case). Once the open stateid has been superseded by "B",
>> the
>> closing of "A" should have no effect.
>>
>> Perhaps nfs_clear_open_stateid needs to check and see whether the
>> open
>> stateid has been superseded before doing its thing?
>>
>
> Ok, I see something that might be a problem in this call in
> nfs4_close_done:
>
>        nfs_clear_open_stateid(state, &calldata->arg.stateid,
>                         res_stateid,
> calldata->arg.fmode);
>
> Note that we pass two nfs4_stateids to this call. The first is the
> stateid that got sent in the CLOSE call, and the second is the stateid
> that came back in the CLOSE response.
>
> RFC5661 and RFC7530 both indicate that the stateid in a CLOSE response
> should be ignored.
>
> So, I think a patch like this may be in order. As to whether it will
> fix this bug, I sort of doubt it, but it might not hurt to test it
> out?

Doesn't this undo the fix in 4a1e2feb9d24 ("NFSv4.1: Fix a protocol
issue
with CLOSE stateids")?

I don't think it will help with the race either, since I think the issue
is
that the race above unsets NFS_OPEN_STATE, then that is reason enough to
overwrite the existing open_state, even if it is going to overwrite with
an
older stateid. What close does with its stateid is unimportant to that
problem.

In the interest of science, I'll kick off a test with this patch. It
should
take a few hours to reproduce, but I'm also fairly busy today, so I'll
report back this evening most likely. Thanks for the patch.

Ben

> ----------------------8<--------------------------
>
> [RFC PATCH] nfs: properly ignore the stateid in a CLOSE response
>
> Signed-off-by: Jeff Layton
> ---
> fs/nfs/nfs4proc.c | 14 +++-----------
> 1 file changed, 3 insertions(+), 11 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 7897826d7c51..58413bd0aae2 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -1451,7 +1451,6 @@ static void
> nfs_resync_open_stateid_locked(struct nfs4_state *state)
> }
>
> static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
> - nfs4_stateid *arg_stateid,
> nfs4_stateid *stateid, fmode_t fmode)
> {
> clear_bit(NFS_O_RDWR_STATE, &state->flags);
> @@ -1467,12 +1466,8 @@ static void
> nfs_clear_open_stateid_locked(struct nfs4_state *state,
> clear_bit(NFS_O_WRONLY_STATE, &state->flags);
> clear_bit(NFS_OPEN_STATE, &state->flags);
> }
> - if (stateid == NULL)
> - return;
> /* Handle races with OPEN */
> - if (!nfs4_stateid_match_other(arg_stateid, &state->open_stateid) ||
> - (nfs4_stateid_match_other(stateid, &state->open_stateid) &&
> - !nfs4_stateid_is_newer(stateid, &state->open_stateid))) {
> + if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
> nfs_resync_open_stateid_locked(state);
> return;
> }
> @@ -1482,11 +1477,10 @@ static void
> nfs_clear_open_stateid_locked(struct nfs4_state *state,
> }
>
> static void nfs_clear_open_stateid(struct nfs4_state *state,
> - nfs4_stateid *arg_stateid,
> nfs4_stateid *stateid, fmode_t fmode)
> {
> write_seqlock(&state->seqlock);
> - nfs_clear_open_stateid_locked(state, arg_stateid, stateid, fmode);
> + nfs_clear_open_stateid_locked(state, stateid, fmode);
> write_sequnlock(&state->seqlock);
> if (test_bit(NFS_STATE_RECLAIM_NOGRACE, &state->flags))
> nfs4_schedule_state_manager(state->owner->so_server->nfs_client);
> @@ -3042,7 +3036,6 @@ static void nfs4_close_done(struct rpc_task
> *task, void *data)
> struct nfs4_closedata *calldata = data;
> struct nfs4_state *state = calldata->state;
> struct nfs_server *server = NFS_SERVER(calldata->inode);
> - nfs4_stateid *res_stateid = NULL;
>
> dprintk("%s: begin!\n", __func__);
> if (!nfs4_sequence_done(task, &calldata->res.seq_res))
> @@ -3053,7 +3046,6 @@ static void nfs4_close_done(struct rpc_task
> *task, void *data)
> */
> switch (task->tk_status) {
> case 0:
> - res_stateid = &calldata->res.stateid;
> if (calldata->roc)
> pnfs_roc_set_barrier(state->inode,
> calldata->roc_barrier);
> @@ -3081,7 +3073,7 @@ static void nfs4_close_done(struct rpc_task
> *task, void *data)
> }
> }
> nfs_clear_open_stateid(state, &calldata->arg.stateid,
> - res_stateid, calldata->arg.fmode);
> + calldata->arg.fmode);
> out_release:
> nfs_release_seqid(calldata->arg.seqid);
> nfs_refresh_inode(calldata->inode, calldata->res.fattr);
> --
> 2.9.3
>

2016-11-13 14:22:42

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On 13 Nov 2016, at 8:34, Benjamin Coddington wrote:
>
> On 12 Nov 2016, at 21:56, Jeff Layton wrote:
>
>> On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
>>> On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
>>>>
>>>> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
>>>>
>>>>>
>>>>> On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
>>>>>>
>>>>>> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I've been seeing the following on a modified version of
>>>>>>>> generic/089
>>>>>>>> that gets the client stuck sending LOCK with
>>>>>>>> NFS4ERR_OLD_STATEID.
>>>>>>>>
>>>>>>>> 1. Client has open stateid A, sends a CLOSE
>>>>>>>> 2. Client sends OPEN with same owner
>>>>>>>> 3. Client sends another OPEN with same owner
>>>>>>>> 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
>>>>>>>> sequence 2)
>>>>>>>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>>>>>>>> 6. Client gets a reply to CLOSE in 1
>>>>>>>> 7. Client gets reply to OPEN in 2, stateid is B.1
>>>>>>>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a
>>>>>>>> loop
>>>>>>>>
>>>>>>>> The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so
>>>>>>>> that
>>>>>>>> the OPEN
>>>>>>>> response in 7 is able to update the open_stateid even though it
>>>>>>>> has a
>>>>>>>> lower
>>>>>>>> sequence number.
>>>>>>>>
>>>>>>>> I think this case could be handled by never updating the
>>>>>>>> open_stateid
>>>>>>>> if the
>>>>>>>> stateids match but the sequence number of the new state is less
>>>>>>>> than
>>>>>>>> the
>>>>>>>> current open_state.
>>>>>>>>
>>>>>>>
>>>>>>> What kernel is this on?
>>>>>>
>>>>>> On v4.9-rc2 with a couple fixups. Without them, I can't test
>>>>>> long
>>>>>> enough to
>>>>>> reproduce this race. I don't think any of those are involved in
>>>>>> this
>>>>>> problem, though.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes, that seems wrong. The client should be picking B.2 for the
>>>>>>> open
>>>>>>> stateid to use. I think that decision of whether to take a seqid
>>>>>>> is
>>>>>>> made
>>>>>>> in nfs_need_update_open_stateid. The logic in there looks
>>>>>>> correct to
>>>>>>> me
>>>>>>> at first glance though.
>>>>>>
>>>>>> nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE
>>>>>> is
>>>>>> unset.
>>>>>> That's the precondition set up by steps 1-6. Perhaps it should
>>>>>> not
>>>>>> update
>>>>>> the stateid if they match but the sequence number is less, and
>>>>>> still set
>>>>>> NFS_OPEN_STATE once more. That will fix _this_ case. Are there
>>>>>> other
>>>>>> cases
>>>>>> where that would be a problem?
>>>>>>
>>>>>> Ben
>>>>>
>>>>> That seems wrong.
>>>>
>>>> I'm not sure what you mean: what seems wrong?
>>>>
>>>
>>> Sorry, it seems wrong that the client would issue the LOCK with B.1
>>> there.
>>>
>>>>>
>>>>> The only close was sent in step 1, and that was for a
>>>>> completely different stateid (A rather than B). It seems likely
>>>>> that
>>>>> that is where the bug is.
>>>>
>>>> I'm still not sure what point you're trying to make..
>>>>
>>>> Even though the close was sent in step 1, the response wasn't
>>>> processed
>>>> until step 6..
>>>
>>> Not really a point per-se, I was just saying where I think the bug
>>> might
>>> be...
>>>
>>> When you issue a CLOSE, you issue it vs. a particular stateid
>>> (stateid
>>> "A" in this case). Once the open stateid has been superseded by "B",
>>> the
>>> closing of "A" should have no effect.
>>>
>>> Perhaps nfs_clear_open_stateid needs to check and see whether the
>>> open
>>> stateid has been superseded before doing its thing?
>>>
>>
>> Ok, I see something that might be a problem in this call in
>> nfs4_close_done:
>>
>>        nfs_clear_open_stateid(state, &calldata->arg.stateid,
>>                         res_stateid,
>> calldata->arg.fmode);
>>
>> Note that we pass two nfs4_stateids to this call. The first is the
>> stateid that got sent in the CLOSE call, and the second is the
>> stateid
>> that came back in the CLOSE response.
>>
>> RFC5661 and RFC7530 both indicate that the stateid in a CLOSE
>> response
>> should be ignored.
>>
>> So, I think a patch like this may be in order. As to whether it will
>> fix this bug, I sort of doubt it, but it might not hurt to test it
>> out?
>
> Doesn't this undo the fix in 4a1e2feb9d24 ("NFSv4.1: Fix a protocol
> issue
> with CLOSE stateids")?
>
> I don't think it will help with the race either, since I think the
> issue is
> that the race above unsets NFS_OPEN_STATE, then that is reason enough
> to
> overwrite the existing open_state, even if it is going to overwrite
> with an
> older stateid. What close does with its stateid is unimportant to
> that
> problem.
>
> In the interest of science, I'll kick off a test with this patch. It
> should
> take a few hours to reproduce, but I'm also fairly busy today, so I'll
> report back this evening most likely. Thanks for the patch.

I got lucky and it reproduced in about 10 minutes with this patch.

Ben


2016-11-13 14:33:31

by Jeff Layton

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

On Sun, 2016-11-13 at 08:34 -0500, Benjamin Coddington wrote:
>
> On 12 Nov 2016, at 21:56, Jeff Layton wrote:
>
> > On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
> > > On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
> > > >
> > > > On 12 Nov 2016, at 11:52, Jeff Layton wrote:
> > > >
> > > > >
> > > > > On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
> > > > > >
> > > > > > On 12 Nov 2016, at 7:54, Jeff Layton wrote:
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > I've been seeing the following on a modified version of 
> > > > > > > > generic/089
> > > > > > > > that gets the client stuck sending LOCK with 
> > > > > > > > NFS4ERR_OLD_STATEID.
> > > > > > > >
> > > > > > > > 1. Client has open stateid A, sends a CLOSE
> > > > > > > > 2. Client sends OPEN with same owner
> > > > > > > > 3. Client sends another OPEN with same owner
> > > > > > > > 4. Client gets a reply to OPEN in 3, stateid is B.2 (stateid B
> > > > > > > > sequence 2)
> > > > > > > > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
> > > > > > > > 6. Client gets a reply to CLOSE in 1
> > > > > > > > 7. Client gets reply to OPEN in 2, stateid is B.1
> > > > > > > > 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in a loop
> > > > > > > >
> > > > > > > > The CLOSE response in 6 causes us to clear NFS_OPEN_STATE, so 
> > > > > > > > that
> > > > > > > > the OPEN
> > > > > > > > response in 7 is able to update the open_stateid even though it 
> > > > > > > > has a
> > > > > > > > lower
> > > > > > > > sequence number.
> > > > > > > >
> > > > > > > > I think this case could be handled by never updating the 
> > > > > > > > open_stateid
> > > > > > > > if the
> > > > > > > > stateids match but the sequence number of the new state is less 
> > > > > > > > than
> > > > > > > > the
> > > > > > > > current open_state.
> > > > > > > >
> > > > > > >
> > > > > > > What kernel is this on?
> > > > > >
> > > > > > On v4.9-rc2 with a couple fixups. Without them, I can't test long
> > > > > > enough to
> > > > > > reproduce this race. I don't think any of those are involved in 
> > > > > > this
> > > > > > problem, though.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Yes, that seems wrong. The client should be picking B.2 for the 
> > > > > > > open
> > > > > > > stateid to use. I think that decision of whether to take a seqid 
> > > > > > > is
> > > > > > > made
> > > > > > > in nfs_need_update_open_stateid. The logic in there looks 
> > > > > > > correct to
> > > > > > > me
> > > > > > > at first glance though.
> > > > > >
> > > > > > nfs_need_update_open_stateid() will return true if NFS_OPEN_STATE 
> > > > > > is
> > > > > > unset.
> > > > > > That's the precondition set up by steps 1-6. Perhaps it should 
> > > > > > not
> > > > > > update
> > > > > > the stateid if they match but the sequence number is less, and 
> > > > > > still set
> > > > > > NFS_OPEN_STATE once more. That will fix _this_ case. Are there 
> > > > > > other
> > > > > > cases
> > > > > > where that would be a problem?
> > > > > >
> > > > > > Ben
> > > > >
> > > > > That seems wrong.
> > > >
> > > > I'm not sure what you mean: what seems wrong?
> > > >
> > >
> > > Sorry, it seems wrong that the client would issue the LOCK with B.1
> > > there.
> > >
> > > > >
> > > > > The only close was sent in step 1, and that was for a
> > > > > completely different stateid (A rather than B). It seems likely 
> > > > > that
> > > > > that is where the bug is.
> > > >
> > > > I'm still not sure what point you're trying to make..
> > > >
> > > > Even though the close was sent in step 1, the response wasn't 
> > > > processed
> > > > until step 6..
> > >
> > > Not really a point per-se, I was just saying where I think the bug 
> > > might
> > > be...
> > >
> > > When you issue a CLOSE, you issue it vs. a particular stateid 
> > > (stateid
> > > "A" in this case). Once the open stateid has been superseded by "B", 
> > > the
> > > closing of "A" should have no effect.
> > >
> > > Perhaps nfs_clear_open_stateid needs to check and see whether the 
> > > open
> > > stateid has been superseded before doing its thing?
> > >
> >
> > Ok, I see something that might be a problem in this call in
> > nfs4_close_done:
> >
> >        nfs_clear_open_stateid(state, &calldata->arg.stateid,
> >                         res_stateid, 
> > calldata->arg.fmode);
> >
> > Note that we pass two nfs4_stateids to this call. The first is the
> > stateid that got sent in the CLOSE call, and the second is the stateid
> > that came back in the CLOSE response.
> >
> > RFC5661 and RFC7530 both indicate that the stateid in a CLOSE response
> > should be ignored.
> >
> > So, I think a patch like this may be in order. As to whether it will
> > fix this bug, I sort of doubt it, but it might not hurt to test it 
> > out?
>
> Doesn't this undo the fix in 4a1e2feb9d24 ("NFSv4.1: Fix a protocol 
> issue
> with CLOSE stateids")?
>

I don't think so. The (updated) specs are pretty clear that the stateid
in a CLOSE response is not to be trusted in any way. I think we're
better off just relying on the stateid that was sent in the request.

On a related note, we probably need to fix knfsd not to send anything
in there as well, but we probably need to think about older clients
there as well.

Thanks for testing the patch. I'm not too surprised this didn't help
here though. It didn't really change the logic appreciably, unless you
have a server sending weird stateids in the CLOSE response.

> I don't think it will help with the race either, since I think the issue 
> is
> that the race above unsets NFS_OPEN_STATE, then that is reason enough to
> overwrite the existing open_state, even if it is going to overwrite with 
> an
> older stateid. What close does with its stateid is unimportant to that
> problem.
>
> In the interest of science, I'll kick off a test with this patch. It 
> should
> take a few hours to reproduce, but I'm also fairly busy today, so I'll
> report back this evening most likely. Thanks for the patch.
>
> Ben
>
> > ----------------------8<--------------------------
> >
> > [RFC PATCH] nfs: properly ignore the stateid in a CLOSE response
> >
> > Signed-off-by: Jeff Layton
> > ---
> >  fs/nfs/nfs4proc.c | 14 +++-----------
> >  1 file changed, 3 insertions(+), 11 deletions(-)
> >
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index 7897826d7c51..58413bd0aae2 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -1451,7 +1451,6 @@ static void 
> > nfs_resync_open_stateid_locked(struct nfs4_state *state)
> >  }
> >
> >  static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
> > > > - nfs4_stateid *arg_stateid,
> > > >   nfs4_stateid *stateid, fmode_t fmode)
> >  {
> > > >   clear_bit(NFS_O_RDWR_STATE, &state->flags);
> > @@ -1467,12 +1466,8 @@ static void 
> > nfs_clear_open_stateid_locked(struct nfs4_state *state,
> > > >   clear_bit(NFS_O_WRONLY_STATE, &state->flags);
> > > >   clear_bit(NFS_OPEN_STATE, &state->flags);
> > > >   }
> > > > - if (stateid == NULL)
> > > > - return;
> > > >   /* Handle races with OPEN */
> > > > - if (!nfs4_stateid_match_other(arg_stateid, &state->open_stateid) ||
> > > > - (nfs4_stateid_match_other(stateid, &state->open_stateid) &&
> > > > - !nfs4_stateid_is_newer(stateid, &state->open_stateid))) {
> > > > + if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
> > > >   nfs_resync_open_stateid_locked(state);
> > > >   return;
> > > >   }
> > @@ -1482,11 +1477,10 @@ static void 
> > nfs_clear_open_stateid_locked(struct nfs4_state *state,
> >  }
> >
> >  static void nfs_clear_open_stateid(struct nfs4_state *state,
> > > > - nfs4_stateid *arg_stateid,
> > > >   nfs4_stateid *stateid, fmode_t fmode)
> >  {
> > > >   write_seqlock(&state->seqlock);
> > > > - nfs_clear_open_stateid_locked(state, arg_stateid, stateid, fmode);
> > > > + nfs_clear_open_stateid_locked(state, stateid, fmode);
> > > >   write_sequnlock(&state->seqlock);
> > > >   if (test_bit(NFS_STATE_RECLAIM_NOGRACE, &state->flags))
> > > >   nfs4_schedule_state_manager(state->owner->so_server->nfs_client);
> > @@ -3042,7 +3036,6 @@ static void nfs4_close_done(struct rpc_task 
> > *task, void *data)
> > > >   struct nfs4_closedata *calldata = data;
> > > >   struct nfs4_state *state = calldata->state;
> > > >   struct nfs_server *server = NFS_SERVER(calldata->inode);
> > > > - nfs4_stateid *res_stateid = NULL;
> >
> > > >   dprintk("%s: begin!\n", __func__);
> > > >   if (!nfs4_sequence_done(task, &calldata->res.seq_res))
> > @@ -3053,7 +3046,6 @@ static void nfs4_close_done(struct rpc_task 
> > *task, void *data)
> > > >   */
> > > >   switch (task->tk_status) {
> > > >   case 0:
> > > > - res_stateid = &calldata->res.stateid;
> > > >   if (calldata->roc)
> > > >   pnfs_roc_set_barrier(state->inode,
> > > >   calldata->roc_barrier);
> > @@ -3081,7 +3073,7 @@ static void nfs4_close_done(struct rpc_task 
> > *task, void *data)
> > > >   }
> > > >   }
> > > >   nfs_clear_open_stateid(state, &calldata->arg.stateid,
> > > > - res_stateid, calldata->arg.fmode);
> > > > + calldata->arg.fmode);
> >  out_release:
> > > >   nfs_release_seqid(calldata->arg.seqid);
> > > >   nfs_refresh_inode(calldata->inode, calldata->res.fattr);
> > -- 
> > 2.9.3
> >

--
Jeff Layton <[email protected]>

2016-11-13 14:47:52

by Trond Myklebust

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

T24gU2F0LCAyMDE2LTExLTEyIGF0IDIxOjU2IC0wNTAwLCBKZWZmIExheXRvbiB3cm90ZToNCj4g
T24gU2F0LCAyMDE2LTExLTEyIGF0IDE2OjE2IC0wNTAwLCBKZWZmIExheXRvbiB3cm90ZToNCj4g
PiANCj4gPiBPbiBTYXQsIDIwMTYtMTEtMTIgYXQgMTM6MDMgLTA1MDAsIEJlbmphbWluIENvZGRp
bmd0b24gd3JvdGU6DQo+ID4gPiANCj4gPiA+IA0KPiA+ID4gT24gMTIgTm92IDIwMTYsIGF0IDEx
OjUyLCBKZWZmIExheXRvbiB3cm90ZToNCj4gPiA+IA0KPiA+ID4gPiANCj4gPiA+ID4gDQo+ID4g
PiA+IE9uIFNhdCwgMjAxNi0xMS0xMiBhdCAxMDozMSAtMDUwMCwgQmVuamFtaW4gQ29kZGluZ3Rv
biB3cm90ZToNCj4gPiA+ID4gPiANCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBPbiAxMiBOb3YgMjAx
NiwgYXQgNzo1NCwgSmVmZiBMYXl0b24gd3JvdGU6DQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiAN
Cj4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiBPbiBTYXQsIDIwMTYtMTEt
MTIgYXQgMDY6MDggLTA1MDAsIEJlbmphbWluIENvZGRpbmd0b24NCj4gPiA+ID4gPiA+IHdyb3Rl
Og0KPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiA+IA0KPiA+ID4g
PiA+ID4gPiBJJ3ZlIGJlZW4gc2VlaW5nIHRoZSBmb2xsb3dpbmcgb24gYSBtb2RpZmllZCB2ZXJz
aW9uIG9mDQo+ID4gPiA+ID4gPiA+IGdlbmVyaWMvMDg5DQo+ID4gPiA+ID4gPiA+IHRoYXQgZ2V0
cyB0aGUgY2xpZW50IHN0dWNrIHNlbmRpbmcgTE9DSyB3aXRoDQo+ID4gPiA+ID4gPiA+IE5GUzRF
UlJfT0xEX1NUQVRFSUQuDQo+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiAxLiBDbGllbnQg
aGFzIG9wZW4gc3RhdGVpZCBBLCBzZW5kcyBhIENMT1NFDQo+ID4gPiA+ID4gPiA+IDIuIENsaWVu
dCBzZW5kcyBPUEVOIHdpdGggc2FtZSBvd25lcg0KPiA+ID4gPiA+ID4gPiAzLiBDbGllbnQgc2Vu
ZHMgYW5vdGhlciBPUEVOIHdpdGggc2FtZSBvd25lcg0KPiA+ID4gPiA+ID4gPiA0LiBDbGllbnQg
Z2V0cyBhIHJlcGx5IHRvIE9QRU4gaW4gMywgc3RhdGVpZCBpcyBCLjINCj4gPiA+ID4gPiA+ID4g
KHN0YXRlaWQgQg0KPiA+ID4gPiA+ID4gPiBzZXF1ZW5jZSAyKQ0KPiA+ID4gPiA+ID4gPiA1LiBD
bGllbnQgZG9lcyBMT0NLLExPQ0tVLEZSRUVfU1RBVEVJRCBmcm9tIEIuMg0KPiA+ID4gPiA+ID4g
PiA2LiBDbGllbnQgZ2V0cyBhIHJlcGx5IHRvIENMT1NFIGluIDENCj4gPiA+ID4gPiA+ID4gNy4g
Q2xpZW50IGdldHMgcmVwbHkgdG8gT1BFTiBpbiAyLCBzdGF0ZWlkIGlzIEIuMQ0KPiA+ID4gPiA+
ID4gPiA4LiBDbGllbnQgc2VuZHMgTE9DSyB3aXRoIEIuMSAtIE9MRF9TVEFURUlELCBub3cgc3R1
Y2sgaW4NCj4gPiA+ID4gPiA+ID4gYSBsb29wDQo+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4g
PiBUaGUgQ0xPU0UgcmVzcG9uc2UgaW4gNiBjYXVzZXMgdXMgdG8gY2xlYXINCj4gPiA+ID4gPiA+
ID4gTkZTX09QRU5fU1RBVEUsIHNvIHRoYXQNCj4gPiA+ID4gPiA+ID4gdGhlIE9QRU4NCj4gPiA+
ID4gPiA+ID4gcmVzcG9uc2UgaW4gNyBpcyBhYmxlIHRvIHVwZGF0ZSB0aGUgb3Blbl9zdGF0ZWlk
IGV2ZW4NCj4gPiA+ID4gPiA+ID4gdGhvdWdoIGl0IGhhcyBhDQo+ID4gPiA+ID4gPiA+IGxvd2Vy
DQo+ID4gPiA+ID4gPiA+IHNlcXVlbmNlIG51bWJlci4NCj4gPiA+ID4gPiA+ID4gDQo+ID4gPiA+
ID4gPiA+IEkgdGhpbmsgdGhpcyBjYXNlIGNvdWxkIGJlIGhhbmRsZWQgYnkgbmV2ZXIgdXBkYXRp
bmcgdGhlDQo+ID4gPiA+ID4gPiA+IG9wZW5fc3RhdGVpZA0KPiA+ID4gPiA+ID4gPiBpZiB0aGUN
Cj4gPiA+ID4gPiA+ID4gc3RhdGVpZHMgbWF0Y2ggYnV0IHRoZSBzZXF1ZW5jZSBudW1iZXIgb2Yg
dGhlIG5ldyBzdGF0ZQ0KPiA+ID4gPiA+ID4gPiBpcyBsZXNzIHRoYW4NCj4gPiA+ID4gPiA+ID4g
dGhlDQo+ID4gPiA+ID4gPiA+IGN1cnJlbnQgb3Blbl9zdGF0ZS4NCj4gPiA+ID4gPiA+ID4gDQo+
ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IFdoYXQga2VybmVsIGlzIHRoaXMgb24/DQo+ID4gPiA+
ID4gDQo+ID4gPiA+ID4gT24gdjQuOS1yYzIgd2l0aCBhIGNvdXBsZSBmaXh1cHMuwqDCoFdpdGhv
dXQgdGhlbSwgSSBjYW4ndCB0ZXN0DQo+ID4gPiA+ID4gbG9uZw0KPiA+ID4gPiA+IGVub3VnaCB0
bw0KPiA+ID4gPiA+IHJlcHJvZHVjZSB0aGlzIHJhY2UuwqDCoEkgZG9uJ3QgdGhpbmsgYW55IG9m
IHRob3NlIGFyZSBpbnZvbHZlZA0KPiA+ID4gPiA+IGluIHRoaXMNCj4gPiA+ID4gPiBwcm9ibGVt
LCB0aG91Z2guDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IA0KPiA+ID4g
PiA+ID4gDQo+ID4gPiA+ID4gPiBZZXMsIHRoYXQgc2VlbXMgd3JvbmcuIFRoZSBjbGllbnQgc2hv
dWxkIGJlIHBpY2tpbmcgQi4yIGZvcg0KPiA+ID4gPiA+ID4gdGhlIG9wZW4NCj4gPiA+ID4gPiA+
IHN0YXRlaWQgdG8gdXNlLiBJIHRoaW5rIHRoYXQgZGVjaXNpb24gb2Ygd2hldGhlciB0byB0YWtl
IGENCj4gPiA+ID4gPiA+IHNlcWlkIGlzDQo+ID4gPiA+ID4gPiBtYWRlDQo+ID4gPiA+ID4gPiBp
bsKgbmZzX25lZWRfdXBkYXRlX29wZW5fc3RhdGVpZC4gVGhlIGxvZ2ljIGluIHRoZXJlIGxvb2tz
DQo+ID4gPiA+ID4gPiBjb3JyZWN0IHRvDQo+ID4gPiA+ID4gPiBtZQ0KPiA+ID4gPiA+ID4gYXQg
Zmlyc3QgZ2xhbmNlIHRob3VnaC4NCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBuZnNfbmVlZF91cGRh
dGVfb3Blbl9zdGF0ZWlkKCkgd2lsbCByZXR1cm4gdHJ1ZSBpZg0KPiA+ID4gPiA+IE5GU19PUEVO
X1NUQVRFIGlzDQo+ID4gPiA+ID4gdW5zZXQuDQo+ID4gPiA+ID4gVGhhdCdzIHRoZSBwcmVjb25k
aXRpb24gc2V0IHVwIGJ5IHN0ZXBzIDEtNi7CoMKgUGVyaGFwcyBpdA0KPiA+ID4gPiA+IHNob3Vs
ZCBub3QNCj4gPiA+ID4gPiB1cGRhdGUNCj4gPiA+ID4gPiB0aGUgc3RhdGVpZCBpZiB0aGV5IG1h
dGNoIGJ1dCB0aGUgc2VxdWVuY2UgbnVtYmVyIGlzIGxlc3MsDQo+ID4gPiA+ID4gYW5kIHN0aWxs
IHNldA0KPiA+ID4gPiA+IE5GU19PUEVOX1NUQVRFIG9uY2UgbW9yZS7CoMKgVGhhdCB3aWxsIGZp
eCBfdGhpc18gY2FzZS7CoMKgQXJlDQo+ID4gPiA+ID4gdGhlcmUgb3RoZXINCj4gPiA+ID4gPiBj
YXNlcw0KPiA+ID4gPiA+IHdoZXJlIHRoYXQgd291bGQgYmUgYSBwcm9ibGVtPw0KPiA+ID4gPiA+
IA0KPiA+ID4gPiA+IEJlbg0KPiA+ID4gPiANCj4gPiA+ID4gVGhhdCBzZWVtcyB3cm9uZy4NCj4g
PiA+IA0KPiA+ID4gSSdtIG5vdCBzdXJlIHdoYXQgeW91IG1lYW46IHdoYXQgc2VlbXMgd3Jvbmc/
DQo+ID4gPiANCj4gPiANCj4gPiBTb3JyeSwgaXQgc2VlbXMgd3JvbmcgdGhhdCB0aGUgY2xpZW50
IHdvdWxkIGlzc3VlIHRoZSBMT0NLIHdpdGggQi4xDQo+ID4gdGhlcmUuDQo+ID4gDQo+ID4gPiAN
Cj4gPiA+ID4gDQo+ID4gPiA+IA0KPiA+ID4gPiBUaGUgb25seSBjbG9zZSB3YXMgc2VudCBpbiBz
dGVwIDEsIGFuZCB0aGF0IHdhcyBmb3IgYQ0KPiA+ID4gPiBjb21wbGV0ZWx5IGRpZmZlcmVudCBz
dGF0ZWlkIChBIHJhdGhlciB0aGFuIEIpLiBJdCBzZWVtcyBsaWtlbHkNCj4gPiA+ID4gdGhhdA0K
PiA+ID4gPiB0aGF0IGlzIHdoZXJlIHRoZSBidWcgaXMuDQo+ID4gPiANCj4gPiA+IEknbSBzdGls
bCBub3Qgc3VyZSB3aGF0IHBvaW50IHlvdSdyZSB0cnlpbmcgdG8gbWFrZS4uDQo+ID4gPiANCj4g
PiA+IEV2ZW4gdGhvdWdoIHRoZSBjbG9zZSB3YXMgc2VudCBpbiBzdGVwIDEsIHRoZSByZXNwb25z
ZSB3YXNuJ3QNCj4gPiA+IHByb2Nlc3NlZA0KPiA+ID4gdW50aWwgc3RlcCA2Li4NCj4gPiANCj4g
PiBOb3QgcmVhbGx5IGEgcG9pbnQgcGVyLXNlLCBJIHdhcyBqdXN0IHNheWluZyB3aGVyZSBJIHRo
aW5rIHRoZSBidWcNCj4gPiBtaWdodA0KPiA+IGJlLi4uDQo+ID4gDQo+ID4gV2hlbiB5b3UgaXNz
dWUgYSBDTE9TRSwgeW91IGlzc3VlIGl0IHZzLiBhIHBhcnRpY3VsYXIgc3RhdGVpZA0KPiA+IChz
dGF0ZWlkDQo+ID4gIkEiIGluIHRoaXMgY2FzZSkuIE9uY2UgdGhlIG9wZW4gc3RhdGVpZCBoYXMg
YmVlbiBzdXBlcnNlZGVkIGJ5DQo+ID4gIkIiLCB0aGUNCj4gPiBjbG9zaW5nIG9mICJBIiBzaG91
bGQgaGF2ZSBubyBlZmZlY3QuDQo+ID4gDQo+ID4gUGVyaGFwc8KgbmZzX2NsZWFyX29wZW5fc3Rh
dGVpZCBuZWVkcyB0byBjaGVjayBhbmQgc2VlIHdoZXRoZXIgdGhlDQo+ID4gb3Blbg0KPiA+IHN0
YXRlaWQgaGFzIGJlZW4gc3VwZXJzZWRlZCBiZWZvcmUgZG9pbmcgaXRzIHRoaW5nPw0KPiA+IA0K
PiANCj4gT2ssIEkgc2VlIHNvbWV0aGluZyB0aGF0IG1pZ2h0IGJlIGEgcHJvYmxlbSBpbiB0aGlz
IGNhbGwgaW4NCj4gbmZzNF9jbG9zZV9kb25lOg0KPiANCj4gwqDCoMKgwqDCoMKgwqBuZnNfY2xl
YXJfb3Blbl9zdGF0ZWlkKHN0YXRlLCAmY2FsbGRhdGEtPmFyZy5zdGF0ZWlkLA0KPiDCoMKgwqDC
oMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqByZXNfc3RhdGVpZCwgY2Fs
bGRhdGEtPmFyZy5mbW9kZSk7DQo+IA0KPiBOb3RlIHRoYXQgd2UgcGFzcyB0d28gbmZzNF9zdGF0
ZWlkcyB0byB0aGlzIGNhbGwuIFRoZSBmaXJzdCBpcyB0aGUNCj4gc3RhdGVpZCB0aGF0IGdvdCBz
ZW50IGluIHRoZSBDTE9TRSBjYWxsLCBhbmQgdGhlIHNlY29uZCBpcyB0aGUNCj4gc3RhdGVpZA0K
PiB0aGF0IGNhbWUgYmFjayBpbiB0aGUgQ0xPU0UgcmVzcG9uc2UuDQo+IA0KPiBSRkM1NjYxIGFu
ZCBSRkM3NTMwIGJvdGggaW5kaWNhdGUgdGhhdCB0aGUgc3RhdGVpZCBpbiBhIENMT1NFDQo+IHJl
c3BvbnNlDQo+IHNob3VsZCBiZSBpZ25vcmVkLg0KPiANCj4gU28sIEkgdGhpbmsgYSBwYXRjaCBs
aWtlIHRoaXMgbWF5IGJlIGluIG9yZGVyLiBBcyB0byB3aGV0aGVyIGl0IHdpbGwNCj4gZml4IHRo
aXMgYnVnLCBJIHNvcnQgb2YgZG91YnQgaXQsIGJ1dCBpdCBtaWdodCBub3QgaHVydCB0byB0ZXN0
IGl0DQo+IG91dD8NCj4gDQo+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS04PC0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tDQo+IA0KPiBbUkZDIFBBVENIXSBuZnM6IHByb3Blcmx5IGlnbm9yZSB0aGUg
c3RhdGVpZCBpbiBhIENMT1NFIHJlc3BvbnNlDQo+IA0KPiBTaWduZWQtb2ZmLWJ5OiBKZWZmIExh
eXRvbsKgDQo+IC0tLQ0KPiDCoGZzL25mcy9uZnM0cHJvYy5jIHwgMTQgKysrLS0tLS0tLS0tLS0N
Cj4gwqAxIGZpbGUgY2hhbmdlZCwgMyBpbnNlcnRpb25zKCspLCAxMSBkZWxldGlvbnMoLSkNCj4g
DQo+IGRpZmYgLS1naXQgYS9mcy9uZnMvbmZzNHByb2MuYyBiL2ZzL25mcy9uZnM0cHJvYy5jDQo+
IGluZGV4IDc4OTc4MjZkN2M1MS4uNTg0MTNiZDBhYWUyIDEwMDY0NA0KPiAtLS0gYS9mcy9uZnMv
bmZzNHByb2MuYw0KPiArKysgYi9mcy9uZnMvbmZzNHByb2MuYw0KPiBAQCAtMTQ1MSw3ICsxNDUx
LDYgQEAgc3RhdGljIHZvaWQNCj4gbmZzX3Jlc3luY19vcGVuX3N0YXRlaWRfbG9ja2VkKHN0cnVj
dCBuZnM0X3N0YXRlICpzdGF0ZSkNCj4gwqB9DQo+IMKgDQo+IMKgc3RhdGljIHZvaWQgbmZzX2Ns
ZWFyX29wZW5fc3RhdGVpZF9sb2NrZWQoc3RydWN0IG5mczRfc3RhdGUgKnN0YXRlLA0KPiAtCQlu
ZnM0X3N0YXRlaWQgKmFyZ19zdGF0ZWlkLA0KPiDCoAkJbmZzNF9zdGF0ZWlkICpzdGF0ZWlkLCBm
bW9kZV90IGZtb2RlKQ0KPiDCoHsNCj4gwqAJY2xlYXJfYml0KE5GU19PX1JEV1JfU1RBVEUsICZz
dGF0ZS0+ZmxhZ3MpOw0KPiBAQCAtMTQ2NywxMiArMTQ2Niw4IEBAIHN0YXRpYyB2b2lkDQo+IG5m
c19jbGVhcl9vcGVuX3N0YXRlaWRfbG9ja2VkKHN0cnVjdCBuZnM0X3N0YXRlICpzdGF0ZSwNCj4g
wqAJCWNsZWFyX2JpdChORlNfT19XUk9OTFlfU1RBVEUsICZzdGF0ZS0+ZmxhZ3MpOw0KPiDCoAkJ
Y2xlYXJfYml0KE5GU19PUEVOX1NUQVRFLCAmc3RhdGUtPmZsYWdzKTsNCj4gwqAJfQ0KPiAtCWlm
IChzdGF0ZWlkID09IE5VTEwpDQo+IC0JCXJldHVybjsNCj4gwqAJLyogSGFuZGxlIHJhY2VzIHdp
dGggT1BFTiAqLw0KPiAtCWlmICghbmZzNF9zdGF0ZWlkX21hdGNoX290aGVyKGFyZ19zdGF0ZWlk
LCAmc3RhdGUtDQo+ID5vcGVuX3N0YXRlaWQpIHx8DQo+IC0JwqDCoMKgwqAobmZzNF9zdGF0ZWlk
X21hdGNoX290aGVyKHN0YXRlaWQsICZzdGF0ZS0+b3Blbl9zdGF0ZWlkKSANCj4gJiYNCj4gLQnC
oMKgwqDCoCFuZnM0X3N0YXRlaWRfaXNfbmV3ZXIoc3RhdGVpZCwgJnN0YXRlLT5vcGVuX3N0YXRl
aWQpKSkNCj4gew0KPiArCWlmICghbmZzNF9zdGF0ZWlkX21hdGNoX290aGVyKHN0YXRlaWQsICZz
dGF0ZS0NCj4gPm9wZW5fc3RhdGVpZCkpIHsNCg0KTm8uIEkgdGhpbmsgd2hhdCB3ZSBzaG91bGQg
YmUgZG9pbmcgaGVyZSBpcw0KDQoxKSBpZiAobmZzNF9zdGF0ZWlkX21hdGNoX290aGVyKGFyZ19z
dGF0ZWlkLCAmc3RhdGUtPm9wZW5fc3RhdGVpZCkgdGhlbg0KanVzdCBpZ25vcmUgdGhlIHJlc3Vs
dCBhbmQgcmV0dXJuIGltbWVkaWF0ZWx5LCBiZWNhdXNlIGl0IGFwcGxpZXMgdG8gYQ0KY29tcGxl
dGVseSBkaWZmZXJlbnQgc3RhdGVpZC4NCg0KMikgaWYgKG5mczRfc3RhdGVpZF9tYXRjaF9vdGhl
cihzdGF0ZWlkLCAmc3RhdGUtPm9wZW5fc3RhdGVpZCkNCiYmwqAhbmZzNF9zdGF0ZWlkX2lzX25l
d2VyKHN0YXRlaWQsICZzdGF0ZS0+b3Blbl9zdGF0ZWlkKSkpIHRoZW4gcmVzeW5jLA0KYmVjYXVz
ZSB0aGlzIHdhcyBsaWtlbHkgYW4gT1BFTl9ET1dOR1JBREUgdGhhdCBoYXMgcmFjZWQgd2l0aCBv
bmUgb3INCm1vcmUgT1BFTiBjYWxscy4NCg0KTm90ZSB0aGF0IHRoZSByZWFzb24gd2h5IHdlJ3Zl
IGJlZW4gY2FyZWZ1bCBhYm91dCB0aGlzIHByZXZpb3VzbHkgaXMNCmJlY2F1c2UgUkZDMzUzMCBk
aWQgbm90IGVuZm9yY2UgdGhlICJzZXFpZDpvdGhlciIgZGVmaW5pdGlvbiBvZg0Kc3RhdGVpZHMu
IE5vdyB0aGF0IFJGQzc1MzAgc2VjdGlvbiA5LjEuNC4yIGhhcyBzdHJlbmd0aGVuZWQgdGhlDQpk
ZWZpbml0aW9uIG9mIHN0YXRlaWRzIGZvciBORlN2NC4wLCB3ZSBjYW4gYXNzdW1lIHRoZSBhYm92
ZSBob2xkcyBmb3INCmFsbCBleGlzdGluZyBtaW5vciB2ZXJzaW9ucyBvZiBORlN2NC4=

2016-11-14 14:54:19

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race


On 13 Nov 2016, at 9:47, Trond Myklebust wrote:

> On Sat, 2016-11-12 at 21:56 -0500, Jeff Layton wrote:
>> On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
>>>
>>> On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
>>>>
>>>>
>>>> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
>>>>
>>>>>
>>>>>
>>>>> On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
>>>>>>
>>>>>>
>>>>>> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I've been seeing the following on a modified version of
>>>>>>>> generic/089
>>>>>>>> that gets the client stuck sending LOCK with
>>>>>>>> NFS4ERR_OLD_STATEID.
>>>>>>>>
>>>>>>>> 1. Client has open stateid A, sends a CLOSE
>>>>>>>> 2. Client sends OPEN with same owner
>>>>>>>> 3. Client sends another OPEN with same owner
>>>>>>>> 4. Client gets a reply to OPEN in 3, stateid is B.2
>>>>>>>> (stateid B
>>>>>>>> sequence 2)
>>>>>>>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>>>>>>>> 6. Client gets a reply to CLOSE in 1
>>>>>>>> 7. Client gets reply to OPEN in 2, stateid is B.1
>>>>>>>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in
>>>>>>>> a loop
>>>>>>>>
>>>>>>>> The CLOSE response in 6 causes us to clear
>>>>>>>> NFS_OPEN_STATE, so that
>>>>>>>> the OPEN
>>>>>>>> response in 7 is able to update the open_stateid even
>>>>>>>> though it has a
>>>>>>>> lower
>>>>>>>> sequence number.
>>>>>>>>
>>>>>>>> I think this case could be handled by never updating the
>>>>>>>> open_stateid
>>>>>>>> if the
>>>>>>>> stateids match but the sequence number of the new state
>>>>>>>> is less than
>>>>>>>> the
>>>>>>>> current open_state.
>>>>>>>>
>>>>>>>
>>>>>>> What kernel is this on?
>>>>>>
>>>>>> On v4.9-rc2 with a couple fixups.  Without them, I can't test
>>>>>> long
>>>>>> enough to
>>>>>> reproduce this race.  I don't think any of those are involved
>>>>>> in this
>>>>>> problem, though.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes, that seems wrong. The client should be picking B.2 for
>>>>>>> the open
>>>>>>> stateid to use. I think that decision of whether to take a
>>>>>>> seqid is
>>>>>>> made
>>>>>>> in nfs_need_update_open_stateid. The logic in there looks
>>>>>>> correct to
>>>>>>> me
>>>>>>> at first glance though.
>>>>>>
>>>>>> nfs_need_update_open_stateid() will return true if
>>>>>> NFS_OPEN_STATE is
>>>>>> unset.
>>>>>> That's the precondition set up by steps 1-6.  Perhaps it
>>>>>> should not
>>>>>> update
>>>>>> the stateid if they match but the sequence number is less,
>>>>>> and still set
>>>>>> NFS_OPEN_STATE once more.  That will fix _this_ case.  Are
>>>>>> there other
>>>>>> cases
>>>>>> where that would be a problem?
>>>>>>
>>>>>> Ben
>>>>>
>>>>> That seems wrong.
>>>>
>>>> I'm not sure what you mean: what seems wrong?
>>>>
>>>
>>> Sorry, it seems wrong that the client would issue the LOCK with B.1
>>> there.
>>>
>>>>
>>>>>
>>>>>
>>>>> The only close was sent in step 1, and that was for a
>>>>> completely different stateid (A rather than B). It seems likely
>>>>> that
>>>>> that is where the bug is.
>>>>
>>>> I'm still not sure what point you're trying to make..
>>>>
>>>> Even though the close was sent in step 1, the response wasn't
>>>> processed
>>>> until step 6..
>>>
>>> Not really a point per-se, I was just saying where I think the bug
>>> might
>>> be...
>>>
>>> When you issue a CLOSE, you issue it vs. a particular stateid
>>> (stateid
>>> "A" in this case). Once the open stateid has been superseded by
>>> "B", the
>>> closing of "A" should have no effect.
>>>
>>> Perhaps nfs_clear_open_stateid needs to check and see whether the
>>> open
>>> stateid has been superseded before doing its thing?
>>>
>>
>> Ok, I see something that might be a problem in this call in
>> nfs4_close_done:
>>
>>        nfs_clear_open_stateid(state, &calldata->arg.stateid,
>>                         res_stateid,
>> calldata->arg.fmode);
>>
>> Note that we pass two nfs4_stateids to this call. The first is the
>> stateid that got sent in the CLOSE call, and the second is the
>> stateid
>> that came back in the CLOSE response.
>>
>> RFC5661 and RFC7530 both indicate that the stateid in a CLOSE
>> response
>> should be ignored.
>>
>> So, I think a patch like this may be in order. As to whether it will
>> fix this bug, I sort of doubt it, but it might not hurt to test it
>> out?
>>
>> ----------------------8<--------------------------
>>
>> [RFC PATCH] nfs: properly ignore the stateid in a CLOSE response
>>
>> Signed-off-by: Jeff Layton 
>> ---
>>  fs/nfs/nfs4proc.c | 14 +++-----------
>>  1 file changed, 3 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 7897826d7c51..58413bd0aae2 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -1451,7 +1451,6 @@ static void
>> nfs_resync_open_stateid_locked(struct nfs4_state *state)
>>  }
>>  
>>  static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
>> - nfs4_stateid *arg_stateid,
>>   nfs4_stateid *stateid, fmode_t fmode)
>>  {
>>   clear_bit(NFS_O_RDWR_STATE, &state->flags);
>> @@ -1467,12 +1466,8 @@ static void
>> nfs_clear_open_stateid_locked(struct nfs4_state *state,
>>   clear_bit(NFS_O_WRONLY_STATE, &state->flags);
>>   clear_bit(NFS_OPEN_STATE, &state->flags);
>>   }
>> - if (stateid == NULL)
>> - return;
>>   /* Handle races with OPEN */
>> - if (!nfs4_stateid_match_other(arg_stateid, &state-
>>> open_stateid) ||
>> -     (nfs4_stateid_match_other(stateid, &state->open_stateid)
>> &&
>> -     !nfs4_stateid_is_newer(stateid, &state->open_stateid)))
>> {
>> + if (!nfs4_stateid_match_other(stateid, &state-
>>> open_stateid)) {
>
> No. I think what we should be doing here is
>
> 1) if (nfs4_stateid_match_other(arg_stateid, &state->open_stateid)
> then

You must mean (!nfs4_stateid_match_other(arg_stateid,
&state->open_stateid)

> just ignore the result and return immediately, because it applies to a
> completely different stateid.
>
> 2) if (nfs4_stateid_match_other(stateid, &state->open_stateid)
> && !nfs4_stateid_is_newer(stateid, &state->open_stateid))) then
> resync,
> because this was likely an OPEN_DOWNGRADE that has raced with one or
> more OPEN calls.

OK, but these do not help the originally reported race because at the
time
that the CLOSE response handling does a resync - I presume
nfs_resync_open_stateid_locked() - all the state counters are zero,
which
bypasses resetting NFS_OPEN_STATE, which, if unset, allows any stateid
to
update the open_stateid.

I need something like:

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a230764b7d07..c9aa166c45aa 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1425,8 +1425,13 @@ static void
nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
static bool nfs_need_update_open_stateid(struct nfs4_state *state,
const nfs4_stateid *stateid, nfs4_stateid *freeme)
{
- if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0)
+ if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0) {
+ if (nfs4_stateid_match_other(stateid,
&state->open_stateid) &&
+ !nfs4_stateid_is_newer(stateid,
&state->open_stateid))
+ return false;
return true;
+ }
+
if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
nfs4_stateid_copy(freeme, &state->open_stateid);
nfs_test_and_clear_all_open_stateid(state);

I've got that in testing, and I'll let it run for a day or so.

Another way to simplify CLOSE/OPEN races might be to never reuse an
openowner if we send a close. Any opens after a close would have to use
a
new mutated openowner, so the state always proceeds from open to close
and
never back again. Any reason not to do that?

Ben

2016-11-14 16:29:39

by Trond Myklebust

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

T24gTW9uLCAyMDE2LTExLTE0IGF0IDA5OjUzIC0wNTAwLCBCZW5qYW1pbiBDb2RkaW5ndG9uIHdy
b3RlOg0KPiBPbiAxMyBOb3YgMjAxNiwgYXQgOTo0NywgVHJvbmQgTXlrbGVidXN0IHdyb3RlOg0K
PiANCj4gPiANCj4gPiBPbiBTYXQsIDIwMTYtMTEtMTIgYXQgMjE6NTYgLTA1MDAsIEplZmYgTGF5
dG9uIHdyb3RlOg0KPiA+ID4gDQo+ID4gPiBPbiBTYXQsIDIwMTYtMTEtMTIgYXQgMTY6MTYgLTA1
MDAsIEplZmYgTGF5dG9uIHdyb3RlOg0KPiA+ID4gPiANCj4gPiA+ID4gDQo+ID4gPiA+IE9uIFNh
dCwgMjAxNi0xMS0xMiBhdCAxMzowMyAtMDUwMCwgQmVuamFtaW4gQ29kZGluZ3RvbiB3cm90ZToN
Cj4gPiA+ID4gPiANCj4gPiA+ID4gPiANCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBPbiAxMiBOb3Yg
MjAxNiwgYXQgMTE6NTIsIEplZmYgTGF5dG9uIHdyb3RlOg0KPiA+ID4gPiA+IA0KPiA+ID4gPiA+
ID4gDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gT24gU2F0LCAyMDE2
LTExLTEyIGF0IDEwOjMxIC0wNTAwLCBCZW5qYW1pbiBDb2RkaW5ndG9uDQo+ID4gPiA+ID4gPiB3
cm90ZToNCj4gPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiANCj4g
PiA+ID4gPiA+ID4gT24gMTIgTm92IDIwMTYsIGF0IDc6NTQsIEplZmYgTGF5dG9uIHdyb3RlOg0K
PiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiANCj4gPiA+
ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiBPbiBTYXQsIDIw
MTYtMTEtMTIgYXQgMDY6MDggLTA1MDAsIEJlbmphbWluIENvZGRpbmd0b24NCj4gPiA+ID4gPiA+
ID4gPiB3cm90ZToNCj4gPiA+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiA+ID4gDQo+ID4g
PiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiA+ID4gSSd2
ZSBiZWVuIHNlZWluZyB0aGUgZm9sbG93aW5nIG9uIGEgbW9kaWZpZWQgdmVyc2lvbg0KPiA+ID4g
PiA+ID4gPiA+ID4gb2YNCj4gPiA+ID4gPiA+ID4gPiA+IGdlbmVyaWMvMDg5DQo+ID4gPiA+ID4g
PiA+ID4gPiB0aGF0IGdldHMgdGhlIGNsaWVudCBzdHVjayBzZW5kaW5nIExPQ0sgd2l0aA0KPiA+
ID4gPiA+ID4gPiA+ID4gTkZTNEVSUl9PTERfU1RBVEVJRC4NCj4gPiA+ID4gPiA+ID4gPiA+IA0K
PiA+ID4gPiA+ID4gPiA+ID4gMS4gQ2xpZW50IGhhcyBvcGVuIHN0YXRlaWQgQSwgc2VuZHMgYSBD
TE9TRQ0KPiA+ID4gPiA+ID4gPiA+ID4gMi4gQ2xpZW50IHNlbmRzIE9QRU4gd2l0aCBzYW1lIG93
bmVyDQo+ID4gPiA+ID4gPiA+ID4gPiAzLiBDbGllbnQgc2VuZHMgYW5vdGhlciBPUEVOIHdpdGgg
c2FtZSBvd25lcg0KPiA+ID4gPiA+ID4gPiA+ID4gNC4gQ2xpZW50IGdldHMgYSByZXBseSB0byBP
UEVOIGluIDMsIHN0YXRlaWQgaXMgQi4yDQo+ID4gPiA+ID4gPiA+ID4gPiAoc3RhdGVpZCBCDQo+
ID4gPiA+ID4gPiA+ID4gPiBzZXF1ZW5jZSAyKQ0KPiA+ID4gPiA+ID4gPiA+ID4gNS4gQ2xpZW50
IGRvZXMgTE9DSyxMT0NLVSxGUkVFX1NUQVRFSUQgZnJvbSBCLjINCj4gPiA+ID4gPiA+ID4gPiA+
IDYuIENsaWVudCBnZXRzIGEgcmVwbHkgdG8gQ0xPU0UgaW4gMQ0KPiA+ID4gPiA+ID4gPiA+ID4g
Ny4gQ2xpZW50IGdldHMgcmVwbHkgdG8gT1BFTiBpbiAyLCBzdGF0ZWlkIGlzIEIuMQ0KPiA+ID4g
PiA+ID4gPiA+ID4gOC4gQ2xpZW50IHNlbmRzIExPQ0sgd2l0aCBCLjEgLSBPTERfU1RBVEVJRCwg
bm93DQo+ID4gPiA+ID4gPiA+ID4gPiBzdHVjayBpbg0KPiA+ID4gPiA+ID4gPiA+ID4gYSBsb29w
DQo+ID4gPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiA+IFRoZSBDTE9TRSByZXNwb25z
ZSBpbiA2IGNhdXNlcyB1cyB0byBjbGVhcg0KPiA+ID4gPiA+ID4gPiA+ID4gTkZTX09QRU5fU1RB
VEUsIHNvIHRoYXQNCj4gPiA+ID4gPiA+ID4gPiA+IHRoZSBPUEVODQo+ID4gPiA+ID4gPiA+ID4g
PiByZXNwb25zZSBpbiA3IGlzIGFibGUgdG8gdXBkYXRlIHRoZSBvcGVuX3N0YXRlaWQgZXZlbg0K
PiA+ID4gPiA+ID4gPiA+ID4gdGhvdWdoIGl0IGhhcyBhDQo+ID4gPiA+ID4gPiA+ID4gPiBsb3dl
cg0KPiA+ID4gPiA+ID4gPiA+ID4gc2VxdWVuY2UgbnVtYmVyLg0KPiA+ID4gPiA+ID4gPiA+ID4g
DQo+ID4gPiA+ID4gPiA+ID4gPiBJIHRoaW5rIHRoaXMgY2FzZSBjb3VsZCBiZSBoYW5kbGVkIGJ5
IG5ldmVyIHVwZGF0aW5nDQo+ID4gPiA+ID4gPiA+ID4gPiB0aGUNCj4gPiA+ID4gPiA+ID4gPiA+
IG9wZW5fc3RhdGVpZA0KPiA+ID4gPiA+ID4gPiA+ID4gaWYgdGhlDQo+ID4gPiA+ID4gPiA+ID4g
PiBzdGF0ZWlkcyBtYXRjaCBidXQgdGhlIHNlcXVlbmNlIG51bWJlciBvZiB0aGUgbmV3DQo+ID4g
PiA+ID4gPiA+ID4gPiBzdGF0ZQ0KPiA+ID4gPiA+ID4gPiA+ID4gaXMgbGVzcyB0aGFuDQo+ID4g
PiA+ID4gPiA+ID4gPiB0aGUNCj4gPiA+ID4gPiA+ID4gPiA+IGN1cnJlbnQgb3Blbl9zdGF0ZS4N
Cj4gPiA+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiA+IFdo
YXQga2VybmVsIGlzIHRoaXMgb24/DQo+ID4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gPiBPbiB2
NC45LXJjMiB3aXRoIGEgY291cGxlIGZpeHVwcy7CoMKgV2l0aG91dCB0aGVtLCBJIGNhbid0DQo+
ID4gPiA+ID4gPiA+IHRlc3QNCj4gPiA+ID4gPiA+ID4gbG9uZw0KPiA+ID4gPiA+ID4gPiBlbm91
Z2ggdG8NCj4gPiA+ID4gPiA+ID4gcmVwcm9kdWNlIHRoaXMgcmFjZS7CoMKgSSBkb24ndCB0aGlu
ayBhbnkgb2YgdGhvc2UgYXJlDQo+ID4gPiA+ID4gPiA+IGludm9sdmVkDQo+ID4gPiA+ID4gPiA+
IGluIHRoaXMNCj4gPiA+ID4gPiA+ID4gcHJvYmxlbSwgdGhvdWdoLg0KPiA+ID4gPiA+ID4gPiAN
Cj4gPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiANCj4g
PiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gPiBZZXMsIHRoYXQgc2VlbXMgd3JvbmcuIFRo
ZSBjbGllbnQgc2hvdWxkIGJlIHBpY2tpbmcgQi4yDQo+ID4gPiA+ID4gPiA+ID4gZm9yDQo+ID4g
PiA+ID4gPiA+ID4gdGhlIG9wZW4NCj4gPiA+ID4gPiA+ID4gPiBzdGF0ZWlkIHRvIHVzZS4gSSB0
aGluayB0aGF0IGRlY2lzaW9uIG9mIHdoZXRoZXIgdG8NCj4gPiA+ID4gPiA+ID4gPiB0YWtlIGEN
Cj4gPiA+ID4gPiA+ID4gPiBzZXFpZCBpcw0KPiA+ID4gPiA+ID4gPiA+IG1hZGUNCj4gPiA+ID4g
PiA+ID4gPiBpbsKgbmZzX25lZWRfdXBkYXRlX29wZW5fc3RhdGVpZC4gVGhlIGxvZ2ljIGluIHRo
ZXJlDQo+ID4gPiA+ID4gPiA+ID4gbG9va3MNCj4gPiA+ID4gPiA+ID4gPiBjb3JyZWN0IHRvDQo+
ID4gPiA+ID4gPiA+ID4gbWUNCj4gPiA+ID4gPiA+ID4gPiBhdCBmaXJzdCBnbGFuY2UgdGhvdWdo
Lg0KPiA+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gbmZzX25lZWRfdXBkYXRlX29wZW5fc3Rh
dGVpZCgpIHdpbGwgcmV0dXJuIHRydWUgaWYNCj4gPiA+ID4gPiA+ID4gTkZTX09QRU5fU1RBVEUg
aXMNCj4gPiA+ID4gPiA+ID4gdW5zZXQuDQo+ID4gPiA+ID4gPiA+IFRoYXQncyB0aGUgcHJlY29u
ZGl0aW9uIHNldCB1cCBieSBzdGVwcyAxLTYuwqDCoFBlcmhhcHMgaXQNCj4gPiA+ID4gPiA+ID4g
c2hvdWxkIG5vdA0KPiA+ID4gPiA+ID4gPiB1cGRhdGUNCj4gPiA+ID4gPiA+ID4gdGhlIHN0YXRl
aWQgaWYgdGhleSBtYXRjaCBidXQgdGhlIHNlcXVlbmNlIG51bWJlciBpcw0KPiA+ID4gPiA+ID4g
PiBsZXNzLA0KPiA+ID4gPiA+ID4gPiBhbmQgc3RpbGwgc2V0DQo+ID4gPiA+ID4gPiA+IE5GU19P
UEVOX1NUQVRFIG9uY2UgbW9yZS7CoMKgVGhhdCB3aWxsIGZpeCBfdGhpc18NCj4gPiA+ID4gPiA+
ID4gY2FzZS7CoMKgQXJlDQo+ID4gPiA+ID4gPiA+IHRoZXJlIG90aGVyDQo+ID4gPiA+ID4gPiA+
IGNhc2VzDQo+ID4gPiA+ID4gPiA+IHdoZXJlIHRoYXQgd291bGQgYmUgYSBwcm9ibGVtPw0KPiA+
ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+ID4gQmVuDQo+ID4gPiA+ID4gPiANCj4gPiA+ID4gPiA+
IFRoYXQgc2VlbXMgd3JvbmcuDQo+ID4gPiA+ID4gDQo+ID4gPiA+ID4gSSdtIG5vdCBzdXJlIHdo
YXQgeW91IG1lYW46IHdoYXQgc2VlbXMgd3Jvbmc/DQo+ID4gPiA+ID4gDQo+ID4gPiA+IA0KPiA+
ID4gPiBTb3JyeSwgaXQgc2VlbXMgd3JvbmcgdGhhdCB0aGUgY2xpZW50IHdvdWxkIGlzc3VlIHRo
ZSBMT0NLIHdpdGgNCj4gPiA+ID4gQi4xDQo+ID4gPiA+IHRoZXJlLg0KPiA+ID4gPiANCj4gPiA+
ID4gPiANCj4gPiA+ID4gPiANCj4gPiA+ID4gPiA+IA0KPiA+ID4gPiA+ID4gDQo+ID4gPiA+ID4g
PiANCj4gPiA+ID4gPiA+IFRoZSBvbmx5IGNsb3NlIHdhcyBzZW50IGluIHN0ZXAgMSwgYW5kIHRo
YXQgd2FzIGZvciBhDQo+ID4gPiA+ID4gPiBjb21wbGV0ZWx5IGRpZmZlcmVudCBzdGF0ZWlkIChB
IHJhdGhlciB0aGFuIEIpLiBJdCBzZWVtcw0KPiA+ID4gPiA+ID4gbGlrZWx5DQo+ID4gPiA+ID4g
PiB0aGF0DQo+ID4gPiA+ID4gPiB0aGF0IGlzIHdoZXJlIHRoZSBidWcgaXMuDQo+ID4gPiA+ID4g
DQo+ID4gPiA+ID4gSSdtIHN0aWxsIG5vdCBzdXJlIHdoYXQgcG9pbnQgeW91J3JlIHRyeWluZyB0
byBtYWtlLi4NCj4gPiA+ID4gPiANCj4gPiA+ID4gPiBFdmVuIHRob3VnaCB0aGUgY2xvc2Ugd2Fz
IHNlbnQgaW4gc3RlcCAxLCB0aGUgcmVzcG9uc2Ugd2Fzbid0DQo+ID4gPiA+ID4gcHJvY2Vzc2Vk
DQo+ID4gPiA+ID4gdW50aWwgc3RlcCA2Li4NCj4gPiA+ID4gDQo+ID4gPiA+IE5vdCByZWFsbHkg
YSBwb2ludCBwZXItc2UsIEkgd2FzIGp1c3Qgc2F5aW5nIHdoZXJlIEkgdGhpbmsgdGhlDQo+ID4g
PiA+IGJ1Zw0KPiA+ID4gPiBtaWdodA0KPiA+ID4gPiBiZS4uLg0KPiA+ID4gPiANCj4gPiA+ID4g
V2hlbiB5b3UgaXNzdWUgYSBDTE9TRSwgeW91IGlzc3VlIGl0IHZzLiBhIHBhcnRpY3VsYXIgc3Rh
dGVpZA0KPiA+ID4gPiAoc3RhdGVpZA0KPiA+ID4gPiAiQSIgaW4gdGhpcyBjYXNlKS4gT25jZSB0
aGUgb3BlbiBzdGF0ZWlkIGhhcyBiZWVuIHN1cGVyc2VkZWQgYnkNCj4gPiA+ID4gIkIiLCB0aGUN
Cj4gPiA+ID4gY2xvc2luZyBvZiAiQSIgc2hvdWxkIGhhdmUgbm8gZWZmZWN0Lg0KPiA+ID4gPiAN
Cj4gPiA+ID4gUGVyaGFwc8KgbmZzX2NsZWFyX29wZW5fc3RhdGVpZCBuZWVkcyB0byBjaGVjayBh
bmQgc2VlIHdoZXRoZXINCj4gPiA+ID4gdGhlDQo+ID4gPiA+IG9wZW4NCj4gPiA+ID4gc3RhdGVp
ZCBoYXMgYmVlbiBzdXBlcnNlZGVkIGJlZm9yZSBkb2luZyBpdHMgdGhpbmc/DQo+ID4gPiA+IA0K
PiA+ID4gDQo+ID4gPiBPaywgSSBzZWUgc29tZXRoaW5nIHRoYXQgbWlnaHQgYmUgYSBwcm9ibGVt
IGluIHRoaXMgY2FsbCBpbg0KPiA+ID4gbmZzNF9jbG9zZV9kb25lOg0KPiA+ID4gDQo+ID4gPiDC
oMKgwqDCoMKgwqDCoG5mc19jbGVhcl9vcGVuX3N0YXRlaWQoc3RhdGUsICZjYWxsZGF0YS0+YXJn
LnN0YXRlaWQsDQo+ID4gPiDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKg
wqDCoMKgwqByZXNfc3RhdGVpZCzCoA0KPiA+ID4gY2FsbGRhdGEtPmFyZy5mbW9kZSk7DQo+ID4g
PiANCj4gPiA+IE5vdGUgdGhhdCB3ZSBwYXNzIHR3byBuZnM0X3N0YXRlaWRzIHRvIHRoaXMgY2Fs
bC4gVGhlIGZpcnN0IGlzDQo+ID4gPiB0aGUNCj4gPiA+IHN0YXRlaWQgdGhhdCBnb3Qgc2VudCBp
biB0aGUgQ0xPU0UgY2FsbCwgYW5kIHRoZSBzZWNvbmQgaXMgdGhlDQo+ID4gPiBzdGF0ZWlkDQo+
ID4gPiB0aGF0IGNhbWUgYmFjayBpbiB0aGUgQ0xPU0UgcmVzcG9uc2UuDQo+ID4gPiANCj4gPiA+
IFJGQzU2NjEgYW5kIFJGQzc1MzAgYm90aCBpbmRpY2F0ZSB0aGF0IHRoZSBzdGF0ZWlkIGluIGEg
Q0xPU0UNCj4gPiA+IHJlc3BvbnNlDQo+ID4gPiBzaG91bGQgYmUgaWdub3JlZC4NCj4gPiA+IA0K
PiA+ID4gU28sIEkgdGhpbmsgYSBwYXRjaCBsaWtlIHRoaXMgbWF5IGJlIGluIG9yZGVyLiBBcyB0
byB3aGV0aGVyIGl0DQo+ID4gPiB3aWxsDQo+ID4gPiBmaXggdGhpcyBidWcsIEkgc29ydCBvZiBk
b3VidCBpdCwgYnV0IGl0IG1pZ2h0IG5vdCBodXJ0IHRvIHRlc3QNCj4gPiA+IGl0DQo+ID4gPiBv
dXQ/DQo+ID4gPiANCj4gPiA+IC0tLS0tLS0tLS0tLS0tLS0tLS0tLS04PC0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tDQo+ID4gPiANCj4gPiA+IFtSRkMgUEFUQ0hdIG5mczogcHJvcGVybHkgaWdu
b3JlIHRoZSBzdGF0ZWlkIGluIGEgQ0xPU0UgcmVzcG9uc2UNCj4gPiA+IA0KPiA+ID4gU2lnbmVk
LW9mZi1ieTogSmVmZiBMYXl0b27CoA0KPiA+ID4gLS0tDQo+ID4gPiDCoGZzL25mcy9uZnM0cHJv
Yy5jIHwgMTQgKysrLS0tLS0tLS0tLS0NCj4gPiA+IMKgMSBmaWxlIGNoYW5nZWQsIDMgaW5zZXJ0
aW9ucygrKSwgMTEgZGVsZXRpb25zKC0pDQo+ID4gPiANCj4gPiA+IGRpZmYgLS1naXQgYS9mcy9u
ZnMvbmZzNHByb2MuYyBiL2ZzL25mcy9uZnM0cHJvYy5jDQo+ID4gPiBpbmRleCA3ODk3ODI2ZDdj
NTEuLjU4NDEzYmQwYWFlMiAxMDA2NDQNCj4gPiA+IC0tLSBhL2ZzL25mcy9uZnM0cHJvYy5jDQo+
ID4gPiArKysgYi9mcy9uZnMvbmZzNHByb2MuYw0KPiA+ID4gQEAgLTE0NTEsNyArMTQ1MSw2IEBA
IHN0YXRpYyB2b2lkDQo+ID4gPiBuZnNfcmVzeW5jX29wZW5fc3RhdGVpZF9sb2NrZWQoc3RydWN0
IG5mczRfc3RhdGUgKnN0YXRlKQ0KPiA+ID4gwqB9DQo+ID4gPiDCoA0KPiA+ID4gwqBzdGF0aWMg
dm9pZCBuZnNfY2xlYXJfb3Blbl9zdGF0ZWlkX2xvY2tlZChzdHJ1Y3QgbmZzNF9zdGF0ZQ0KPiA+
ID4gKnN0YXRlLA0KPiA+ID4gLQkJbmZzNF9zdGF0ZWlkICphcmdfc3RhdGVpZCwNCj4gPiA+IMKg
CQluZnM0X3N0YXRlaWQgKnN0YXRlaWQsIGZtb2RlX3QgZm1vZGUpDQo+ID4gPiDCoHsNCj4gPiA+
IMKgCWNsZWFyX2JpdChORlNfT19SRFdSX1NUQVRFLCAmc3RhdGUtPmZsYWdzKTsNCj4gPiA+IEBA
IC0xNDY3LDEyICsxNDY2LDggQEAgc3RhdGljIHZvaWQNCj4gPiA+IG5mc19jbGVhcl9vcGVuX3N0
YXRlaWRfbG9ja2VkKHN0cnVjdCBuZnM0X3N0YXRlICpzdGF0ZSwNCj4gPiA+IMKgCQljbGVhcl9i
aXQoTkZTX09fV1JPTkxZX1NUQVRFLCAmc3RhdGUtPmZsYWdzKTsNCj4gPiA+IMKgCQljbGVhcl9i
aXQoTkZTX09QRU5fU1RBVEUsICZzdGF0ZS0+ZmxhZ3MpOw0KPiA+ID4gwqAJfQ0KPiA+ID4gLQlp
ZiAoc3RhdGVpZCA9PSBOVUxMKQ0KPiA+ID4gLQkJcmV0dXJuOw0KPiA+ID4gwqAJLyogSGFuZGxl
IHJhY2VzIHdpdGggT1BFTiAqLw0KPiA+ID4gLQlpZiAoIW5mczRfc3RhdGVpZF9tYXRjaF9vdGhl
cihhcmdfc3RhdGVpZCwgJnN0YXRlLQ0KPiA+ID4gPiANCj4gPiA+ID4gb3Blbl9zdGF0ZWlkKSB8
fA0KPiA+ID4gLQnCoMKgwqDCoChuZnM0X3N0YXRlaWRfbWF0Y2hfb3RoZXIoc3RhdGVpZCwgJnN0
YXRlLQ0KPiA+ID4gPm9wZW5fc3RhdGVpZCkNCj4gPiA+ICYmDQo+ID4gPiAtCcKgwqDCoMKgIW5m
czRfc3RhdGVpZF9pc19uZXdlcihzdGF0ZWlkLCAmc3RhdGUtDQo+ID4gPiA+b3Blbl9zdGF0ZWlk
KSkpDQo+ID4gPiB7DQo+ID4gPiArCWlmICghbmZzNF9zdGF0ZWlkX21hdGNoX290aGVyKHN0YXRl
aWQsICZzdGF0ZS0NCj4gPiA+ID4gDQo+ID4gPiA+IG9wZW5fc3RhdGVpZCkpIHsNCj4gPiANCj4g
PiBOby4gSSB0aGluayB3aGF0IHdlIHNob3VsZCBiZSBkb2luZyBoZXJlIGlzDQo+ID4gDQo+ID4g
MSkgaWYgKG5mczRfc3RhdGVpZF9tYXRjaF9vdGhlcihhcmdfc3RhdGVpZCwgJnN0YXRlLT5vcGVu
X3N0YXRlaWQpwqANCj4gPiB0aGVuDQo+IA0KPiBZb3UgbXVzdCBtZWFuICghbmZzNF9zdGF0ZWlk
X21hdGNoX290aGVyKGFyZ19zdGF0ZWlkLMKgDQo+ICZzdGF0ZS0+b3Blbl9zdGF0ZWlkKQ0KDQpT
b3JyeS4gWWVzLg0KDQo+ID4gDQo+ID4ganVzdCBpZ25vcmUgdGhlIHJlc3VsdCBhbmQgcmV0dXJu
IGltbWVkaWF0ZWx5LCBiZWNhdXNlIGl0IGFwcGxpZXMNCj4gPiB0byBhDQo+ID4gY29tcGxldGVs
eSBkaWZmZXJlbnQgc3RhdGVpZC4NCj4gPiANCj4gPiAyKSBpZiAobmZzNF9zdGF0ZWlkX21hdGNo
X290aGVyKHN0YXRlaWQsICZzdGF0ZS0+b3Blbl9zdGF0ZWlkKQ0KPiA+ICYmwqAhbmZzNF9zdGF0
ZWlkX2lzX25ld2VyKHN0YXRlaWQsICZzdGF0ZS0+b3Blbl9zdGF0ZWlkKSkpIHRoZW7CoA0KPiA+
IHJlc3luYywNCj4gPiBiZWNhdXNlIHRoaXMgd2FzIGxpa2VseSBhbiBPUEVOX0RPV05HUkFERSB0
aGF0IGhhcyByYWNlZCB3aXRoIG9uZQ0KPiA+IG9yDQo+ID4gbW9yZSBPUEVOIGNhbGxzLg0KPiAN
Cj4gT0ssIGJ1dCB0aGVzZSBkbyBub3QgaGVscCB0aGUgb3JpZ2luYWxseSByZXBvcnRlZCByYWNl
IGJlY2F1c2UgYXQNCj4gdGhlwqANCj4gdGltZQ0KPiB0aGF0IHRoZSBDTE9TRSByZXNwb25zZSBo
YW5kbGluZyBkb2VzIGEgcmVzeW5jIC0gSSBwcmVzdW1lDQo+IG5mc19yZXN5bmNfb3Blbl9zdGF0
ZWlkX2xvY2tlZCgpIC0gYWxsIHRoZSBzdGF0ZSBjb3VudGVycyBhcmUgemVybyzCoA0KPiB3aGlj
aA0KPiBieXBhc3NlcyByZXNldHRpbmcgTkZTX09QRU5fU1RBVEUsIHdoaWNoLCBpZiB1bnNldCwg
YWxsb3dzIGFueQ0KPiBzdGF0ZWlkwqANCj4gdG8NCj4gdXBkYXRlIHRoZSBvcGVuX3N0YXRlaWQu
DQo+wqANCg0KUGxlYXNlIHNlZSB0aGUgMiBwYXRjaGVzIEkganVzdCBwb3N0ZWQuIFRoZXkgc2hv
dWxkIGZpeCB0aGUgcHJvYmxlbS4=

2016-11-14 18:40:28

by Benjamin Coddington

[permalink] [raw]
Subject: Re: CLOSE/OPEN race

T24gMTQgTm92IDIwMTYsIGF0IDExOjI5LCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6Cj4KPiBQbGVh
c2Ugc2VlIHRoZSAyIHBhdGNoZXMgSSBqdXN0IHBvc3RlZC4gVGhleSBzaG91bGQgZml4IHRoZSAK
PiBwcm9ibGVtLk7Ci8KnwrLDpsOscsK4wpt5w7rDqMKaw5hiwrJYwqzCtsOHwqd2w5hewpYpw57C
unsubsOHK8KJwrfCpcKKe8Kxwp3DuyLCnsOYXm7Ch3LCocO2wqZ6w4sawoHDq2jCmcKow6jCrcOa
JsKiw7gewq5HwqvCncOpaMKuAyjCrcOpwprCjsKKw53Comoiwp3DuhrCthttwqfDv8OvwoHDqsOk
esK5w57ClsKKw6DDvmbCo8KiwrdowprCiMKnfsKIbcKaCgpTZWVtcyBsaWtlIHRoZXkgZG8uICBU
aGFua3MhICBJJ2xsIGxlYXZlIHRoZW0gc3Bpbm5pbmcgaW4gdGhhdCB0ZXN0Cm92ZXJuaWdodCB0
byByZWFsbHkgZ2l2ZSB0aGVtIGEgd29ya291dC4KCkJlbgo=