2017-10-16 08:49:59

by Stanislav Kinsburskiy

[permalink] [raw]
Subject: [RFC} NFS client issue with exclusive creation when server died in the middle

Hi,

We discovered an issue with NFSv4.0 mount.
Server has crashed (or killed by OOM; it was Ganesha) on exclusive open-create operation after file was actually created, but no response was send to the client.
Server was restarted (with grace period), and next clients attempt to create a file after server is ready fails with EEXIST.
This is, probably, because for each open request client creates opendata and puts new jiffies value as the verifier (in nfs4_opendata_alloc) in the request.
Does it sound like a client issue?

Thanks in advance,
Stanislav Kinsburskii




2017-10-16 14:57:53

by Trond Myklebust

[permalink] [raw]
Subject: Re: [RFC} NFS client issue with exclusive creation when server died in the middle

T24gTW9uLCAyMDE3LTEwLTE2IGF0IDEwOjQ5ICswMjAwLCBTdGFuaXNsYXYgS2luc2J1cnNraXkg
d3JvdGU6DQo+IEhpLA0KPiANCj4gV2UgZGlzY292ZXJlZCBhbiBpc3N1ZSB3aXRoIE5GU3Y0LjAg
bW91bnQuDQo+IFNlcnZlciBoYXMgY3Jhc2hlZCAob3Iga2lsbGVkIGJ5IE9PTTsgaXQgd2FzIEdh
bmVzaGEpIG9uIGV4Y2x1c2l2ZQ0KPiBvcGVuLWNyZWF0ZSBvcGVyYXRpb24gYWZ0ZXIgZmlsZSB3
YXMgYWN0dWFsbHkgY3JlYXRlZCwgYnV0IG5vDQo+IHJlc3BvbnNlIHdhcyBzZW5kIHRvIHRoZSBj
bGllbnQuDQo+IFNlcnZlciB3YXMgcmVzdGFydGVkICh3aXRoIGdyYWNlIHBlcmlvZCksIGFuZCBu
ZXh0IGNsaWVudHMgYXR0ZW1wdCB0bw0KPiBjcmVhdGUgYSBmaWxlIGFmdGVyIHNlcnZlciBpcyBy
ZWFkeSBmYWlscyB3aXRoIEVFWElTVC4NCj4gVGhpcyBpcywgcHJvYmFibHksIGJlY2F1c2UgZm9y
IGVhY2ggb3BlbiByZXF1ZXN0IGNsaWVudCBjcmVhdGVzDQo+IG9wZW5kYXRhIGFuZCBwdXRzIG5l
dyBqaWZmaWVzIHZhbHVlIGFzIHRoZSB2ZXJpZmllciAoaW4NCj4gbmZzNF9vcGVuZGF0YV9hbGxv
YykgaW4gdGhlIHJlcXVlc3QuDQo+IERvZXMgaXQgc291bmQgbGlrZSBhIGNsaWVudCBpc3N1ZT8N
Cj4gDQoNCkhpIFN0YW5pc2xhdiwNCg0KSWYgaXQgZGlkbid0IHJlY2VpdmUgYSByZXBseSwgdGhl
IGNsaWVudCBzaG91bGQgYmUgcmV1c2luZyB0aGUgc2FtZQ0Kb3BlbmRhdGEgZm9yIHRoZSByZXNl
bmQgb2YgdGhlIG9wZXJhdGlvbiBhZnRlciBzdGF0ZSByZWNvdmVyIGlzDQpjb21wbGV0ZS4gRG9l
c24ndCBpdD8NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRh
aW5lciwgUHJpbWFyeURhdGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg==


2017-10-16 15:18:42

by Stanislav Kinsburskiy

[permalink] [raw]
Subject: Re: [RFC} NFS client issue with exclusive creation when server died in the middle



16.10.2017 16:57, Trond Myklebust пишет:
> On Mon, 2017-10-16 at 10:49 +0200, Stanislav Kinsburskiy wrote:
>> Hi,
>>
>> We discovered an issue with NFSv4.0 mount.
>> Server has crashed (or killed by OOM; it was Ganesha) on exclusive
>> open-create operation after file was actually created, but no
>> response was send to the client.
>> Server was restarted (with grace period), and next clients attempt to
>> create a file after server is ready fails with EEXIST.
>> This is, probably, because for each open request client creates
>> opendata and puts new jiffies value as the verifier (in
>> nfs4_opendata_alloc) in the request.
>> Does it sound like a client issue?
>>
>
> Hi Stanislav,
>
> If it didn't receive a reply, the client should be reusing the same
> opendata for the resend of the operation after state recover is
> complete. Doesn't it?
>

Hi Trond,

Well, yes, it should. But looks like it doesn't, unfortunately. That's at least what we saw on the server side.
It was git clone operation and server crashed somewhere in the middle.
Then server was migrated (we have a shared storage), but migration uses grace logic, so it's equal to server restart.
That what we saw during clone:

nfs server 192.168.56.201:/0200000000000003: resource temporarily unavailable (jukebox)
nfs server 192.168.56.201:/0200000000000003: resource temporarily unavailable (jukebox)
nfs server 192.168.56.201:/0200000000000003: resource temporarily unavailable (jukebox)
nfs server 192.168.56.201:/0200000000000003: resource temporarily unavailable (jukebox)
nfs server 192.168.56.201:/0200000000000003: resource available again

then git failed with the following:

error: unable to create file src/test/cli/ceph-authtool/list-empty-bin.t (File exists)

because ganesha received OPEN operation with createmode4 = exclusive4 and with the verifier that initialized by NFS client:

verf[0] = jiffies;
verf[1] = current->pid;
memcpy(p->o_arg.u.verifier.data, verf, sizeof(p->o_arg.u.verifier.data));

And this verifier didn't match the one, which created file had already.
So, our assumption, that verifier has changed, because otherwise Ganesha returns 0.

Below is my understanding (hopefully correct):
1) client started open/create request in nfs4_do_open (where there is a loop around _nfs4_do_open)
2) it received a non-fatal error from server (NFS4ERR_BAD_STATEID or other)
3) it repeats open/create operation, but with new verifier, which is allocated in:

_nfs4_do_open
nfs4_opendata_alloc

while is such a case client has to use old verifier. But I was released already upon failed open/create request.

What do you think about it?