2012-05-22 16:10:19

by Andy Adamson

[permalink] [raw]
Subject: [PATCH 1/4] NFSv4.1 just use nfs_put_client in filelayout release

From: Andy Adamson <[email protected]>

Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4filelayout.c | 12 ++----------
1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 474c630..8df91be 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -101,9 +101,6 @@ static void filelayout_reset_write(struct nfs_write_data *data)
&hdr->pages,
hdr->completion_ops);
}
- /* balance nfs_get_client in filelayout_write_pagelist */
- nfs_put_client(data->ds_clp);
- data->ds_clp = NULL;
}

static void filelayout_reset_read(struct nfs_read_data *data)
@@ -125,9 +122,6 @@ static void filelayout_reset_read(struct nfs_read_data *data)
&hdr->pages,
hdr->completion_ops);
}
- /* balance nfs_get_client in filelayout_read_pagelist */
- nfs_put_client(data->ds_clp);
- data->ds_clp = NULL;
}

static int filelayout_async_handle_error(struct rpc_task *task,
@@ -326,8 +320,7 @@ static void filelayout_read_release(void *data)
{
struct nfs_read_data *rdata = data;

- if (!test_bit(NFS_IOHDR_REDO, &rdata->header->flags))
- nfs_put_client(rdata->ds_clp);
+ nfs_put_client(rdata->ds_clp);
rdata->header->mds_ops->rpc_release(data);
}

@@ -424,8 +417,7 @@ static void filelayout_write_release(void *data)
{
struct nfs_write_data *wdata = data;

- if (!test_bit(NFS_IOHDR_REDO, &wdata->header->flags))
- nfs_put_client(wdata->ds_clp);
+ nfs_put_client(wdata->ds_clp);
wdata->header->mds_ops->rpc_release(data);
}

--
1.7.7.6



2012-05-22 18:07:11

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch

On Tue, May 22, 2012 at 1:46 PM, Myklebust, Trond
<[email protected]> wrote:
> On Tue, 2012-05-22 at 16:30 +0000, Adamson, Andy wrote:
>> On May 22, 2012, at 12:24 PM, Myklebust, Trond wrote:
>>
>> > On Tue, 2012-05-22 at 08:09 -0400, [email protected] wrote:
>> >> From: Andy Adamson <[email protected]>
>> >>
>> >> Signed-off-by: Andy Adamson <[email protected]>
>> >> ---
>> >> fs/nfs/write.c | ? ?2 ++
>> >> 1 files changed, 2 insertions(+), 0 deletions(-)
>> >>
>> >> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>> >> index e6fe3d6..c7295de 100644
>> >> --- a/fs/nfs/write.c
>> >> +++ b/fs/nfs/write.c
>> >> @@ -1555,6 +1555,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
>> >> ? ? ? ? ? ?/* We have a mismatch. Write the page again */
>> >> ? ? ? ? ? ?dprintk(" mismatch\n");
>> >> ? ? ? ? ? ?nfs_mark_request_dirty(req);
>> >> + ? ? ? ? ?nfs_unlock_request(req);
>> >> + ? ? ? ? ?continue;
>> >> ? ?next:
>> >> ? ? ? ? ? ?nfs_unlock_and_release_request(req);
>> >> ? ?}
>> >
>> > What is this patch trying to fix? As far as I can see it will lead to a
>> > reference leak.
>>
>> The release of the page drops the refcount to zero. Since it is a mismatch, we want to continue to use the page, so nfs_page_find_request_locked is called we get the WARNING in kref_get, and an Oops in various places in pnfs_update_layout.
>>
>> Here is the output of added printk's.
>>
>> -->Andy
>>
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988921] NFS: ? ? ? commit (0:35/2 4096@184320)
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988923] ?mismatch req ffff880042dbb800
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988927] nfs_release_request WB req ffff880042dbb800 wb_kref 1
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988932] put_lseg: lseg ffff880042d95380 ref 4 valid 0
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988945] nfs_page_find_request_locked req ffff88003e033800
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988949] nfs_page_find_request_locked WB req ffff88003e033800 wb_kref 0
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.988953] ------------[ cut here ]------------
>> May 21 18:25:33 fedora-64-2 kernel: [ ?152.989082] WARNING: at include/linux/kref.h:41 kref_get+0x20/0x2c [nfs]()
>
> Are these commit-to-ds requests?

Yes

-->Andy

> As far as I can see, there is a bug in
> transfer_commit_list() in that it locks the requests, but does not
> reference them.
>
> Fred?
>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> [email protected]
> http://www.netapp.com
>

2012-05-23 13:35:26

by Adamson, Andy

[permalink] [raw]
Subject: Re: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch


On May 22, 2012, at 4:40 PM, Myklebust, Trond wrote:

> On Tue, 2012-05-22 at 14:07 -0400, Andy Adamson wrote:
>> On Tue, May 22, 2012 at 1:46 PM, Myklebust, Trond
>> <[email protected]> wrote:
>>> On Tue, 2012-05-22 at 16:30 +0000, Adamson, Andy wrote:
>>>> On May 22, 2012, at 12:24 PM, Myklebust, Trond wrote:
>>>>
>>>>> On Tue, 2012-05-22 at 08:09 -0400, [email protected] wrote:
>>>>>> From: Andy Adamson <[email protected]>
>>>>>>
>>>>>> Signed-off-by: Andy Adamson <[email protected]>
>>>>>> ---
>>>>>> fs/nfs/write.c | 2 ++
>>>>>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>>>>>> index e6fe3d6..c7295de 100644
>>>>>> --- a/fs/nfs/write.c
>>>>>> +++ b/fs/nfs/write.c
>>>>>> @@ -1555,6 +1555,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
>>>>>> /* We have a mismatch. Write the page again */
>>>>>> dprintk(" mismatch\n");
>>>>>> nfs_mark_request_dirty(req);
>>>>>> + nfs_unlock_request(req);
>>>>>> + continue;
>>>>>> next:
>>>>>> nfs_unlock_and_release_request(req);
>>>>>> }
>>>>>
>>>>> What is this patch trying to fix? As far as I can see it will lead to a
>>>>> reference leak.
>>>>
>>>> The release of the page drops the refcount to zero. Since it is a mismatch, we want to continue to use the page, so nfs_page_find_request_locked is called we get the WARNING in kref_get, and an Oops in various places in pnfs_update_layout.
>>>>
>>>> Here is the output of added printk's.
>>>>
>>>> -->Andy
>>>>
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988921] NFS: commit (0:35/2 4096@184320)
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988923] mismatch req ffff880042dbb800
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988927] nfs_release_request WB req ffff880042dbb800 wb_kref 1
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988932] put_lseg: lseg ffff880042d95380 ref 4 valid 0
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988945] nfs_page_find_request_locked req ffff88003e033800
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988949] nfs_page_find_request_locked WB req ffff88003e033800 wb_kref 0
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988953] ------------[ cut here ]------------
>>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.989082] WARNING: at include/linux/kref.h:41 kref_get+0x20/0x2c [nfs]()
>>>
>>> Are these commit-to-ds requests?
>>
>> Yes
>
> OK, so how about the following fix:



Yep - works just fine. Thanks!

-->Andy

>
> 8<---------------------------------------------------------------
> From 53b8ee346463946f88b3e1639d688c384df1166c Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Tue, 22 May 2012 16:36:27 -0400
> Subject: [PATCH] NFSv4.1: Fix a bad reference count issue in the pNFS commit
> code
>
> filelayout_scan_commit_lists needs to bump the reference count on
> the struct nfs_page just like nfs_scan_commit_list().
>
> Reported-by: Andy Adamson <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/nfs/nfs4filelayout.c | 1 +
> 1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
> index 474c630..33849d3 100644
> --- a/fs/nfs/nfs4filelayout.c
> +++ b/fs/nfs/nfs4filelayout.c
> @@ -1106,6 +1106,7 @@ transfer_commit_list(struct list_head *src, struct list_head *dst,
> list_for_each_entry_safe(req, tmp, src, wb_list) {
> if (!nfs_lock_request(req))
> continue;
> + kref_get(&req->wb_kref);
> if (cond_resched_lock(cinfo->lock))
> list_safe_reset_next(req, tmp, wb_list);
> nfs_request_remove_commit_list(req, cinfo);
> --
> 1.7.7.6
>
>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> [email protected]
> http://www.netapp.com
>


2012-05-22 16:10:21

by Andy Adamson

[permalink] [raw]
Subject: [PATCH 2/4] NFSv4.1 skip rpc_call_done only on disconnected DS slot_table_waitq tasks

From: Andy Adamson <[email protected]>

We reset all I/O on a disconnected data server through the pgio layer indicated
by the NFS_IOHDR_REDO flag.

Differentiate between on-the-wire tasks returning with an error which must
call rpc_call_done and tasks woken from the data server slot_table_waitq
waiting for a session slot with a status of zero which call rpc_exit in
rpc_prepare and need to skip rpc_call_done.

Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4filelayout.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 8df91be..db6f2a6 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -302,7 +302,8 @@ static void filelayout_read_call_done(struct rpc_task *task, void *data)

dprintk("--> %s task->tk_status %d\n", __func__, task->tk_status);

- if (test_bit(NFS_IOHDR_REDO, &rdata->header->flags))
+ if (test_bit(NFS_IOHDR_REDO, &rdata->header->flags) &&
+ task->tk_status == 0)
return;

/* Note this may cause RPC to be resent */
@@ -399,7 +400,8 @@ static void filelayout_write_call_done(struct rpc_task *task, void *data)
{
struct nfs_write_data *wdata = data;

- if (test_bit(NFS_IOHDR_REDO, &wdata->header->flags))
+ if (test_bit(NFS_IOHDR_REDO, &wdata->header->flags) &&
+ task->tk_status == 0)
return;

/* Note this may cause RPC to be resent */
--
1.7.7.6


2012-05-22 16:12:16

by Andy Adamson

[permalink] [raw]
Subject: Re: [PATCH 1/4] NFSv4.1 Just use nfs_put_client in filelayout release

Sorry - this is a repeat...

-->Andy

On Tue, May 22, 2012 at 8:09 AM, <[email protected]> wrote:
> From: Andy Adamson <[email protected]>
>
> Signed-off-by: Andy Adamson <[email protected]>
> ---
> ?fs/nfs/nfs4filelayout.c | ? 12 ++----------
> ?1 files changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
> index 474c630..8df91be 100644
> --- a/fs/nfs/nfs4filelayout.c
> +++ b/fs/nfs/nfs4filelayout.c
> @@ -101,9 +101,6 @@ static void filelayout_reset_write(struct nfs_write_data *data)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&hdr->pages,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?hdr->completion_ops);
> ? ? ? ?}
> - ? ? ? /* balance nfs_get_client in filelayout_write_pagelist */
> - ? ? ? nfs_put_client(data->ds_clp);
> - ? ? ? data->ds_clp ? ? = NULL;
> ?}
>
> ?static void filelayout_reset_read(struct nfs_read_data *data)
> @@ -125,9 +122,6 @@ static void filelayout_reset_read(struct nfs_read_data *data)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&hdr->pages,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?hdr->completion_ops);
> ? ? ? ?}
> - ? ? ? /* balance nfs_get_client in filelayout_read_pagelist */
> - ? ? ? nfs_put_client(data->ds_clp);
> - ? ? ? data->ds_clp = NULL;
> ?}
>
> ?static int filelayout_async_handle_error(struct rpc_task *task,
> @@ -326,8 +320,7 @@ static void filelayout_read_release(void *data)
> ?{
> ? ? ? ?struct nfs_read_data *rdata = data;
>
> - ? ? ? if (!test_bit(NFS_IOHDR_REDO, &rdata->header->flags))
> - ? ? ? ? ? ? ? nfs_put_client(rdata->ds_clp);
> + ? ? ? nfs_put_client(rdata->ds_clp);
> ? ? ? ?rdata->header->mds_ops->rpc_release(data);
> ?}
>
> @@ -424,8 +417,7 @@ static void filelayout_write_release(void *data)
> ?{
> ? ? ? ?struct nfs_write_data *wdata = data;
>
> - ? ? ? if (!test_bit(NFS_IOHDR_REDO, &wdata->header->flags))
> - ? ? ? ? ? ? ? nfs_put_client(wdata->ds_clp);
> + ? ? ? nfs_put_client(wdata->ds_clp);
> ? ? ? ?wdata->header->mds_ops->rpc_release(data);
> ?}
>
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html

2012-05-22 16:10:21

by Andy Adamson

[permalink] [raw]
Subject: [PATCH 1/4] NFSv4.1 Just use nfs_put_client in filelayout release

From: Andy Adamson <[email protected]>

Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4filelayout.c | 12 ++----------
1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index 474c630..8df91be 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -101,9 +101,6 @@ static void filelayout_reset_write(struct nfs_write_data *data)
&hdr->pages,
hdr->completion_ops);
}
- /* balance nfs_get_client in filelayout_write_pagelist */
- nfs_put_client(data->ds_clp);
- data->ds_clp = NULL;
}

static void filelayout_reset_read(struct nfs_read_data *data)
@@ -125,9 +122,6 @@ static void filelayout_reset_read(struct nfs_read_data *data)
&hdr->pages,
hdr->completion_ops);
}
- /* balance nfs_get_client in filelayout_read_pagelist */
- nfs_put_client(data->ds_clp);
- data->ds_clp = NULL;
}

static int filelayout_async_handle_error(struct rpc_task *task,
@@ -326,8 +320,7 @@ static void filelayout_read_release(void *data)
{
struct nfs_read_data *rdata = data;

- if (!test_bit(NFS_IOHDR_REDO, &rdata->header->flags))
- nfs_put_client(rdata->ds_clp);
+ nfs_put_client(rdata->ds_clp);
rdata->header->mds_ops->rpc_release(data);
}

@@ -424,8 +417,7 @@ static void filelayout_write_release(void *data)
{
struct nfs_write_data *wdata = data;

- if (!test_bit(NFS_IOHDR_REDO, &wdata->header->flags))
- nfs_put_client(wdata->ds_clp);
+ nfs_put_client(wdata->ds_clp);
wdata->header->mds_ops->rpc_release(data);
}

--
1.7.7.6


2012-05-22 16:24:27

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch

T24gVHVlLCAyMDEyLTA1LTIyIGF0IDA4OjA5IC0wNDAwLCBhbmRyb3NAbmV0YXBwLmNvbSB3cm90
ZToNCj4gRnJvbTogQW5keSBBZGFtc29uIDxhbmRyb3NAbmV0YXBwLmNvbT4NCj4gDQo+IFNpZ25l
ZC1vZmYtYnk6IEFuZHkgQWRhbXNvbiA8YW5kcm9zQG5ldGFwcC5jb20+DQo+IC0tLQ0KPiAgZnMv
bmZzL3dyaXRlLmMgfCAgICAyICsrDQo+ICAxIGZpbGVzIGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygr
KSwgMCBkZWxldGlvbnMoLSkNCj4gDQo+IGRpZmYgLS1naXQgYS9mcy9uZnMvd3JpdGUuYyBiL2Zz
L25mcy93cml0ZS5jDQo+IGluZGV4IGU2ZmUzZDYuLmM3Mjk1ZGUgMTAwNjQ0DQo+IC0tLSBhL2Zz
L25mcy93cml0ZS5jDQo+ICsrKyBiL2ZzL25mcy93cml0ZS5jDQo+IEBAIC0xNTU1LDYgKzE1NTUs
OCBAQCBzdGF0aWMgdm9pZCBuZnNfY29tbWl0X3JlbGVhc2VfcGFnZXMoc3RydWN0IG5mc19jb21t
aXRfZGF0YSAqZGF0YSkNCj4gIAkJLyogV2UgaGF2ZSBhIG1pc21hdGNoLiBXcml0ZSB0aGUgcGFn
ZSBhZ2FpbiAqLw0KPiAgCQlkcHJpbnRrKCIgbWlzbWF0Y2hcbiIpOw0KPiAgCQluZnNfbWFya19y
ZXF1ZXN0X2RpcnR5KHJlcSk7DQo+ICsJCW5mc191bmxvY2tfcmVxdWVzdChyZXEpOw0KPiArCQlj
b250aW51ZTsNCj4gIAluZXh0Og0KPiAgCQluZnNfdW5sb2NrX2FuZF9yZWxlYXNlX3JlcXVlc3Qo
cmVxKTsNCj4gIAl9DQoNCldoYXQgaXMgdGhpcyBwYXRjaCB0cnlpbmcgdG8gZml4PyBBcyBmYXIg
YXMgSSBjYW4gc2VlIGl0IHdpbGwgbGVhZCB0byBhDQpyZWZlcmVuY2UgbGVhay4NCg0KLS0gDQpU
cm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0KDQpOZXRBcHANClRy
b25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNvbQ0KDQo=

2012-05-22 17:46:21

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch

T24gVHVlLCAyMDEyLTA1LTIyIGF0IDE2OjMwICswMDAwLCBBZGFtc29uLCBBbmR5IHdyb3RlOg0K
PiBPbiBNYXkgMjIsIDIwMTIsIGF0IDEyOjI0IFBNLCBNeWtsZWJ1c3QsIFRyb25kIHdyb3RlOg0K
PiANCj4gPiBPbiBUdWUsIDIwMTItMDUtMjIgYXQgMDg6MDkgLTA0MDAsIGFuZHJvc0BuZXRhcHAu
Y29tIHdyb3RlOg0KPiA+PiBGcm9tOiBBbmR5IEFkYW1zb24gPGFuZHJvc0BuZXRhcHAuY29tPg0K
PiA+PiANCj4gPj4gU2lnbmVkLW9mZi1ieTogQW5keSBBZGFtc29uIDxhbmRyb3NAbmV0YXBwLmNv
bT4NCj4gPj4gLS0tDQo+ID4+IGZzL25mcy93cml0ZS5jIHwgICAgMiArKw0KPiA+PiAxIGZpbGVz
IGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygrKSwgMCBkZWxldGlvbnMoLSkNCj4gPj4gDQo+ID4+IGRp
ZmYgLS1naXQgYS9mcy9uZnMvd3JpdGUuYyBiL2ZzL25mcy93cml0ZS5jDQo+ID4+IGluZGV4IGU2
ZmUzZDYuLmM3Mjk1ZGUgMTAwNjQ0DQo+ID4+IC0tLSBhL2ZzL25mcy93cml0ZS5jDQo+ID4+ICsr
KyBiL2ZzL25mcy93cml0ZS5jDQo+ID4+IEBAIC0xNTU1LDYgKzE1NTUsOCBAQCBzdGF0aWMgdm9p
ZCBuZnNfY29tbWl0X3JlbGVhc2VfcGFnZXMoc3RydWN0IG5mc19jb21taXRfZGF0YSAqZGF0YSkN
Cj4gPj4gCQkvKiBXZSBoYXZlIGEgbWlzbWF0Y2guIFdyaXRlIHRoZSBwYWdlIGFnYWluICovDQo+
ID4+IAkJZHByaW50aygiIG1pc21hdGNoXG4iKTsNCj4gPj4gCQluZnNfbWFya19yZXF1ZXN0X2Rp
cnR5KHJlcSk7DQo+ID4+ICsJCW5mc191bmxvY2tfcmVxdWVzdChyZXEpOw0KPiA+PiArCQljb250
aW51ZTsNCj4gPj4gCW5leHQ6DQo+ID4+IAkJbmZzX3VubG9ja19hbmRfcmVsZWFzZV9yZXF1ZXN0
KHJlcSk7DQo+ID4+IAl9DQo+ID4gDQo+ID4gV2hhdCBpcyB0aGlzIHBhdGNoIHRyeWluZyB0byBm
aXg/IEFzIGZhciBhcyBJIGNhbiBzZWUgaXQgd2lsbCBsZWFkIHRvIGENCj4gPiByZWZlcmVuY2Ug
bGVhay4NCj4gDQo+IFRoZSByZWxlYXNlIG9mIHRoZSBwYWdlIGRyb3BzIHRoZSByZWZjb3VudCB0
byB6ZXJvLiBTaW5jZSBpdCBpcyBhIG1pc21hdGNoLCB3ZSB3YW50IHRvIGNvbnRpbnVlIHRvIHVz
ZSB0aGUgcGFnZSwgc28gbmZzX3BhZ2VfZmluZF9yZXF1ZXN0X2xvY2tlZCBpcyBjYWxsZWQgd2Ug
Z2V0IHRoZSBXQVJOSU5HIGluIGtyZWZfZ2V0LCBhbmQgYW4gT29wcyBpbiB2YXJpb3VzIHBsYWNl
cyBpbiBwbmZzX3VwZGF0ZV9sYXlvdXQuDQo+IA0KPiBIZXJlIGlzIHRoZSBvdXRwdXQgb2YgYWRk
ZWQgcHJpbnRrJ3MuDQo+IA0KPiAtLT5BbmR5DQo+IA0KPiBNYXkgMjEgMTg6MjU6MzMgZmVkb3Jh
LTY0LTIga2VybmVsOiBbICAxNTIuOTg4OTIxXSBORlM6ICAgICAgIGNvbW1pdCAoMDozNS8yIDQw
OTZAMTg0MzIwKQ0KPiBNYXkgMjEgMTg6MjU6MzMgZmVkb3JhLTY0LTIga2VybmVsOiBbICAxNTIu
OTg4OTIzXSAgbWlzbWF0Y2ggcmVxIGZmZmY4ODAwNDJkYmI4MDANCj4gTWF5IDIxIDE4OjI1OjMz
IGZlZG9yYS02NC0yIGtlcm5lbDogWyAgMTUyLjk4ODkyN10gbmZzX3JlbGVhc2VfcmVxdWVzdCBX
QiByZXEgZmZmZjg4MDA0MmRiYjgwMCB3Yl9rcmVmIDENCj4gTWF5IDIxIDE4OjI1OjMzIGZlZG9y
YS02NC0yIGtlcm5lbDogWyAgMTUyLjk4ODkzMl0gcHV0X2xzZWc6IGxzZWcgZmZmZjg4MDA0MmQ5
NTM4MCByZWYgNCB2YWxpZCAwDQo+IE1heSAyMSAxODoyNTozMyBmZWRvcmEtNjQtMiBrZXJuZWw6
IFsgIDE1Mi45ODg5NDVdIG5mc19wYWdlX2ZpbmRfcmVxdWVzdF9sb2NrZWQgcmVxIGZmZmY4ODAw
M2UwMzM4MDANCj4gTWF5IDIxIDE4OjI1OjMzIGZlZG9yYS02NC0yIGtlcm5lbDogWyAgMTUyLjk4
ODk0OV0gbmZzX3BhZ2VfZmluZF9yZXF1ZXN0X2xvY2tlZCBXQiByZXEgZmZmZjg4MDAzZTAzMzgw
MCB3Yl9rcmVmIDANCj4gTWF5IDIxIDE4OjI1OjMzIGZlZG9yYS02NC0yIGtlcm5lbDogWyAgMTUy
Ljk4ODk1M10gLS0tLS0tLS0tLS0tWyBjdXQgaGVyZSBdLS0tLS0tLS0tLS0tDQo+IE1heSAyMSAx
ODoyNTozMyBmZWRvcmEtNjQtMiBrZXJuZWw6IFsgIDE1Mi45ODkwODJdIFdBUk5JTkc6IGF0IGlu
Y2x1ZGUvbGludXgva3JlZi5oOjQxIGtyZWZfZ2V0KzB4MjAvMHgyYyBbbmZzXSgpDQoNCkFyZSB0
aGVzZSBjb21taXQtdG8tZHMgcmVxdWVzdHM/IEFzIGZhciBhcyBJIGNhbiBzZWUsIHRoZXJlIGlz
IGEgYnVnIGluDQp0cmFuc2Zlcl9jb21taXRfbGlzdCgpIGluIHRoYXQgaXQgbG9ja3MgdGhlIHJl
cXVlc3RzLCBidXQgZG9lcyBub3QNCnJlZmVyZW5jZSB0aGVtLg0KDQpGcmVkPw0KDQotLSANClRy
b25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJv
bmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5uZXRhcHAuY29tDQoNCg==

2012-05-22 16:10:21

by Andy Adamson

[permalink] [raw]
Subject: [PATCH 3/4] NFSv4.1 fix null state reference in filelayout_async_handle_error

From: Andy Adamson <[email protected]>

Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/nfs4filelayout.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c
index db6f2a6..cd3d1ce 100644
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -193,8 +193,8 @@ static int filelayout_async_handle_error(struct rpc_task *task,
* layout is destroyed and a new valid layout is obtained.
*/
set_bit(NFS_LAYOUT_INVALID,
- &NFS_I(state->inode)->layout->plh_flags);
- pnfs_destroy_layout(NFS_I(state->inode));
+ &NFS_I(inode)->layout->plh_flags);
+ pnfs_destroy_layout(NFS_I(inode));
rpc_wake_up(&tbl->slot_tbl_waitq);
goto reset;
/* RPC connection errors */
@@ -208,7 +208,7 @@ static int filelayout_async_handle_error(struct rpc_task *task,
dprintk("%s DS connection error %d\n", __func__,
task->tk_status);
if (!filelayout_test_devid_invalid(devid))
- _pnfs_return_layout(state->inode);
+ _pnfs_return_layout(inode);
filelayout_mark_devid_invalid(devid);
rpc_wake_up(&tbl->slot_tbl_waitq);
nfs4_ds_disconnect(clp);
--
1.7.7.6


2012-05-22 16:30:30

by Adamson, Andy

[permalink] [raw]
Subject: Re: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch


On May 22, 2012, at 12:24 PM, Myklebust, Trond wrote:

> On Tue, 2012-05-22 at 08:09 -0400, [email protected] wrote:
>> From: Andy Adamson <[email protected]>
>>
>> Signed-off-by: Andy Adamson <[email protected]>
>> ---
>> fs/nfs/write.c | 2 ++
>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c
>> index e6fe3d6..c7295de 100644
>> --- a/fs/nfs/write.c
>> +++ b/fs/nfs/write.c
>> @@ -1555,6 +1555,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
>> /* We have a mismatch. Write the page again */
>> dprintk(" mismatch\n");
>> nfs_mark_request_dirty(req);
>> + nfs_unlock_request(req);
>> + continue;
>> next:
>> nfs_unlock_and_release_request(req);
>> }
>
> What is this patch trying to fix? As far as I can see it will lead to a
> reference leak.

The release of the page drops the refcount to zero. Since it is a mismatch, we want to continue to use the page, so nfs_page_find_request_locked is called we get the WARNING in kref_get, and an Oops in various places in pnfs_update_layout.

Here is the output of added printk's.

-->Andy

May 21 18:25:33 fedora-64-2 kernel: [ 152.988921] NFS: commit (0:35/2 4096@184320)
May 21 18:25:33 fedora-64-2 kernel: [ 152.988923] mismatch req ffff880042dbb800
May 21 18:25:33 fedora-64-2 kernel: [ 152.988927] nfs_release_request WB req ffff880042dbb800 wb_kref 1
May 21 18:25:33 fedora-64-2 kernel: [ 152.988932] put_lseg: lseg ffff880042d95380 ref 4 valid 0
May 21 18:25:33 fedora-64-2 kernel: [ 152.988945] nfs_page_find_request_locked req ffff88003e033800
May 21 18:25:33 fedora-64-2 kernel: [ 152.988949] nfs_page_find_request_locked WB req ffff88003e033800 wb_kref 0
May 21 18:25:33 fedora-64-2 kernel: [ 152.988953] ------------[ cut here ]------------
May 21 18:25:33 fedora-64-2 kernel: [ 152.989082] WARNING: at include/linux/kref.h:41 kref_get+0x20/0x2c [nfs]()




May 21 18:25:33 fedora-64-2 kernel: [ 152.989484] kernel BUG at fs/nfs/pnfs.c:584!
May 21 18:25:33 fedora-64-2 kernel: [ 152.989488] invalid opcode: 0000 [#1] SMP
May 21 18:25:33 fedora-64-2 kernel: [ 152.989493] CPU 1
May 21 18:25:33 fedora-64-2 kernel: [ 152.989495] Modules linked in: nfs_layout_nfsv41_files nfs auth_rpcgss nfs_acl fuse lockd rfcomm bnep ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables snd_ens1371 gameport snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device btusb snd_pcm joydev bluetooth rfkill ppdev snd_timer vmw_balloon microcode snd parport_pc parport soundcore snd_page_alloc e1000 i2c_piix4 i2c_core shpchp uinput sunrpc mptspi mptscsih mptbase scsi_transport_spi [last unloaded: scsi_wait_scan]
May 21 18:25:33 fedora-64-2 kernel: [ 152.989552]
May 21 18:25:33 fedora-64-2 kernel: [ 152.989557] Pid: 1595, comm: flush-0:35 Tainted: G W 3.4.0-rc7+ #5 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
May 21 18:25:33 fedora-64-2 kernel: [ 152.989566] RIP: 0010:[<ffffffffa02c7377>] [<ffffffffa02c7377>] pnfs_update_layout+0x460/0x6a6 [nfs]

>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> [email protected]
> http://www.netapp.com
>


2012-05-22 20:40:43

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch

T24gVHVlLCAyMDEyLTA1LTIyIGF0IDE0OjA3IC0wNDAwLCBBbmR5IEFkYW1zb24gd3JvdGU6DQo+
IE9uIFR1ZSwgTWF5IDIyLCAyMDEyIGF0IDE6NDYgUE0sIE15a2xlYnVzdCwgVHJvbmQNCj4gPFRy
b25kLk15a2xlYnVzdEBuZXRhcHAuY29tPiB3cm90ZToNCj4gPiBPbiBUdWUsIDIwMTItMDUtMjIg
YXQgMTY6MzAgKzAwMDAsIEFkYW1zb24sIEFuZHkgd3JvdGU6DQo+ID4+IE9uIE1heSAyMiwgMjAx
MiwgYXQgMTI6MjQgUE0sIE15a2xlYnVzdCwgVHJvbmQgd3JvdGU6DQo+ID4+DQo+ID4+ID4gT24g
VHVlLCAyMDEyLTA1LTIyIGF0IDA4OjA5IC0wNDAwLCBhbmRyb3NAbmV0YXBwLmNvbSB3cm90ZToN
Cj4gPj4gPj4gRnJvbTogQW5keSBBZGFtc29uIDxhbmRyb3NAbmV0YXBwLmNvbT4NCj4gPj4gPj4N
Cj4gPj4gPj4gU2lnbmVkLW9mZi1ieTogQW5keSBBZGFtc29uIDxhbmRyb3NAbmV0YXBwLmNvbT4N
Cj4gPj4gPj4gLS0tDQo+ID4+ID4+IGZzL25mcy93cml0ZS5jIHwgICAgMiArKw0KPiA+PiA+PiAx
IGZpbGVzIGNoYW5nZWQsIDIgaW5zZXJ0aW9ucygrKSwgMCBkZWxldGlvbnMoLSkNCj4gPj4gPj4N
Cj4gPj4gPj4gZGlmZiAtLWdpdCBhL2ZzL25mcy93cml0ZS5jIGIvZnMvbmZzL3dyaXRlLmMNCj4g
Pj4gPj4gaW5kZXggZTZmZTNkNi4uYzcyOTVkZSAxMDA2NDQNCj4gPj4gPj4gLS0tIGEvZnMvbmZz
L3dyaXRlLmMNCj4gPj4gPj4gKysrIGIvZnMvbmZzL3dyaXRlLmMNCj4gPj4gPj4gQEAgLTE1NTUs
NiArMTU1NSw4IEBAIHN0YXRpYyB2b2lkIG5mc19jb21taXRfcmVsZWFzZV9wYWdlcyhzdHJ1Y3Qg
bmZzX2NvbW1pdF9kYXRhICpkYXRhKQ0KPiA+PiA+PiAgICAgICAgICAgIC8qIFdlIGhhdmUgYSBt
aXNtYXRjaC4gV3JpdGUgdGhlIHBhZ2UgYWdhaW4gKi8NCj4gPj4gPj4gICAgICAgICAgICBkcHJp
bnRrKCIgbWlzbWF0Y2hcbiIpOw0KPiA+PiA+PiAgICAgICAgICAgIG5mc19tYXJrX3JlcXVlc3Rf
ZGlydHkocmVxKTsNCj4gPj4gPj4gKyAgICAgICAgICBuZnNfdW5sb2NrX3JlcXVlc3QocmVxKTsN
Cj4gPj4gPj4gKyAgICAgICAgICBjb250aW51ZTsNCj4gPj4gPj4gICAgbmV4dDoNCj4gPj4gPj4g
ICAgICAgICAgICBuZnNfdW5sb2NrX2FuZF9yZWxlYXNlX3JlcXVlc3QocmVxKTsNCj4gPj4gPj4g
ICAgfQ0KPiA+PiA+DQo+ID4+ID4gV2hhdCBpcyB0aGlzIHBhdGNoIHRyeWluZyB0byBmaXg/IEFz
IGZhciBhcyBJIGNhbiBzZWUgaXQgd2lsbCBsZWFkIHRvIGENCj4gPj4gPiByZWZlcmVuY2UgbGVh
ay4NCj4gPj4NCj4gPj4gVGhlIHJlbGVhc2Ugb2YgdGhlIHBhZ2UgZHJvcHMgdGhlIHJlZmNvdW50
IHRvIHplcm8uIFNpbmNlIGl0IGlzIGEgbWlzbWF0Y2gsIHdlIHdhbnQgdG8gY29udGludWUgdG8g
dXNlIHRoZSBwYWdlLCBzbyBuZnNfcGFnZV9maW5kX3JlcXVlc3RfbG9ja2VkIGlzIGNhbGxlZCB3
ZSBnZXQgdGhlIFdBUk5JTkcgaW4ga3JlZl9nZXQsIGFuZCBhbiBPb3BzIGluIHZhcmlvdXMgcGxh
Y2VzIGluIHBuZnNfdXBkYXRlX2xheW91dC4NCj4gPj4NCj4gPj4gSGVyZSBpcyB0aGUgb3V0cHV0
IG9mIGFkZGVkIHByaW50aydzLg0KPiA+Pg0KPiA+PiAtLT5BbmR5DQo+ID4+DQo+ID4+IE1heSAy
MSAxODoyNTozMyBmZWRvcmEtNjQtMiBrZXJuZWw6IFsgIDE1Mi45ODg5MjFdIE5GUzogICAgICAg
Y29tbWl0ICgwOjM1LzIgNDA5NkAxODQzMjApDQo+ID4+IE1heSAyMSAxODoyNTozMyBmZWRvcmEt
NjQtMiBrZXJuZWw6IFsgIDE1Mi45ODg5MjNdICBtaXNtYXRjaCByZXEgZmZmZjg4MDA0MmRiYjgw
MA0KPiA+PiBNYXkgMjEgMTg6MjU6MzMgZmVkb3JhLTY0LTIga2VybmVsOiBbICAxNTIuOTg4OTI3
XSBuZnNfcmVsZWFzZV9yZXF1ZXN0IFdCIHJlcSBmZmZmODgwMDQyZGJiODAwIHdiX2tyZWYgMQ0K
PiA+PiBNYXkgMjEgMTg6MjU6MzMgZmVkb3JhLTY0LTIga2VybmVsOiBbICAxNTIuOTg4OTMyXSBw
dXRfbHNlZzogbHNlZyBmZmZmODgwMDQyZDk1MzgwIHJlZiA0IHZhbGlkIDANCj4gPj4gTWF5IDIx
IDE4OjI1OjMzIGZlZG9yYS02NC0yIGtlcm5lbDogWyAgMTUyLjk4ODk0NV0gbmZzX3BhZ2VfZmlu
ZF9yZXF1ZXN0X2xvY2tlZCByZXEgZmZmZjg4MDAzZTAzMzgwMA0KPiA+PiBNYXkgMjEgMTg6MjU6
MzMgZmVkb3JhLTY0LTIga2VybmVsOiBbICAxNTIuOTg4OTQ5XSBuZnNfcGFnZV9maW5kX3JlcXVl
c3RfbG9ja2VkIFdCIHJlcSBmZmZmODgwMDNlMDMzODAwIHdiX2tyZWYgMA0KPiA+PiBNYXkgMjEg
MTg6MjU6MzMgZmVkb3JhLTY0LTIga2VybmVsOiBbICAxNTIuOTg4OTUzXSAtLS0tLS0tLS0tLS1b
IGN1dCBoZXJlIF0tLS0tLS0tLS0tLS0NCj4gPj4gTWF5IDIxIDE4OjI1OjMzIGZlZG9yYS02NC0y
IGtlcm5lbDogWyAgMTUyLjk4OTA4Ml0gV0FSTklORzogYXQgaW5jbHVkZS9saW51eC9rcmVmLmg6
NDEga3JlZl9nZXQrMHgyMC8weDJjIFtuZnNdKCkNCj4gPg0KPiA+IEFyZSB0aGVzZSBjb21taXQt
dG8tZHMgcmVxdWVzdHM/DQo+IA0KPiBZZXMNCg0KT0ssIHNvIGhvdyBhYm91dCB0aGUgZm9sbG93
aW5nIGZpeDoNCg0KODwtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyb20gNTNiOGVlMzQ2NDYzOTQ2Zjg4YjNlMTYzOWQ2ODhj
Mzg0ZGYxMTY2YyBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDENCkZyb206IFRyb25kIE15a2xlYnVz
dCA8VHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20+DQpEYXRlOiBUdWUsIDIyIE1heSAyMDEyIDE2
OjM2OjI3IC0wNDAwDQpTdWJqZWN0OiBbUEFUQ0hdIE5GU3Y0LjE6IEZpeCBhIGJhZCByZWZlcmVu
Y2UgY291bnQgaXNzdWUgaW4gdGhlIHBORlMgY29tbWl0DQogY29kZQ0KDQpmaWxlbGF5b3V0X3Nj
YW5fY29tbWl0X2xpc3RzIG5lZWRzIHRvIGJ1bXAgdGhlIHJlZmVyZW5jZSBjb3VudCBvbg0KdGhl
IHN0cnVjdCBuZnNfcGFnZSBqdXN0IGxpa2UgbmZzX3NjYW5fY29tbWl0X2xpc3QoKS4NCg0KUmVw
b3J0ZWQtYnk6IEFuZHkgQWRhbXNvbiA8YW5kcm9zQG5ldGFwcC5jb20+DQpTaWduZWQtb2ZmLWJ5
OiBUcm9uZCBNeWtsZWJ1c3QgPFRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tPg0KLS0tDQogZnMv
bmZzL25mczRmaWxlbGF5b3V0LmMgfCAgICAxICsNCiAxIGZpbGVzIGNoYW5nZWQsIDEgaW5zZXJ0
aW9ucygrKSwgMCBkZWxldGlvbnMoLSkNCg0KZGlmZiAtLWdpdCBhL2ZzL25mcy9uZnM0ZmlsZWxh
eW91dC5jIGIvZnMvbmZzL25mczRmaWxlbGF5b3V0LmMNCmluZGV4IDQ3NGM2MzAuLjMzODQ5ZDMg
MTAwNjQ0DQotLS0gYS9mcy9uZnMvbmZzNGZpbGVsYXlvdXQuYw0KKysrIGIvZnMvbmZzL25mczRm
aWxlbGF5b3V0LmMNCkBAIC0xMTA2LDYgKzExMDYsNyBAQCB0cmFuc2Zlcl9jb21taXRfbGlzdChz
dHJ1Y3QgbGlzdF9oZWFkICpzcmMsIHN0cnVjdCBsaXN0X2hlYWQgKmRzdCwNCiAJbGlzdF9mb3Jf
ZWFjaF9lbnRyeV9zYWZlKHJlcSwgdG1wLCBzcmMsIHdiX2xpc3QpIHsNCiAJCWlmICghbmZzX2xv
Y2tfcmVxdWVzdChyZXEpKQ0KIAkJCWNvbnRpbnVlOw0KKwkJa3JlZl9nZXQoJnJlcS0+d2Jfa3Jl
Zik7DQogCQlpZiAoY29uZF9yZXNjaGVkX2xvY2soY2luZm8tPmxvY2spKQ0KIAkJCWxpc3Rfc2Fm
ZV9yZXNldF9uZXh0KHJlcSwgdG1wLCB3Yl9saXN0KTsNCiAJCW5mc19yZXF1ZXN0X3JlbW92ZV9j
b21taXRfbGlzdChyZXEsIGNpbmZvKTsNCi0tIA0KMS43LjcuNg0KDQoNCi0tIA0KVHJvbmQgTXlr
bGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXINCg0KTmV0QXBwDQpUcm9uZC5NeWts
ZWJ1c3RAbmV0YXBwLmNvbQ0Kd3d3Lm5ldGFwcC5jb20NCg0K

2012-05-22 16:10:21

by Andy Adamson

[permalink] [raw]
Subject: [PATCH 4/4] NFSv4.1 do not release page on commit mismatch

From: Andy Adamson <[email protected]>

Signed-off-by: Andy Adamson <[email protected]>
---
fs/nfs/write.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index e6fe3d6..c7295de 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1555,6 +1555,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data)
/* We have a mismatch. Write the page again */
dprintk(" mismatch\n");
nfs_mark_request_dirty(req);
+ nfs_unlock_request(req);
+ continue;
next:
nfs_unlock_and_release_request(req);
}
--
1.7.7.6