2019-05-13 07:04:01

by Yihao Wu

[permalink] [raw]
Subject: [PATCH v2 0/2] Fix two bugs CB_NOTIFY_LOCK failing to wake a water

This patch set fix bugs related to CB_NOTIFY_LOCK. These bugs cause
poor performance in locking operation. And this patchset should also fix
the failure when running xfstests generic/089 on a NFSv4.1 filesystem.

Yihao Wu (2):
NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled

fs/nfs/nfs4proc.c | 31 ++++++++++++-------------------
1 file changed, 12 insertions(+), 19 deletions(-)

--
1.8.3.1


2019-05-13 07:04:01

by Yihao Wu

[permalink] [raw]
Subject: [PATCH v2 1/2] NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter

Commit b7dbcc0e433f "NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter"
found this bug. However it didn't fix it.

This commit replaces schedule_timeout() with wait_woken() and
default_wake_function() with woken_wake_function() in function
nfs4_retry_setlk() and nfs4_wake_lock_waiter(). wait_woken() uses
memory barriers in its implementation to avoid potential race condition
when putting a process into sleeping state and then waking it up.

Fixes: a1d617d8f134 ("nfs: allow blocking locks to be awoken by lock callbacks")
Cc: [email protected] #4.9+
Signed-off-by: Yihao Wu <[email protected]>
---
fs/nfs/nfs4proc.c | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c29cbef..f9ed6b5 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6932,7 +6932,6 @@ struct nfs4_lock_waiter {
struct task_struct *task;
struct inode *inode;
struct nfs_lowner *owner;
- bool notified;
};

static int
@@ -6954,13 +6953,13 @@ struct nfs4_lock_waiter {
/* Make sure it's for the right inode */
if (nfs_compare_fh(NFS_FH(waiter->inode), &cbnl->cbnl_fh))
return 0;
-
- waiter->notified = true;
}

/* override "private" so we can use default_wake_function */
wait->private = waiter->task;
- ret = autoremove_wake_function(wait, mode, flags, key);
+ ret = woken_wake_function(wait, mode, flags, key);
+ if (ret)
+ list_del_init(&wait->entry);
wait->private = waiter;
return ret;
}
@@ -6979,8 +6978,7 @@ struct nfs4_lock_waiter {
.s_dev = server->s_dev };
struct nfs4_lock_waiter waiter = { .task = current,
.inode = state->inode,
- .owner = &owner,
- .notified = false };
+ .owner = &owner};
wait_queue_entry_t wait;

/* Don't bother with waitqueue if we don't expect a callback */
@@ -6993,21 +6991,14 @@ struct nfs4_lock_waiter {
add_wait_queue(q, &wait);

while(!signalled()) {
- waiter.notified = false;
status = nfs4_proc_setlk(state, cmd, request);
if ((status != -EAGAIN) || IS_SETLK(cmd))
break;

status = -ERESTARTSYS;
- spin_lock_irqsave(&q->lock, flags);
- if (waiter.notified) {
- spin_unlock_irqrestore(&q->lock, flags);
- continue;
- }
- set_current_state(TASK_INTERRUPTIBLE);
- spin_unlock_irqrestore(&q->lock, flags);
-
- freezable_schedule_timeout(NFS4_LOCK_MAXTIMEOUT);
+ freezer_do_not_count();
+ wait_woken(&wait, TASK_INTERRUPTIBLE, NFS4_LOCK_MAXTIMEOUT);
+ freezer_count();
}

finish_wait(q, &wait);
--
1.8.3.1

2019-05-13 07:04:38

by Yihao Wu

[permalink] [raw]
Subject: [PATCH v2 2/2] NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled

When a waiter is waked by CB_NOTIFY_LOCK, it will retry
nfs4_proc_setlk(). The waiter may fail to nfs4_proc_setlk() and sleep
again. However, the waiter is already removed from clp->cl_lock_waitq
when handling CB_NOTIFY_LOCK in nfs4_wake_lock_waiter(). So any
subsequent CB_NOTIFY_LOCK won't wake this waiter anymore. We should
put the waiter back to clp->cl_lock_waitq before retrying.

Cc: [email protected] #4.9+
Signed-off-by: Yihao Wu <[email protected]>
---
fs/nfs/nfs4proc.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f9ed6b5..218c086 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6988,20 +6988,22 @@ struct nfs4_lock_waiter {
init_wait(&wait);
wait.private = &waiter;
wait.func = nfs4_wake_lock_waiter;
- add_wait_queue(q, &wait);

while(!signalled()) {
+ add_wait_queue(q, &wait);
status = nfs4_proc_setlk(state, cmd, request);
- if ((status != -EAGAIN) || IS_SETLK(cmd))
+ if ((status != -EAGAIN) || IS_SETLK(cmd)) {
+ finish_wait(q, &wait);
break;
+ }

status = -ERESTARTSYS;
freezer_do_not_count();
wait_woken(&wait, TASK_INTERRUPTIBLE, NFS4_LOCK_MAXTIMEOUT);
freezer_count();
+ finish_wait(q, &wait);
}

- finish_wait(q, &wait);
return status;
}
#else /* !CONFIG_NFS_V4_1 */
--
1.8.3.1

2019-05-13 14:08:20

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix two bugs CB_NOTIFY_LOCK failing to wake a water

On Mon, 2019-05-13 at 14:49 +0800, Yihao Wu wrote:
> This patch set fix bugs related to CB_NOTIFY_LOCK. These bugs cause
> poor performance in locking operation. And this patchset should also fix
> the failure when running xfstests generic/089 on a NFSv4.1 filesystem.
>
> Yihao Wu (2):
> NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
> NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
>
> fs/nfs/nfs4proc.c | 31 ++++++++++++-------------------
> 1 file changed, 12 insertions(+), 19 deletions(-)
>

Looks good to me. Nice catch, btw!

Reviewed-by: Jeff Layton <[email protected]>

2019-05-16 08:03:14

by Yihao Wu

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix two bugs CB_NOTIFY_LOCK failing to wake a water

On 2019/5/13 9:36 PM, Jeff Layton wrote:
> On Mon, 2019-05-13 at 14:49 +0800, Yihao Wu wrote:
>> This patch set fix bugs related to CB_NOTIFY_LOCK. These bugs cause
>> poor performance in locking operation. And this patchset should also fix
>> the failure when running xfstests generic/089 on a NFSv4.1 filesystem.
>>
>> Yihao Wu (2):
>> NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
>> NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
>>
>> fs/nfs/nfs4proc.c | 31 ++++++++++++-------------------
>> 1 file changed, 12 insertions(+), 19 deletions(-)
>>
>
> Looks good to me. Nice catch, btw!
>
> Reviewed-by: Jeff Layton <[email protected]>
>

Thank you for your reviewing, Jeff!

And it seems I forgot to CC maintainers of fs/nfs. So I added you to the CC
list, Trond and Anna. Does this patchset need further reviewing?

Sorry to bother you.

Thanks,
Yihao Wu

2019-05-16 11:50:21

by Jeffrey Layton

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix two bugs CB_NOTIFY_LOCK failing to wake a water

On Thu, 2019-05-16 at 16:01 +0800, Yihao Wu wrote:
> On 2019/5/13 9:36 PM, Jeff Layton wrote:
> > On Mon, 2019-05-13 at 14:49 +0800, Yihao Wu wrote:
> > > This patch set fix bugs related to CB_NOTIFY_LOCK. These bugs cause
> > > poor performance in locking operation. And this patchset should also fix
> > > the failure when running xfstests generic/089 on a NFSv4.1 filesystem.
> > >
> > > Yihao Wu (2):
> > > NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
> > > NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
> > >
> > > fs/nfs/nfs4proc.c | 31 ++++++++++++-------------------
> > > 1 file changed, 12 insertions(+), 19 deletions(-)
> > >
> >
> > Looks good to me. Nice catch, btw!
> >
> > Reviewed-by: Jeff Layton <[email protected]>
> >
>
> Thank you for your reviewing, Jeff!
>
> And it seems I forgot to CC maintainers of fs/nfs. So I added you to the CC
> list, Trond and Anna. Does this patchset need further reviewing?
>
> Sorry to bother you.
>

This should probably go through Anna's tree, and I assume she'll pick it
up once she and/or Trond have had a chance to review it.

Thanks,
--
Jeff Layton <[email protected]>

2019-05-17 09:44:12

by Joseph Qi

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter

Hi Yihao,

On 19/5/13 14:57, Yihao Wu wrote:
> Commit b7dbcc0e433f "NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter"
> found this bug. However it didn't fix it.
>
> This commit replaces schedule_timeout() with wait_woken() and
> default_wake_function() with woken_wake_function() in function
> nfs4_retry_setlk() and nfs4_wake_lock_waiter(). wait_woken() uses
> memory barriers in its implementation to avoid potential race condition
> when putting a process into sleeping state and then waking it up.
>
> Fixes: a1d617d8f134 ("nfs: allow blocking locks to be awoken by lock callbacks")
> Cc: [email protected] #4.9+
> Signed-off-by: Yihao Wu <[email protected]>
> ---
> fs/nfs/nfs4proc.c | 23 +++++++----------------
> 1 file changed, 7 insertions(+), 16 deletions(-)
>
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index c29cbef..f9ed6b5 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -6932,7 +6932,6 @@ struct nfs4_lock_waiter {
> struct task_struct *task;
> struct inode *inode;
> struct nfs_lowner *owner;
> - bool notified;
> };
>
> static int
> @@ -6954,13 +6953,13 @@ struct nfs4_lock_waiter {
> /* Make sure it's for the right inode */
> if (nfs_compare_fh(NFS_FH(waiter->inode), &cbnl->cbnl_fh))
> return 0;
> -
> - waiter->notified = true;
> }
>
> /* override "private" so we can use default_wake_function */
> wait->private = waiter->task;
> - ret = autoremove_wake_function(wait, mode, flags, key);
> + ret = woken_wake_function(wait, mode, flags, key);
> + if (ret)
> + list_del_init(&wait->entry);
> wait->private = waiter;
> return ret;
> }
> @@ -6979,8 +6978,7 @@ struct nfs4_lock_waiter {
> .s_dev = server->s_dev };
> struct nfs4_lock_waiter waiter = { .task = current,
> .inode = state->inode,
> - .owner = &owner,
> - .notified = false };
> + .owner = &owner};
> wait_queue_entry_t wait;
>
> /* Don't bother with waitqueue if we don't expect a callback */
> @@ -6993,21 +6991,14 @@ struct nfs4_lock_waiter {
> add_wait_queue(q, &wait);
>
> while(!signalled()) {
> - waiter.notified = false;
> status = nfs4_proc_setlk(state, cmd, request);
> if ((status != -EAGAIN) || IS_SETLK(cmd))
> break;
>
> status = -ERESTARTSYS;
> - spin_lock_irqsave(&q->lock, flags);
> - if (waiter.notified) {
> - spin_unlock_irqrestore(&q->lock, flags);
> - continue;
> - }
> - set_current_state(TASK_INTERRUPTIBLE);
> - spin_unlock_irqrestore(&q->lock, flags);
> -
> - freezable_schedule_timeout(NFS4_LOCK_MAXTIMEOUT);
> + freezer_do_not_count();
> + wait_woken(&wait, TASK_INTERRUPTIBLE, NFS4_LOCK_MAXTIMEOUT);
> + freezer_count();

Since now variable 'flags' is not used anymore, we have to delete it as well.
Otherwise there is a compile warning “unused variable ‘flags’”.

Thanks,
Joseph

> }
>
> finish_wait(q, &wait);
>

2019-05-17 14:06:30

by Yihao Wu

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter

On 2019/5/17 5:22 PM, Joseph Qi wrote:
> Hi Yihao,
>
> On 19/5/13 14:57, Yihao Wu wrote:
>> Commit b7dbcc0e433f "NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter"
>> found this bug. However it didn't fix it.
>>
>> This commit replaces schedule_timeout() with wait_woken() and
>> default_wake_function() with woken_wake_function() in function
>> nfs4_retry_setlk() and nfs4_wake_lock_waiter(). wait_woken() uses
>> memory barriers in its implementation to avoid potential race condition
>> when putting a process into sleeping state and then waking it up.
>>
>> Fixes: a1d617d8f134 ("nfs: allow blocking locks to be awoken by lock callbacks")
>> Cc: [email protected] #4.9+
>> Signed-off-by: Yihao Wu <[email protected]>
>> ---
>> fs/nfs/nfs4proc.c | 23 +++++++----------------
>> 1 file changed, 7 insertions(+), 16 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index c29cbef..f9ed6b5 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -6932,7 +6932,6 @@ struct nfs4_lock_waiter {
>> struct task_struct *task;
>> struct inode *inode;
>> struct nfs_lowner *owner;
>> - bool notified;
>> };
>>
>> static int
>> @@ -6954,13 +6953,13 @@ struct nfs4_lock_waiter {
>> /* Make sure it's for the right inode */
>> if (nfs_compare_fh(NFS_FH(waiter->inode), &cbnl->cbnl_fh))
>> return 0;
>> -
>> - waiter->notified = true;
>> }
>>
>> /* override "private" so we can use default_wake_function */
>> wait->private = waiter->task;
>> - ret = autoremove_wake_function(wait, mode, flags, key);
>> + ret = woken_wake_function(wait, mode, flags, key);
>> + if (ret)
>> + list_del_init(&wait->entry);
>> wait->private = waiter;
>> return ret;
>> }
>> @@ -6979,8 +6978,7 @@ struct nfs4_lock_waiter {
>> .s_dev = server->s_dev };
>> struct nfs4_lock_waiter waiter = { .task = current,
>> .inode = state->inode,
>> - .owner = &owner,
>> - .notified = false };
>> + .owner = &owner};
>> wait_queue_entry_t wait;
>>
>> /* Don't bother with waitqueue if we don't expect a callback */
>> @@ -6993,21 +6991,14 @@ struct nfs4_lock_waiter {
>> add_wait_queue(q, &wait);
>>
>> while(!signalled()) {
>> - waiter.notified = false;
>> status = nfs4_proc_setlk(state, cmd, request);
>> if ((status != -EAGAIN) || IS_SETLK(cmd))
>> break;
>>
>> status = -ERESTARTSYS;
>> - spin_lock_irqsave(&q->lock, flags);
>> - if (waiter.notified) {
>> - spin_unlock_irqrestore(&q->lock, flags);
>> - continue;
>> - }
>> - set_current_state(TASK_INTERRUPTIBLE);
>> - spin_unlock_irqrestore(&q->lock, flags);
>> -
>> - freezable_schedule_timeout(NFS4_LOCK_MAXTIMEOUT);
>> + freezer_do_not_count();
>> + wait_woken(&wait, TASK_INTERRUPTIBLE, NFS4_LOCK_MAXTIMEOUT);
>> + freezer_count();
>
> Since now variable 'flags' is not used anymore, we have to delete it as well.
> Otherwise there is a compile warning “unused variable ‘flags’”.
>
> Thanks,
> Joseph

Thank you Joseph. I'll remove unused 'flags' in PATCH v3.

Thanks,
Yihao Wu

>
>> }
>>
>> finish_wait(q, &wait);
>>

2019-05-21 17:57:57

by Yihao Wu

[permalink] [raw]
Subject: [PATCH v3 1/2] NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter

Commit b7dbcc0e433f "NFSv4.1: Fix a race where CB_NOTIFY_LOCK fails to wake a waiter"
found this bug. However it didn't fix it.

This commit replaces schedule_timeout() with wait_woken() and
default_wake_function() with woken_wake_function() in function
nfs4_retry_setlk() and nfs4_wake_lock_waiter(). wait_woken() uses
memory barriers in its implementation to avoid potential race condition
when putting a process into sleeping state and then waking it up.

Fixes: a1d617d8f134 ("nfs: allow blocking locks to be awoken by lock callbacks")
Cc: [email protected] #4.9+
Signed-off-by: Yihao Wu <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
---
v2->v3: remove unused variable flags in nfs4_retry_setlk()

fs/nfs/nfs4proc.c | 24 +++++++-----------------
1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index c29cbef..5f89bbd 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6932,7 +6932,6 @@ struct nfs4_lock_waiter {
struct task_struct *task;
struct inode *inode;
struct nfs_lowner *owner;
- bool notified;
};

static int
@@ -6954,13 +6953,13 @@ struct nfs4_lock_waiter {
/* Make sure it's for the right inode */
if (nfs_compare_fh(NFS_FH(waiter->inode), &cbnl->cbnl_fh))
return 0;
-
- waiter->notified = true;
}

/* override "private" so we can use default_wake_function */
wait->private = waiter->task;
- ret = autoremove_wake_function(wait, mode, flags, key);
+ ret = woken_wake_function(wait, mode, flags, key);
+ if (ret)
+ list_del_init(&wait->entry);
wait->private = waiter;
return ret;
}
@@ -6969,7 +6968,6 @@ struct nfs4_lock_waiter {
nfs4_retry_setlk(struct nfs4_state *state, int cmd, struct file_lock *request)
{
int status = -ERESTARTSYS;
- unsigned long flags;
struct nfs4_lock_state *lsp = request->fl_u.nfs4_fl.owner;
struct nfs_server *server = NFS_SERVER(state->inode);
struct nfs_client *clp = server->nfs_client;
@@ -6979,8 +6977,7 @@ struct nfs4_lock_waiter {
.s_dev = server->s_dev };
struct nfs4_lock_waiter waiter = { .task = current,
.inode = state->inode,
- .owner = &owner,
- .notified = false };
+ .owner = &owner};
wait_queue_entry_t wait;

/* Don't bother with waitqueue if we don't expect a callback */
@@ -6993,21 +6990,14 @@ struct nfs4_lock_waiter {
add_wait_queue(q, &wait);

while(!signalled()) {
- waiter.notified = false;
status = nfs4_proc_setlk(state, cmd, request);
if ((status != -EAGAIN) || IS_SETLK(cmd))
break;

status = -ERESTARTSYS;
- spin_lock_irqsave(&q->lock, flags);
- if (waiter.notified) {
- spin_unlock_irqrestore(&q->lock, flags);
- continue;
- }
- set_current_state(TASK_INTERRUPTIBLE);
- spin_unlock_irqrestore(&q->lock, flags);
-
- freezable_schedule_timeout(NFS4_LOCK_MAXTIMEOUT);
+ freezer_do_not_count();
+ wait_woken(&wait, TASK_INTERRUPTIBLE, NFS4_LOCK_MAXTIMEOUT);
+ freezer_count();
}

finish_wait(q, &wait);
--
1.8.3.1