2008-01-04 22:51:17

by Wendy Cheng

[permalink] [raw]
Subject: [PATCH] Fix NLM reference count panic

This is not exciting as Jeff Garzik's new NFSv4 server but a painful
chore needs to be done. Will start to push the patches for NLM lock
failover code next week. However, instead of doing large amount of
patches all at once, would like to do it one by one, if they can be
functionally separated. This is to avoid the tedious re-base works I
have been doing for the past, (oh no), two years ??!!!

First one should be simple enough without too much explanation ....

-- Wendy




Attachments:
001-nlm_f_count.patch (955.00 B)

2008-01-04 23:01:00

by Wendy Cheng

[permalink] [raw]
Subject: Re: [PATCH] Fix NLM reference count panic

Forgot the signed-off line .. here it is ...
Signed-off-by: S. Wendy Cheng <[email protected]>
>
> ------------------------------------------------------------------------
>
> This fixes the incorrect fclose call inside nlm_traverse_files() where
> a posix lock could still be held by NFS client. Problem was found in a
> kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of
> the fclose call due to NFS-NLM locks still hanging on inode->i_flock list.
>
> Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt
>
> svcsubs.c | 3 +--
> 1 files changed, 1 insertion(+), 2 deletions(-)
>
> --- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400
> +++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400
> @@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host
> mutex_lock(&nlm_file_mutex);
> file->f_count--;
> /* No more references to this file. Let go of it. */
> - if (list_empty(&file->f_blocks) && !file->f_locks
> - && !file->f_shares && !file->f_count) {
> + if (!nlm_file_inuse(file)) {
> hlist_del(&file->f_list);
> nlmsvc_ops->fclose(file->f_file);
> kfree(file);
>


2008-01-04 23:24:23

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] Fix NLM reference count panic

On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote:
> This fixes the incorrect fclose call inside nlm_traverse_files() where
> a posix lock could still be held by NFS client. Problem was found in a
> kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of
> the fclose call due to NFS-NLM locks still hanging on inode->i_flock list.
>
> Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt

Next time it'd be best just to include that into the referred-to text
into the message.

> svcsubs.c | 3 +--
> 1 files changed, 1 insertion(+), 2 deletions(-)
>
> --- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400
> +++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400
> @@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host
> mutex_lock(&nlm_file_mutex);
> file->f_count--;
> /* No more references to this file. Let go of it. */
> - if (list_empty(&file->f_blocks) && !file->f_locks
> - && !file->f_shares && !file->f_count) {
> + if (!nlm_file_inuse(file)) {
> hlist_del(&file->f_list);
> nlmsvc_ops->fclose(file->f_file);
> kfree(file);

This just replaces the file->f_locks check by a search of the inode's
lock list.

What confuses me here is that the nlm_inspect_file() call just above
already did that search, and set file->f_locks accordingly. The only
difference is that now we've acquired the nlm_file_mutex. I don't
understand yet how that makes a difference.

--b.

2008-01-05 04:46:08

by Wendy Cheng

[permalink] [raw]
Subject: Re: [PATCH] Fix NLM reference count panic

J. Bruce Fields wrote:

>On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote:
>
>
>>This fixes the incorrect fclose call inside nlm_traverse_files() where
>>a posix lock could still be held by NFS client. Problem was found in a
>>kernel panic inside locks_remove_flock() (fs/locks.c:2034) as part of
>>the fclose call due to NFS-NLM locks still hanging on inode->i_flock list.
>>
>>Also see: http://people.redhat.com/wcheng/Patches/NFS/NLM/001.txt
>>
>>
>
>Next time it'd be best just to include that into the referred-to text
>into the message.
>
>

Sorry, forgot to remove it - it was for internal review purpose. I was a
little bit careless with this patch.

>
>
>> svcsubs.c | 3 +--
>> 1 files changed, 1 insertion(+), 2 deletions(-)
>>
>>--- gfs2-nmw/fs/lockd/svcsubs.c 2007-04-10 11:59:09.000000000 -0400
>>+++ linux/fs/lockd/svcsubs.c 2007-04-18 10:01:23.000000000 -0400
>>@@ -250,8 +250,7 @@ nlm_traverse_files(struct nlm_host *host
>> mutex_lock(&nlm_file_mutex);
>> file->f_count--;
>> /* No more references to this file. Let go of it. */
>>- if (list_empty(&file->f_blocks) && !file->f_locks
>>- && !file->f_shares && !file->f_count) {
>>+ if (!nlm_file_inuse(file)) {
>> hlist_del(&file->f_list);
>> nlmsvc_ops->fclose(file->f_file);
>> kfree(file);
>>
>>
>
>This just replaces the file->f_locks check by a search of the inode's
>lock list.
>
>What confuses me here is that the nlm_inspect_file() call just above
>already did that search, and set file->f_locks accordingly. The only
>difference is that now we've acquired the nlm_file_mutex. I don't
>understand yet how that makes a difference.
>
>
>
You're right. I got the patch sequence wrong. It will cause panic only
when we selectively unlock nlm locks under my "unlock" patch as the
following. See how it returns without doing nlm_traverse_locks() below
... Let's combine this patch into the big unlock patch so we won't have
any confusion. The unlock patch will submitted on Monday after this
weekend's sanity check test run. In short, I withdraw this patch... Wendy

nlm_inspect_file(struct nlm_host *host, struct nlm_file *file,
nlm_host_match_fn_t match)
{
+ /* Cluster failover has timing constraints. There is a slight
+ * performance hit if nlm_fo_unlock_match() is implemented as
+ * a match fn (since it will be invoked for each block, share,
+ * and lock later when the lists are traversed). Instead, we
+ * add path-matching logic into the following unlikely clause.
+ * If matches, the dummy nlmsvc_fo_match will always return
+ * true.
+ */
+ dprintk("nlm_inspect_files: file=%p\n", file);
+ if (unlikely(match == nlmsvc_fo_match)) {
+ if (!nlmsvc_fo_unlock_match((void *)host, file))
+ return 0;
+ fo_printk("nlm_fo find lock file entry (0x%p)\n", file);
+ }
+
nlmsvc_traverse_blocks(host, file, match);
nlmsvc_traverse_shares(host, file, match);
return nlm_traverse_locks(host, file, match);
@@ -369,3 +451,35 @@ nlmsvc_invalidate_all(void)
*/
nlm_traverse_files(NULL, nlmsvc_is_client);
}


2008-01-05 06:05:12

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] Fix NLM reference count panic

On Sat, Jan 05, 2008 at 12:03:07AM -0500, Wendy Cheng wrote:
> J. Bruce Fields wrote:
>
>> On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote:
>> This just replaces the file->f_locks check by a search of the inode's
>> lock list.
>>
>> What confuses me here is that the nlm_inspect_file() call just above
>> already did that search, and set file->f_locks accordingly. The only
>> difference is that now we've acquired the nlm_file_mutex. I don't
>> understand yet how that makes a difference.
>>
>>
> You're right. I got the patch sequence wrong. It will cause panic only when
> we selectively unlock nlm locks under my "unlock" patch as the following.
> See how it returns without doing nlm_traverse_locks() below ... Let's
> combine this patch into the big unlock patch so we won't have any
> confusion. The unlock patch will submitted on Monday after this weekend's
> sanity check test run. In short, I withdraw this patch... Wendy

OK. I'll admit to still being a little confused--but with luck it'll
all make sense when I see the whole thing. Have a good weekend!

--b.

> nlm_inspect_file(struct nlm_host *host, struct nlm_file *file,
> nlm_host_match_fn_t match)
> {
> + /* Cluster failover has timing constraints. There is a slight
> + * performance hit if nlm_fo_unlock_match() is implemented as
> + * a match fn (since it will be invoked for each block, share,
> + * and lock later when the lists are traversed). Instead, we
> + * add path-matching logic into the following unlikely clause.
> + * If matches, the dummy nlmsvc_fo_match will always return
> + * true.
> + */
> + dprintk("nlm_inspect_files: file=%p\n", file);
> + if (unlikely(match == nlmsvc_fo_match)) {
> + if (!nlmsvc_fo_unlock_match((void *)host, file))
> + return 0;
> + fo_printk("nlm_fo find lock file entry (0x%p)\n", file);
> + }
> +
> nlmsvc_traverse_blocks(host, file, match);
> nlmsvc_traverse_shares(host, file, match);
> return nlm_traverse_locks(host, file, match);
> @@ -369,3 +451,35 @@ nlmsvc_invalidate_all(void)
> */
> nlm_traverse_files(NULL, nlmsvc_is_client);
> }
>

2008-01-05 17:29:21

by Wendy Cheng

[permalink] [raw]
Subject: Re: [PATCH] Fix NLM reference count panic

J. Bruce Fields wrote:

>On Sat, Jan 05, 2008 at 12:03:07AM -0500, Wendy Cheng wrote:
>
>
>>J. Bruce Fields wrote:
>>
>>
>>
>>>On Fri, Jan 04, 2008 at 05:48:49PM -0500, Wendy Cheng wrote:
>>>This just replaces the file->f_locks check by a search of the inode's
>>>lock list.
>>>
>>>What confuses me here is that the nlm_inspect_file() call just above
>>>already did that search, and set file->f_locks accordingly. The only
>>>difference is that now we've acquired the nlm_file_mutex. I don't
>>>understand yet how that makes a difference.
>>>
>>>
>>>
>>>
>>You're right. I got the patch sequence wrong. It will cause panic only when
>>we selectively unlock nlm locks under my "unlock" patch as the following.
>>See how it returns without doing nlm_traverse_locks() below ... Let's
>>combine this patch into the big unlock patch so we won't have any
>>confusion. The unlock patch will submitted on Monday after this weekend's
>>sanity check test run. In short, I withdraw this patch... Wendy
>>
>>
>
>OK. I'll admit to still being a little confused--but with luck it'll
>all make sense when I see the whole thing. Have a good weekend!
>
>
>
Actually the *existing* logic (implying without my changes) is awkward.
It uses file->f_locks to carry the result of unlock operation. Leave the
non-zero f_locks value hanging there if unlock fails, It only resets it
back to zero when next time nlm_inspect_file is called. Code like this
is a good place to generate subtle race conditions that could result
with file close failure. Just a complaint - I don't have a solid proof
that it has caused troubles though.

Anyway, will combine this small patch into the big unlock patch. You
have a good weekend too !

-- Wendy

2008-01-05 17:36:04

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH] Fix NLM reference count panic

On Sat, Jan 05, 2008 at 12:46:21PM -0500, Wendy Cheng wrote:
> Actually the *existing* logic (implying without my changes) is awkward. It
> uses file->f_locks to carry the result of unlock operation. Leave the
> non-zero f_locks value hanging there if unlock fails, It only resets it
> back to zero when next time nlm_inspect_file is called. Code like this is a
> good place to generate subtle race conditions that could result with file
> close failure. Just a complaint - I don't have a solid proof that it has
> caused troubles though.

Absolutely agreed. It looks very fragile to me.

--b.