2022-12-22 06:38:41

by Wang Yugui

[permalink] [raw]
Subject: a nfsd_file_free panic when shudown

Hi,

a nfsd_file_free panic when shudown.

This is a kernel 6.1.1 with some nfsd 6.2 pathes.
It happened when os shutdown.

but the reproducer is yet not clear.
we just gather the info of the attachment file.

the path list that applied to kernel 6.1.1.
Subject: nfsd: don't call nfsd_file_put from client states seqfile display
Subject: NFSD: Pass the target nfsd_file to nfsd_commit()
Subject: NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file
Subject: NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
Subject: nfsd: fix up the filecache laundrette scheduling
Subject: NFSD: Flesh out a documenting comment for filecache.c
Subject: NFSD: Clean up nfs4_preprocess_stateid_op() call sites
Subject: NFSD: Trace stateids returned via DELEGRETURN
Subject: NFSD: Trace delegation revocations
Subject: NFSD: Use const pointers as parameters to fh_ helpers
Subject: NFSD: Update file_hashtbl() helpers
Subject: NFSD: Clean up nfsd4_init_file()
Subject: NFSD: Add a nfsd4_file_hash_remove() helper
Subject: NFSD: Clean up find_or_add_file()
Subject: NFSD: Refactor find_file()
Subject: NFSD: Use rhashtable for managing nfs4_file objects
Subject: NFSD: Fix licensing header in filecache.c
Subject: nfsd: remove the pages_flushed statistic from filecache
Subject: nfsd: reorganize filecache.c
Subject: NFSD: Add an nfsd_file_fsync tracepoint
Subject: nfsd: rework refcounting in filecache
Subject: nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
Subject: NFSD: fix use-after-free in __nfs42_ssc_open()


It happened just after 'Subject: nfsd: rework refcounting in filecache'
is added, so this patch maybe related.

Best Regards
Wang Yugui ([email protected])
2022/12/22


Attachments:
rpviewer.png (63.90 kB)

2022-12-22 08:16:17

by Wang Yugui

[permalink] [raw]
Subject: Re: a nfsd_file_free panic when shudown

Hi,

> a nfsd_file_free panic when shudown.
>
> This is a kernel 6.1.1 with some nfsd 6.2 pathes.
> It happened when os shutdown.
>
> but the reproducer is yet not clear.
> we just gather the info of the attachment file.

Now I can reproduce it.
1) 'tail -f xxx' to keep a file is open from nfs4 client
2) 'shutdow -r now' the nfs server.

more panic info in the attachment files (panic-2.txt/panic-3.txt)

Best Regards
Wang Yugui ([email protected])
2022/12/22

>
> the path list that applied to kernel 6.1.1.
> Subject: nfsd: don't call nfsd_file_put from client states seqfile display
> Subject: NFSD: Pass the target nfsd_file to nfsd_commit()
> Subject: NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file
> Subject: NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
> Subject: nfsd: fix up the filecache laundrette scheduling
> Subject: NFSD: Flesh out a documenting comment for filecache.c
> Subject: NFSD: Clean up nfs4_preprocess_stateid_op() call sites
> Subject: NFSD: Trace stateids returned via DELEGRETURN
> Subject: NFSD: Trace delegation revocations
> Subject: NFSD: Use const pointers as parameters to fh_ helpers
> Subject: NFSD: Update file_hashtbl() helpers
> Subject: NFSD: Clean up nfsd4_init_file()
> Subject: NFSD: Add a nfsd4_file_hash_remove() helper
> Subject: NFSD: Clean up find_or_add_file()
> Subject: NFSD: Refactor find_file()
> Subject: NFSD: Use rhashtable for managing nfs4_file objects
> Subject: NFSD: Fix licensing header in filecache.c
> Subject: nfsd: remove the pages_flushed statistic from filecache
> Subject: nfsd: reorganize filecache.c
> Subject: NFSD: Add an nfsd_file_fsync tracepoint
> Subject: nfsd: rework refcounting in filecache
> Subject: nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
> Subject: NFSD: fix use-after-free in __nfs42_ssc_open()
>
>
> It happened just after 'Subject: nfsd: rework refcounting in filecache'
> is added, so this patch maybe related.
>
> Best Regards
> Wang Yugui ([email protected])
> 2022/12/22

--
Wang Yugui <[email protected]>


Attachments:
panic-2.txt (10.76 kB)
panic-3.txt (9.97 kB)
Download all attachments

2022-12-22 09:43:52

by Wang Yugui

[permalink] [raw]
Subject: Re: a nfsd_file_free panic when shudown

Hi, Jeff Layton

> Hi,
> > a nfsd_file_free panic when shudown.
> >
> > This is a kernel 6.1.1 with some nfsd 6.2 pathes.
> > It happened when os shutdown.
> >
> > but the reproducer is yet not clear.
> > we just gather the info of the attachment file.
>
> Now I can reproduce it.
> 1) 'tail -f xxx' to keep a file is open from nfs4 client
> 2) 'shutdow -r now' the nfs server.
>
> more panic info in the attachment files (panic-2.txt/panic-3.txt)
>
> Best Regards
> Wang Yugui ([email protected])
> 2022/12/22
>
> >
> > the path list that applied to kernel 6.1.1.
> > Subject: nfsd: don't call nfsd_file_put from client states seqfile display
> > Subject: NFSD: Pass the target nfsd_file to nfsd_commit()
> > Subject: NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file
> > Subject: NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
> > Subject: nfsd: fix up the filecache laundrette scheduling
> > Subject: NFSD: Flesh out a documenting comment for filecache.c
> > Subject: NFSD: Clean up nfs4_preprocess_stateid_op() call sites
> > Subject: NFSD: Trace stateids returned via DELEGRETURN
> > Subject: NFSD: Trace delegation revocations
> > Subject: NFSD: Use const pointers as parameters to fh_ helpers
> > Subject: NFSD: Update file_hashtbl() helpers
> > Subject: NFSD: Clean up nfsd4_init_file()
> > Subject: NFSD: Add a nfsd4_file_hash_remove() helper
> > Subject: NFSD: Clean up find_or_add_file()
> > Subject: NFSD: Refactor find_file()
> > Subject: NFSD: Use rhashtable for managing nfs4_file objects
> > Subject: NFSD: Fix licensing header in filecache.c
> > Subject: nfsd: remove the pages_flushed statistic from filecache
> > Subject: nfsd: reorganize filecache.c
> > Subject: NFSD: Add an nfsd_file_fsync tracepoint
> > Subject: nfsd: rework refcounting in filecache
> > Subject: nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
> > Subject: NFSD: fix use-after-free in __nfs42_ssc_open()
> >
> >
> > It happened just after 'Subject: nfsd: rework refcounting in filecache'
> > is added, so this patch maybe related.

It is confirmed that this panic is caused by the patch
'nfsd: rework refcounting in filecache'.

the problem is 100% reproduced when this patch is applied.
the problem is yet not reproduced when this patch is not applied.

Best Regards
Wang Yugui ([email protected])
2022/12/22


2022-12-22 11:24:20

by Jeff Layton

[permalink] [raw]
Subject: Re: a nfsd_file_free panic when shudown

On Thu, 2022-12-22 at 17:38 +0800, Wang Yugui wrote:
> Hi, Jeff Layton
>
> > Hi,
> > > a nfsd_file_free panic when shudown.
> > >
> > > This is a kernel 6.1.1 with some nfsd 6.2 pathes.
> > > It happened when os shutdown.
> > >
> > > but the reproducer is yet not clear.
> > > we just gather the info of the attachment file.
> >
> > Now I can reproduce it.
> > 1) 'tail -f xxx' to keep a file is open from nfs4 client
> > 2) 'shutdow -r now' the nfs server.
> >
> > more panic info in the attachment files (panic-2.txt/panic-3.txt)
> >
> > Best Regards
> > Wang Yugui ([email protected])
> > 2022/12/22
> >
> > >
> > > the path list that applied to kernel 6.1.1.
> > > Subject: nfsd: don't call nfsd_file_put from client states seqfile display
> > > Subject: NFSD: Pass the target nfsd_file to nfsd_commit()
> > > Subject: NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file
> > > Subject: NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
> > > Subject: nfsd: fix up the filecache laundrette scheduling
> > > Subject: NFSD: Flesh out a documenting comment for filecache.c
> > > Subject: NFSD: Clean up nfs4_preprocess_stateid_op() call sites
> > > Subject: NFSD: Trace stateids returned via DELEGRETURN
> > > Subject: NFSD: Trace delegation revocations
> > > Subject: NFSD: Use const pointers as parameters to fh_ helpers
> > > Subject: NFSD: Update file_hashtbl() helpers
> > > Subject: NFSD: Clean up nfsd4_init_file()
> > > Subject: NFSD: Add a nfsd4_file_hash_remove() helper
> > > Subject: NFSD: Clean up find_or_add_file()
> > > Subject: NFSD: Refactor find_file()
> > > Subject: NFSD: Use rhashtable for managing nfs4_file objects
> > > Subject: NFSD: Fix licensing header in filecache.c
> > > Subject: nfsd: remove the pages_flushed statistic from filecache
> > > Subject: nfsd: reorganize filecache.c
> > > Subject: NFSD: Add an nfsd_file_fsync tracepoint
> > > Subject: nfsd: rework refcounting in filecache
> > > Subject: nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
> > > Subject: NFSD: fix use-after-free in __nfs42_ssc_open()
> > >
> > >
> > > It happened just after 'Subject: nfsd: rework refcounting in filecache'
> > > is added, so this patch maybe related.
>
> It is confirmed that this panic is caused by the patch
> 'nfsd: rework refcounting in filecache'.
>
> the problem is 100% reproduced when this patch is applied.
> the problem is yet not reproduced when this patch is not applied.
>
> Best Regards
> Wang Yugui ([email protected])
> 2022/12/22
>

Thanks for the bug report! This patch seems to fix the problem for me.
Can you test it and let me know whether it also fixes it for you? If so,
I'll send this to Chuck and the list for inclusion soon.

Thanks!
--?
Jeff Layton <[email protected]>

--------------------8<--------------------------

[PATCH] nfsd: shut down the NFSv4 state objects before the filecache

Currently, we shut down the filecache before trying to clean up the
stateids that depend on it, which can lead to an oops at shutdown time
due to a refcount imbalance. Change the shutdown procedures to bring
down the state handling infrastructure prior to shutting down the
filecache.

Reported-by: Wang Yugui <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
---
fs/nfsd/nfssvc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 56fba1cba3af..325d3d3f1211 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -453,8 +453,8 @@ static void nfsd_shutdown_net(struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);

- nfsd_file_cache_shutdown_net(net);
nfs4_state_shutdown_net(net);
+ nfsd_file_cache_shutdown_net(net);
if (nn->lockd_up) {
lockd_down(net);
nn->lockd_up = false;
--
2.38.1


2022-12-22 13:17:30

by Wang Yugui

[permalink] [raw]
Subject: Re: a nfsd_file_free panic when shudown

Hi,


> On Thu, 2022-12-22 at 17:38 +0800, Wang Yugui wrote:
> > Hi, Jeff Layton
> >
> > > Hi,
> > > > a nfsd_file_free panic when shudown.
> > > >
> > > > This is a kernel 6.1.1 with some nfsd 6.2 pathes.
> > > > It happened when os shutdown.
> > > >
> > > > but the reproducer is yet not clear.
> > > > we just gather the info of the attachment file.
> > >
> > > Now I can reproduce it.
> > > 1) 'tail -f xxx' to keep a file is open from nfs4 client
> > > 2) 'shutdow -r now' the nfs server.
> > >
> > > more panic info in the attachment files (panic-2.txt/panic-3.txt)
> > >
> > > >
> > > > It happened just after 'Subject: nfsd: rework refcounting in filecache'
> > > > is added, so this patch maybe related.
> >
> > It is confirmed that this panic is caused by the patch
> > 'nfsd: rework refcounting in filecache'.
> >
> > the problem is 100% reproduced when this patch is applied.
> > the problem is yet not reproduced when this patch is not applied.
> >
> > Best Regards
> > Wang Yugui ([email protected])
> > 2022/12/22
> >
>
> Thanks for the bug report! This patch seems to fix the problem for me.
> Can you test it and let me know whether it also fixes it for you? If so,
> I'll send this to Chuck and the list for inclusion soon.

This new patch fix the problem here too. Thanks a lot.

Best Regards
Wang Yugui ([email protected])
2022/12/22


>
> Thanks!
> --?
> Jeff Layton <[email protected]>
>
> --------------------8<--------------------------
>
> [PATCH] nfsd: shut down the NFSv4 state objects before the filecache
>
> Currently, we shut down the filecache before trying to clean up the
> stateids that depend on it, which can lead to an oops at shutdown time
> due to a refcount imbalance. Change the shutdown procedures to bring
> down the state handling infrastructure prior to shutting down the
> filecache.
>
> Reported-by: Wang Yugui <[email protected]>
> Signed-off-by: Jeff Layton <[email protected]>
> ---
> fs/nfsd/nfssvc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 56fba1cba3af..325d3d3f1211 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -453,8 +453,8 @@ static void nfsd_shutdown_net(struct net *net)
> {
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> - nfsd_file_cache_shutdown_net(net);
> nfs4_state_shutdown_net(net);
> + nfsd_file_cache_shutdown_net(net);
> if (nn->lockd_up) {
> lockd_down(net);
> nn->lockd_up = false;
> --
> 2.38.1
>