2015-02-04 17:08:52

by Bruno Prémont

[permalink] [raw]
Subject: Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

On Fri, 30 January 2015 Trond Myklebust wrote:
> On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
> > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
> > > On a system running home-brown container (mntns, utsns, pidns, netns)
> > > with NFS mount-point bind-mounted into the container I hit the following
> > > trace if nfs filesystem is first umount()ed in init ns and then later
> > > umounted from container when the container exists.
> > >
> > > [51397.767310] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > > [51397.770671] IP: [<ffffffff81828173>] rpc_new_client+0x193/0x2b0
> > > [51397.773967] PGD 0
> > > [51397.777218] Oops: 0000 [#1] SMP
> > > [51397.780490] Modules linked in:
> > > [51397.783751] CPU: 0 PID: 1711 Comm: image-starter Not tainted 3.19.0-rc2-kvm+ #7
> > > [51397.787123] Hardware name: Gigabyte Technology Co., Ltd. GA-A75M-UD2H/GA-A75M-UD2H, BIOS F6 09/28/2012
> > > [51397.790606] task: ffff8800c9fcbac0 ti: ffff8801fe754000 task.ti: ffff8801fe754000
> > > [51397.794149] RIP: 0010:[<ffffffff81828173>] [<ffffffff81828173>] rpc_new_client+0x193/0x2b0
> > > [51397.797798] RSP: 0018:ffff8801fe757908 EFLAGS: 00010246
> > > [51397.801444] RAX: 0000000000000000 RBX: ffff88009dafb240 RCX: 0000000000000180
> > > [51397.805174] RDX: 000000000000bae0 RSI: 0000000000001770 RDI: ffff88009dafb308
> > > [51397.808913] RBP: ffff8801fe757948 R08: ffff88009daf92d8 R09: ffff88009dafb458
> > > [51397.812673] R10: ffff88009dafb458 R11: ffff88020ec15bc0 R12: ffff8801fe757a40
> > > [51397.816456] R13: ffffffff81b9d800 R14: ffff8800c6e31030 R15: 0000000000000000
> > > [51397.820270] FS: 00007f335a3a1700(0000) GS:ffff88020ec00000(0000) knlGS:00000000f7287700
> > > [51397.824168] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [51397.828066] CR2: 0000000000000008 CR3: 00000001fc54d000 CR4: 00000000000007f0
> > > [51397.832017] Stack:
> > > [51397.835924] 000000000000000a ffffffff81b9d770 ffffffff81826450 ffff8801fe757a40
> > > [51397.840023] ffff8801fe757a40 ffff8800cf08d500 ffffffff81826450 ffffffff820f4728
> > > [51397.844130] ffff8801fe757978 ffffffff81828815 ffff8801fe757978 ffffffff8182aad8
> > > [51397.848224] Call Trace:
> > > [51397.852221] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > [51397.856273] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > [51397.860295] [<ffffffff81828815>] rpc_create_xprt+0x15/0xb0
> > > [51397.864324] [<ffffffff8182aad8>] ? xprt_create_transport+0x108/0x1b0
> > > [51397.868428] [<ffffffff81828971>] rpc_create+0xc1/0x190
> > > [51397.872574] [<ffffffff81111c86>] ? internal_add_timer+0x66/0x80
> > > [51397.876733] [<ffffffff81113a99>] ? mod_timer+0x109/0x1e0
> > > [51397.880877] [<ffffffff8183a19e>] rpcb_create+0x6e/0x90
> > > [51397.884999] [<ffffffff8183a71a>] rpcb_getport_async+0x15a/0x330
> > > [51397.889118] [<ffffffff8182f1da>] ? rpc_malloc+0x3a/0x70
> > > [51397.893240] [<ffffffff811af8d2>] ? __kmalloc+0xc2/0x170
> > > [51397.897354] [<ffffffff81826830>] ? call_reserveresult+0x110/0x110
> > > [51397.901490] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > [51397.905606] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > [51397.909662] [<ffffffff8182648e>] call_bind+0x3e/0x40
> > > [51397.913709] [<ffffffff8182fa99>] __rpc_execute+0x79/0x330
> > > [51397.917778] [<ffffffff818327bd>] rpc_execute+0x5d/0xa0
> > > [51397.921871] [<ffffffff818286cb>] rpc_run_task+0x6b/0x90
> > > [51397.925989] [<ffffffff8182872e>] rpc_call_sync+0x3e/0xa0
> > > [51397.930108] [<ffffffff8127fe29>] nsm_mon_unmon+0xb9/0xd0
> > > [51397.934191] [<ffffffff8110e2a0>] ? call_rcu_bh+0x20/0x20
> > > [51397.938235] [<ffffffff8128018c>] nsm_unmonitor+0x8c/0x140
> > > [51397.942309] [<ffffffff8127bc43>] nlm_destroy_host_locked+0x63/0xa0
> > > [51397.946442] [<ffffffff8127c03c>] nlmclnt_release_host+0x7c/0x130
> > > [51397.950591] [<ffffffff81279645>] nlmclnt_done+0x15/0x30
> > > [51397.954773] [<ffffffff81241862>] nfs_destroy_server+0x12/0x20
> > > [51397.958934] [<ffffffff81242372>] nfs_free_server+0x22/0xa0
> > > [51397.963053] [<ffffffff8124cadd>] nfs_kill_super+0x1d/0x30
> > > [51397.967158] [<ffffffff811c2e2c>] deactivate_locked_super+0x4c/0x70
> > > [51397.971286] [<ffffffff811c33f9>] deactivate_super+0x49/0x70
> > > [51397.975398] [<ffffffff811ddafe>] cleanup_mnt+0x3e/0x90
> > > [51397.979499] [<ffffffff811ddb9d>] __cleanup_mnt+0xd/0x10
> > > [51397.983598] [<ffffffff810e04cc>] task_work_run+0xbc/0xe0
> > > [51397.987697] [<ffffffff810c8f95>] do_exit+0x295/0xaf0
> > > [51397.991812] [<ffffffff811c2239>] ? ____fput+0x9/0x10
> > > [51397.995937] [<ffffffff810e04b4>] ? task_work_run+0xa4/0xe0
> > > [51398.000070] [<ffffffff810c986a>] do_group_exit+0x3a/0xa0
> > > [51398.004201] [<ffffffff810c98df>] SyS_exit_group+0xf/0x10
> > > [51398.008315] [<ffffffff8185e8d2>] system_call_fastpath+0x12/0x17
> > > [51398.012438] Code: 43 78 48 8d bb c8 00 00 00 48 89 7b 70 48 8b 30 e8 63 2d 01 00 c7 03 01 00 00 00 65 48 8b 04 25 00 aa 00 00 48 8b 80 c0 09 00 00 <4c> 8b 68 08 49 83 c5 45 4
> > > [51398.022378] RIP [<ffffffff81828173>] rpc_new_client+0x193/0x2b0
> > > [51398.026732] RSP <ffff8801fe757908>
> > > [51398.031025] CR2: 0000000000000008
> > > [51398.035326] ---[ end trace b701b037bc457620 ]---
> > > [51398.058223] Fixing recursive fault but reboot is needed!
> >
> > We should rather change rpcb_create() to pass the nodename from the
> > parent. The point is that the rpc_clnt->cl_nodename is used in various
> > different contexts (for instance in the AUTH_SYS credential) where it
> > isn't always appropriate to have it set to an empty string.
>
> I was rather hoping that Bruno would fix up his patch and resend, but
> since other reports of the same bug are now surfacing... Please could
> you all check if something like the following patch fixes it.

This patch works for me, so
Tested-by: Bruno Prémont <[email protected]>

Now I get just the following complaint in dmesg on shutdown:
[ 1940.173201] lockd: cannot unmonitor nfs.home
^^^^^^^^ name of NFS server

This complaint did not happen with my "empty string" name
patch.

Thanks,
Bruno


> Thanks
> Trond
>
> 8<---------------------------------------------------------------------
> From 87557df0ca2241da0edac558286650fdb081152c Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Fri, 30 Jan 2015 18:12:28 -0500
> Subject: [PATCH] SUNRPC: NULL utsname dereference on NFS umount during
> namespace cleanup
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Fix an Oopsable condition when nsm_mon_unmon is called as part of the
> namespace cleanup, which now apparently happens after the utsname
> has been freed.
>
> Link: http://lkml.kernel.org/r/[email protected]
> Reported-by: Bruno Prémont <[email protected]>
> Cc: [email protected] # 3.18
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> fs/lockd/mon.c | 13 +++++++++----
> include/linux/sunrpc/clnt.h | 3 ++-
> net/sunrpc/clnt.c | 12 +++++++-----
> net/sunrpc/rpcb_clnt.c | 8 ++++++--
> 4 files changed, 24 insertions(+), 12 deletions(-)
>
> diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
> index 1cc6ec51e6b1..47a32b6d9b90 100644
> --- a/fs/lockd/mon.c
> +++ b/fs/lockd/mon.c
> @@ -65,7 +65,7 @@ static inline struct sockaddr *nsm_addr(const struct nsm_handle *nsm)
> return (struct sockaddr *)&nsm->sm_addr;
> }
>
> -static struct rpc_clnt *nsm_create(struct net *net)
> +static struct rpc_clnt *nsm_create(struct net *net, const char *nodename)
> {
> struct sockaddr_in sin = {
> .sin_family = AF_INET,
> @@ -77,6 +77,7 @@ static struct rpc_clnt *nsm_create(struct net *net)
> .address = (struct sockaddr *)&sin,
> .addrsize = sizeof(sin),
> .servername = "rpc.statd",
> + .nodename = nodename,
> .program = &nsm_program,
> .version = NSM_VERSION,
> .authflavor = RPC_AUTH_NULL,
> @@ -102,7 +103,7 @@ out:
> return clnt;
> }
>
> -static struct rpc_clnt *nsm_client_get(struct net *net)
> +static struct rpc_clnt *nsm_client_get(struct net *net, const char *nodename)
> {
> struct rpc_clnt *clnt, *new;
> struct lockd_net *ln = net_generic(net, lockd_net_id);
> @@ -111,7 +112,7 @@ static struct rpc_clnt *nsm_client_get(struct net *net)
> if (clnt != NULL)
> goto out;
>
> - clnt = new = nsm_create(net);
> + clnt = new = nsm_create(net, nodename);
> if (IS_ERR(clnt))
> goto out;
>
> @@ -190,19 +191,23 @@ int nsm_monitor(const struct nlm_host *host)
> struct nsm_res res;
> int status;
> struct rpc_clnt *clnt;
> + const char *nodename = NULL;
>
> dprintk("lockd: nsm_monitor(%s)\n", nsm->sm_name);
>
> if (nsm->sm_monitored)
> return 0;
>
> + if (host->h_rpcclnt)
> + nodename = host->h_rpcclnt->cl_nodename;
> +
> /*
> * Choose whether to record the caller_name or IP address of
> * this peer in the local rpc.statd's database.
> */
> nsm->sm_mon_name = nsm_use_hostnames ? nsm->sm_name : nsm->sm_addrbuf;
>
> - clnt = nsm_client_get(host->net);
> + clnt = nsm_client_get(host->net, nodename);
> if (IS_ERR(clnt)) {
> status = PTR_ERR(clnt);
> dprintk("lockd: failed to create NSM upcall transport, "
> diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
> index d86acc63b25f..598ba80ec30c 100644
> --- a/include/linux/sunrpc/clnt.h
> +++ b/include/linux/sunrpc/clnt.h
> @@ -57,7 +57,7 @@ struct rpc_clnt {
> const struct rpc_timeout *cl_timeout; /* Timeout strategy */
>
> int cl_nodelen; /* nodename length */
> - char cl_nodename[UNX_MAXNODENAME];
> + char cl_nodename[UNX_MAXNODENAME+1];
> struct rpc_pipe_dir_head cl_pipedir_objects;
> struct rpc_clnt * cl_parent; /* Points to parent of clones */
> struct rpc_rtt cl_rtt_default;
> @@ -112,6 +112,7 @@ struct rpc_create_args {
> struct sockaddr *saddress;
> const struct rpc_timeout *timeout;
> const char *servername;
> + const char *nodename;
> const struct rpc_program *program;
> u32 prognumber; /* overrides program->number */
> u32 version;
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 05da12a33945..3f5d4d48f0cb 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -286,10 +286,8 @@ static struct rpc_xprt *rpc_clnt_set_transport(struct rpc_clnt *clnt,
>
> static void rpc_clnt_set_nodename(struct rpc_clnt *clnt, const char *nodename)
> {
> - clnt->cl_nodelen = strlen(nodename);
> - if (clnt->cl_nodelen > UNX_MAXNODENAME)
> - clnt->cl_nodelen = UNX_MAXNODENAME;
> - memcpy(clnt->cl_nodename, nodename, clnt->cl_nodelen);
> + clnt->cl_nodelen = strlcpy(clnt->cl_nodename,
> + nodename, sizeof(clnt->cl_nodename));
> }
>
> static int rpc_client_register(struct rpc_clnt *clnt,
> @@ -365,6 +363,7 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args,
> const struct rpc_version *version;
> struct rpc_clnt *clnt = NULL;
> const struct rpc_timeout *timeout;
> + const char *nodename = args->nodename;
> int err;
>
> /* sanity check the name before trying to print it */
> @@ -420,8 +419,10 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args,
>
> atomic_set(&clnt->cl_count, 1);
>
> + if (nodename == NULL)
> + nodename = utsname()->nodename;
> /* save the nodename */
> - rpc_clnt_set_nodename(clnt, utsname()->nodename);
> + rpc_clnt_set_nodename(clnt, nodename);
>
> err = rpc_client_register(clnt, args->authflavor, args->client_name);
> if (err)
> @@ -576,6 +577,7 @@ static struct rpc_clnt *__rpc_clone_client(struct rpc_create_args *args,
> if (xprt == NULL)
> goto out_err;
> args->servername = xprt->servername;
> + args->nodename = clnt->cl_nodename;
>
> new = rpc_new_client(args, xprt, clnt);
> if (IS_ERR(new)) {
> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
> index 05202012bcfc..cf5770d8f49a 100644
> --- a/net/sunrpc/rpcb_clnt.c
> +++ b/net/sunrpc/rpcb_clnt.c
> @@ -355,7 +355,8 @@ out:
> return result;
> }
>
> -static struct rpc_clnt *rpcb_create(struct net *net, const char *hostname,
> +static struct rpc_clnt *rpcb_create(struct net *net, const char *nodename,
> + const char *hostname,
> struct sockaddr *srvaddr, size_t salen,
> int proto, u32 version)
> {
> @@ -365,6 +366,7 @@ static struct rpc_clnt *rpcb_create(struct net *net, const char *hostname,
> .address = srvaddr,
> .addrsize = salen,
> .servername = hostname,
> + .nodename = nodename,
> .program = &rpcb_program,
> .version = version,
> .authflavor = RPC_AUTH_UNIX,
> @@ -740,7 +742,9 @@ void rpcb_getport_async(struct rpc_task *task)
> dprintk("RPC: %5u %s: trying rpcbind version %u\n",
> task->tk_pid, __func__, bind_version);
>
> - rpcb_clnt = rpcb_create(xprt->xprt_net, xprt->servername, sap, salen,
> + rpcb_clnt = rpcb_create(xprt->xprt_net,
> + clnt->cl_nodename,
> + xprt->servername, sap, salen,
> xprt->prot, bind_version);
> if (IS_ERR(rpcb_clnt)) {
> status = PTR_ERR(rpcb_clnt);


2015-02-04 19:06:52

by Trond Myklebust

[permalink] [raw]
Subject: Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

On Wed, Feb 4, 2015 at 12:08 PM, Bruno Prémont
<[email protected]> wrote:
>
> On Fri, 30 January 2015 Trond Myklebust wrote:
> > On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
> > > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
> > > > On a system running home-brown container (mntns, utsns, pidns, netns)
> > > > with NFS mount-point bind-mounted into the container I hit the following
> > > > trace if nfs filesystem is first umount()ed in init ns and then later
> > > > umounted from container when the container exists.
> > > >
> > > > [51397.767310] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> > > > [51397.770671] IP: [<ffffffff81828173>] rpc_new_client+0x193/0x2b0
> > > > [51397.773967] PGD 0
> > > > [51397.777218] Oops: 0000 [#1] SMP
> > > > [51397.780490] Modules linked in:
> > > > [51397.783751] CPU: 0 PID: 1711 Comm: image-starter Not tainted 3.19.0-rc2-kvm+ #7
> > > > [51397.787123] Hardware name: Gigabyte Technology Co., Ltd. GA-A75M-UD2H/GA-A75M-UD2H, BIOS F6 09/28/2012
> > > > [51397.790606] task: ffff8800c9fcbac0 ti: ffff8801fe754000 task.ti: ffff8801fe754000
> > > > [51397.794149] RIP: 0010:[<ffffffff81828173>] [<ffffffff81828173>] rpc_new_client+0x193/0x2b0
> > > > [51397.797798] RSP: 0018:ffff8801fe757908 EFLAGS: 00010246
> > > > [51397.801444] RAX: 0000000000000000 RBX: ffff88009dafb240 RCX: 0000000000000180
> > > > [51397.805174] RDX: 000000000000bae0 RSI: 0000000000001770 RDI: ffff88009dafb308
> > > > [51397.808913] RBP: ffff8801fe757948 R08: ffff88009daf92d8 R09: ffff88009dafb458
> > > > [51397.812673] R10: ffff88009dafb458 R11: ffff88020ec15bc0 R12: ffff8801fe757a40
> > > > [51397.816456] R13: ffffffff81b9d800 R14: ffff8800c6e31030 R15: 0000000000000000
> > > > [51397.820270] FS: 00007f335a3a1700(0000) GS:ffff88020ec00000(0000) knlGS:00000000f7287700
> > > > [51397.824168] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > > [51397.828066] CR2: 0000000000000008 CR3: 00000001fc54d000 CR4: 00000000000007f0
> > > > [51397.832017] Stack:
> > > > [51397.835924] 000000000000000a ffffffff81b9d770 ffffffff81826450 ffff8801fe757a40
> > > > [51397.840023] ffff8801fe757a40 ffff8800cf08d500 ffffffff81826450 ffffffff820f4728
> > > > [51397.844130] ffff8801fe757978 ffffffff81828815 ffff8801fe757978 ffffffff8182aad8
> > > > [51397.848224] Call Trace:
> > > > [51397.852221] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > > [51397.856273] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > > [51397.860295] [<ffffffff81828815>] rpc_create_xprt+0x15/0xb0
> > > > [51397.864324] [<ffffffff8182aad8>] ? xprt_create_transport+0x108/0x1b0
> > > > [51397.868428] [<ffffffff81828971>] rpc_create+0xc1/0x190
> > > > [51397.872574] [<ffffffff81111c86>] ? internal_add_timer+0x66/0x80
> > > > [51397.876733] [<ffffffff81113a99>] ? mod_timer+0x109/0x1e0
> > > > [51397.880877] [<ffffffff8183a19e>] rpcb_create+0x6e/0x90
> > > > [51397.884999] [<ffffffff8183a71a>] rpcb_getport_async+0x15a/0x330
> > > > [51397.889118] [<ffffffff8182f1da>] ? rpc_malloc+0x3a/0x70
> > > > [51397.893240] [<ffffffff811af8d2>] ? __kmalloc+0xc2/0x170
> > > > [51397.897354] [<ffffffff81826830>] ? call_reserveresult+0x110/0x110
> > > > [51397.901490] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > > [51397.905606] [<ffffffff81826450>] ? call_start+0x20/0x20
> > > > [51397.909662] [<ffffffff8182648e>] call_bind+0x3e/0x40
> > > > [51397.913709] [<ffffffff8182fa99>] __rpc_execute+0x79/0x330
> > > > [51397.917778] [<ffffffff818327bd>] rpc_execute+0x5d/0xa0
> > > > [51397.921871] [<ffffffff818286cb>] rpc_run_task+0x6b/0x90
> > > > [51397.925989] [<ffffffff8182872e>] rpc_call_sync+0x3e/0xa0
> > > > [51397.930108] [<ffffffff8127fe29>] nsm_mon_unmon+0xb9/0xd0
> > > > [51397.934191] [<ffffffff8110e2a0>] ? call_rcu_bh+0x20/0x20
> > > > [51397.938235] [<ffffffff8128018c>] nsm_unmonitor+0x8c/0x140
> > > > [51397.942309] [<ffffffff8127bc43>] nlm_destroy_host_locked+0x63/0xa0
> > > > [51397.946442] [<ffffffff8127c03c>] nlmclnt_release_host+0x7c/0x130
> > > > [51397.950591] [<ffffffff81279645>] nlmclnt_done+0x15/0x30
> > > > [51397.954773] [<ffffffff81241862>] nfs_destroy_server+0x12/0x20
> > > > [51397.958934] [<ffffffff81242372>] nfs_free_server+0x22/0xa0
> > > > [51397.963053] [<ffffffff8124cadd>] nfs_kill_super+0x1d/0x30
> > > > [51397.967158] [<ffffffff811c2e2c>] deactivate_locked_super+0x4c/0x70
> > > > [51397.971286] [<ffffffff811c33f9>] deactivate_super+0x49/0x70
> > > > [51397.975398] [<ffffffff811ddafe>] cleanup_mnt+0x3e/0x90
> > > > [51397.979499] [<ffffffff811ddb9d>] __cleanup_mnt+0xd/0x10
> > > > [51397.983598] [<ffffffff810e04cc>] task_work_run+0xbc/0xe0
> > > > [51397.987697] [<ffffffff810c8f95>] do_exit+0x295/0xaf0
> > > > [51397.991812] [<ffffffff811c2239>] ? ____fput+0x9/0x10
> > > > [51397.995937] [<ffffffff810e04b4>] ? task_work_run+0xa4/0xe0
> > > > [51398.000070] [<ffffffff810c986a>] do_group_exit+0x3a/0xa0
> > > > [51398.004201] [<ffffffff810c98df>] SyS_exit_group+0xf/0x10
> > > > [51398.008315] [<ffffffff8185e8d2>] system_call_fastpath+0x12/0x17
> > > > [51398.012438] Code: 43 78 48 8d bb c8 00 00 00 48 89 7b 70 48 8b 30 e8 63 2d 01 00 c7 03 01 00 00 00 65 48 8b 04 25 00 aa 00 00 48 8b 80 c0 09 00 00 <4c> 8b 68 08 49 83 c5 45 4
> > > > [51398.022378] RIP [<ffffffff81828173>] rpc_new_client+0x193/0x2b0
> > > > [51398.026732] RSP <ffff8801fe757908>
> > > > [51398.031025] CR2: 0000000000000008
> > > > [51398.035326] ---[ end trace b701b037bc457620 ]---
> > > > [51398.058223] Fixing recursive fault but reboot is needed!
> > >
> > > We should rather change rpcb_create() to pass the nodename from the
> > > parent. The point is that the rpc_clnt->cl_nodename is used in various
> > > different contexts (for instance in the AUTH_SYS credential) where it
> > > isn't always appropriate to have it set to an empty string.
> >
> > I was rather hoping that Bruno would fix up his patch and resend, but
> > since other reports of the same bug are now surfacing... Please could
> > you all check if something like the following patch fixes it.
>
> This patch works for me, so
> Tested-by: Bruno Prémont <[email protected]>
>
> Now I get just the following complaint in dmesg on shutdown:
> [ 1940.173201] lockd: cannot unmonitor nfs.home
> ^^^^^^^^ name of NFS server
>
> This complaint did not happen with my "empty string" name
> patch.

Are there any clues from rpc.statd in your syslog that might help to
explain the error?

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]

2015-02-04 21:52:37

by Bruno Prémont

[permalink] [raw]
Subject: Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

On Wed, 4 Feb 2015 14:06:48 Trond Myklebust wrote:
> On Wed, Feb 4, 2015 at 12:08 PM, Bruno Prémont wrote:
> > On Fri, 30 January 2015 Trond Myklebust wrote:
> > > On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
> > > > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
> > > > > On a system running home-brown container (mntns, utsns,
> > > > > pidns, netns) with NFS mount-point bind-mounted into the
> > > > > container I hit the following trace if nfs filesystem is
> > > > > first umount()ed in init ns and then later umounted from
> > > > > container when the container exists.
> > > > >
> > > > > <snip trace>
> > > >
> > > > We should rather change rpcb_create() to pass the nodename from
> > > > the parent. The point is that the rpc_clnt->cl_nodename is used
> > > > in various different contexts (for instance in the AUTH_SYS
> > > > credential) where it isn't always appropriate to have it set to
> > > > an empty string.
> > >
> > > I was rather hoping that Bruno would fix up his patch and resend,
> > > but since other reports of the same bug are now surfacing...
> > > Please could you all check if something like the following patch
> > > fixes it.
> >
> > This patch works for me, so
> > Tested-by: Bruno Prémont <[email protected]>
> >
> > Now I get just the following complaint in dmesg on shutdown:
> > [ 1940.173201] lockd: cannot unmonitor nfs.home
> > ^^^^^^^^ name of NFS
> > server
> >
> > This complaint did not happen with my "empty string" name
> > patch.
>
> Are there any clues from rpc.statd in your syslog that might help to
> explain the error?

I would say there is no more running rpc.statd at that time.

My shutdown process looks about as follows (it's not necessarily the
optimal ordering):

- umount nfs mountpoints from root mntns
- stop nfsclient
- stop rpc.statd
- stop rpcbind
- stop containers (with bind-mounted nfs mountpoints from root mntns
that get implicitly umounted on mntns release)

Bruno

2015-02-04 22:29:56

by Trond Myklebust

[permalink] [raw]
Subject: Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

On Wed, Feb 4, 2015 at 4:52 PM, Bruno Prémont <[email protected]> wrote:
> On Wed, 4 Feb 2015 14:06:48 Trond Myklebust wrote:
>> On Wed, Feb 4, 2015 at 12:08 PM, Bruno Prémont wrote:
>> > On Fri, 30 January 2015 Trond Myklebust wrote:
>> > > On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
>> > > > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
>> > > > > On a system running home-brown container (mntns, utsns,
>> > > > > pidns, netns) with NFS mount-point bind-mounted into the
>> > > > > container I hit the following trace if nfs filesystem is
>> > > > > first umount()ed in init ns and then later umounted from
>> > > > > container when the container exists.
>> > > > >
>> > > > > <snip trace>
>> > > >
>> > > > We should rather change rpcb_create() to pass the nodename from
>> > > > the parent. The point is that the rpc_clnt->cl_nodename is used
>> > > > in various different contexts (for instance in the AUTH_SYS
>> > > > credential) where it isn't always appropriate to have it set to
>> > > > an empty string.
>> > >
>> > > I was rather hoping that Bruno would fix up his patch and resend,
>> > > but since other reports of the same bug are now surfacing...
>> > > Please could you all check if something like the following patch
>> > > fixes it.
>> >
>> > This patch works for me, so
>> > Tested-by: Bruno Prémont <[email protected]>
>> >
>> > Now I get just the following complaint in dmesg on shutdown:
>> > [ 1940.173201] lockd: cannot unmonitor nfs.home
>> > ^^^^^^^^ name of NFS
>> > server
>> >
>> > This complaint did not happen with my "empty string" name
>> > patch.
>>
>> Are there any clues from rpc.statd in your syslog that might help to
>> explain the error?
>
> I would say there is no more running rpc.statd at that time.
>
> My shutdown process looks about as follows (it's not necessarily the
> optimal ordering):
>
> - umount nfs mountpoints from root mntns
> - stop nfsclient
> - stop rpc.statd
> - stop rpcbind
> - stop containers (with bind-mounted nfs mountpoints from root mntns
> that get implicitly umounted on mntns release)
>

In that case, the above message can be taken as a likely good
indication that the kernel is working as expected. If there is no
rpc.statd to talk to, then the unmonitor RPC call is expected to fail.

--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]