Return-Path: linux-nfs-owner@vger.kernel.org Received: from szxga02-in.huawei.com ([119.145.14.65]:43216 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751747Ab3LJDPm (ORCPT ); Mon, 9 Dec 2013 22:15:42 -0500 Message-ID: <52A686A7.8060208@huawei.com> Date: Tue, 10 Dec 2013 11:12:39 +0800 From: Weng Meiling MIME-Version: 1.0 To: "J. Bruce Fields" CC: , , , , Subject: NFSd 3.13 bug (Was "Re: [PATCH 3.4 9/9] nfsd: use the current net ns in write_threads() and write_ports()") References: <1386136415-30976-1-git-send-email-wengmeiling.weng@huawei.com> <1386136415-30976-10-git-send-email-wengmeiling.weng@huawei.com> <20131204212532.GB19452@fieldses.org> In-Reply-To: <20131204212532.GB19452@fieldses.org> Content-Type: text/plain; charset="ISO-8859-1" Sender: linux-nfs-owner@vger.kernel.org List-ID: Hi guys, When I test NFS in different network namespace with the 3.13-rc2 kernel, I trigger a kernel panic. On 2013/12/5 5:25, J. Bruce Fields wrote: > On Wed, Dec 04, 2013 at 01:53:35PM +0800, Weng Meiling wrote: >> Upstream commit f7fb86c6e639360ad9c253cec534819ef928a674 (nfsd: use >> "init_net" for portmapper) introduced a bug. >> >> Starting NFSd in a non init_net network namespace will lead to >> NULL pointer deference. Because RPCBIND client will be NULL when register >> RPC service with the local portmapper in svc_addsock(). >> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 >> IP: [] call_start+0x10/0x30 [sunrpc] >> ... >> Pid: 27770, comm: rpc.nfsd ... >> RIP: 0010:[] [] call_start+0x10/0x30 [sunrpc] >> ... >> [] __rpc_execute+0x91/0x160 [sunrpc] >> [] rpc_execute+0x71/0x80 [sunrpc] >> [] rpc_run_task+0x89/0xa0 [sunrpc] >> [] rpc_call_sync+0x3d/0x70 [sunrpc] >> [] rpcb_register+0xa6/0xd0 [sunrpc] >> [] __svc_register+0x1ae/0x1c0 [sunrpc] >> [] ? cache_alloc_refill+0x85/0x290 >> [] svc_register+0x8f/0xc0 [sunrpc] >> [] ? kmem_cache_alloc_trace+0xc3/0x1d0 >> [] svc_setup_socket+0x1a8/0x2c0 [sunrpc] >> [] ? read_tsc+0x16/0x40 >> [] svc_addsock+0x118/0x1c0 [sunrpc] >> [] ? do_gettimeofday+0x15/0x50 >> [] ? nfsd_create_serv+0xdc/0x150 [nfsd] >> [] ? simple_strtoull+0x2c/0x50 >> [] __write_ports+0x1fe/0x230 [nfsd] >> [] write_ports+0x37/0x60 [nfsd] >> [] ? __write_ports+0x230/0x230 [nfsd] >> [] nfsctl_transaction_write+0x72/0x90 [nfsd] >> [] vfs_write+0xcb/0x130 >> [] sys_write+0x50/0x90 >> >> Fix it by using the current's network namespace so NFSd uses the >> consistent net ns all the time. > > Everything else looks like a straightforward backport, but doing this > differently from upstream makes me nervous. Don't we also want to take > 11f779421a39b86da8a523d97e5fd3477878d44f "nfsd: containerize NFSd > filesystem" ? (Stanislav?) > > --b. > I backport the patch 11f779421a39b86da8a523d97e5fd3477878d44f "nfsd: containerize NFSd filesystem" and test. But I trigger a bug, this bug still exists in 3.13 kernel. The following is what I do: The steps: step 1: start NFS server in init_net net ns #service nfsserver start step 2: stop NFS server in non init_net net ns #ip netns add test #ip netns list test #ip netns exec test service nfsserver stop step 3: start NFS server again in the non init_net net ns #ip netns exec test service nfsserver start This step 3 will trigger kernel panic. The reason seems that "ip netns exec" creates a new mount namespace, the changes to the new mount namespace don't propgate to other namespaces. So when stop NFS server in second step, the NFSD filesystem isn't umounted. When restart NFS server in third step, the NFSD filesystem will not remount, this result to the NFSD file system superblock's net ns is still init_net and RPCBIND client will be NULL when register RPC service with the local portmapper in svc_addsock(). Do you have any ideas about this problem? the detail call trace: [ 497.554677] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 [ 497.554687] IP: [] call_start+0x10/0x30 [sunrpc] [ 497.554707] PGD 0 [ 497.554711] Oops: 0000 [#1] SMP [ 497.554716] Modules linked in: nfsd lockd nfs_acl auth_rpcgss sunrpc oid_registry edd af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave loop dm_mod e1000e iTCO_wdt iTCO_vendor_support i2c_i801 bnx2 ipv6 lpc_ich i7core_edac edac_core acpi_cpufreq ehci_pci button ses enclosure serio_raw sg rtc_cmos mfd_core ptp hid_generic pps_core i2c_core pcspkr ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod [ 497.554788] CPU: 2 PID: 7837 Comm: rpc.nfsd Not tainted 3.13.0-rc2-0.1-default+ #1 [ 497.554793] Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA , BIOS CTSAV036 04/27/2011 [ 497.554800] task: ffff8800ba76e2d0 ti: ffff88043e8e8000 task.ti: ffff88043e8e8000 [ 497.554805] RIP: 0010:[] [] call_start+0x10/0x30 [sunrpc] [ 497.554819] RSP: 0018:ffff88043e8e9aa8 EFLAGS: 00010202 [ 497.554823] RAX: ffffffffa033f4b8 RBX: ffff8800bb030040 RCX: 0000000000000034 [ 497.554828] RDX: 0000000000000000 RSI: ffff8800bb0300b0 RDI: ffff8800bb030040 [ 497.554832] RBP: ffff88043e8e9aa8 R08: 0040000000000000 R09: 0200000000000000 [ 497.554836] R10: 0000000000000000 R11: ffff8802348fe040 R12: ffff8800bb030040 [ 497.554841] R13: ffffffffa031a160 R14: 0000000000000000 R15: ffffffffa031a160 [ 497.554846] FS: 00007f2fa0536700(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000 [ 497.554851] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 497.554855] CR2: 0000000000000058 CR3: 0000000434e30000 CR4: 00000000000007e0 [ 497.554859] Stack: [ 497.554862] ffff88043e8e9af8 ffffffffa0323f61 ffff00066c0a0100 ffff8800bb0300b0 [ 497.554871] 000000003e8e9ae8 ffff8800bb030040 ffff8800bb030040 0000000000000000 [ 497.554878] 0000000000000000 0000000000000002 ffff88043e8e9b28 ffffffffa03240ed [ 497.554886] Call Trace: [ 497.554902] [] __rpc_execute+0xa1/0x190 [sunrpc] [ 497.554918] [] rpc_execute+0x9d/0xc0 [sunrpc] [ 497.554930] [] rpc_run_task+0x89/0xa0 [sunrpc] [ 497.554943] [] rpc_call_sync+0x3e/0xa0 [sunrpc] [ 497.554961] [] rpcb_register_call+0x37/0x60 [sunrpc] [ 497.554979] [] rpcb_register+0x9c/0xb0 [sunrpc] [ 497.554996] [] __svc_register+0x1ae/0x1c0 [sunrpc] [ 497.555012] [] svc_register+0x90/0xe0 [sunrpc] [ 497.555029] [] svc_setup_socket+0x1e7/0x300 [sunrpc] [ 497.555038] [] ? __getnstimeofday+0x43/0xd0 [ 497.555055] [] svc_addsock+0xca/0x1e0 [sunrpc] [ 497.555068] [] ? nfsd_create_serv+0x111/0x180 [nfsd] [ 497.555075] [] ? simple_strtol+0xe/0x30 [ 497.555084] [] ? get_int+0x57/0x70 [nfsd] [ 497.555094] [] __write_ports+0x119/0x140 [nfsd] [ 497.555103] [] write_ports+0x7a/0xb0 [nfsd] [ 497.555112] [] ? __write_ports+0x140/0x140 [nfsd] [ 497.555122] [] nfsctl_transaction_write+0x6a/0x80 [nfsd] [ 497.555129] [] vfs_write+0xc7/0x1e0 [ 497.555134] [] SyS_write+0x5d/0xa0 [ 497.555142] [] system_call_fastpath+0x16/0x1b [ 497.555146] Code: 00 00 00 01 55 48 89 e5 75 0d 48 c7 47 50 60 a1 31 a0 b8 01 00 00 00 c9 c3 66 90 48 8b 47 28 48 8b 57 18 55 83 40 20 01 48 89 e5 <48> 8b 42 58 83 40 1c 01 48 c7 47 50 f0 a1 31 a0 c9 c3 66 66 66 [ 497.555189] RIP [] call_start+0x10/0x30 [sunrpc] [ 497.555200] RSP [ 497.555203] CR2: 0000000000000058 [ 497.555208] ---[ end trace 34ca8d40727792e2 ]--- >> >> Signed-off-by: Weng Meiling >> --- >> fs/nfsd/nfsctl.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c >> index 1d74af2..4ff0db9 100644 >> --- a/fs/nfsd/nfsctl.c >> +++ b/fs/nfsd/nfsctl.c >> @@ -15,6 +15,7 @@ >> #include >> #include >> #include >> +#include >> >> #include "idmap.h" >> #include "nfsd.h" >> @@ -389,7 +390,7 @@ static ssize_t write_threads(struct file *file, char *buf, size_t size) >> { >> char *mesg = buf; >> int rv; >> - struct net *net = &init_net; >> + struct net *net = current->nsproxy->net_ns; >> >> if (size > 0) { >> int newthreads; >> @@ -857,7 +858,7 @@ static ssize_t __write_ports(struct file *file, char *buf, size_t size, >> static ssize_t write_ports(struct file *file, char *buf, size_t size) >> { >> ssize_t rv; >> - struct net *net = &init_net; >> + struct net *net = current->nsproxy->net_ns; >> >> mutex_lock(&nfsd_mutex); >> rv = __write_ports(file, buf, size, net); >> -- >> 1.8.2.2 >> >> > > . >