Return-Path: Received: from mail-qt0-f178.google.com ([209.85.216.178]:33756 "EHLO mail-qt0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753728AbdHWKdo (ORCPT ); Wed, 23 Aug 2017 06:33:44 -0400 Received: by mail-qt0-f178.google.com with SMTP id d15so5758799qta.0 for ; Wed, 23 Aug 2017 03:33:44 -0700 (PDT) Message-ID: <1503484422.5294.0.camel@redhat.com> Subject: Re: [PATCH v4] net: sunrpc: svcsock: fix NULL-pointer exception From: Jeff Layton To: Vadim Lomovtsev , trond.myklebust@primarydata.com, anna.schumaker@netapp.com, bfields@fieldses.org, davem@davemloft.net, linux-nfs@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, pabeni@redhat.com Date: Wed, 23 Aug 2017 06:33:42 -0400 In-Reply-To: <20170823102403.GF1443@dhcp187-32.khw.lab.eng.bos.redhat.com> References: <1503305779-6064-1-git-send-email-vlomovts@redhat.com> <1503314587-15420-1-git-send-email-vlomovts@redhat.com> <20170823102403.GF1443@dhcp187-32.khw.lab.eng.bos.redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Wed, 2017-08-23 at 06:24 -0400, Vadim Lomovtsev wrote: > Hi all, > > Any comments on this ? > > WBR, > Vadim > > On Mon, Aug 21, 2017 at 07:23:07AM -0400, Vadim Lomovtsev wrote: > > While running nfs/connectathon tests kernel NULL-pointer exception > > has been observed due to races in svcsock.c. > > > > Race is appear when kernel accepts connection by kernel_accept > > (which creates new socket) and start queuing ingress packets > > to new socket. This happens in ksoftirq context which could run > > concurrently on a different core while new socket setup is not done yet. > > > > The fix is to re-order socket user data init sequence and add > > write/read barrier calls to be sure that we got proper values > > for callback pointers before actually calling them. > > > > Test results: nfs/connectathon reports '0' failed tests for about 200+ iterations. > > > > Crash log: > > ---<-snip->--- > > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual address 00000000 > > [ 6708.647093] pgd = ffff0000094e0000 > > [ 6708.650497] [00000000] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff80003, *pte=0000000000000000 > > [ 6708.660761] Internal error: Oops: 86000005 [#1] SMP > > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops > > [ 6708.736446] ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: stap_3c300909c5b3f46dcacd49aab3334af_87021] > > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: G W OE 4.11.0-4.el7.aarch64 #1 > > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 2017 > > [ 6708.767910] task: ffff810006842e80 task.stack: ffff81000689c000 > > [ 6708.773822] PC is at 0x0 > > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc] > > [ 6708.781866] pc : [<0000000000000000>] lr : [] pstate: 60000145 > > [ 6708.789248] sp : ffff810ffbad3900 > > [ 6708.792551] x29: ffff810ffbad3900 x28: ffff000008c73d58 > > [ 6708.797853] x27: 0000000000000000 x26: ffff81000bbe1e00 > > [ 6708.803156] x25: 0000000000000020 x24: ffff800f7410bf28 > > [ 6708.808458] x23: ffff000008c63000 x22: ffff000008c63000 > > [ 6708.813760] x21: ffff800f7410bf28 x20: ffff81000bbe1e00 > > [ 6708.819063] x19: ffff810012412400 x18: 00000000d82a9df2 > > [ 6708.824365] x17: 0000000000000000 x16: 0000000000000000 > > [ 6708.829667] x15: 0000000000000000 x14: 0000000000000001 > > [ 6708.834969] x13: 0000000000000000 x12: 722e736f622e676e > > [ 6708.840271] x11: 00000000f814dd99 x10: 0000000000000000 > > [ 6708.845573] x9 : 7374687225000000 x8 : 0000000000000000 > > [ 6708.850875] x7 : 0000000000000000 x6 : 0000000000000000 > > [ 6708.856177] x5 : 0000000000000028 x4 : 0000000000000000 > > [ 6708.861479] x3 : 0000000000000000 x2 : 00000000e5000000 > > [ 6708.866781] x1 : 0000000000000000 x0 : ffff81000bbe1e00 > > [ 6708.872084] > > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0xffff81000689c000) > > [ 6708.880341] Stack: (0xffff810ffbad3900 to 0xffff8100068a0000) > > [ 6708.886075] Call trace: > > [ 6708.888513] Exception stack(0xffff810ffbad3710 to 0xffff810ffbad3840) > > [ 6708.894942] 3700: ffff810012412400 0001000000000000 > > [ 6708.902759] 3720: ffff810ffbad3900 0000000000000000 0000000060000145 ffff800f79300000 > > [ 6708.910577] 3740: ffff000009274d00 00000000000003ea 0000000000000015 ffff000008c63000 > > [ 6708.918395] 3760: ffff810ffbad3830 ffff800f79300000 000000000000004d 0000000000000000 > > [ 6708.926212] 3780: ffff810ffbad3890 ffff0000080f88dc ffff800f79300000 000000000000004d > > [ 6708.934030] 37a0: ffff800f7930093c ffff000008c63000 0000000000000000 0000000000000140 > > [ 6708.941848] 37c0: ffff000008c2c000 0000000000040b00 ffff81000bbe1e00 0000000000000000 > > [ 6708.949665] 37e0: 00000000e5000000 0000000000000000 0000000000000000 0000000000000028 > > [ 6708.957483] 3800: 0000000000000000 0000000000000000 0000000000000000 7374687225000000 > > [ 6708.965300] 3820: 0000000000000000 00000000f814dd99 722e736f622e676e 0000000000000000 > > [ 6708.973117] [< (null)>] (null) > > [ 6708.977824] [] tcp_data_queue+0x754/0xc5c > > [ 6708.983386] [] tcp_rcv_established+0x1a0/0x67c > > [ 6708.989384] [] tcp_v4_do_rcv+0x15c/0x22c > > [ 6708.994858] [] tcp_v4_rcv+0xaf0/0xb58 > > [ 6709.000077] [] ip_local_deliver_finish+0x10c/0x254 > > [ 6709.006419] [] ip_local_deliver+0xf0/0xfc > > [ 6709.011980] [] ip_rcv_finish+0x208/0x3a4 > > [ 6709.017454] [] ip_rcv+0x2dc/0x3c8 > > [ 6709.022328] [] __netif_receive_skb_core+0x2f8/0xa0c > > [ 6709.028758] [] __netif_receive_skb+0x38/0x84 > > [ 6709.034580] [] netif_receive_skb_internal+0x68/0xdc > > [ 6709.041010] [] napi_gro_receive+0xcc/0x1a8 > > [ 6709.046690] [] nicvf_cq_intr_handler+0x59c/0x730 [nicvf] > > [ 6709.053559] [] nicvf_poll+0x38/0xb8 [nicvf] > > [ 6709.059295] [] net_rx_action+0x2f8/0x464 > > [ 6709.064771] [] __do_softirq+0x11c/0x308 > > [ 6709.070164] [] irq_exit+0x12c/0x174 > > [ 6709.075206] [] __handle_domain_irq+0x78/0xc4 > > [ 6709.081027] [] gic_handle_irq+0x94/0x190 > > [ 6709.086501] Exception stack(0xffff81000689fdf0 to 0xffff81000689ff20) > > [ 6709.092929] fde0: 0000810ff2ec0000 ffff000008c10000 > > [ 6709.100747] fe00: ffff000008c70ef4 0000000000000001 0000000000000000 ffff810ffbad9b18 > > [ 6709.108565] fe20: ffff810ffbad9c70 ffff8100169d3800 ffff810006843ab0 ffff81000689fe80 > > [ 6709.116382] fe40: 0000000000000bd0 0000ffffdf979cd0 183f5913da192500 0000ffff8a254ce4 > > [ 6709.124200] fe60: 0000ffff8a254b78 0000aaab10339808 0000000000000000 0000ffff8a0c2a50 > > [ 6709.132018] fe80: 0000ffffdf979b10 ffff000008d6d450 ffff000008c10000 ffff000008d6d000 > > [ 6709.139836] fea0: 0000000000000054 ffff000008cd3dbc 0000000000000000 0000000000000000 > > [ 6709.147653] fec0: 0000000000000000 0000000000000000 0000000000000000 ffff81000689ff20 > > [ 6709.155471] fee0: ffff000008085240 ffff81000689ff20 ffff000008085244 0000000060000145 > > [ 6709.163289] ff00: ffff81000689ff10 ffff00000813f1e4 ffffffffffffffff ffff00000813f238 > > [ 6709.171107] [] el1_irq+0xb4/0x140 > > [ 6709.175976] [] arch_cpu_idle+0x44/0x11c > > [ 6709.181368] [] default_idle_call+0x20/0x30 > > [ 6709.187020] [] do_idle+0x158/0x1e4 > > [ 6709.191973] [] cpu_startup_entry+0x2c/0x30 > > [ 6709.197624] [] secondary_start_kernel+0x13c/0x160 > > [ 6709.203878] [<0000000001bc71c4>] 0x1bc71c4 > > [ 6709.207967] Code: bad PC value > > [ 6709.211061] SMP: stopping secondary CPUs > > [ 6709.218830] Starting crashdump kernel... > > [ 6709.222749] Bye! > > ---<-snip>--- > > > > Signed-off-by: Vadim Lomovtsev > > Reviewed-by: Jeff Layton > > --- > > net/sunrpc/svcsock.c | 22 ++++++++++++++++++++-- > > 1 file changed, 20 insertions(+), 2 deletions(-) > > > > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c > > index 2b720fa..e185001 100644 > > --- a/net/sunrpc/svcsock.c > > +++ b/net/sunrpc/svcsock.c > > @@ -421,6 +421,9 @@ static void svc_data_ready(struct sock *sk) > > dprintk("svc: socket %p(inet %p), busy=%d\n", > > svsk, sk, > > test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); > > + > > + /* Refer to svc_setup_socket() for details. */ > > + rmb(); > > svsk->sk_odata(sk); > > if (!test_and_set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags)) > > svc_xprt_enqueue(&svsk->sk_xprt); > > @@ -437,6 +440,9 @@ static void svc_write_space(struct sock *sk) > > if (svsk) { > > dprintk("svc: socket %p(inet %p), write_space busy=%d\n", > > svsk, sk, test_bit(XPT_BUSY, &svsk->sk_xprt.xpt_flags)); > > + > > + /* Refer to svc_setup_socket() for details. */ > > + rmb(); > > svsk->sk_owspace(sk); > > svc_xprt_enqueue(&svsk->sk_xprt); > > } > > @@ -760,8 +766,12 @@ static void svc_tcp_listen_data_ready(struct sock *sk) > > dprintk("svc: socket %p TCP (listen) state change %d\n", > > sk, sk->sk_state); > > > > - if (svsk) > > + if (svsk) { > > + /* Refer to svc_setup_socket() for details. */ > > + rmb(); > > svsk->sk_odata(sk); > > + } > > + > > /* > > * This callback may called twice when a new connection > > * is established as a child socket inherits everything > > @@ -794,6 +804,8 @@ static void svc_tcp_state_change(struct sock *sk) > > if (!svsk) > > printk("svc: socket %p: no user data\n", sk); > > else { > > + /* Refer to svc_setup_socket() for details. */ > > + rmb(); > > svsk->sk_ostate(sk); > > if (sk->sk_state != TCP_ESTABLISHED) { > > set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags); > > @@ -1381,12 +1393,18 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv, > > return ERR_PTR(err); > > } > > > > - inet->sk_user_data = svsk; > > svsk->sk_sock = sock; > > svsk->sk_sk = inet; > > svsk->sk_ostate = inet->sk_state_change; > > svsk->sk_odata = inet->sk_data_ready; > > svsk->sk_owspace = inet->sk_write_space; > > + /* > > + * This barrier is necessary in order to prevent race condition > > + * with svc_data_ready(), svc_listen_data_ready() and others > > + * when calling callbacks above. > > + */ > > + wmb(); > > + inet->sk_user_data = svsk; > > > > /* Initialize the socket */ > > if (sock->type == SOCK_DGRAM) > > -- > > 1.8.3.1 > > I think this one looks fine. Bruce, do you mind picking this one up? It might also be reasonable for stable... -- Jeff Layton