Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp1884835ybz; Thu, 30 Apr 2020 07:12:08 -0700 (PDT) X-Google-Smtp-Source: APiQypL8qTGZAN2LDXB/9wS6dypzns/UL0xifIX+SOnSXWWEUx9N5vWBTioF7oJDc5WpTTcbeCJu X-Received: by 2002:a17:906:4a4e:: with SMTP id a14mr2900877ejv.363.1588255928350; Thu, 30 Apr 2020 07:12:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588255928; cv=none; d=google.com; s=arc-20160816; b=zsEDusIH5pB0F8ECbmAi/70z6hmE+dOWktwDprHvGXiKW4tf13n8nOdITRFJzZXCdh avwuMqIARGso4jqTT4xvwlRYAFMXF9dRlrJilULxroFAJOt/76eokPMN1u+BxYN6WEDT YodTuB9UOc2knrhFow+/toRcv/gaEMhM60lOkMq2MWXj4D5/Of0CqdAh/1Na6CFiB3+5 EPGsBij8omNqZgmZvs2sVCBTGbbCsfuL4hdII8PZ4d00cSV6GNa643pOIWLgfGOlNjiW fuPtUzYDRdDq0wyA4D4OoEDaKtBSeFEQabGHP3RY9jxyo/9Q5KD78O7qc/p2ToJI3Dpd TpAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tR6dKM9dZEypBRDwx4WLpNex8jHnVkU+6lkZbofGtN4=; b=udEyY7GfUh3k9fI4v94JRqtenGn69li1g+KhHIO2Fr3cIX5jkXyuha5gLi+/fUajYJ my0yrx6kFhu4mvwrESCCvJx0s2Yt1b0lJPTRkMUx0fnPlLtsaQCBJLPMQ8OWHR/xoqOj KNMAa/RJMpUKUcdQHmcdhcp44nUxWYFHhqWnFkF6IjT0j5bM0qm85HMsBlI++PB04Kuw bQsd6zKKJG0wrU7eXqbzGN/pwIlwu5HVNE7cJdC2F+4TJjErSvCP5t67S/sSZs+Pu4Vv 1U1u///psIZVKnBPLMoLmujEnoTausUk6uA+J5kw/Oft1RvslgyqiUKJfo8QXiIvf9sA ZdEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=iImgsOB2; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u16si5750933edd.86.2020.04.30.07.11.41; Thu, 30 Apr 2020 07:12:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=iImgsOB2; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728084AbgD3OLX (ORCPT + 99 others); Thu, 30 Apr 2020 10:11:23 -0400 Received: from mail.kernel.org ([198.145.29.99]:59640 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728038AbgD3NvZ (ORCPT ); Thu, 30 Apr 2020 09:51:25 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8570C20774; Thu, 30 Apr 2020 13:51:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1588254684; bh=ZvFIe6HkwNls+CEZHNHEL9ixoiOWbo6LZBpF7nEzXUU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iImgsOB2IOXbxZNgKPqWpBIrI3p9VVm6jKs2y3yvoon6pHbNePLbLn0p9wNw9kA2C 9pWksbHYcmMYBLHVuSIZ10x4VZU8Pl9d4H7+ieHoGztfEg8HXKgX7NdQogeo5FMFXj y3+4g9AC0393kfiYGNL6O1tXm0eyRPKjBHUmbNHE= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Chuck Lever , Sasha Levin , linux-nfs@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH AUTOSEL 5.6 35/79] svcrdma: Fix trace point use-after-free race Date: Thu, 30 Apr 2020 09:49:59 -0400 Message-Id: <20200430135043.19851-35-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200430135043.19851-1-sashal@kernel.org> References: <20200430135043.19851-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org From: Chuck Lever [ Upstream commit e28b4fc652c1830796a4d3e09565f30c20f9a2cf ] I hit this while testing nfsd-5.7 with kernel memory debugging enabled on my server: Mar 30 13:21:45 klimt kernel: BUG: unable to handle page fault for address: ffff8887e6c279a8 Mar 30 13:21:45 klimt kernel: #PF: supervisor read access in kernel mode Mar 30 13:21:45 klimt kernel: #PF: error_code(0x0000) - not-present page Mar 30 13:21:45 klimt kernel: PGD 3601067 P4D 3601067 PUD 87c519067 PMD 87c3e2067 PTE 800ffff8193d8060 Mar 30 13:21:45 klimt kernel: Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI Mar 30 13:21:45 klimt kernel: CPU: 2 PID: 1933 Comm: nfsd Not tainted 5.6.0-rc6-00040-g881e87a3c6f9 #1591 Mar 30 13:21:45 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015 Mar 30 13:21:45 klimt kernel: RIP: 0010:svc_rdma_post_chunk_ctxt+0xab/0x284 [rpcrdma] Mar 30 13:21:45 klimt kernel: Code: c1 83 34 02 00 00 29 d0 85 c0 7e 72 48 8b bb a0 02 00 00 48 8d 54 24 08 4c 89 e6 48 8b 07 48 8b 40 20 e8 5a 5c 2b e1 41 89 c6 <8b> 45 20 89 44 24 04 8b 05 02 e9 01 00 85 c0 7e 33 e9 5e 01 00 00 Mar 30 13:21:45 klimt kernel: RSP: 0018:ffffc90000dfbdd8 EFLAGS: 00010286 Mar 30 13:21:45 klimt kernel: RAX: 0000000000000000 RBX: ffff8887db8db400 RCX: 0000000000000030 Mar 30 13:21:45 klimt kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000246 Mar 30 13:21:45 klimt kernel: RBP: ffff8887e6c27988 R08: 0000000000000000 R09: 0000000000000004 Mar 30 13:21:45 klimt kernel: R10: ffffc90000dfbdd8 R11: 00c068ef00000000 R12: ffff8887eb4e4a80 Mar 30 13:21:45 klimt kernel: R13: ffff8887db8db634 R14: 0000000000000000 R15: ffff8887fc931000 Mar 30 13:21:45 klimt kernel: FS: 0000000000000000(0000) GS:ffff88885bd00000(0000) knlGS:0000000000000000 Mar 30 13:21:45 klimt kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 CR3: 000000081b72e002 CR4: 00000000001606e0 Mar 30 13:21:45 klimt kernel: Call Trace: Mar 30 13:21:45 klimt kernel: ? svc_rdma_vec_to_sg+0x7f/0x7f [rpcrdma] Mar 30 13:21:45 klimt kernel: svc_rdma_send_write_chunk+0x59/0xce [rpcrdma] Mar 30 13:21:45 klimt kernel: svc_rdma_sendto+0xf9/0x3ae [rpcrdma] Mar 30 13:21:45 klimt kernel: ? nfsd_destroy+0x51/0x51 [nfsd] Mar 30 13:21:45 klimt kernel: svc_send+0x105/0x1e3 [sunrpc] Mar 30 13:21:45 klimt kernel: nfsd+0xf2/0x149 [nfsd] Mar 30 13:21:45 klimt kernel: kthread+0xf6/0xfb Mar 30 13:21:45 klimt kernel: ? kthread_queue_delayed_work+0x74/0x74 Mar 30 13:21:45 klimt kernel: ret_from_fork+0x3a/0x50 Mar 30 13:21:45 klimt kernel: Modules linked in: ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue ib_umad ib_ipoib mlx4_ib sb_edac x86_pkg_temp_thermal iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel glue_helper crypto_simd cryptd pcspkr rpcrdma i2c_i801 rdma_ucm lpc_ich mfd_core ib_iser rdma_cm iw_cm ib_cm mei_me raid0 libiscsi mei sg scsi_transport_iscsi ioatdma wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter nfsd nfs_acl lockd auth_rpcgss grace sunrpc ip_tables xfs libcrc32c mlx4_en sd_mod sr_mod cdrom mlx4_core crc32c_intel igb nvme i2c_algo_bit ahci i2c_core libahci nvme_core dca libata t10_pi qedr dm_mirror dm_region_hash dm_log dm_mod dax qede qed crc8 ib_uverbs ib_core Mar 30 13:21:45 klimt kernel: CR2: ffff8887e6c279a8 Mar 30 13:21:45 klimt kernel: ---[ end trace 87971d2ad3429424 ]--- It's absolutely not safe to use resources pointed to by the @send_wr argument of ib_post_send() _after_ that function returns. Those resources are typically freed by the Send completion handler, which can run before ib_post_send() returns. Thus the trace points currently around ib_post_send() in the server's RPC/RDMA transport are a hazard, even when they are disabled. Rearrange them so that they touch the Work Request only _before_ ib_post_send() is invoked. Fixes: bd2abef33394 ("svcrdma: Trace key RDMA API events") Fixes: 4201c7464753 ("svcrdma: Introduce svc_rdma_send_ctxt") Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin --- include/trace/events/rpcrdma.h | 50 +++++++++++++++++++-------- net/sunrpc/xprtrdma/svc_rdma_rw.c | 3 +- net/sunrpc/xprtrdma/svc_rdma_sendto.c | 16 +++++---- 3 files changed, 46 insertions(+), 23 deletions(-) diff --git a/include/trace/events/rpcrdma.h b/include/trace/events/rpcrdma.h index c0e4c93324f57..fa14adf242353 100644 --- a/include/trace/events/rpcrdma.h +++ b/include/trace/events/rpcrdma.h @@ -1699,17 +1699,15 @@ DECLARE_EVENT_CLASS(svcrdma_sendcomp_event, TRACE_EVENT(svcrdma_post_send, TP_PROTO( - const struct ib_send_wr *wr, - int status + const struct ib_send_wr *wr ), - TP_ARGS(wr, status), + TP_ARGS(wr), TP_STRUCT__entry( __field(const void *, cqe) __field(unsigned int, num_sge) __field(u32, inv_rkey) - __field(int, status) ), TP_fast_assign( @@ -1717,12 +1715,11 @@ TRACE_EVENT(svcrdma_post_send, __entry->num_sge = wr->num_sge; __entry->inv_rkey = (wr->opcode == IB_WR_SEND_WITH_INV) ? wr->ex.invalidate_rkey : 0; - __entry->status = status; ), - TP_printk("cqe=%p num_sge=%u inv_rkey=0x%08x status=%d", + TP_printk("cqe=%p num_sge=%u inv_rkey=0x%08x", __entry->cqe, __entry->num_sge, - __entry->inv_rkey, __entry->status + __entry->inv_rkey ) ); @@ -1787,26 +1784,23 @@ TRACE_EVENT(svcrdma_wc_receive, TRACE_EVENT(svcrdma_post_rw, TP_PROTO( const void *cqe, - int sqecount, - int status + int sqecount ), - TP_ARGS(cqe, sqecount, status), + TP_ARGS(cqe, sqecount), TP_STRUCT__entry( __field(const void *, cqe) __field(int, sqecount) - __field(int, status) ), TP_fast_assign( __entry->cqe = cqe; __entry->sqecount = sqecount; - __entry->status = status; ), - TP_printk("cqe=%p sqecount=%d status=%d", - __entry->cqe, __entry->sqecount, __entry->status + TP_printk("cqe=%p sqecount=%d", + __entry->cqe, __entry->sqecount ) ); @@ -1902,6 +1896,34 @@ DECLARE_EVENT_CLASS(svcrdma_sendqueue_event, DEFINE_SQ_EVENT(full); DEFINE_SQ_EVENT(retry); +TRACE_EVENT(svcrdma_sq_post_err, + TP_PROTO( + const struct svcxprt_rdma *rdma, + int status + ), + + TP_ARGS(rdma, status), + + TP_STRUCT__entry( + __field(int, avail) + __field(int, depth) + __field(int, status) + __string(addr, rdma->sc_xprt.xpt_remotebuf) + ), + + TP_fast_assign( + __entry->avail = atomic_read(&rdma->sc_sq_avail); + __entry->depth = rdma->sc_sq_depth; + __entry->status = status; + __assign_str(addr, rdma->sc_xprt.xpt_remotebuf); + ), + + TP_printk("addr=%s sc_sq_avail=%d/%d status=%d", + __get_str(addr), __entry->avail, __entry->depth, + __entry->status + ) +); + #endif /* _TRACE_RPCRDMA_H */ #include diff --git a/net/sunrpc/xprtrdma/svc_rdma_rw.c b/net/sunrpc/xprtrdma/svc_rdma_rw.c index 48fe3b16b0d9c..a59912e2666d3 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_rw.c +++ b/net/sunrpc/xprtrdma/svc_rdma_rw.c @@ -323,8 +323,6 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc) if (atomic_sub_return(cc->cc_sqecount, &rdma->sc_sq_avail) > 0) { ret = ib_post_send(rdma->sc_qp, first_wr, &bad_wr); - trace_svcrdma_post_rw(&cc->cc_cqe, - cc->cc_sqecount, ret); if (ret) break; return 0; @@ -337,6 +335,7 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc) trace_svcrdma_sq_retry(rdma); } while (1); + trace_svcrdma_sq_post_err(rdma, ret); set_bit(XPT_CLOSE, &xprt->xpt_flags); /* If even one was posted, there will be a completion. */ diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c index f3f108090aa45..612770aaf729b 100644 --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c @@ -310,15 +310,17 @@ int svc_rdma_send(struct svcxprt_rdma *rdma, struct ib_send_wr *wr) } svc_xprt_get(&rdma->sc_xprt); + trace_svcrdma_post_send(wr); ret = ib_post_send(rdma->sc_qp, wr, NULL); - trace_svcrdma_post_send(wr, ret); - if (ret) { - set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags); - svc_xprt_put(&rdma->sc_xprt); - wake_up(&rdma->sc_send_wait); - } - break; + if (ret) + break; + return 0; } + + trace_svcrdma_sq_post_err(rdma, ret); + set_bit(XPT_CLOSE, &rdma->sc_xprt.xpt_flags); + svc_xprt_put(&rdma->sc_xprt); + wake_up(&rdma->sc_send_wait); return ret; } -- 2.20.1