Received: by 10.192.165.148 with SMTP id m20csp2531042imm; Sun, 22 Apr 2018 09:00:04 -0700 (PDT) X-Google-Smtp-Source: AIpwx48gTaehGPbGD0nPkDCB3k7RIZUkHq9Q+8LPLMv7D+YYQZFKFGwdDa60SDm3n60SwIINYTwh X-Received: by 10.98.2.72 with SMTP id 69mr16992971pfc.12.1524412804594; Sun, 22 Apr 2018 09:00:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524412804; cv=none; d=google.com; s=arc-20160816; b=MXHaNFZd7vM2LjIFWCiFt4hLWlLziz4dkFTGXZauRwXbm9qnU9TbgV35EdKYPMqg/t rjXPDraijUzxNsGGCUzbmfx8Hsbk7zAquhMBouaaLcVDU6ByxC7IxqelbdE4qxw3a0BB xlFibbugyh+RuBz9Dc8vWUYGlvAPoBGuhwkj+h93OE2WNJA3e3lxBY3jOiUggiSYK2Fw ZPbsjMXndWylBpgJKOxTSj8M40iXYY4P3UWuM0imOiI9cfLFFbTPK5eVPziZK9hJP4iF HCBQZr5S470lHe+aRT348bot4lqYcjUdQGBSgN0slBk1XoL0js3qL+ZDCIyyUx4GcPUe MYcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=ug9tgYqi/3tBbBJhWPGitoeKTfWZpNKGOLJFVZoUedc=; b=yhYi4wQch64OQ/sWVV0oxms1J/z+OG5nEGsWi+S43XYIL9AS9iLKtLyT2pM02yknU9 7QjHGZin7i133u3uqjN55JUA7EJMTontRkDqKKTFBitbydTPVFpMALmPLlDNKThXMuc8 eQwLeGw6sZIu52AdCX6+ahz6eTkxy1YofSsaiAN2Gbs2Ww5ZolTObJgHnectrPhvrUlV 5O0+h94WCpIgXDaeD3o8tp3mXmwvK1xaO6oK39CkqLZnBRvc923EPXi1xktxamYUY/FL T1Cv7Pv+Kx9McFjCr7sXm0vFIlqjVDHH8QbK4cDUQldPiqKpRTo67s5aSIdplgZDTmPv tYdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d16si2322775pgn.563.2018.04.22.08.59.50; Sun, 22 Apr 2018 09:00:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754282AbeDVN7N (ORCPT + 99 others); Sun, 22 Apr 2018 09:59:13 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:46546 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754257AbeDVN7G (ORCPT ); Sun, 22 Apr 2018 09:59:06 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 0089DCEE; Sun, 22 Apr 2018 13:59:04 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Chuck Lever , Anna Schumaker Subject: [PATCH 4.16 106/196] xprtrdma: Fix latency regression on NUMA NFS/RDMA clients Date: Sun, 22 Apr 2018 15:52:06 +0200 Message-Id: <20180422135109.726959401@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422135104.278511750@linuxfoundation.org> References: <20180422135104.278511750@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Chuck Lever commit 6720a89933739cb8dec748cd253f7c8df2c0ae4d upstream. With v4.15, on one of my NFS/RDMA clients I measured a nearly doubling in the latency of small read and write system calls. There was no change in server round trip time. The extra latency appears in the whole RPC execution path. "git bisect" settled on commit ccede7598588 ("xprtrdma: Spread reply processing over more CPUs") . After some experimentation, I found that leaving the WQ bound and allowing the scheduler to pick the dispatch CPU seems to eliminate the long latencies, and it does not introduce any new regressions. The fix is implemented by reverting only the part of commit ccede7598588 ("xprtrdma: Spread reply processing over more CPUs") that dispatches RPC replies specifically on the CPU where the matching RPC call was made. Interestingly, saving the CPU number and later queuing reply processing there was effective _only_ for a NFS READ and WRITE request. On my NUMA client, in-kernel RPC reply processing for asynchronous RPCs was dispatched on the same CPU where the RPC call was made, as expected. However synchronous RPCs seem to get their reply dispatched on some other CPU than where the call was placed, every time. Fixes: ccede7598588 ("xprtrdma: Spread reply processing over ... ") Signed-off-by: Chuck Lever Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Anna Schumaker Signed-off-by: Greg Kroah-Hartman --- net/sunrpc/xprtrdma/rpc_rdma.c | 2 +- net/sunrpc/xprtrdma/transport.c | 2 -- net/sunrpc/xprtrdma/xprt_rdma.h | 1 - 3 files changed, 1 insertion(+), 4 deletions(-) --- a/net/sunrpc/xprtrdma/rpc_rdma.c +++ b/net/sunrpc/xprtrdma/rpc_rdma.c @@ -1366,7 +1366,7 @@ void rpcrdma_reply_handler(struct rpcrdm trace_xprtrdma_reply(rqst->rq_task, rep, req, credits); - queue_work_on(req->rl_cpu, rpcrdma_receive_wq, &rep->rr_work); + queue_work(rpcrdma_receive_wq, &rep->rr_work); return; out_badstatus: --- a/net/sunrpc/xprtrdma/transport.c +++ b/net/sunrpc/xprtrdma/transport.c @@ -52,7 +52,6 @@ #include #include #include -#include #include "xprt_rdma.h" @@ -651,7 +650,6 @@ xprt_rdma_allocate(struct rpc_task *task if (!rpcrdma_get_recvbuf(r_xprt, req, rqst->rq_rcvsize, flags)) goto out_fail; - req->rl_cpu = smp_processor_id(); req->rl_connect_cookie = 0; /* our reserved value */ rpcrdma_set_xprtdata(rqst, req); rqst->rq_buffer = req->rl_sendbuf->rg_base; --- a/net/sunrpc/xprtrdma/xprt_rdma.h +++ b/net/sunrpc/xprtrdma/xprt_rdma.h @@ -334,7 +334,6 @@ enum { struct rpcrdma_buffer; struct rpcrdma_req { struct list_head rl_list; - int rl_cpu; unsigned int rl_connect_cookie; struct rpcrdma_buffer *rl_buffer; struct rpcrdma_rep *rl_reply;