Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp1744789imc; Fri, 22 Feb 2019 10:20:37 -0800 (PST) X-Google-Smtp-Source: AHgI3IbncFwwaFuq8kR7qsj4u4pYy5WoA0CE9jZBS9vHQRPAM37Zax7q1D/ZaR0gqs+A+LA7Pf4d X-Received: by 2002:a63:b14d:: with SMTP id g13mr5280136pgp.270.1550859637698; Fri, 22 Feb 2019 10:20:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550859637; cv=none; d=google.com; s=arc-20160816; b=W3UlLh/pxmA5y9vl52iqNDS06df5RIB813QxgSyg4KxbqhJxrxrrdtkOoGD9+KAVeo EVUzKsJ1+q83bRSYJx3QzH2Fbhts7awvZgD0adSzJHR/A29IjVKXJNlSw0rwVpnzM2OT nPNUYbi8PJTDCK2Je6PASMZ2b8trMT13SigGUeRoTwH7HDF54EB4ev1/IpAuE87LuW4X 0vvRbXod+hoI6KLhYPssFNfcdRXq7Jf09lC3azWrCl87n/zfhIDQqgjRfy0gGzQuoZBl bnCPTlvus0D84aZD/hdKcICAt2sEcAGmCJymDGV6VNwPJkyCVPtPUsKjpR53aKzI1KBg VMgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=XvH0hKFiegkEa7My+HGrtE9xRd3iiV+9MpkunJwzduI=; b=kiphqyZVzYUGIS9geTeuI0v0WSpdBlR5f6+vts3okLgmi6bXN2ZgKBh7DClCwaYTzo r78YU4507fxBfLXGqXB9vYCRfcgBUoBIp/tim9F9xhBLYO4RvafMDrQuXC8c1gSwBZlH Ru3rys50efxbcG15BL3hfoECM/l4EJZ9Peb1dBJUD8amIHukjRenLWBi33A6+rOToba9 Jq7Ernyf/XQyKYW5duWIzcJ38Kp7fM9wLALnkqQzctZCt634KAjqpRnLugIxoun2uH/C z+aT3mjTPbBg8KCGpapm5+y9oLCdSLBwgRDFs8M2IXqf7l7NBBzcoxS1X9z/brFUgz3f rayA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=X4y+1Y0H; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 63si1918472pla.187.2019.02.22.10.20.22; Fri, 22 Feb 2019 10:20:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=X4y+1Y0H; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727554AbfBVSSD (ORCPT + 99 others); Fri, 22 Feb 2019 13:18:03 -0500 Received: from mail-pl1-f193.google.com ([209.85.214.193]:35173 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726246AbfBVSSD (ORCPT ); Fri, 22 Feb 2019 13:18:03 -0500 Received: by mail-pl1-f193.google.com with SMTP id p19so1456997plo.2 for ; Fri, 22 Feb 2019 10:18:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=XvH0hKFiegkEa7My+HGrtE9xRd3iiV+9MpkunJwzduI=; b=X4y+1Y0HlLzaRqcwDnW/IgGWlWm5BThIee+t1Gqp5wKrGe1DsOeYbuLc3UBNbNB2JY 9eRB6pnoCwQ6y7qU9bV/qCeSoS7MWHFYXETUkqeVQSYE2hgCX5xCSN3rrvj+WZyWybLh dJIup3Q1lY5Lb9QZqXrjdEkDFzAHDjbtuxp2gHOj8S9Jpi+CQuLUxuECK8i+EiyZj0QY n1aH1bnDPG4LjRb33iFlgBfY09e/Lu8IQFT/EsGrbdz31mfxAQfIPIXqgl/5yQ9Owjyi yGt8T2+aISUB/PQKQEPSBn2y1OWDqA1CVhbnHoEZ3ufyDQIlsJzXa+VLL6imB+hbfsqL IZ0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=XvH0hKFiegkEa7My+HGrtE9xRd3iiV+9MpkunJwzduI=; b=VbzKThyy5tRm35VNYPuHobWRa+EECV8R7/drCliyTRg2zSNyxPR34T12MUo1whWefK NyV+u7fOhu3R5aJJ1NkZ8dRL+7v4AmpDUmCx9wZs3lFTaNmHMH3o+h9ECE+xtCYW2+nC KjnNQ0pwSrjshCqEaJzLaQ22T1eAqF48maO4amhS/IUZQjhiLhtm0BSwzdAOJhnnfdb6 t8Z77a5O9YpJLIQKWicUGDQWEEYnPVhC1ZrU5c79wJWphFjNSM3jOeymovi0f0DUyzQ7 UAh09ovpixsuCyiYh4nVW+IxF5ImVUHCNP556A6tTUSE5DY/hY4PAS3ipzMVSadtL1j6 zebA== X-Gm-Message-State: AHQUAuYNfL0Ad1Y8H43lRtr+pZDdnXo/pabPJOBTf8TZbpYQYnLKvKac rFpuHhomcUiU1ySdEFaXnGw7jg== X-Received: by 2002:a17:902:6a4:: with SMTP id 33mr5348763plh.99.1550859482188; Fri, 22 Feb 2019 10:18:02 -0800 (PST) Received: from ziepe.ca (S010614cc2056d97f.ed.shawcable.net. [174.3.196.123]) by smtp.gmail.com with ESMTPSA id n74sm3769900pfb.188.2019.02.22.10.18.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 22 Feb 2019 10:18:01 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1gxFOa-0002U2-Qa; Fri, 22 Feb 2019 11:18:00 -0700 Date: Fri, 22 Feb 2019 11:18:00 -0700 From: Jason Gunthorpe To: =?utf-8?B?SMOla29u?= Bugge Cc: Yishai Hadas , Doug Ledford , jackm@dev.mellanox.co.il, majd@mellanox.com, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] IB/mlx4: Increase the timeout for CM cache Message-ID: <20190222181800.GA9524@ziepe.ca> References: <20190217144512.1171546-1-haakon.bugge@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190217144512.1171546-1-haakon.bugge@oracle.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 17, 2019 at 03:45:12PM +0100, Håkon Bugge wrote: > Using CX-3 virtual functions, either from a bare-metal machine or > pass-through from a VM, MAD packets are proxied through the PF driver. > > Since the VF drivers have separate name spaces for MAD Transaction Ids > (TIDs), the PF driver has to re-map the TIDs and keep the book keeping > in a cache. > > Following the RDMA Connection Manager (CM) protocol, it is clear when > an entry has to evicted form the cache. But life is not perfect, > remote peers may die or be rebooted. Hence, it's a timeout to wipe out > a cache entry, when the PF driver assumes the remote peer has gone. > > During workloads where a high number of QPs are destroyed concurrently, > excessive amount of CM DREQ retries has been observed > > The problem can be demonstrated in a bare-metal environment, where two > nodes have instantiated 8 VFs each. This using dual ported HCAs, so we > have 16 vPorts per physical server. > > 64 processes are associated with each vPort and creates and destroys > one QP for each of the remote 64 processes. That is, 1024 QPs per > vPort, all in all 16K QPs. The QPs are created/destroyed using the > CM. > > When tearing down these 16K QPs, excessive CM DREQ retries (and > duplicates) are observed. With some cat/paste/awk wizardry on the > infiniband_cm sysfs, we observe as sum of the 16 vPorts on one of the > nodes: > > cm_rx_duplicates: > dreq 2102 > cm_rx_msgs: > drep 1989 > dreq 6195 > rep 3968 > req 4224 > rtu 4224 > cm_tx_msgs: > drep 4093 > dreq 27568 > rep 4224 > req 3968 > rtu 3968 > cm_tx_retries: > dreq 23469 > > Note that the active/passive side is equally distributed between the > two nodes. > > Enabling pr_debug in cm.c gives tons of: > > [171778.814239] mlx4_ib_multiplex_cm_handler: id{slave: > 1,sl_cm_id: 0xd393089f} is NULL! > > By increasing the CM_CLEANUP_CACHE_TIMEOUT from 5 to 30 seconds, the > tear-down phase of the application is reduced from approximately 90 to > 50 seconds. Retries/duplicates are also significantly reduced: > > cm_rx_duplicates: > dreq 2460 > [] > cm_tx_retries: > dreq 3010 > req 47 > > Increasing the timeout further didn't help, as these duplicates and > retries stems from a too short CMA timeout, which was 20 (~4 seconds) > on the systems. By increasing the CMA timeout to 22 (~17 seconds), the > numbers fell down to about 10 for both of them. > > Adjustment of the CMA timeout is not part of this commit. > > Signed-off-by: Håkon Bugge > Acked-by: Jack Morgenstein > --- Applied to for-next Thanks, Jason