Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp7150553ybi; Thu, 13 Jun 2019 10:24:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqzb/WtG6esZNol7AyoevDkJZLXPpwG8CU9i0aTQRgSCoeu4qhgsOd90rIdJdPmeT8hmASae X-Received: by 2002:aa7:8705:: with SMTP id b5mr73564273pfo.27.1560446663041; Thu, 13 Jun 2019 10:24:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560446663; cv=none; d=google.com; s=arc-20160816; b=ZFQzKmjNEICsIn1oRkGhbjnMtg0HMLOSXlrzoeyYvi7KqmS2ZuJbOiKntbvp5WK1me a/dkulQ0V6vk/uZl4BMli/xRyrn+hGBkVSqAcQiU0OLrpv9b+DJj0OQW5NhxUpO41DOq 9HkJCL5bBFfC6npLyiLxzBPw0VZkkq8KLrxrhvja9ZBzFGVLj8HoGH55wrZvvzd0NwqM gtrYIClff/0fyMnpemH1NsTazETVmRqazFIdtgf8setQ6JT39bX96duQQ+2rtKLt4f9H H3YESErHlkOCzolSUnRYV+/75xUjxpZHQODi7IjyUezsK/z6PLF5f+32M7JipbKhRROo usfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=GtpmeeTFCKpP6HQrilI3kSYLuVspQkmTP/uXeNzDQjo=; b=rnFdVgtiz+t6lee7w6+c7eLG18kXXRESzaeJyJbyKP5ctAFc6t29YYxitsPDEgfF1J ex5079WgCc8PPddd7EjoHagGGX3i7WVIMOnVtWDKYYkBrZUbcxehsa09m2mUu2NfPLfe dqUtmPL32Pm/zEMqS6slPKebR4lE3L9u4b5br8yvnI8emVylUFTI9k+er2hsf8qoT8Wb IAs5spPfpcH2L7rksd8L9JDPUOoEA0FdguePUZjLX8NsE0+hfB6zHs2j22dMhRrAyHLB V8a1OrR8xuTOTEoZIFdK9v9Np1F5ieYBEPUM4atED0pl1XEdOOv3ti2/s2a6IDYI7pAy WLdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=mT3F8J6N; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s197si126918pfs.289.2019.06.13.10.24.08; Thu, 13 Jun 2019 10:24:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=mT3F8J6N; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729108AbfFMRX5 (ORCPT + 99 others); Thu, 13 Jun 2019 13:23:57 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:44461 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728945AbfFMRX5 (ORCPT ); Thu, 13 Jun 2019 13:23:57 -0400 Received: by mail-qk1-f196.google.com with SMTP id p144so1144254qke.11 for ; Thu, 13 Jun 2019 10:23:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=GtpmeeTFCKpP6HQrilI3kSYLuVspQkmTP/uXeNzDQjo=; b=mT3F8J6NfzNTEEhWSzulXpOJYe2i9a0RuYZvLuz89rToDUw0iEhW0InNtPGdnBkhEK a+fiz8/aGS3ScKPNTOnoa0O0W9uViAdjElZYNdW8hJ20W86p4yu5MHDIwLYNA+3NZFlE GWNAlGdtPHRDhR7qREYLPpVVnpoR+1+ZhlssyM9eVdLnXX3y7kTm4BczGnYMLnfHl5w9 OOGeasLQVks3ncSG3fiA0BebO5QmTQCYiJqfzK2Bo5DAZNiq4C+96flUcbb7TS89+dEx sK70TNR+KyjjryqnWboE0bAIloRYVP/5IZf7v0CT5LhsIhplTS5AQ0j0+4Vt8hh1bY+c bp1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=GtpmeeTFCKpP6HQrilI3kSYLuVspQkmTP/uXeNzDQjo=; b=KvmA8Qhn5ZXVETn8DrnE7S6h4JYez9R5hERkiQzxoa1Ph7enQeV1jOanyioWjTb/dy 3zdON/dkz9/anqo0WMWv7KNdY/FJRvQezrPi6O4d69blbgmyrzILj4zjF2MU1Qn4n/75 oxZvnHq78/aAfMUNEP0MxeyyB+12VgqjSovBwdsV0TBJLaRIFPrZmtfNwqtIrexP9U1E Q6xNoxigrK8hKa0+awnAwQ6NFGehJBeBMVrm8HiQ9rrTiOJMTAex5Lm8yZcrQpnCu7Dq xueR+a8lkTfSf4WZrJUFFvgjp3xRrdSR73j25uMRy2gAZQQjUIR0BjN2fS0f2xy9tfKK l61Q== X-Gm-Message-State: APjAAAUkmrfrARGbNAkmUOAroUiQSsaByEmY7Gdz7tMT1x1OWdLRFWVK NtXrxGsnMsDyuvpqFuKT8fdf+A== X-Received: by 2002:a05:620a:124f:: with SMTP id a15mr72445088qkl.173.1560446636313; Thu, 13 Jun 2019 10:23:56 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-55-100.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.55.100]) by smtp.gmail.com with ESMTPSA id c4sm98515qkd.24.2019.06.13.10.23.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 13 Jun 2019 10:23:55 -0700 (PDT) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1hbTS7-00034X-97; Thu, 13 Jun 2019 14:23:55 -0300 Date: Thu, 13 Jun 2019 14:23:55 -0300 From: Jason Gunthorpe To: =?utf-8?B?SMOla29u?= Bugge Cc: Doug Ledford , Leon Romanovsky , Parav Pandit , Steve Wise , OFED mailing list , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] RDMA/cma: Make CM response timeout and # CM retries configurable Message-ID: <20190613172355.GF22901@ziepe.ca> References: <20190226075722.1692315-1-haakon.bugge@oracle.com> <174ccd37a9ffa05d0c7c03fe80ff7170a9270824.camel@redhat.com> <67B4F337-4C3A-4193-B1EF-42FD4765CBB7@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <67B4F337-4C3A-4193-B1EF-42FD4765CBB7@oracle.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 13, 2019 at 06:58:30PM +0200, HÃ¥kon Bugge wrote: > If you refer to the backlog parameter in rdma_listen(), I cannot see > it being used at all for IB. > > For CX-3, which is paravirtualized wrt. MAD packets, it is the proxy > UD receive queue length for the PF driver that can be construed as a > backlog. No, in IB you can drop UD packets if your RQ is full - so the proxy RQ is really part of the overall RQ on QP1. The backlog starts once packets are taken off the RQ and begin the connection accept processing. > Customer configures #VMs and different workload may lead to way > different number of CM connections. The proxying of MAD packet > through the PF driver has a finite packet rate. With 64 VMs, 10.000 > QPs on each, all going down due to a switch failing or similar, you > have 640.000 DREQs to be sent, and with the finite packet rate of MA > packets through the PF, this takes more than the current CM > timeout. And then you re-transmit and increase the burden of the PF > proxying. I feel like the performance of all this proxying is too low to support such a large work load :( Can it be improved? Jason