Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp2332813imc; Sat, 23 Feb 2019 00:51:59 -0800 (PST) X-Google-Smtp-Source: AHgI3IZ7CgOxFOyOef2CYR0aHaI5Mk4EhJUirIfaH3V2v7v9bjnTX1vcsrbCnMj7d6zKtDUw+6BE X-Received: by 2002:a63:3548:: with SMTP id c69mr8126276pga.256.1550911919892; Sat, 23 Feb 2019 00:51:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550911919; cv=none; d=google.com; s=arc-20160816; b=u82rJtR4TC7PQAtOuZbEdJJAQqIOQeOf6mtfSnCjiwvQyFtSD2ifViimRQqqyuEOmW 6cASGunfkMAkB9qHSjXISHZBjAu/fNfIngticJ+oOpsn5AL5BVsUAg/OgLXal9RaiLFp bsmi8Jp4zEfBcANVsu+w7evzj53D9b40chrSp3gJqtGVUCo5dm/W6yt1xesL1G92Sg5v pVq2k/M2Y5R5633JhG7atvpefqyOM9PirDhCztvWT3yNHvs9u4PWDn6EjAaLf/WB3TT5 5L25iiIIbLk7cYjQDUXXSZo5HGI8Pewnn093+AbLuVvO2MmuN9k9jmgsWEWE83i0Zdng 1zhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=4n3jGD0QAwWh/S69spuxkcgSD4O6MjTkej2FLjpciYU=; b=y5MZXow53Em6r/Xlk6HxLMgdktaYWWumVc8lsp1ONAc01u7p8mxlQv8SF8/qUaa62X Cv1b9aafcJOIpE+dVMcy9kqa1KPKzN7dJwfgRITfnwKHS96BL6c4RXIx+6d1Eos9ZJ0y L6mqHzHGEmYnWiuopw05ZPybP4vx18U+rZYSHte2qAx+soefJ4DJO0hy1SoAAlqRobtE gmabNG/FsSZgLJUEhR0JuPr6TzgRc0Vxfr7XXQQfTm/SjBS501sf+foUR4ccyQE61ODd Ay9tH16ebPbICFlvUqmHjIfKg6Jx2QO2P5s6+tL7g18Md4OJzK9Z9l4u+dwlHaG0r502 wivQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=G1LgigA0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w189si3423245pgd.486.2019.02.23.00.51.44; Sat, 23 Feb 2019 00:51:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=G1LgigA0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727758AbfBWItg (ORCPT + 99 others); Sat, 23 Feb 2019 03:49:36 -0500 Received: from mail.kernel.org ([198.145.29.99]:45658 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727045AbfBWItg (ORCPT ); Sat, 23 Feb 2019 03:49:36 -0500 Received: from localhost (unknown [77.138.135.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 98E8B20684; Sat, 23 Feb 2019 08:49:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1550911775; bh=JtSt8PVWRAnZLISQFVTcjSj49kQls2AezvlmrYh9PQM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=G1LgigA0xCkzqmYhkdqn9rxwqty6YsO/z2rhyclc0HN5F0dLd7UhskIV9YFY3zR1k TvAAFZJDm7xES9bv+nnk+XQjPA3ntYT0iKMrzID/iHKGqVkKKAuD8PHJD3+tIeEAVI eYw8rKrikEbSbHM58UqWZA1PA79P3T3+U8wkFLXM= Date: Sat, 23 Feb 2019 10:49:30 +0200 From: Leon Romanovsky To: Doug Ledford Cc: Steve Wise , Jason Gunthorpe , =?iso-8859-1?Q?H=E5kon?= Bugge , Parav Pandit , "linux-rdma@vger.kernel.org" , linux-kernel@vger.kernel.org Subject: Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable Message-ID: <20190223084930.GJ23561@mtr-leonro.mtl.com> References: <20190217170909.1178575-1-haakon.bugge@oracle.com> <20190222163637.GA9819@ziepe.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Zrag5V6pnZGjLKiw" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --Zrag5V6pnZGjLKiw Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Feb 22, 2019 at 12:51:55PM -0500, Doug Ledford wrote: > > > > On Feb 22, 2019, at 12:14 PM, Steve Wise = wrote: > > > > > > On 2/22/2019 10:36 AM, Jason Gunthorpe wrote: > >> On Sun, Feb 17, 2019 at 06:09:09PM +0100, H=C3=A5kon Bugge wrote: > >>> During certain workloads, the default CM response timeout is too > >>> short, leading to excessive retries. Hence, make it configurable > >>> through sysctl. While at it, also make number of CM retries > >>> configurable. > >>> > >>> The defaults are not changed. > >>> > >>> Signed-off-by: H=C3=A5kon Bugge > >>> drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++----- > >>> 1 file changed, 44 insertions(+), 7 deletions(-) > >>> > >>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/= cma.c > >>> index c43512752b8a..ce99e1cd1029 100644 > >>> +++ b/drivers/infiniband/core/cma.c > >>> @@ -43,6 +43,7 @@ > >>> #include > >>> #include > >>> #include > >>> +#include > >>> #include > >>> > >>> #include > >>> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty"); > >>> MODULE_DESCRIPTION("Generic RDMA CM Agent"); > >>> MODULE_LICENSE("Dual BSD/GPL"); > >>> > >>> -#define CMA_CM_RESPONSE_TIMEOUT 20 > >>> #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000 > >>> -#define CMA_MAX_CM_RETRIES 15 > >>> #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24) > >>> #define CMA_IBOE_PACKET_LIFETIME 18 > >>> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP > >>> > >>> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20 > >>> +static int cma_cm_response_timeout =3D CMA_DFLT_CM_RESPONSE_TIMEOUT; > >>> +static int cma_cm_response_timeout_min =3D 8; > >>> +static int cma_cm_response_timeout_max =3D 31; > >>> +#undef CMA_DFLT_CM_RESPONSE_TIMEOUT > >>> + > >>> +#define CMA_DFLT_MAX_CM_RETRIES 15 > >>> +static int cma_max_cm_retries =3D CMA_DFLT_MAX_CM_RETRIES; > >>> +static int cma_max_cm_retries_min =3D 1; > >>> +static int cma_max_cm_retries_max =3D 100; > >>> +#undef CMA_DFLT_MAX_CM_RETRIES > >>> + > >>> +static struct ctl_table_header *cma_ctl_table_hdr; > >>> +static struct ctl_table cma_ctl_table[] =3D { > >>> + { > >>> + .procname =3D "cma_cm_response_timeout", > >>> + .data =3D &cma_cm_response_timeout, > >>> + .maxlen =3D sizeof(cma_cm_response_timeout), > >>> + .mode =3D 0644, > >>> + .proc_handler =3D proc_dointvec_minmax, > >>> + .extra1 =3D &cma_cm_response_timeout_min, > >>> + .extra2 =3D &cma_cm_response_timeout_max, > >>> + }, > >>> + { > >>> + .procname =3D "cma_max_cm_retries", > >>> + .data =3D &cma_max_cm_retries, > >>> + .maxlen =3D sizeof(cma_max_cm_retries), > >>> + .mode =3D 0644, > >>> + .proc_handler =3D proc_dointvec_minmax, > >>> + .extra1 =3D &cma_max_cm_retries_min, > >>> + .extra2 =3D &cma_max_cm_retries_max, > >>> + }, > >>> + { } > >>> +}; > >> Is sysctl the right approach here? Should it be rdma tool instead? > >> > >> Jason > > > > There are other rdma sysctls currently: net.rdma_ucm.max_backlog and > > net.iw_cm.default_backlog. The core network stack seems to use sysctl > > and not ip tool to set basically globals. > > > > To use rdma tool, we'd have to have some concept of a "module" object, I > > guess. IE there's dev, link, and resource rdma tool objects currently. > > But these cma timeout settings are really not per dev, link, nor a > > resource. Maybe we have just a "core" object: rdma core set > > cma_max_cm_retries min 8 max 30. > > I don=E2=80=99t know, I think you make a fairly good argument for leaving= it as a sysctl. We have infrastructure in place for admins to set persist= ent sysctl settings. The per device/link settings need something different= because link names and such can change. Since these are globals, I=E2=80= =99d leave them where they are. > I have patches from Parav which extend rdmatool to set global to whole stack parameters, something like "rdma system ...", so the option to set through rdmatool global parameters for modules e.g. "rdma system cma se= t ..." exists. But I'm not sure if it is right thing to do. > -- > Doug Ledford > GPG KeyID: B826A3330E572FDD > Key fingerprint =3D AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD > > > > > > > --Zrag5V6pnZGjLKiw Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBAgAGBQJccQkaAAoJEORje4g2clinKGMP/1WTMBxq8GTQD2lnIFTMyrzC YeXQ9H7MAvM+4aQ45KU2F85hf5dJEBoayu2+aacCphwme3YApLEh4UmKKtiPjJ0q 6N5/NO7Ru6Li3nI5kXzQka5AMnAPIoX9YaSDF8jOKQHWYSb2c2sse0Jld+Dt4iYl Qi1aK7ZSJT3pF5En2nAAwPXbmRk80bFFpocCTkLqhJEJ3Zt73K9l3+LP0B3LS16r C784cabgxkvbPkkc67YDLzixnuuTwsDu819drTi4edp87J5tHyj/BCtZclNu+ZiT 15y65Ma+wW0SLt9BBl+K5KwqX3JfqoM7oX47eGrWpe1C+9kdaDSdCFe0SeN/wxSg /FL8ucw551hwYwKx5yPF38Jfkt7kJv/NlhBbOmqucTDCIGd6us0F5e4ixJhh5hA7 wBV6tQESgtugWAdiHz6Pu1yAWXiOQuoekhmGxNf33Yt5Fe4QRs8i+uj+iIrLhQ8B mj9kpIFmR17w9OYhVU118wFuDiERM6LuVkyPN1jaVWwIXfvoSt44e/4gqkLW+Htk ioqjLbuzuu5zzzcUP4X2gsm//vJreglT6EjhMgDiTtWkOdSDn3hQj3Pm7aMhUYXm fnrdhkSCgGFdbStKL1EYwjC4rJE97cjM3YUqzEtAmwWmBz3d+vD3sF7hoUVYgk0B eKC4tCVyUQ0Ao4Cquefr =Rw17 -----END PGP SIGNATURE----- --Zrag5V6pnZGjLKiw--