Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1396442pxf; Fri, 9 Apr 2021 07:27:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzYzStVWMkhK8suHFgvaPi5y+F9KNQ+8KtppQawftfPii7MCPTAWAJ9+1Eh9SKARXZiiz+6 X-Received: by 2002:a17:907:2d0c:: with SMTP id gs12mr2484978ejc.443.1617978471957; Fri, 09 Apr 2021 07:27:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617978471; cv=none; d=google.com; s=arc-20160816; b=yPUDKQJtmOXS//uTrC04809OYbld7xSLQXH7BTJVFSQIeIVTOwjzMXe27b4uIdV2xw gP5HhrUJLuzwyl1Wtm4LnEIK3dEZoWBg5pGcWnYm5qFQr7pUqJowb+eScLImEONmQesz sATSw77n6B5evqwGVZQp3ij31AYMYmRyOki/+YUKvP3puP/8oQzG/5J3wYPjPQGRRfJC kgr81qcku2gb1gmcXDSQt8GXxJDehk4rDPgejqij/MNpFZS0uFhP1J/yxnM/YHk3iu0u LcTo9DW7EAED55jICAxndefYwTyaaOODQy6W1iie3cCmudnISieQY1b/yK9hZXGv6LXu RGPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=VrcauDRI2G2cppws5nGpiAIytO0As0ZxLlEfBcOAXks=; b=J8xvifAuuyzRvrmILJxnFKnBlYxQ6pVTKxw/AHE8e7qVLpFY3dwvmpA87FR7FL4ne7 6s5eFF4+UbIpro6lbO3N4cvimDjcz3cxGwvrRd01hmlM3vS39wCwrrhUn3wFxUrxX+jr y7fbgl7ydw+EWlKKlJj8EYzEbrB24g/CyrlY7Ybm3ez7zPfICuWP8I31BHA9Up1uC9lz Zb7uSvZ4YH0M1j/m4q69Fsq351Rzqwku56XNGGnjgekevfttDYCwLwsm1ffouegwf+tM 0SJ3vJCYQahZhLYFY9fejD75m6nqobK35e6gdpY/nvbGGl2Y5Qvqpf70wqhhHk9t/SuL hVsA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ec23si2197594ejb.710.2021.04.09.07.27.26; Fri, 09 Apr 2021 07:27:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233535AbhDIO0q (ORCPT + 99 others); Fri, 9 Apr 2021 10:26:46 -0400 Received: from p3plsmtpa06-08.prod.phx3.secureserver.net ([173.201.192.109]:46414 "EHLO p3plsmtpa06-08.prod.phx3.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233009AbhDIO0p (ORCPT ); Fri, 9 Apr 2021 10:26:45 -0400 Received: from [192.168.0.116] ([71.184.94.153]) by :SMTPAUTH: with ESMTPSA id Us5VlwAa2dm6ZUs5Wl5bWv; Fri, 09 Apr 2021 07:26:30 -0700 X-CMAE-Analysis: v=2.4 cv=U/hXscnu c=1 sm=1 tr=0 ts=60706416 a=vbvdVb1zh1xTTaY8rfQfKQ==:117 a=vbvdVb1zh1xTTaY8rfQfKQ==:17 a=IkcTkHD0fZMA:10 a=RpEpbWADTCFOjAt6Sx0A:9 a=QEXdDO2ut3YA:10 X-SECURESERVER-ACCT: tom@talpey.com Subject: Re: [PATCH rdma-next 00/10] Enable relaxed ordering for ULPs To: Jason Gunthorpe , Chuck Lever III Cc: Christoph Hellwig , Leon Romanovsky , Doug Ledford , Leon Romanovsky , Adit Ranadive , Anna Schumaker , Ariel Elior , Avihai Horon , Bart Van Assche , Bernard Metzler , "David S. Miller" , Dennis Dalessandro , Devesh Sharma , Faisal Latif , Jack Wang , Jakub Kicinski , Bruce Fields , Jens Axboe , Karsten Graul , Keith Busch , Lijun Ou , CIFS , LKML , Linux NFS Mailing List , "linux-nvme@lists.infradead.org" , linux-rdma , "linux-s390@vger.kernel.org" , Max Gurtovoy , Max Gurtovoy , "Md. Haris Iqbal" , Michael Guralnik , Michal Kalderon , Mike Marciniszyn , Naresh Kumar PBS , Linux-Net , Potnuri Bharat Teja , "rds-devel@oss.oracle.com" , Sagi Grimberg , "samba-technical@lists.samba.org" , Santosh Shilimkar , Selvin Xavier , Shiraz Saleem , Somnath Kotur , Sriharsha Basavapatna , Steve French , Trond Myklebust , VMware PV-Drivers , Weihang Li , Yishai Hadas , Zhu Yanjun References: <20210405052404.213889-1-leon@kernel.org> <20210405134115.GA22346@lst.de> <20210405200739.GB7405@nvidia.com> <20210406114952.GH7405@nvidia.com> From: Tom Talpey Message-ID: Date: Fri, 9 Apr 2021 10:26:21 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: <20210406114952.GH7405@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4xfDL4pd6XxGPHGfvdTinpyMkVkFRWAFpjjSftPvGYODZvaLECzsppXek/rN3BIA1uWJ0baYy+yAp8rDK+SNoiDivEM1RvfoozMqfGwVMebigGEtKaxUNM KDSS+mV10/D5XuojlzmJbX0vkZv3QFfiuPxwu5wGRnXIV94q8IyVTbNqSdjBCCdx9P26Aunt64telKqrj5+ujkLrUNl5iyM/G5VWutWtDcRt41UCOzaOzgl5 rQW2TjtvbDGiLp+oQpWFiGR6ZlSoYcndYGCXo5nTfm1lwq9BX2jDGnYxEQj1fiIF1wloJc40NySzS7yX+VcgTru7haxkmD/OKgLUHURpnRVw6vyxDprFypKq y+68JDGL6iHhWhaTVmuPjIh6XMg4nOSXPE4aLe1eky3MkdqKKCwA5FwzI//jN9u3N/4cemXIo1TtA/9m9lkb5gSxANwgiwLPChaYc25byjiYoiglmvzhGLj7 n2ur5Ec9z4r0eRdQY+QU7rCN+AuWyi+HZYJvCU+dz5roP1jC8BnsiN6P9QEk0HFikG3c0ATvnhXxArrEPo/bDJzc03pXhSaRH9q+Q0tExO7pwcjzdJoB1dEi Nizrg29VBXx9p6t1UnqLh24C8RTWxlZCIhue6PDP7XIMsy3A2FEP5T5+WBAgfHFGJ0/S08NYmdfT6BYaZ7PS1basLH/a2JmiP7lHm/zRknGdx/B11VY0fId0 fY6ew9xFWO+v/Qs12qBCOFTi1/pzVuF6ArPlxPSsCflDHsWLhgStnW8fZzV232ySqcQQJhh6AjMMQ7XsAOvI9DfXzHvvwmXbI05dO6gSMdVOtTRY9QdeJJeD 6T7UUY8LMs62kh5CcFXz4tufLe0Ni+kEuhVh1YvCmtBECR0vrRyvfRcXQ0xUKLLsjGVcI1PsEpeyXoz5cSbEAypiDVV8DVDAGL8+OK1AS+SMGLYOCT2AvWFB 1wfYuQW/VbP0DLp7BGgxocyZfkrxFmbw0BdmLW3HVttcVAXu1qYbusuGCXdlovsjc4HoSIqmebQNeQGVA+Myh/zuGQXDvVC5MwdpcSjyiysNWNQveXQfOpOZ ogAu0EmHI2ZLDyGvW01C73n40VyxfNLhPf3Hs+1k/pNdSf/ZcBMrMBzKR36Q5AnSJkXgBFikZjT8m5djuL3xZ4ljc9Rrcu8rnjbCltD6aJjEVhNz8bM/blvc /H4vqQUCzK/27755XYBKj6HAFsSXgHCSW6quHROCpvJDs2Tm2s5GBWt2y0dU9Xsz6SszIRHxkw08fN8It3SlicZkFxVNt8TM7GN0aTSL3TtWK2TuBorLi0jb hxOoTMJheb6XZIQnDraErUC+WBihJRcahXiYMSkaOdt22tO5Kn+VmgdMtOdcyPiZ+GgB0sXGfKWlr3acgvEAysUaauzwf8635/mlBeZyCHg+B4LBP0SZpX1c shworsBrfwgg1yNNYSw4XMk7nIICniJG3B1r4kc1f75t9bDG2yGRReeLaRgmXoOYQQMKltSIM2vieMJROMFNf2ozfzVIt2PKsqRz/Vken5qgCTMvmmw5qsX5 YaIWkcScoG+1lRYkpxHP8XrJ1sxqJfFnc94GT6HIckMxN43RwpoVK5ckJX1/Jwdbo+bPrrtodFHOgFs83B6JCqqxDQgJ6JHbUkpiASt0WWz0fE0rPHrCHuOf mryfPvvopKU1NaoBHiicHQA2Tn+rTlGb4/5HDi0rH8ErlIun9Xmb0wBtaQfiW/5XmPQBE/glW9JhtoX1QFdPx1Z7BZKUK9VBvIUuao1w41lPb+kHbOWC5qaT v4uzWTozXqiOMMRgcpRj4Ycp0NOy/4yuPJ0yTUvGd0YmcFqtYAbOLxO9tPnR1dlhURrgcpMrYMa8Qwf/X5k/HEWZcy4WCivSERxfvU++4a1o3KYhKY990GOD JTwIQpWpHY/5lAZ6rZEoeF8aQaIUUGorrvbwGbESTMXT0Xga Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/6/2021 7:49 AM, Jason Gunthorpe wrote: > On Mon, Apr 05, 2021 at 11:42:31PM +0000, Chuck Lever III wrote: > >> We need to get a better idea what correctness testing has been done, >> and whether positive correctness testing results can be replicated >> on a variety of platforms. > > RO has been rolling out slowly on mlx5 over a few years and storage > ULPs are the last to change. eg the mlx5 ethernet driver has had RO > turned on for a long time, userspace HPC applications have been using > it for a while now too. I'd love to see RO be used more, it was always something the RDMA specs supported and carefully architected for. My only concern is that it's difficult to get right, especially when the platforms have been running strictly-ordered for so long. The ULPs need testing, and a lot of it. > We know there are platforms with broken RO implementations (like > Haswell) but the kernel is supposed to globally turn off RO on all > those cases. I'd be a bit surprised if we discover any more from this > series. > > On the other hand there are platforms that get huge speed ups from > turning this on, AMD is one example, there are a bunch in the ARM > world too. My belief is that the biggest risk is from situations where completions are batched, and therefore polling is used to detect them without interrupts (which explicitly). The RO pipeline will completely reorder DMA writes, and consumers which infer ordering from memory contents may break. This can even apply within the provider code, which may attempt to poll WR and CQ structures, and be tripped up. The Mellanox adapter, itself, historically has strict in-order DMA semantics, and while it's great to relax that, changing it by default for all consumers is something to consider very cautiously. > Still, obviously people should test on the platforms they have. Yes, and "test" be taken seriously with focus on ULP data integrity. Speedups will mean nothing if the data is ever damaged. Tom.