Received: by 10.192.165.148 with SMTP id m20csp2017008imm; Thu, 3 May 2018 09:01:43 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoBxzJtaUD4gKhXpxE97uTCgLsJm5Qj/mvPfE3C/v4oc3wRsPiYowUYVuQk1SvElzfM5g0Z X-Received: by 2002:a17:902:aa03:: with SMTP id be3-v6mr23702450plb.215.1525363303434; Thu, 03 May 2018 09:01:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525363303; cv=none; d=google.com; s=arc-20160816; b=G7JB91skPNAMuJqsVUMX5j9Qf0cOXNZIRvZZqiM6yk0JSJlqLha+tXioTITyPXaovn IIz2f35lo3HviGAyN3leRLWR4SEYYTYTYpuddszf7aQgaXIi5RpHkWh+3REBJngiu0sa pRqhP41wxk0uxeq9p3DZD1IkqPOoAohly6H/WKkCktXfbXTie+gURh6xbzF2gmvqo+PV 9h4S/OQiYIprVBwwrZppqchXEBHrefomBjAuhLgDQFg/jL5nlPO1ShdV19IUJcUijcJd QOwL4feOIy7+1Vni0fEBuM+PVT5ofNPDXJdGHygg1PZBJgkGyYM0/1v74vYGirPNDbE6 qeJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:arc-authentication-results; bh=lqgYbd94i/hdMawWSuXMdeD6msrlg1BUoIHaq5jsr2k=; b=EW3tOSFTaWKiraW1G9wuY4b3PiIHK6/Xm3PlG7un7yZ6GJttO5kl9F43ip5GSH9y9X 6npmL1/NL6rIGitYJJ1wVgXtGt2An2KFK8/Rl+fjwUK8h8pMHdyuRHcmwwIw9348pBlY 8zCH69+vaaxYPFYBww3l4oILJqKfAcfAcy5XX56VwcPr1IcC6E2X8GashIG0cBCAU5hE r9CPZrfQu5jnQ4zX1eZqmGeTCCoHh/EhDYRRqcXBwC3NNgazVihWVv+LdviJ8ReN/51Y ZcbJHa35cx1Kqu7H9xrvjr2Mka+lsKNfoc1nx13euwz8WQv+4RPwakrzSBjR/fzoNtNk iiyg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q1-v6si5072910pgc.265.2018.05.03.09.01.27; Thu, 03 May 2018 09:01:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751302AbeECQAh (ORCPT + 99 others); Thu, 3 May 2018 12:00:37 -0400 Received: from ale.deltatee.com ([207.54.116.67]:53836 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751075AbeECQAe (ORCPT ); Thu, 3 May 2018 12:00:34 -0400 Received: from guinness.priv.deltatee.com ([172.16.1.162]) by ale.deltatee.com with esmtp (Exim 4.89) (envelope-from ) id 1fEGeL-0006mE-QO; Thu, 03 May 2018 10:00:06 -0600 To: =?UTF-8?Q?Christian_K=c3=b6nig?= , linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm@lists.01.org, linux-block@vger.kernel.org Cc: Stephen Bates , Christoph Hellwig , Jens Axboe , Keith Busch , Sagi Grimberg , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy , Dan Williams , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Benjamin Herrenschmidt , Alex Williamson References: <20180423233046.21476-1-logang@deltatee.com> <805645c1-ea40-2e57-88eb-5dd34e579b2e@deltatee.com> <3e4e0126-f444-8d88-6793-b5eb97c61f76@amd.com> From: Logan Gunthorpe Message-ID: Date: Thu, 3 May 2018 09:59:54 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <3e4e0126-f444-8d88-6793-b5eb97c61f76@amd.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-SA-Exim-Connect-IP: 172.16.1.162 X-SA-Exim-Rcpt-To: alex.williamson@redhat.com, benh@kernel.crashing.org, jglisse@redhat.com, dan.j.williams@intel.com, maxg@mellanox.com, jgg@mellanox.com, bhelgaas@google.com, sagi@grimberg.me, keith.busch@intel.com, axboe@kernel.dk, hch@lst.de, sbates@raithlin.com, linux-block@vger.kernel.org, linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, christian.koenig@amd.com X-SA-Exim-Mail-From: logang@deltatee.com X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on ale.deltatee.com X-Spam-Level: X-Spam-Status: No, score=-8.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, GREYLIST_ISWHITE autolearn=ham autolearn_force=no version=3.4.1 Subject: Re: [PATCH v4 00/14] Copy Offload in NVMe Fabrics with P2P PCI Memory X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/05/18 03:05 AM, Christian König wrote: > Ok, I'm still missing the big picture here. First question is what is > the P2PDMA provider? Well there's some pretty good documentation in the patchset for this, but in short, a provider is a device that provides some kind of P2P resource (ie. BAR memory, or perhaps a doorbell register -- only memory is supported at this time). > Second question is how to you want to handle things when device are not > behind the same root port (which is perfectly possible in the cases I > deal with)? I think we need to implement a whitelist. If both root ports are in the white list and are on the same bus then we return a larger distance instead of -1. > Third question why multiple clients? That feels a bit like you are > pushing something special to your use case into the common PCI > subsystem. Something which usually isn't a good idea. No, I think this will be pretty standard. In the simple general case you are going to have one provider and at least two clients (one which writes the memory and one which reads it). However, one client is likely, but not necessarily, the same as the provider. In the NVMeof case, we might have N clients: 1 RDMA device and N-1 block devices. The code doesn't care which device provides the memory as it could be the RDMA device or one/all of the block devices (or, in theory, a completely separate device with P2P-able memory). However, it does require that all devices involved are accessible per pci_p2pdma_distance() or it won't use P2P transactions. I could also imagine other use cases: ie. an RDMA NIC sends data to a GPU for processing and then sends the data to an NVMe device for storage (or vice-versa). In this case we have 3 clients and one provider. > As far as I can see we need a function which return the distance between > a initiator and target device. This function then returns -1 if the > transaction can't be made and a positive value otherwise. If you need to make a simpler convenience function for your use case I'm not against it. > We also need to give the direction of the transaction and have a > whitelist root complex PCI-IDs which can handle P2P transactions from > different ports for a certain DMA direction. Yes. In the NVMeof case we need all devices to be able to DMA in both directions so we did not need the DMA direction. But I can see this being useful once we add the whitelist. Logan