Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755526AbdDGLTk (ORCPT ); Fri, 7 Apr 2017 07:19:40 -0400 Received: from mail-eopbgr660134.outbound.protection.outlook.com ([40.107.66.134]:46877 "EHLO CAN01-QB1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754749AbdDGLTd (ORCPT ); Fri, 7 Apr 2017 07:19:33 -0400 From: "Stephen Bates" To: Sagi Grimberg , Jason Gunthorpe CC: Logan Gunthorpe , Christoph Hellwig , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Max Gurtovoy , Dan Williams , Keith Busch , "linux-pci@vger.kernel.org" , "linux-scsi@vger.kernel.org" , "linux-nvme@lists.infradead.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@ml01.01.org" , "linux-kernel@vger.kernel.org" Subject: Re: [RFC 6/8] nvmet: Be careful about using iomem accesses when dealing with p2pmem Thread-Topic: [RFC 6/8] nvmet: Be careful about using iomem accesses when dealing with p2pmem Thread-Index: AQHSqaLNc25BDB2zpE+Ni89kfhXhOaG1EcYAgABQNICAAnlvAIACA7qA Date: Fri, 7 Apr 2017 11:19:30 +0000 Message-ID: <3E85B4D4-9EBC-4299-8209-2D8740947764@raithlin.com> References: <1490911959-5146-1-git-send-email-logang@deltatee.com> <1490911959-5146-7-git-send-email-logang@deltatee.com> <080b68b4-eba3-861c-4f29-5d829425b5e7@grimberg.me> <20170404154629.GA13552@obsidianresearch.com> <4df229d8-8124-664a-9bc4-6401bc034be1@grimberg.me> In-Reply-To: <4df229d8-8124-664a-9bc4-6401bc034be1@grimberg.me> Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/f.20.0.170309 authentication-results: grimberg.me; dkim=none (message not signed) header.d=none;grimberg.me; dmarc=none action=none header.from=raithlin.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [51.171.151.230] x-microsoft-exchange-diagnostics: 1;YTOPR01MB0618;7:PLij7/y1YvGcCkvIClM4kPYJlgNfualQaODO/g+rVeQ3IIvfQC6xS4gPMTjA8Q8vnCamNiIutqmgqkgwInJGZbJgyyLdIHqAolm4GE6eKEU8e4hfgRBnkyqsBzWrzNbijxQArjg4+wQZmkF5VvGCkqsGP8RPU8CZ513DcEhoRrjAUyw6Wo+YWDDA6MT+at0f1t2eMn4nlmGdWGfiw83ahDPyprGG+dxs/K0y2C6ObX7wiV+ZWWjsStRRI48+eGLSBM3JmUkCuqvWu8OixRqF/93nprOFB1m5g0TJArRqZ/lMFRC1DnWcV9W7QsS2lpUeszahoDDaJ6EiZB7kVvkuSg== x-ms-office365-filtering-correlation-id: ba80f9e7-716b-402c-22e6-08d47da7f613 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(2017030254075)(201703131423075)(201702281549075);SRVR:YTOPR01MB0618; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040450)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3002001)(6041248)(20161123555025)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(20161123562025)(2016111802025)(6072148)(6043046);SRVR:YTOPR01MB0618;BCL:0;PCL:0;RULEID:;SRVR:YTOPR01MB0618; x-forefront-prvs: 0270ED2845 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39410400002)(39400400002)(39830400002)(39450400003)(24454002)(377424004)(377454003)(8676002)(2950100002)(5660300001)(8936002)(25786009)(7416002)(6436002)(7736002)(36756003)(3280700002)(53936002)(2906002)(54906002)(77096006)(6486002)(53546009)(6512007)(6506006)(81166006)(6246003)(3846002)(6116002)(54356999)(33656002)(102836003)(3660700001)(76176999)(50986999)(122556002)(305945005)(2900100001)(83506001)(66066001)(82746002)(93886004)(4326008)(83716003)(189998001)(86362001)(4001350100001)(38730400002)(229853002)(87944003);DIR:OUT;SFP:1102;SCL:1;SRVR:YTOPR01MB0618;H:YTOPR01MB0619.CANPRD01.PROD.OUTLOOK.COM;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <170DB2A444BB8A4E942185B78B7D54E9@CANPRD01.PROD.OUTLOOK.COM> MIME-Version: 1.0 X-OriginatorOrg: raithlin.com X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Apr 2017 11:19:30.5490 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 18519031-7ff4-4cbb-bbcb-c3252d330f4b X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR01MB0618 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v37BJxOk026468 Content-Length: 1773 Lines: 27 On 2017-04-06, 6:33 AM, "Sagi Grimberg" wrote: > Say it's connected via 2 legs, the bar is accessed from leg A and the > data from the disk comes via leg B. In this case, the data is heading > towards the p2p device via leg B (might be congested), the completion > goes directly to the RC, and then the host issues a read from the > bar via leg A. I don't understand what can guarantee ordering here. > Stephen told me that this still guarantees ordering, but I honestly > can't understand how, perhaps someone can explain to me in a simple > way that I can understand. Sagi As long as legA, legB and the RC are all connected to the same switch then ordering will be preserved (I think many other topologies also work). Here is how it would work for the problem case you are concerned about (which is a read from the NVMe drive). 1. Disk device DMAs out the data to the p2pmem device via a string of PCIe MemWr TLPs. 2. Disk device writes to the completion queue (in system memory) via a MemWr TLP. 3. The last of the MemWrs from step 1 might have got stalled in the PCIe switch due to congestion but if so they are stalled in the egress path of the switch for the p2pmem port. 4. The RC determines the IO is complete when the TLP associated with step 2 updates the memory associated with the CQ. It issues some operation to read the p2pmem. 5. Regardless of whether the MemRd TLP comes from the RC or another device connected to the switch it is queued in the egress queue for the p2pmem FIO behind the last DMA TLP (from step 1). PCIe ordering ensures that this MemRd cannot overtake the MemWr (Reads can never pass writes). Therefore the MemRd can never get to the p2pmem device until after the last DMA MemWr has. I hope this helps! Stephen