Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752585AbdHIBkG (ORCPT ); Tue, 8 Aug 2017 21:40:06 -0400 Received: from mail-cys01nam02on0114.outbound.protection.outlook.com ([104.47.37.114]:13427 "EHLO NAM02-CY1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751974AbdHIBkE (ORCPT ); Tue, 8 Aug 2017 21:40:04 -0400 From: Casey Leedom To: Bjorn Helgaas , Ding Tianhong CC: "ashok.raj@intel.com" , "bhelgaas@google.com" , Michael Werner , Ganesh GR , "asit.k.mallick@intel.com" , "patrick.j.cramer@intel.com" , "Suravee.Suthikulpanit@amd.com" , "Bob.Shaw@amd.com" , "l.stach@pengutronix.de" , "amira@mellanox.com" , "gabriele.paoloni@huawei.com" , "David.Laight@aculab.com" , "jeffrey.t.kirsher@intel.com" , "catalin.marinas@arm.com" , "will.deacon@arm.com" , "mark.rutland@arm.com" , "robin.murphy@arm.com" , "davem@davemloft.net" , "alexander.duyck@gmail.com" , "linux-arm-kernel@lists.infradead.org" , "netdev@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linuxarm@huawei.com" Subject: Re: [PATCH v9 1/4] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING Thread-Topic: [PATCH v9 1/4] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING Thread-Index: AQHTDbseETiekn3E/ky/7c4YQbAw3aJ7Hs4AgAAfpJo= Date: Wed, 9 Aug 2017 01:40:01 +0000 Message-ID: References: <1501917313-9812-1-git-send-email-dingtianhong@huawei.com> <1501917313-9812-2-git-send-email-dingtianhong@huawei.com>,<20170808232200.GO16580@bhelgaas-glaptop.roam.corp.google.com> In-Reply-To: <20170808232200.GO16580@bhelgaas-glaptop.roam.corp.google.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=leedom@chelsio.com; x-originating-ip: [24.130.148.141] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;MWHPR12MB1440;6:gpFX6k+Id149zhjGl8ife1/xrda9ZobbtMyP7PM7eN4+6cMUlYzrEvudZgTm4ggdcjlzqupW7iYxhtrZOqwDFPEKICr7jPKUGk6kEXeqMS0GGLve8Etoo/7NwxJK8cal/7nrmjlS3On9CQZsx9DTf9PN3yGNNqZvobZmNeaF2iCe3FVnzTtQq/AgG9kpvMruwi76NKvSNawAP+v9vtdAXPMVrEsjQvLGUqt5OU3/dVoW+RPN6Br1BYN0Na/OMA9sWGyuSUHkZeGH/4szZazMx6QKD3oPWxKL06HlHMAHvFIIqSzVUnljLssDInNIIZRyuS4twfTSbWZTVs3CvhD7Rg==;5:r4WaPCfkUFN+RTmX8BxTLOL4GI0uEiX/pj9GKuae53LVXVrgzwdTAcEGapv30Hx2uld5aipKZmjomffmpC2pnEE/aPoJkU8Uv6/UGH3VH20RWgNFDUXxMOQNjtuQXGhEjsW2NV3AZiHYfKRG2/29yw==;24:19lmo/M9rA2EQ7TdZ6jyM+ZOfE277nVSiBV7SXVYQeHviui1/Ha6iQSHJUmIMTqancxolF4Sj0BqDk0nEYb04CBYVqGe0EousSrcgHqbims=;7:ZcSzPaWpjw/8556Z+CGtBTKBtgxK6e28+K7j/XIw4dHc4slQxP9OWaGeuepba01r/pN8Hhp6t+45a8NlMGQ3LQp5Hwmpb8Kcu2NIaWEcdZJBUErGLF0aX7rY7RYhaKRAps6T4Subzk08brUbFGEmYgNVk8XN2VGH1Hj5onpI+OWwIzyo2IybC7iONrizmKkcsNdSZGg86++SBCpVcsyVWaTm2P1DhDyMD3jDem+4X3k= x-ms-office365-filtering-correlation-id: 87148d18-37b3-4dae-7364-08d4dec78d5d x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(2017082002075)(300000503095)(300135400095)(2017052603031)(201703131423075)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:MWHPR12MB1440; x-ms-traffictypediagnostic: MWHPR12MB1440: x-exchange-antispam-report-test: UriScan:(108984395545644)(767451399110)(228905959029699); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(100000703101)(100105400095)(93006095)(93001095)(6041248)(20161123564025)(20161123558100)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(2016111802025)(20161123555025)(20161123560025)(6043046)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:MWHPR12MB1440;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:MWHPR12MB1440; x-forefront-prvs: 0394259C80 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(979002)(6009001)(39450400003)(39840400002)(39410400002)(39400400002)(189002)(377454003)(199003)(50986999)(38730400002)(575784001)(81166006)(3846002)(68736007)(97736004)(2950100002)(9686003)(7736002)(6436002)(5660300001)(478600001)(189998001)(81156014)(551934003)(8666007)(33656002)(2900100001)(7416002)(105586002)(54356999)(76176999)(305945005)(7696004)(8676002)(3660700001)(25786009)(99286003)(54906002)(53936002)(4326008)(3280700002)(55016002)(6246003)(106356001)(14454004)(86362001)(229853002)(77096006)(2906002)(101416001)(6506006)(6116002)(102836003)(6306002)(8936002)(39060400002)(74316002)(66066001)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1102;SCL:1;SRVR:MWHPR12MB1440;H:MWHPR12MB1600.namprd12.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" MIME-Version: 1.0 X-OriginatorOrg: chelsio.com X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Aug 2017 01:40:01.6871 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 065db76d-a7ae-4c60-b78a-501e8fc17095 X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1440 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by nfs id v791eEmG011115 Content-Length: 5034 Lines: 109 | From: Bjorn Helgaas | Sent: Tuesday, August 8, 2017 4:22 PM | | This needs to include a link to the Intel spec | (https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf, | sec 3.9.1). In the commit message or as a comment? Regardless, I agree. It's always nice to be able to go back and see what the official documentation says. However, that said, links on the internet are ... fragile as time goes by, so we might want to simply quote section 3.9.1 in the commit message since it's relatively short: 3.9.1 Optimizing PCIe Performance for Accesses Toward Coherent Memory and Toward MMIO Regions (P2P) In order to maximize performance for PCIe devices in the processors listed in Table 3-6 below, the soft- ware should determine whether the accesses are toward coherent memory (system memory) or toward MMIO regions (P2P access to other devices). If the access is toward MMIO region, then software can command HW to set the RO bit in the TLP header, as this would allow hardware to achieve maximum throughput for these types of accesses. For accesses toward coherent memory, software can command HW to clear the RO bit in the TLP header (no RO), as this would allow hardware to achieve maximum throughput for these types of accesses. Table 3-6. Intel Processor CPU RP Device IDs for Processors Optimizing PCIe Performance Processor CPU RP Device IDs Intel Xeon processors based on 6F01H-6F0EH Broadwell microarchitecture Intel Xeon processors based on 2F01H-2F0EH Haswell microarchitecture | It should also include a pointer to the AMD erratum, if available, or | at least some reference to how we know it doesn't obey the rules. Getting an ACK from AMD seems like a forlorn cause at this point. My contact was Bob Shaw and he stopped responding to me messages almost a year ago saying that all of AMD's energies were being redirected towards upcoming x86 products (likely Ryzen as we now know). As far as I can tell AMD has walked away from their A1100 (AKA "Seattle") ARM SoC. On the specific issue, I can certainly write up somthing even more extensive than I wrote up for the comment in drivers/pci/quirks.c. Please review the comment I wrote up and tell me if you'd like something even more detailed -- I'm usually acused of writing comments which are too long, so this would be a new one on me ... :-) | Ashok, thanks for chiming in. Now that you have, I have a few more | questions for you: I can answer a few of these: | - Is the above doc the one you mentioned as being now public? Yes. Ashok worked with me to the extent he was allowed prior to the publishing of the public technocal note, but he couldn't say much. (Believe it or not, it is possible to say less than the quoted section above.) When the note was published, Patrick Cramer sent me the note about it and pointed me at section 3.9.1. | - Is this considered a hardware erratum? I certainly consider it a Hardware Bug. And I'm really hoping that Ashok will be able to find a "Chicken Bit" which allows the broken feature to be turned off. Remember, the Relaxed Ordering Attribute on a Transaction Layer Packet is simply a HINT. It is perfectly reasonable for a compliant implementation to simply ignore the Relaxed Ordering Attribute on an incoming TLP Request. The sole responsibility of a compliant implementation is to return the exact same Relaxed Ordering and No Snoop Attributes in any TLP Response (The rules for ID-Based Ordering Attribute are more complex.) Earlier Intel Root Complexes did exactly this: they ignored the Relaxed Ordering Attribute and there was no performance difference for using/not-using it. It's pretty obvious that an attempt was made to implement optimizations surounding the use of Relaxed Ordering and they didn't work. | - If so, is there a pointer to that as well? Intel is historically tight-lipped about admiting any bugs/errata in their products. I'm guessing that the above quoted Section 3.9.1 is likely to be all we ever get. The language above regarding TLPs targetting Coherent Shared Memory are basically as much of an admission that they got it wrong as we're going to get. But heck, maybe we'll get lucky ... Especially with regard to the hoped for "Chicken Bit" ... | - If this is not considered an erratum, can you provide any guidance | about how an OS should determine when it should use RO? Software? We don't need no stinking software! Sorry, I couldn't resist. | Relying on a list of device IDs in an optimization manual is OK for an | erratum, but if it's *not* an erratum, it seems like a hole in the specs | because as far as I know there's no generic way for the OS to discover | whether to use RO. Well, here's to hoping that Ashok and/or Patrick are able to offer more detailed information ... Casey