Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752642AbdHEGmI (ORCPT ); Sat, 5 Aug 2017 02:42:08 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:11258 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751200AbdHEGmF (ORCPT ); Sat, 5 Aug 2017 02:42:05 -0400 Subject: Re: [PATCH v8 1/4] PCI: Add new PCIe Fabric End Node flag, PCI_DEV_FLAGS_NO_RELAXED_ORDERING To: Casey Leedom , "ashok.raj@intel.com" , "bhelgaas@google.com" , "helgaas@kernel.org" , Michael Werner , Ganesh GR , "asit.k.mallick@intel.com" , "patrick.j.cramer@intel.com" , "Suravee.Suthikulpanit@amd.com" , "Bob.Shaw@amd.com" , "l.stach@pengutronix.de" , "amira@mellanox.com" , "gabriele.paoloni@huawei.com" , "David.Laight@aculab.com" , "jeffrey.t.kirsher@intel.com" , "catalin.marinas@arm.com" , "will.deacon@arm.com" , "mark.rutland@arm.com" , "robin.murphy@arm.com" , "davem@davemloft.net" , "alexander.duyck@gmail.com" , "linux-arm-kernel@lists.infradead.org" , "netdev@vger.kernel.org" , "linux-pci@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linuxarm@huawei.com" References: <1501767889-7772-1-git-send-email-dingtianhong@huawei.com> <1501767889-7772-2-git-send-email-dingtianhong@huawei.com> From: Ding Tianhong Message-ID: Date: Sat, 5 Aug 2017 14:28:43 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.23.32] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.598565A8.006A,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 441cfdd01ec26c3bd210a8a4e8ce7082 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3676 Lines: 88 On 2017/8/5 5:06, Casey Leedom wrote: > | From: Ding Tianhong > | Sent: Thursday, August 3, 2017 6:44 AM > | > | diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > | index 6967c6b..1e1cdbe 100644 > | --- a/drivers/pci/quirks.c > | +++ b/drivers/pci/quirks.c > | @@ -4016,6 +4016,44 @@ static void quirk_tw686x_class(struct pci_dev *pdev) > | quirk_tw686x_class); > | > | /* > | + * Some devices have problems with Transaction Layer Packets with the Relaxed > | + * Ordering Attribute set. Such devices should mark themselves and other > | + * Device Drivers should check before sending TLPs with RO set. > | + */ > | +static void quirk_relaxedordering_disable(struct pci_dev *dev) > | +{ > | + dev->dev_flags |= PCI_DEV_FLAGS_NO_RELAXED_ORDERING; > | +} > | + > | +/* > | + * Intel E5-26xx Root Complex has a Flow Control Credit issue which can > | + * cause performance problems with Upstream Transaction Layer Packets with > | + * Relaxed Ordering set. > | + */ > | +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f02, PCI_CLASS_NOT_DEFINED, 8, > | + quirk_relaxedordering_disable); > | +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f04, PCI_CLASS_NOT_DEFINED, 8, > | + quirk_relaxedordering_disable); > | +DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, 0x6f08, PCI_CLASS_NOT_DEFINED, 8, > | + quirk_relaxedordering_disable); > | + ... > > It looks like this is missing the set of Root Complex IDs that were noted in > the document to which Patrick Cramer sent us a reference: > > https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf > > In section 3.9.1 we have: > > 3.9.1 Optimizing PCIe Performance for Accesses Toward Coherent Memory > and Toward MMIO Regions (P2P) > > In order to maximize performance for PCIe devices in the processors > listed in Table 3-6 below, the soft- ware should determine whether the > accesses are toward coherent memory (system memory) or toward MMIO > regions (P2P access to other devices). If the access is toward MMIO > region, then software can command HW to set the RO bit in the TLP > header, as this would allow hardware to achieve maximum throughput for > these types of accesses. For accesses toward coherent memory, software > can command HW to clear the RO bit in the TLP header (no RO), as this > would allow hardware to achieve maximum throughput for these types of > accesses. > > Table 3-6. Intel Processor CPU RP Device IDs for Processors Optimizing > PCIe Performance > > Processor CPU RP Device IDs > > Intel Xeon processors based on 6F01H-6F0EH > Broadwell microarchitecture > > Intel Xeon processors based on 2F01H-2F0EH > Haswell microarchitecture > > The PCI Device IDs you have there are the first ones that I guessed at > having the performance problem with Relaxed Ordering. We now apparently > have a complete list from Intel. > > I don't want to phrase this as a "NAK" because you've gone around the > mulberry bush a bunch of times already. So maybe just go with what you've > got in version 8 of your patch and then do a follow on patch to complete the > table? > Casey: Thanks for the good catch, I found that the Ashok has notice this 3 month before, I am so sorry to miss it, it was really a long discussion for this problem, but don't worry, It is not a big work to fix it, I will send the v9 version. :) Ding > Casey > . >