Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751827AbdHaQBd (ORCPT ); Thu, 31 Aug 2017 12:01:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48094 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751583AbdHaQBb (ORCPT ); Thu, 31 Aug 2017 12:01:31 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 87C3961490 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=alex.williamson@redhat.com Date: Thu, 31 Aug 2017 10:01:30 -0600 From: Alex Williamson To: Jan Glauber Cc: Bjorn Helgaas , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, david.daney@cavium.com, Jon Masters , Robert Richter , linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org Subject: Re: [PATCH v3 3/3] PCI: Avoid slot reset for Cavium cn8xxx root ports Message-ID: <20170831100130.5c8a922e@w520.home> In-Reply-To: <20170831094052.GA15906@hc> References: <20170830142454.10971-1-jglauber@cavium.com> <20170830142454.10971-4-jglauber@cavium.com> <20170830084012.19d91759@w520.home> <20170831094052.GA15906@hc> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Thu, 31 Aug 2017 16:01:31 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2830 Lines: 64 On Thu, 31 Aug 2017 11:40:52 +0200 Jan Glauber wrote: > On Wed, Aug 30, 2017 at 08:40:12AM -0600, Alex Williamson wrote: > > On Wed, 30 Aug 2017 16:24:54 +0200 > > Jan Glauber wrote: > > > > > Root ports of cn8xxx do not function after a slot reset when used with > > > some e1000e and LSI HBA devices. Add a quirk to prevent slot reset on > > > these root ports. > > > > > > Signed-off-by: Jan Glauber > > > --- > > > drivers/pci/quirks.c | 16 ++++++++++++++++ > > > 1 file changed, 16 insertions(+) > > > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > > index 85191b8..6679971 100644 > > > --- a/drivers/pci/quirks.c > > > +++ b/drivers/pci/quirks.c > > > @@ -845,6 +845,22 @@ static void quirk_cavium_sriov_rnm_link(struct pci_dev *dev) > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa018, quirk_cavium_sriov_rnm_link); > > > #endif > > > > > > +/* > > > + * Root port on some Cavium CN8xxx chips do not successfully complete > > > + * a bus reset when used with certain types of child devices. Config > > > + * space access to the child may quit responding. Flag all devices under > > > + * the secondary bus as non-resettable. > > > + */ > > > +static void quirk_CN8xxx_secondary_bus(struct pci_dev *dev) > > > +{ > > > + struct pci_dev *pdev; > > > + > > > + dev_warn(&dev->dev, "Cavium CN8xxx quirk detected; reset for devices on secondary bus disabled\n"); > > > + list_for_each_entry(pdev, &dev->subordinate->devices, bus_list) > > > + pdev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET; > > > +} > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_CN8xxx_secondary_bus); > > > + > > > /* > > > * Some settings of MMRBC can lead to data corruption so block changes. > > > * See AMD 8131 HyperTransport PCI-X Tunnel Revision Guide > > > > > > This doesn't seem reliable, doesn't the user just need to remove and > > reprobe the slot and the device would re-appear without this flag set? > > No, I tried before to disable the slot with "echo 0 > /sys/bus/pci/slots/3/power" > but that does not work as it is not supported. > > I'm not familiar with the quirk types, would another one be better > suited here (even if we don't have the problem you descibed)? The scenario I'm mentioning is to "echo 1 > /sys/bus/pci/devices//remove", then "echo > /sys/bus/pci/rescan". This would break the ordering implicit in using a fixup defined for the root port. It seems like it'd make a lot more sense to add a test on the parent bridge more similar to how the bus reset works. It's not the subordinate devices imposing the no-bus-reset flag, it's the bridge device and the objects and code should support and reflect that. Thanks, Alex