Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755295Ab3FENW2 (ORCPT ); Wed, 5 Jun 2013 09:22:28 -0400 Received: from g1t0027.austin.hp.com ([15.216.28.34]:6422 "EHLO g1t0027.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754976Ab3FENWZ (ORCPT ); Wed, 5 Jun 2013 09:22:25 -0400 Message-ID: <1370438520.20051.115.camel@ejdallLaptop> Subject: Re: [PATCH v2 3/3] PCI/AER: Provide reset_link for firmware first root port From: Betty Dall To: Bjorn Helgaas Cc: Chen Gong , "Rafael J. Wysocki" , Huang Ying , "linux-acpi@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" Date: Wed, 05 Jun 2013 07:22:00 -0600 In-Reply-To: References: <1369924769-17183-1-git-send-email-betty.dall@hp.com> <1369924769-17183-4-git-send-email-betty.dall@hp.com> <20130604083644.GD20448@gchen.bj.intel.com> <1370381931.20051.101.camel@ejdallLaptop> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.2- Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7064 Lines: 155 On Tue, 2013-06-04 at 16:15 -0600, Bjorn Helgaas wrote: > On Tue, Jun 4, 2013 at 3:38 PM, Betty Dall wrote: > > On Tue, 2013-06-04 at 04:36 -0400, Chen Gong wrote: > >> On Thu, May 30, 2013 at 08:39:29AM -0600, Betty Dall wrote: > >> > Date: Thu, 30 May 2013 08:39:29 -0600 > >> > From: Betty Dall > >> > To: rjw@sisk.pl, bhelgaas@google.com > >> > Cc: ying.huang@intel.com, linux-acpi@vger.kernel.org, > >> > linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Betty Dall > >> > > >> > Subject: [PATCH v2 3/3] PCI/AER: Provide reset_link for firmware first root > >> > port > >> > X-Mailer: git-send-email 1.7.7.6 > >> > > >> > The firmware first path does not install the aerdrv root port > >> > service driver, so the firmware first root port does not have > >> > a reset_link callback. When a firmware first root port does not have > >> > a reset_link callback, use a new default reset_link similar to what > >> > we already do for the default_downstream_reset_link(). The firmware > >> > first default reset_link brings the root port out of SBR if firmware > >> > left it in SBR. > >> > > >> > Changes since v1: > >> > Removed incorrect setting of p2p_ctrl after the read. > >> > > >> > Signed-off-by: Betty Dall > >> > --- > >> > > >> > drivers/pci/pcie/aer/aerdrv_core.c | 36 ++++++++++++++++++++++++++++++++++++ > >> > 1 files changed, 36 insertions(+), 0 deletions(-) > >> > > >> > > >> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c > >> > index 8ec8b4f..72f76cd 100644 > >> > --- a/drivers/pci/pcie/aer/aerdrv_core.c > >> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c > >> > @@ -413,6 +413,39 @@ static pci_ers_result_t default_downstream_reset_link(struct pci_dev *dev) > >> > return PCI_ERS_RESULT_RECOVERED; > >> > } > >> > > >> > +/** > >> > + * default_ff_root_port_reset_link - default reset function for firmware > >> > + * first Root Port > >> > + * @dev: pointer to root port's pci_dev data structure > >> > + * > >> > + * Invoked when performing link reset at Root Port w/ no aer driver. > >> > + * This happens through the firmware first path. Firmware may leave > >> > + * the Root Port in SBR and it is the OS responsiblity to bring it out > >> > + * of SBR. > >> > + */ > >> > +static pci_ers_result_t default_ff_root_port_reset_link(struct pci_dev *dev) > >> > +{ > >> > + u16 p2p_ctrl; > >> > + > >> > + /* Read Secondary Bus Reset */ > >> > + pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl); > >> > + > >> > + /* De-assert Secondary Bus Reset, if it is set */ > >> > + if (p2p_ctrl & PCI_BRIDGE_CTL_BUS_RESET) { > >> > + p2p_ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; > >> > + pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl); > >> > + > >> > + /* > >> > + * System software must wait for at least 100ms from the end > >> > + * of a reset of one or more device before it is permitted > >> > + * to issue Configuration Requests to those devices. > >> > + */ > >> > + msleep(200); > >> > + dev_dbg(&dev->dev, "Root Port link has been reset\n"); > >> > + } > >> > + return PCI_ERS_RESULT_RECOVERED; > >> > +} > >> > >> I don't think this function is OK. > >> 1) You don't really reset the 2nd Bus but just checking its status. > >> I think you should have following steps to reset 2nd Bus: > >> > >> a. Assert 2nd Bus Reset > >> b. wait for some time until this message has been broadcasted well > >> c. De-assert 2nd Bus Reset > >> d. wait for Trst time > >> > >> IOW, since we have aer_do_secondary_bus_reset to perform secondary bus > >> reset, why you repeat it again? > >> > >> 2) msleep(200) is too long for kernel. You'd better yield the CPU when > >> sleep. > > > > The firmware first path currently has no reset_link. I want to make a > > minimal change since resetting the link could be considered firmware's > > job in the firmware first path. This change is to just check if SBR is > > set, and bring the link out of reset only if it is in SBR. This way, if > > a another firmware first platform is already resetting the link, it > > won't be done twice. I took the msleep and the code from > > aer_do_sercondary_bus_reset() as you noticed. > > I understand the desire to make a minimal change, and we certainly > don't want to make changes bigger than they need to be. > > However, my goal is really to have clean, maintainable code at the > end. If that means bigger patches where you can't test every affected > platform, that's a risk I'm willing to take. > > In this particular case, we could try to limit the change to just the > platforms you can test, as you've done. But it looks to me like we > actually want to reset the link in *all* cases, regardless of whether > firmware-first is involved. We're handling a fatal error for a > device. That device might be below a PCIe switch (where we currently > reset the link), or it might be directly below a root port (where we > currently don't). Why should those cases be different? I don't think > the device or the driver should be able to tell the difference. > > My *guess* is that this is an oversight in the original code, and that > doing the link reset for both downstream ports and root ports will > make error handling for devices below root ports work better. > > The *last* thing I want is code littered with special cases that are > only there because "they only fix the platforms I could actually > test." It's just about impossible to clean those up later. > > Bjorn That makes sense to me. I will work on a new version of the patch and test it today. -Betty > >> > + > >> > static int find_aer_service_iter(struct device *device, void *data) > >> > { > >> > struct pcie_port_service_driver *service_driver, **drv; > >> > @@ -460,6 +493,9 @@ static pci_ers_result_t reset_link(struct pci_dev *dev) > >> > status = driver->reset_link(udev); > >> > } else if (pci_pcie_type(udev) == PCI_EXP_TYPE_DOWNSTREAM) { > >> > status = default_downstream_reset_link(udev); > >> > + } else if (pci_pcie_type(udev) == PCI_EXP_TYPE_ROOT_PORT && > >> > + pcie_aer_get_firmware_first(udev)) { > >> > + status = default_ff_root_port_reset_link(udev); > >> > } else { > >> > dev_printk(KERN_DEBUG, &dev->dev, > >> > "no link-reset support at upstream device %s\n", > >> > -- > >> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >> > the body of a message to majordomo@vger.kernel.org > >> > More majordomo info at http://vger.kernel.org/majordomo-info.html > >> > Please read the FAQ at http://www.tux.org/lkml/ > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/