Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp681421imm; Fri, 11 May 2018 04:52:36 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqt/YlimQPqQ+V5+jLwIIAKRovzaQabpzvcsRJ+3/y/k6sphWyVzR5MWy0PRlJzrYc0vy3G X-Received: by 2002:a62:9099:: with SMTP id q25-v6mr5287454pfk.66.1526039556491; Fri, 11 May 2018 04:52:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526039556; cv=none; d=google.com; s=arc-20160816; b=er5wMXZrKr2FDsp9u/Dca61KrpmN9S+QKR4mU181s82A7TZcuA3Zza06eHQ2l2iy23 CpcHMvf3SvPI/JTZD2Pm0MNn0D7rOCEJ4ZYsaUi7S5RfHdmyHcE1a/kMsbJ6uVqJsFDB oLUoQmFkQ2TZxqkicQvaW4E987jW4k8j2iF2Gr7VO9Qnk9nB0PNIvJPg1e9N1DEHGGku mVI4cnHiUd6KgUCBOldgyhJULYi3OisP4dwBkrhevQ4zTBVxA+cK/6C0ozbSQGYnwLOq x+y9lumFNDKIakIXdGLlYntRPeXAH+JmWm5Ufmb/apbImcu/kAa1N9F+pUOgw5oBayp9 yaPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:references :in-reply-to:subject:to:from:date:content-transfer-encoding :mime-version:dkim-signature:dkim-signature :arc-authentication-results; bh=Dl5VOu1NCg/KgYG/mUMFS0peQ9aG2ejGDvNVVf2/qkk=; b=ZsbtmRn+RbkV3EO6lPrd/wo+HDWyy9Y2gQkv2JXK8eLeblycrkte6OAdHIikQ0W8ya bcdd3Il1cdGsI7UKHoHsiaQlo3nfTnhyuDyaBbbxQs3QJKHqvGLlAIpulfU5oorTuVgl QPmq64g9SECCUmlfvbvKF+zTvB+ePoBYh4zfCVLJCPv+7CRoIOCPrT+2SCfNZyZ6GyMT ufzwEhf0jpWzzLRs6bXDMhi2sRX8wyNHo4o8jpFYQItoE4Bw3k6WI6WUS8deXVJ1twaY SlvC71ZjjCniWwCObiXe3mfuOjDMP+mIb0mSz0q5eaFoWJU8FIJOIqLtnuSp9Q+lQ6xn SGFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=TqUo0j0A; dkim=pass header.i=@codeaurora.org header.s=default header.b=Gs1lHLm+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t16-v6si3025082pfe.225.2018.05.11.04.52.20; Fri, 11 May 2018 04:52:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=TqUo0j0A; dkim=pass header.i=@codeaurora.org header.s=default header.b=Gs1lHLm+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752868AbeEKLwM (ORCPT + 99 others); Fri, 11 May 2018 07:52:12 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:57898 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752507AbeEKLwK (ORCPT ); Fri, 11 May 2018 07:52:10 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 779F66079C; Fri, 11 May 2018 11:52:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1526039530; bh=pUal0fYb2nnsxHDCbqjcqm3hO/vnpNmmEuRj3sb7uvE=; h=Date:From:To:Subject:In-Reply-To:References:From; b=TqUo0j0A9NXu0dUDoUg+ffslyLvwN7nmFJ313hINTlHelqjOsEbrSk9GKBG575G14 ZGLkq7Yl9lVH8Fd3Y4KhGNMfURrrolrTRAfBAFEFfqBvEFbjjivRRA9aqyYP1CVU1b 8UW/fmrz39Ujz993L2IW3peddrFI4/uocWhRewSo= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.codeaurora.org (Postfix) with ESMTP id CFFFC60541; Fri, 11 May 2018 11:52:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1526039528; bh=pUal0fYb2nnsxHDCbqjcqm3hO/vnpNmmEuRj3sb7uvE=; h=Date:From:To:Subject:In-Reply-To:References:From; b=Gs1lHLm+/kKsfNKu2a22iADncAoQjpW3uomqv3V7tBFp+Suogxsy9pNaYNAYaJnDq F5xOQGOKPl5dDPWMMPMGa6A9EM/HSO/IlGeN4mgbgAr0VPcy/jJwGEBIziYg38ddFk Xtqe9WoPOpFMXG1+4fOzf9lumo6cRXBXXizBvSAM= MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Fri, 11 May 2018 17:22:08 +0530 From: poza@codeaurora.org To: Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Keith Busch , Wei Zhang , Sinan Kaya , Timur Tabi Subject: Re: [PATCH v16 8/9] PCI/DPC: Unify and plumb error handling into DPC In-Reply-To: <1526035408-31328-9-git-send-email-poza@codeaurora.org> References: <1526035408-31328-1-git-send-email-poza@codeaurora.org> <1526035408-31328-9-git-send-email-poza@codeaurora.org> Message-ID: X-Sender: poza@codeaurora.org User-Agent: Roundcube Webmail/1.2.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-05-11 16:13, Oza Pawandeep wrote: > DPC driver implements link_reset callback, and calls > pci_do_fatal_recovery(). > > Which follows standard path of ERR_FATAL recovery. > > Signed-off-by: Oza Pawandeep > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 5e8857a..6af7595 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -354,7 +354,7 @@ static inline resource_size_t > pci_resource_alignment(struct pci_dev *dev, > void pci_enable_acs(struct pci_dev *dev); > > /* PCI error reporting and recovery */ > -void pcie_do_fatal_recovery(struct pci_dev *dev); > +void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service); > void pcie_do_nonfatal_recovery(struct pci_dev *dev); > > bool pcie_wait_for_link(struct pci_dev *pdev, bool active); > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c > b/drivers/pci/pcie/aer/aerdrv_core.c > index fdfc474..36e622d 100644 > --- a/drivers/pci/pcie/aer/aerdrv_core.c > +++ b/drivers/pci/pcie/aer/aerdrv_core.c > @@ -254,7 +254,7 @@ static void handle_error_source(struct pcie_device > *aerdev, > } else if (info->severity == AER_NONFATAL) > pcie_do_nonfatal_recovery(dev); > else if (info->severity == AER_FATAL) > - pcie_do_fatal_recovery(dev); > + pcie_do_fatal_recovery(dev, PCIE_PORT_SERVICE_AER); > } > > #ifdef CONFIG_ACPI_APEI_PCIEAER > @@ -321,7 +321,7 @@ static void aer_recover_work_func(struct > work_struct *work) > if (entry.severity == AER_NONFATAL) > pcie_do_nonfatal_recovery(pdev); > else if (entry.severity == AER_FATAL) > - pcie_do_fatal_recovery(pdev); > + pcie_do_fatal_recovery(pdev, PCIE_PORT_SERVICE_AER); > pci_dev_put(pdev); > } > } > diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c > index 80ec384..5680c13 100644 > --- a/drivers/pci/pcie/dpc.c > +++ b/drivers/pci/pcie/dpc.c > @@ -73,29 +73,31 @@ static void dpc_wait_link_inactive(struct dpc_dev > *dpc) > pcie_wait_for_link(pdev, false); > } > > -static void dpc_work(struct work_struct *work) > +static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev) > { > - struct dpc_dev *dpc = container_of(work, struct dpc_dev, work); > - struct pci_dev *dev, *temp, *pdev = dpc->dev->port; > - struct pci_bus *parent = pdev->subordinate; > - u16 cap = dpc->cap_pos, ctl; > - > - pci_lock_rescan_remove(); > - list_for_each_entry_safe_reverse(dev, temp, &parent->devices, > - bus_list) { > - pci_dev_get(dev); > - pci_dev_set_disconnected(dev, NULL); > - if (pci_has_subordinate(dev)) > - pci_walk_bus(dev->subordinate, > - pci_dev_set_disconnected, NULL); > - pci_stop_and_remove_bus_device(dev); > - pci_dev_put(dev); > - } > - pci_unlock_rescan_remove(); > - > + struct dpc_dev *dpc; > + struct pcie_device *pciedev; > + struct device *devdpc; > + u16 cap, ctl; > + > + /* > + * DPC disables the Link automatically in hardware, so it has > + * already been reset by the time we get here. > + */ > + > + devdpc = pcie_port_find_device(pdev, PCIE_PORT_SERVICE_DPC); > + pciedev = to_pcie_device(devdpc); > + dpc = get_service_data(pciedev); > + cap = dpc->cap_pos; > + > + /* > + * Waiting until the link is inactive, then clearing DPC > + * trigger status to allow the port to leave DPC. > + */ > dpc_wait_link_inactive(dpc); > + > if (dpc->rp_extensions && dpc_wait_rp_inactive(dpc)) > - return; > + return PCI_ERS_RESULT_DISCONNECT; > if (dpc->rp_extensions && dpc->rp_pio_status) { > pci_write_config_dword(pdev, cap + PCI_EXP_DPC_RP_PIO_STATUS, > dpc->rp_pio_status); > @@ -108,6 +110,17 @@ static void dpc_work(struct work_struct *work) > pci_read_config_word(pdev, cap + PCI_EXP_DPC_CTL, &ctl); > pci_write_config_word(pdev, cap + PCI_EXP_DPC_CTL, > ctl | PCI_EXP_DPC_CTL_INT_EN); > + > + return PCI_ERS_RESULT_RECOVERED; > +} > + > +static void dpc_work(struct work_struct *work) > +{ > + struct dpc_dev *dpc = container_of(work, struct dpc_dev, work); > + struct pci_dev *pdev = dpc->dev->port; > + > + /* From DPC point of view error is always FATAL. */ > + pcie_do_fatal_recovery(pdev, PCIE_PORT_SERVICE_DPC); > } > > static void dpc_process_rp_pio_error(struct dpc_dev *dpc) > @@ -288,6 +301,7 @@ static struct pcie_port_service_driver dpcdriver = > { > .service = PCIE_PORT_SERVICE_DPC, > .probe = dpc_probe, > .remove = dpc_remove, > + .reset_link = dpc_reset_link, > }; > > static int __init dpc_service_init(void) > diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c > index 33a16b1..29ff148 100644 > --- a/drivers/pci/pcie/err.c > +++ b/drivers/pci/pcie/err.c > @@ -185,7 +185,7 @@ static pci_ers_result_t default_reset_link(struct > pci_dev *dev) > return PCI_ERS_RESULT_RECOVERED; > } > > -static pci_ers_result_t reset_link(struct pci_dev *dev) > +static pci_ers_result_t reset_link(struct pci_dev *dev, u32 service) > { > struct pci_dev *udev; > pci_ers_result_t status; > @@ -200,7 +200,7 @@ static pci_ers_result_t reset_link(struct pci_dev > *dev) > } > > /* Use the aer driver of the component firstly */ > - driver = pcie_port_find_service(udev, PCIE_PORT_SERVICE_AER); > + driver = pcie_port_find_service(udev, service); > > if (driver && driver->reset_link) { > status = driver->reset_link(udev); > @@ -287,7 +287,7 @@ static pci_ers_result_t > broadcast_error_message(struct pci_dev *dev, > * followed by re-enumeration of devices. > */ > > -void pcie_do_fatal_recovery(struct pci_dev *dev) > +void pcie_do_fatal_recovery(struct pci_dev *dev, u32 service) > { > struct pci_dev *udev; > struct pci_bus *parent; > @@ -313,7 +313,7 @@ void pcie_do_fatal_recovery(struct pci_dev *dev) > pci_dev_put(pdev); > } > > - result = reset_link(udev); > + result = reset_link(udev, service); > > if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { > /* > diff --git a/include/linux/aer.h b/include/linux/aer.h > index 8f87bbe..0c506fe 100644 > --- a/include/linux/aer.h > +++ b/include/linux/aer.h > @@ -14,6 +14,7 @@ > #define AER_NONFATAL 0 > #define AER_FATAL 1 > #define AER_CORRECTABLE 2 > +#define DPC_FATAL 4 > > struct pci_dev; Hi Bjorn, I have addressed all the comments, and I hope we are heading towards closure. I just figure that pcie_do_fatal_recovery (which is getting executed for DPC as well) if (result == PCI_ERS_RESULT_RECOVERED) { if (pcie_wait_for_link(udev, true)) pci_rescan_bus(udev->bus); pci_info(dev, "Device recovery successful\n"); } I have to correct it to if (service==AER && result == PCI_ERS_RESULT_RECOVERED) { if (pcie_wait_for_link(udev, true)) pci_rescan_bus(udev->bus); pci_info(dev, "Device recovery successful\n"); } rest of the things look okay to me. If you have any more comments, I can fix them with this one and hopefully v17 will be the final one. PS: I am going though the code more, and we can have some follow up patches (probably some cleanup) for e.g. pcie_portdrv_slot_reset() checks if (dev->error_state == pci_channel_io_frozen) { dev->state_saved = true; pci_restore_state(dev); pcie_portdrv_restore_config(dev); pci_enable_pcie_error_reporting(dev); } but now since the ERR_FATAL path does not call pcie_portdrv_slot_reset the check is meaning less. besides driver's shut_down callbacks might want to handle pci_channel_io_frozen, but that something left to be discussed later. So, I am not touching dev->error_state anywhere as of now in ERR_FATAL case. Regards, Oza.