Received: by 10.192.165.148 with SMTP id m20csp1423930imm; Wed, 2 May 2018 22:05:04 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpS9TwZLBJWcN5WjvRLNDJ5/nt46peJlhvUNoD2VvSK/Ft7ebjXwNw4DNWYiq5llE598zMv X-Received: by 2002:a17:902:5709:: with SMTP id k9-v6mr22056701pli.165.1525323904540; Wed, 02 May 2018 22:05:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525323904; cv=none; d=google.com; s=arc-20160816; b=vfZuONUUHFin96ffRlMiYvNdYca2nphvclWAaeNh5L1yzozdNLtC5UwuklowjpVwVH Fw4NyfxIATVM3tOQJj9JSOdDJstogiiytAI+23UaWD8ggwUWarBOtyA9Y7kvgMAS7dmx DTXLReKFDUj6QUDL78b/qK01D+Aot6rhBw1Zdm4N0j97T9rY5qJLfd6Tx9B5cEsNNBBf RYXf2TLcbSCwyUtYjbZqS0T+EMlNYJpPvu2BxVCNuy/KtyaMWyaoBP/GqpHyv5u2aAZR Sv5FthohnxzZJBw2/VRFIy2v9rzxQviVErB1hBsEVSqU3Kkb8bSCoaxXux/EDeNRInzu jkdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=rbDJPF8reTwDuA6mqr3Qn8HMsERKz+JaMXLkz+6cP50=; b=ycabmeIbXD1AFooM02pgApdlGpzpqTDbvV4pF2m8n87mqZ9B7OsjUv6q4NNUWz3MZF 7bu8HfAolUndoxNDIpVdKKUX5J8Pmhit5C0/mjKU1WUoiJKot0s4jQzdkg/1InLZmKoQ y9nUhZxdjt3U5ds5IUGGafjLBvzTDrFM69uRNgvQZ7uaA1RErLNiipOwPHZAgY9En9uJ ZtGeNRXsARr4junAOccmY8FNDRxmrh6yq0bIgC7LLeKNTSFnOWBdOYr4sI2Rp7ZOSAtq 22Dd7O7nFNuAyqJeEYBGN0mixKup/GbuIz9XPlqkLimksgTQtpaRcII6Lbm9oA5BLmeq BAdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i11-v6si11068433pgn.530.2018.05.02.22.04.50; Wed, 02 May 2018 22:05:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752072AbeECFEO (ORCPT + 99 others); Thu, 3 May 2018 01:04:14 -0400 Received: from alexa-out.qualcomm.com ([129.46.98.28]:50400 "EHLO alexa-out.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751808AbeECFEE (ORCPT ); Thu, 3 May 2018 01:04:04 -0400 X-IronPort-AV: E=Sophos;i="5.49,356,1520924400"; d="scan'208";a="17319040" Received: from ironmsg03-sd.qualcomm.com ([10.53.140.143]) by alexa-out.qualcomm.com with ESMTP; 02 May 2018 22:04:04 -0700 X-IronPort-AV: E=McAfee;i="5900,7806,8881"; a="152473228" Received: from westreach.qualcomm.com ([10.228.196.125]) by ironmsg03-sd.qualcomm.com with ESMTP; 02 May 2018 22:04:02 -0700 Received: by westreach.qualcomm.com (Postfix, from userid 467151) id CA3321F27; Thu, 3 May 2018 01:04:01 -0400 (EDT) From: Oza Pawandeep To: Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Keith Busch , Wei Zhang , Sinan Kaya , Timur Tabi Cc: Oza Pawandeep Subject: [PATCH v15 3/9] PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices Date: Thu, 3 May 2018 01:03:52 -0400 Message-Id: <1525323838-1735-4-git-send-email-poza@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1525323838-1735-1-git-send-email-poza@codeaurora.org> References: <1525323838-1735-1-git-send-email-poza@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch alters the behavior of handling of ERR_FATAL, where removal of devices is initiated, followed by reset link, followed by re-enumeration. So the errors are handled in a different way as follows: ERR_NONFATAL => call driver recovery entry points ERR_FATAL => remove and re-enumerate please refer to Documentation/PCI/pci-error-recovery.txt for more details. Signed-off-by: Oza Pawandeep diff --git a/drivers/pci/pcie/aer/aerdrv.c b/drivers/pci/pcie/aer/aerdrv.c index 779b387..206f590 100644 --- a/drivers/pci/pcie/aer/aerdrv.c +++ b/drivers/pci/pcie/aer/aerdrv.c @@ -330,6 +330,13 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev) reg32 |= ROOT_PORT_INTR_ON_MESG_MASK; pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, reg32); + /* + * This function is called only on ERR_FATAL now, and since + * the pci_report_resume is called only in ERR_NONFATAL case, + * the clearing part has to be taken care here. + */ + aer_error_resume(dev); + return PCI_ERS_RESULT_RECOVERED; } diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c index 0ea5acc..655d4e8 100644 --- a/drivers/pci/pcie/aer/aerdrv_core.c +++ b/drivers/pci/pcie/aer/aerdrv_core.c @@ -20,6 +20,7 @@ #include #include #include "aerdrv.h" +#include "../../pci.h" #define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \ PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE) @@ -474,6 +475,44 @@ static pci_ers_result_t reset_link(struct pci_dev *dev) return status; } +static pci_ers_result_t do_fatal_recovery(struct pci_dev *dev, int severity) +{ + struct pci_dev *udev; + struct pci_bus *parent; + struct pci_dev *pdev, *temp; + pci_ers_result_t result = PCI_ERS_RESULT_RECOVERED; + + if (severity == AER_FATAL) + pci_cleanup_aer_uncorrect_error_status(dev); + + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) + udev = dev; + else + udev = dev->bus->self; + + parent = udev->subordinate; + pci_lock_rescan_remove(); + list_for_each_entry_safe_reverse(pdev, temp, &parent->devices, + bus_list) { + pci_dev_get(pdev); + pci_dev_set_disconnected(pdev, NULL); + if (pci_has_subordinate(pdev)) + pci_walk_bus(pdev->subordinate, + pci_dev_set_disconnected, NULL); + pci_stop_and_remove_bus_device(pdev); + pci_dev_put(pdev); + } + + result = reset_link(udev); + if (result == PCI_ERS_RESULT_RECOVERED) + if (pcie_wait_for_link(udev, true)) + pci_rescan_bus(udev->bus); + + pci_unlock_rescan_remove(); + + return result; +} + /** * do_recovery - handle nonfatal/fatal error recovery process * @dev: pointer to a pci_dev data structure of agent detecting an error @@ -485,11 +524,15 @@ static pci_ers_result_t reset_link(struct pci_dev *dev) */ static void do_recovery(struct pci_dev *dev, int severity) { - pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED; + pci_ers_result_t status; enum pci_channel_state state; - if (severity == AER_FATAL) - state = pci_channel_io_frozen; + if (severity == AER_FATAL) { + status = do_fatal_recovery(dev, severity); + if (status != PCI_ERS_RESULT_RECOVERED) + goto failed; + return; + } else state = pci_channel_io_normal; @@ -498,12 +541,6 @@ static void do_recovery(struct pci_dev *dev, int severity) "error_detected", report_error_detected); - if (severity == AER_FATAL) { - result = reset_link(dev); - if (result != PCI_ERS_RESULT_RECOVERED) - goto failed; - } - if (status == PCI_ERS_RESULT_CAN_RECOVER) status = broadcast_error_message(dev, state, -- 2.7.4