Received: by 10.192.165.148 with SMTP id m20csp1423656imm; Wed, 2 May 2018 22:04:45 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpRvftgmwrEyMSi65VyV5E7zT3RIBrbG7GQn09JklPXF0Dc8Nqln/fDz/FWRj8ugHnNiqt/ X-Received: by 2002:a17:902:700a:: with SMTP id y10-v6mr22793919plk.265.1525323885695; Wed, 02 May 2018 22:04:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525323885; cv=none; d=google.com; s=arc-20160816; b=0ddnCqTODTTSojs01XDfDX9xrEtXE9kGsOvib8mmhhyovvkbA1LrIj+R38Q5kY02d0 A5hismG8X3sU38HtbVMGLGfWoIC4hp74dpZs6EDGEVWMgl9jyLVtlZza53rc1EoE0b9A y8HCBEphw9OJBQ19WPVvJ3cQYw5ccDpyM6AgzFRKyOm1avhfpFHpHVXvQ4FtaJUYE+Tr 3qWW2DErICimP1BgLDeYHfxm4BVcjibTIDGJLWAVaqiahvs7yL9wkVTHBbpwOt044dLg AW9JdYDug9kn39z/T6oTivjx4/9h8lIi7WVcf2nE1WLb+fMapXg/eGiSVfXZQX72NNc5 qDig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=IsqNRNJtqlshF2exJ90SnQ9Ep3dpEkHx8AASTBoZVwQ=; b=aJoa57IBLGj53IUfl2NHGjXOYQtNbTecwhfc8bephUmc3I6f/V3X8Bh4P96vxAaPdD mi0kYbSViBpPvoP2iC6w1gXqfwsHcBizUXNH3CkXSw2MypfRy6BlfLOt8nkczKaB9EzG +unG83t2Yve6caS2P0NIn+K0PPFRzxO5JJEr/zeRwsDKAekpYIewL7yCn73SGbvfqHy+ 4jVgUML4KxT01/awOrj30RooN25+HzTAEC1b+hP/N1QoJEC8zn0XA4BgRoAoxCJnxx3T QjYI5+4Nr1xHWVtuSUbRkDuQSlbmNLxwosz96z1QbAtPjn9ABxHtZOW4Ry+m1XIsPosa OumQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b26-v6si10608314pgn.268.2018.05.02.22.04.31; Wed, 02 May 2018 22:04:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752098AbeECFER (ORCPT + 99 others); Thu, 3 May 2018 01:04:17 -0400 Received: from wolverine01.qualcomm.com ([199.106.114.254]:5665 "EHLO wolverine01.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751741AbeECFEE (ORCPT ); Thu, 3 May 2018 01:04:04 -0400 X-IronPort-AV: E=Sophos;i="5.49,356,1520924400"; d="scan'208";a="338643498" Received: from unknown (HELO ironmsg-SD-alpha.qualcomm.com) ([10.53.140.30]) by wolverine01.qualcomm.com with ESMTP; 02 May 2018 22:04:03 -0700 X-IronPort-AV: E=McAfee;i="5900,7806,8881"; a="182707647" Received: from westreach.qualcomm.com ([10.228.196.125]) by ironmsg-SD-alpha.qualcomm.com with ESMTP; 02 May 2018 22:04:02 -0700 Received: by westreach.qualcomm.com (Postfix, from userid 467151) id A2CCA1F25; Thu, 3 May 2018 01:04:01 -0400 (EDT) From: Oza Pawandeep To: Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Keith Busch , Wei Zhang , Sinan Kaya , Timur Tabi Cc: Oza Pawandeep Subject: [PATCH v15 0/9] Address error and recovery for AER and DPC Date: Thu, 3 May 2018 01:03:49 -0400 Message-Id: <1525323838-1735-1-git-send-email-poza@codeaurora.org> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch set brings in error handling support for DPC The current implementation of AER and error message broadcasting to the EP driver is tightly coupled and limited to AER service driver. It is important to factor out broadcasting and other link handling callbacks. So that not only when AER gets triggered, but also when DPC get triggered (for e.g. ERR_FATAL), callbacks are handled appropriately. The goal of the patch-set is: DPC should handle the error handling and recovery similar to AER, because finally both are attempting recovery in some or the other way, and for that error handling and recovery framework has to be loosely coupled. It achieves uniformity and transparency to the error handling agents such as AER, DPC, with respect to recovery and error handling. So, this patch-set tries to unify lot of things between error agents and make them behave in a well defined way. (be it error (FATAL, NON_FATAL) handling or recovery). The FATAL error handling is handled with remove/reset_link/re-enumerate sequence while the NON_FATAL follows the default path. Documentation/PCI/pci-error-recovery.txt talks more on that. Changes since v14: Bjorn's comments addressed > simplified the patch set, and moved AER_FATAL handling in the beginning. > rebase the code to 4.17-rc1. Changes since v13: Bjorn's comments addressed > handke FATAL errors with remove devices followed by re-enumeration. > changes in AER and DPC along with required Documentation. Changes since v12: Bjorn's and Keith's Comments addressed. > Made DPC and AER error handling identical > hanldled cases for hotplug enabled system differently. Changes since v11: Bjorn's comments addressed. > rename pcie-err.c to err.c > removed EXPORT_SYMBOL > made generic find_serivce function in port driver. > removed mutex patch as no need to have mutex in pcie_do_recovery > brough in DPC_FATAL in aer.h > so now all the error codes (AER and DPC) are unified in aer.h Changes since v10: Christoph Hellwig's, David Laight's and Randy Dunlap's comments addressed. > renamed pci_do_recovery to pcie_do_recovery > removed inner braces in conditional statements. > restrctured the code in pci_wait_for_link > EXPORT_SYMBOL_GPL Changes since v9: Sinan's comments addressed. > bool active = true; unnecessary variable removed. Changes since v8: Fixed Kbuild errors. Changes since v7: Rebased the code on pci master > https://kernel.googlesource.com/pub/scm/linux/kernel/git/helgaas/pci Changes since v6: Sinan's and Stefan's comments implemented. > reordered patch 6 and 7 > cleaned up Changes since v5: Sinan's and Keith's comments incorporated. > made separate patch for mutex > unified error repotting codes into driver/pci/pci.h > got rid of wait link active/inactive and made generic function in driver/pci/pci.c Changes since v4: Bjorn's comments incorporated. > Renamed only do_recovery. > moved the things more locally to drivers/pci/pci.h Changes since v3: Bjorn's comments incorporated. > Made separate patch renaming generic pci_err.c > Introduce pci_err.h to contain all the error types and recovery > removed all the dependencies on pci.h Changes since v2: Based on feedback from Keith: " When DPC is triggered due to receipt of an uncorrectable error Message, the Requester ID from the Message is recorded in the DPC Error Source ID register and that Message is discarded and not forwarded Upstream. " Removed the patch where AER checks if DPC service is active Changes since v1: Kbuild errors fixed: > pci_find_dpc_dev made static > ras_event.h updated > pci_find_aer_service call with CONFIG check > pci_find_dpc_service call with CONFIG check Oza Pawandeep (9): PCI: Unify wait for link active into generic PCI pci-error-recovery: Add AER_FATAL handling PCI/AER: Handle ERRR_FATAL with removal and re-enumeration of devices PCI/AER: Rename error recovery to generic PCI naming PCI/AER: Factor out error reporting from AER PCI/PORTDRV: Implement generic find service PCI/PORTDRV: Implement generic find device PCI/DPC: Unify and plumb error handling into DPC PCI/DPC: Disable ERR_NONFATAL and enable ERR_FATAL for DPC Documentation/PCI/pci-error-recovery.txt | 35 ++- drivers/pci/hotplug/pciehp_hpc.c | 20 +- drivers/pci/pci.c | 29 +++ drivers/pci/pci.h | 4 + drivers/pci/pcie/Makefile | 2 +- drivers/pci/pcie/aer/aerdrv.c | 2 + drivers/pci/pcie/aer/aerdrv.h | 30 --- drivers/pci/pcie/aer/aerdrv_core.c | 317 +------------------------- drivers/pci/pcie/dpc.c | 58 +++-- drivers/pci/pcie/err.c | 374 +++++++++++++++++++++++++++++++ drivers/pci/pcie/portdrv.h | 4 + drivers/pci/pcie/portdrv_core.c | 67 ++++++ include/linux/aer.h | 1 + include/uapi/linux/pci_regs.h | 1 + 14 files changed, 540 insertions(+), 404 deletions(-) create mode 100644 drivers/pci/pcie/err.c -- 2.7.4