Received: by 10.192.165.148 with SMTP id m20csp1598211imm; Wed, 25 Apr 2018 22:32:27 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrRHLb/gP0JA9YhoV3cRagPy2M0Dgav9cE5M8Gm5qnBLy3y/uelL6dFkNb4SX3Q4tl3tHkG X-Received: by 2002:a17:902:2c83:: with SMTP id n3-v6mr8166270plb.140.1524720747902; Wed, 25 Apr 2018 22:32:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524720747; cv=none; d=google.com; s=arc-20160816; b=XsEly6Tan6pjJ2HlnucJh2PbWHFcAxnxAUdIkEQoXYz75sfxKlPfh/dtgHO2YsrnEb r093aaGYWsHz6Lfo7ifF6ZVWjQbXXYuFehNawxlqXefr6OoXi9UR3JHoAMA54tQMIBpz rinyN5GwTfjn20qLGD/CXp2KjHEtyNIRw2LGKC7ocl8OjCMuRKBefEMlBM9iKGq5VE2u KUPGs/m1HelLCjGE8T3OpvrxsUs1z8VYiSYSOCH0wiwbRbPFQDICxcTCaZjQJ5ew1qXJ 0S8BV35D8asSE1o5esyn0rktyQyNF8E2QfUMuiUnJI4WU4nK8XpOQ2cTC8DXMDJPZoUm 3AZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:references :in-reply-to:subject:to:from:date:content-transfer-encoding :mime-version:dkim-signature:dkim-signature :arc-authentication-results; bh=Jz8XiEO2B2XTltJYmy419yr+f4QHe+foSP0XTEz0MS8=; b=oeFuabkq2rrtkkbJYIzfYByMktYzfxM7j1gIZ0tkmaqoGJVhLFKOLgha4pyCnNsQ/Y tQiaUE28Ko2by4UnszwlWZqmnxf8x7SG+aWcW2Oq5UzELFM3ccm6EKdQxOPLEb8L8r2m DBEuum86kZKgB61+VB6UhsfZs5xsHR1yaqJaQsQkC3Iod9R0eJTdQCsV/zuTHB0wH84L h8GbbCRE4WPDzjpb+BBPNpTIawRrY6YkHbw19T409oVXJFr9+AZ6p6mHzpwdvOHnhh4q S3lvQ3HL3AuUI5ohGuGqwspK5beFPIBy3nj1aorodbT3QjNMyGnhd5eAUmB/2+zWdKAL BNNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=JvAM++YY; dkim=pass header.i=@codeaurora.org header.s=default header.b=YiPlhahF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 11-v6si18235526plc.466.2018.04.25.22.32.12; Wed, 25 Apr 2018 22:32:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=JvAM++YY; dkim=pass header.i=@codeaurora.org header.s=default header.b=YiPlhahF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752326AbeDZFa7 (ORCPT + 99 others); Thu, 26 Apr 2018 01:30:59 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:36330 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750891AbeDZFa5 (ORCPT ); Thu, 26 Apr 2018 01:30:57 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id DB62360FF4; Thu, 26 Apr 2018 05:30:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1524720656; bh=tl6p/jDnz07KUDTjxNOCJvPX43Whw/7P912eY20Ge40=; h=Date:From:To:Subject:In-Reply-To:References:From; b=JvAM++YY/97Ua01gJd9mluxMYxUixBEs72f0oLmMFzmksm9IFasJeZszXdCCwwn3r 6EHK+WNDTO8y07qDEHpVrkITf5S3pEW3Eh+XDOwBfzB12sfqz7Kc/f5htn6JR6B1uL 00HIUAV+E/hl/DvBsQlCMBkUENgJp+a8ipIJrNx0= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.codeaurora.org (Postfix) with ESMTP id A299160FF2; Thu, 26 Apr 2018 05:30:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1524720652; bh=tl6p/jDnz07KUDTjxNOCJvPX43Whw/7P912eY20Ge40=; h=Date:From:To:Subject:In-Reply-To:References:From; b=YiPlhahFRuVXpR0rSDlzjndssgNSLC5m/rIBJN0mbenmZ4fz1l/8YiJSoBJ66zl0z oGEIu6D8xT+6oiDXWV/Kk0pgxl9/G/aH4udZQpqzNoydxlbuHLKqKjfm6q2yMxQgdB yI/xGJpE7ifJDt2NFCU/KwZVkqjYYh70/UXtzvD0= MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 26 Apr 2018 11:00:52 +0530 From: poza@codeaurora.org To: Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Keith Busch , Wei Zhang , Sinan Kaya , Timur Tabi Subject: Re: [PATCH v14 0/9] Address error and recovery for AER and DPC In-Reply-To: <1524496993-29799-1-git-send-email-poza@codeaurora.org> References: <1524496993-29799-1-git-send-email-poza@codeaurora.org> Message-ID: X-Sender: poza@codeaurora.org User-Agent: Roundcube Webmail/1.2.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-04-23 20:53, Oza Pawandeep wrote: > This patch set brings in error handling support for DPC > > The current implementation of AER and error message broadcasting to the > EP driver is tightly coupled and limited to AER service driver. > It is important to factor out broadcasting and other link handling > callbacks. So that not only when AER gets triggered, but also when DPC > get > triggered (for e.g. ERR_FATAL), callbacks are handled appropriately. > > The goal of the patch-set is: > DPC should handle the error handling and recovery similar to AER, > because > finally both are attempting recovery in some or the other way, > and for that error handling and recovery framework has to be loosely > coupled. > > It achieves uniformity and transparency to the error handling agents > such > as AER, DPC, with respect to recovery and error handling. > > So, this patch-set tries to unify lot of things between error agents > and > make them behave in a well defined way. (be it error (FATAL, NON_FATAL) > handling or recovery). > > The FATAL error handling is handled with remove/reset_link/re-enumerate > sequence while the NON_FATAL follows the default path. > Documentation/PCI/pci-error-recovery.txt talks more on that. > > Changes since v13: > Bjorn's comments addressed > > handke FATAL errors with remove devices followed by > re-enumeration. > > changes in AER and DPC along with required Documentation. > Changes since v12: > Bjorn's and Keith's Comments addressed. > > Made DPC and AER error handling identical > > hanldled cases for hotplug enabled system differently. > Changes since v11: > Bjorn's comments addressed. > > rename pcie-err.c to err.c > > removed EXPORT_SYMBOL > > made generic find_serivce function in port driver. > > removed mutex patch as no need to have mutex in pcie_do_recovery > > brough in DPC_FATAL in aer.h > > so now all the error codes (AER and DPC) are unified in aer.h > Changes since v10: > Christoph Hellwig's, David Laight's and Randy Dunlap's > comments addressed. > > renamed pci_do_recovery to pcie_do_recovery > > removed inner braces in conditional statements. > > restrctured the code in pci_wait_for_link > > EXPORT_SYMBOL_GPL > Changes since v9: > Sinan's comments addressed. > > bool active = true; unnecessary variable removed. > Changes since v8: > Fixed Kbuild errors. > Changes since v7: > Rebased the code on pci master > > > https://kernel.googlesource.com/pub/scm/linux/kernel/git/helgaas/pci > Changes since v6: > Sinan's and Stefan's comments implemented. > > reordered patch 6 and 7 > > cleaned up > Changes since v5: > Sinan's and Keith's comments incorporated. > > made separate patch for mutex > > unified error repotting codes into driver/pci/pci.h > > got rid of wait link active/inactive and > made generic function in driver/pci/pci.c > Changes since v4: > Bjorn's comments incorporated. > > Renamed only do_recovery. > > moved the things more locally to drivers/pci/pci.h > Changes since v3: > Bjorn's comments incorporated. > > Made separate patch renaming generic pci_err.c > > Introduce pci_err.h to contain all the error types and > recovery > > removed all the dependencies on pci.h > Changes since v2: > Based on feedback from Keith: > " > When DPC is triggered due to receipt of an uncorrectable error > Message, > the Requester ID from the Message is recorded in the DPC Error > Source ID register and that Message is discarded and not forwarded > Upstream. > " > Removed the patch where AER checks if DPC service is active > Changes since v1: > Kbuild errors fixed: > > pci_find_dpc_dev made static > > ras_event.h updated > > pci_find_aer_service call with CONFIG check > > pci_find_dpc_service call with CONFIG check > > Oza Pawandeep (9): > PCI/AER: Rename error recovery to generic PCI naming > PCI/AER: Factor out error reporting from AER > PCI/PORTDRV: Implement generic find service > PCI/PORTDRV: Implement generic find device > PCI/DPC: Unify and plumb error handling into DPC > PCI: Unify wait for link active into generic PCI > PCI/DPC: Disable ERR_NONFATAL for DPC > PCI/AER/DPC: Align FATAL error handling for AER and DPC > pci-error-recovery: Add AER_FATAL handling > > Documentation/PCI/pci-error-recovery.txt | 35 ++- > drivers/pci/hotplug/pciehp_hpc.c | 20 +- > drivers/pci/pci.c | 30 +++ > drivers/pci/pci.h | 5 + > drivers/pci/pcie/Makefile | 2 +- > drivers/pci/pcie/aer/aerdrv.c | 2 + > drivers/pci/pcie/aer/aerdrv.h | 30 --- > drivers/pci/pcie/aer/aerdrv_core.c | 317 > +------------------------- > drivers/pci/pcie/err.c | 374 > +++++++++++++++++++++++++++++++ > drivers/pci/pcie/pcie-dpc.c | 63 +++--- > drivers/pci/pcie/portdrv.h | 4 + > drivers/pci/pcie/portdrv_core.c | 69 ++++++ > include/linux/aer.h | 2 + > include/uapi/linux/pci_regs.h | 3 +- > 14 files changed, 552 insertions(+), 404 deletions(-) > create mode 100644 drivers/pci/pcie/err.c Hi Bjorn, I know I need to rebase this whole patch-set to 4.17 now. But before I do that, can you please help to comment. Regards, Oza.