Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751478AbeACGOe (ORCPT + 1 other); Wed, 3 Jan 2018 01:14:34 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:48624 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750907AbeACGOc (ORCPT ); Wed, 3 Jan 2018 01:14:32 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 03 Jan 2018 11:44:31 +0530 From: poza@codeaurora.org To: Bjorn Helgaas Cc: Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Gabriele Paoloni , Keith Busch , Wei Zhang , Sinan Kaya , Timur Tabi Subject: Re: [PATCH v2 0/4] Address error and recovery for AER and DPC In-Reply-To: <20180102190215.GC6211@bhelgaas-glaptop.roam.corp.google.com> References: <1514532259-19383-1-git-send-email-poza@codeaurora.org> <20180102190215.GC6211@bhelgaas-glaptop.roam.corp.google.com> Message-ID: User-Agent: Roundcube Webmail/1.2.5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 2018-01-03 00:32, Bjorn Helgaas wrote: > On Fri, Dec 29, 2017 at 12:54:15PM +0530, Oza Pawandeep wrote: >> This patch set brings in support for DPC and AER to co-exist and not >> to >> race for recovery. >> >> The current implementation of AER and error message broadcasting to >> the >> EP driver is tightly coupled and limited to AER service driver. >> It is important to factor out broadcasting and other link handling >> callbacks. So that not only when AER gets triggered, but also when DPC >> get >> triggered, or both get triggered simultaneously (for e.g. ERR_FATAL), >> callbacks are handled appropriately. >> having modularized the code, the race between AER and DPC is handled >> gracefully. >> for e.g. when DPC is active and kicked in, AER should not attempt to >> do >> recovery, because DPC takes care of it. > > High-level question: > > We have some convoluted code in negotiate_os_control() and > aer_service_init() that (I think) essentially disables AER unless the > platform firmware grants us permission to use it. > > The last implementation note in PCIe r3.1, sec 6.2.10 says > > DPC may be controlled in some configurations by platform firmware > and in other configurations by the operating system. DPC > functionality is strongly linked with the functionality in Advanced > Error Reporting. To avoid conflicts over whether platform firmware > or the operating system have control of DPC, it is recommended that > platform firmware and operating systems always link the control of > DPC to the control of Advanced Error Reporting. > > I read that as suggesting that we should enable DPC support in Linux > if and only if we also enable AER. But I don't see anything in DPC > that looks like that. Should there be something there? Should DPC be > restructured so it's enabled and handled inside the AER driver instead > of being a separate driver? > > Bjorn The whole idea of factoring out error handing and plug it back to DPC is to enable DPC is participate synchronously in pcie_port_service_driver hooks. AER and DPC both being port service driver, it makes more sense, for DPC to be able to do with those callbacks as much as AER is able to do with those callbacks currently. but those callbacks are tightly coupled with AER driver. that way DPC and AER can act independently in their own space, by gaining more control. and if needed, both can synchronize the callbacks. Regards, Oza.