Received: by 10.192.165.156 with SMTP id m28csp178161imm; Sun, 15 Apr 2018 20:18:47 -0700 (PDT) X-Google-Smtp-Source: AIpwx49ceQ6s+Y45LP+Co78hZN00gC4eUruEiv0HRRh0sl1zE4XxR629NfljL3rs6JM2IpFT1cUr X-Received: by 10.98.9.72 with SMTP id e69mr20037738pfd.197.1523848727658; Sun, 15 Apr 2018 20:18:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523848727; cv=none; d=google.com; s=arc-20160816; b=no9EYIaQ30OtClrPItHN46SfMhe7UxImkI1JoLOpf7vnRa5O50RBJ4iPTgDoFWd/2j d+H5NL3j8jIwFrSdWXETB4LC/5NNd6FCQpdR8X2KUHJR2NE45LLdGhEezYRWdaUYrsfH B7+O/a50G/ANRKhf1wDCpkc8baEHiE2EVd5z0ktacokAJ5nPEe+jcAQws++Etg0JSMxO yTp5yYlxhBVirjNIE0jRUyFS1/dpXgpzGtqoFiBnOxCQOvU58mFNE0boRVv+O+Vr40KS XcgE+BeOES0hB/z3CmMM9/tIFBCn/fTBX86OnrjRIUGmWPd2pGdl/v1Mps1nxGLFXI0f y8zQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:arc-authentication-results; bh=Hub34NajNoTcebpwgJdE68w4bD/7LMUEelstOjBIQA8=; b=nRR9QsJjUAFlWsuVNt4EYPY/sIv6NcacniKtpsNxYVYA85wih3H/UBAoQ40r4hJePH ai1saekVt0wHqhNKPPcCmS19RivfMD/3dIDZJJ96XR8FG/ZjE2cHV4QZvetIgDaGqwbs FFobW0bwB/e4GaBcLdBJXVdnujLHRJw5jTqWoc4h6kN/eRzb9yNO3SJh5NU6BkdLgBK7 0WZGLrUP1TwQT94IBipEDsQ+95PRstZA8YBh7lJeX52EzLOTJ9Y280QDE42Kt3y1i0F7 LambQmkM0Q15wodJlLPx67hYxD/l9MRtC2F6XetiWIBYVAaTQcEz2ACGJ/yXr8baK+A6 0CzQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e12si515651pgs.22.2018.04.15.20.18.33; Sun, 15 Apr 2018 20:18:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752906AbeDPDRa (ORCPT + 99 others); Sun, 15 Apr 2018 23:17:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:55358 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752609AbeDPDR3 (ORCPT ); Sun, 15 Apr 2018 23:17:29 -0400 Received: from localhost (unknown [69.71.5.252]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4BA3D21775; Mon, 16 Apr 2018 03:17:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4BA3D21775 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=helgaas@kernel.org Date: Sun, 15 Apr 2018 22:17:26 -0500 From: Bjorn Helgaas To: Sinan Kaya Cc: Keith Busch , Oza Pawandeep , Bjorn Helgaas , Philippe Ombredanne , Thomas Gleixner , Greg Kroah-Hartman , Kate Stewart , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Dongdong Liu , Wei Zhang , Timur Tabi , Alex Williamson Subject: Re: [PATCH v13 6/6] PCI/DPC: Do not do recovery for hotplug enabled system Message-ID: <20180416031726.GB158153@bhelgaas-glaptop.roam.corp.google.com> References: <20180410210349.GG54986@bhelgaas-glaptop.roam.corp.google.com> <13efe2e8-74c8-acb4-ec58-f79b14a1f182@codeaurora.org> <20180412140648.GD145698@bhelgaas-glaptop.roam.corp.google.com> <20180412143954.GB4810@localhost.localdomain> <20180412150231.GD4810@localhost.localdomain> <20180412170911.GA6424@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 14, 2018 at 11:53:17AM -0400, Sinan Kaya wrote: > You indicated that you want to unify the AER and DPC behavior. Let's > settle on what we want to do one more time. We have been going forth > and back on the direction. My thinking is that as much as possible, similar events should be handled similarly, whether the mechanism is AER, DPC, EEH, etc. Ideally, drivers shouldn't have to be aware of which mechanism is in use. Error recovery includes conventional PCI as well, but right now I think we're only concerned with PCIe. The following error types are from PCIe r4.0, sec 6.2.2: ERR_COR Corrected by hardware with no software intervention. Software involved for logging only. Handled by AER via pci_error_handlers; DPC is never involved. Link is unaffected. ERR_NONFATAL A transaction is unreliable but the link is fully functional. If DPC is not supported, handled by AER via pci_error_handlers and the link is unaffected. If DPC supported, handled by DPC (because we set PCI_EXP_DPC_CTL_EN_NONFATAL) via remove/re-enumerate. ERR_FATAL The link is unreliable. If DPC is not supported, handled by AER via pci_error_handlers and the link is reset. If DPC supported, handled by DPC via remove/re-enumerate. It doesn't seem right to me that we handle both ERR_NONFATAL and ERR_FATAL events differently if we happen to have DPC support in a switch. Maybe we should consider triggering DPC only on ERR_FATAL? That would keep DPC out of the ERR_NONFATAL cases. For ERR_FATAL, maybe we should bite the bullet and use remove/re-enumerate for AER as well as for DPC. That would be painful for higher-level software, but if we're willing to accept that pain for new systems that support DPC, maybe life would be better overall if it worked the same way on systems without DPC? Bjorn