Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp1091205pxv; Thu, 22 Jul 2021 22:27:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4amVE2QS3uZVlXhxtldIUx4vzyti9pU5DwH27Cay525p1T24kWJHjipBb4r4QyiZ0xwV1 X-Received: by 2002:a05:6402:60c:: with SMTP id n12mr3539648edv.189.1627018064656; Thu, 22 Jul 2021 22:27:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627018064; cv=none; d=google.com; s=arc-20160816; b=MX2/YrCdZqlQa/oyM53qElFSHj1f+kqTdSn9pjROH+RvphyAsGb2ePeumFDVuBLiCK zMOc2jjptbU5CvE3GV0l6at8Qp2pxxPIAGjg1f9yaBfKi92HWYnirAC/OFZU8jEi/5Rd r4+M/WyJz9LStfX10UJ8xHEqFKESSVefD4VuV/f3fGMSrLEGGRBtgX3ethZ5WNL/H5sK boTGbmeNgARJ8SnQ2RwWBtbgqiV6Nd/b+Lzs6YkMY70dQOiwvnHPW0Kw7flmj+HNJjcc rEjsNVKn/McRXrQG14zQJs+QIjAx2UKWquDzBap1dLhXhGy1t+ja1RodBWtMXGcroltf AHbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Xptyi13HccZ4R8P+AijEDP2b7CXavc6mEvvmsOqoBkY=; b=LCv8Z89TpRTKNmqoKR6DW/mYJiSH+FSLAX8SyP+FQ4BqNvFInfINjsU/s02T+D+Cy9 zYxc2tmmbVv2hq3d6C588dCP+vVLG/fsXIzKEQz986bP3YeZfb+2gfJ/S7Rwczc930yg ZCTJwnSxHHW0qvRtFiFStc/ZBBTxWQ29K+NcwvJwTIj9OLmel2Z+C5tjbeI9bPk157kA aZEIGG7cS5f0i6Nego3lYNq17TPdH5pKXynXxRbe1bn28AaztrcYhbr+eBsO5+Ny0PUs FfTqfZFDIKAxOMdAPUUQvDmXtv7HqyPsBaZZyQdOTFZMGI8iPrimcYkraX369C27DL/J WAUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=DJcUAYLk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k9si16914612edo.370.2021.07.22.22.27.21; Thu, 22 Jul 2021 22:27:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=DJcUAYLk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231199AbhGWEoK (ORCPT + 99 others); Fri, 23 Jul 2021 00:44:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231134AbhGWEoK (ORCPT ); Fri, 23 Jul 2021 00:44:10 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79D10C061575; Thu, 22 Jul 2021 22:24:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Xptyi13HccZ4R8P+AijEDP2b7CXavc6mEvvmsOqoBkY=; b=DJcUAYLk0+dpd/8f5g9HghlA3p EeZr7LTFHZmDPgmsgccuMllN8saglU98K+LzeI94X0be9lqlsEPVr/7iAdl5BonTxc0zCc4mQCaU/ 0XS+8rP7eJINp0fVuc4beaxfm32S8rHcaj8NPOZvGL08pLLZQv1cTZgC80Ly2DfcuRWJXGFoiIABb VsygBF0zhSBU1IUbpYEnbCV1lbWyvuZiq5uM8z0aocWQeLQIBRN3HYYObC/sX1W3TDAZCehN+t/xD l8WwHz5ZdXguLSoAcOt6tdlh3NGabF9wtFkidWDFN08ja2VULxLzDcnQJdYtbuO8u1wcZ8V0Zc6G1 og1TAruw==; Received: from hch by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1m6nfa-00B1E8-GY; Fri, 23 Jul 2021 05:24:27 +0000 Date: Fri, 23 Jul 2021 06:24:22 +0100 From: Christoph Hellwig To: Bjorn Helgaas Cc: Kai-Heng Feng , Joerg Roedel , "open list:PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC" , "open list:PCI SUBSYSTEM" , open list , Lalithambika Krishnakumar , Alex Williamson , Oliver O'Halloran , Bjorn Helgaas , Mika Westerberg , Lu Baolu Subject: Re: [PATCH 1/2] PCI/AER: Disable AER interrupt during suspend Message-ID: References: <20210722222351.GA354095@bjorn-Precision-5520> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210722222351.GA354095@bjorn-Precision-5520> X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 22, 2021 at 05:23:51PM -0500, Bjorn Helgaas wrote: > Marking both of these as "not applicable" for now because I don't > think we really understand what's going on. > > Apparently a DMA occurs during suspend or resume and triggers an ACS > violation. I don't think think such a DMA should occur in the first > place. > > Or maybe, since you say the problem happens right after ACS is enabled > during resume, we're doing the ACS enable incorrectly? Although I > would think we should not be doing DMA at the same time we're enabling > ACS, either. > > If this really is a system firmware issue, both HP and Dell should > have the knowledge and equipment to figure out what's going on. DMA on resume sounds really odd. OTOH the below mentioned case of a DMA during suspend seems very like in some setup. NVMe has the concept of a host memory buffer (HMB) that allows the PCIe device to use arbitrary host memory for internal purposes. Combine this with the "Storage D3" misfeature in modern x86 platforms that force a slot into d3cold without consulting the driver first and you'd see symptoms like this. Another case would be the NVMe equivalent of the AER which could lead to a completion without host activity. We now have quirks in the ACPI layer and NVMe to fully shut down the NVMe controllers on these messed up systems with the "Storage D3" misfeature which should avoid such "spurious" DMAs at the cost of wearning out the device much faster.