Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2329402pxb; Fri, 5 Feb 2021 15:18:58 -0800 (PST) X-Google-Smtp-Source: ABdhPJzn3LM0CrcXyKZPWqWZOBW8ixapgwT/Ghqs2YA4922OtWMwaO6RyVB0QpUPUpwTTzxgXTUp X-Received: by 2002:a17:906:3999:: with SMTP id h25mr6365299eje.146.1612567138203; Fri, 05 Feb 2021 15:18:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612567138; cv=none; d=google.com; s=arc-20160816; b=uVUBaBYUDzSvPfupx7tnIHDajGF6//BVwzDvpKoj2vB9cGFY9/0BtdhBs1AW945oen 0Q5xO8tTWYdgiAA2jHzbuhqkj+LNlpxCb9GROI8+92HakW14cb1ktlZP7ui4nct6MCHo R766qPYecDNAnTUV8i8zhamEQF2TMLjqrwdRGkYBtqrfj5O95gYqloqNm3UieBIxqj8f s5F8xduSGfTFPzjeFWVysUxwDjfZn5jPY/GnXB55w23p9kdG/fBsUI1xigTmWrUc2g6a cWHSoH6n6di6YigcqhiLrls1x0/qcRjgpEiMOcCCfKPvborUtoHfjNW0iAopqddWCEvr QDBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version; bh=dcCnqRmAjchBIWgopIMyFLGi9ScvcHWbPb+SJE+igpM=; b=qXJeufrizz43rd4ei3L6ZuKoDKuSEOgptFYMIq3cuvint1avJUTWExOewuJW2uB5HR C3t+BjkLMgoaQwz08NI6KBWFOqLT5EIHQfRBHGZhdTJQGbqBM6fyA98fmSaEnD7PcO0C mYcfmXZ0kixweEVe4WLNBBdI+5z/URsJzQIYz39POIPM4LxadjgFlLdSxxmcnryDi/R0 TyW4DlmcCm08MQv8aM1FwXzpH7wMOvSotHE5ipGHGGB8nMglM/+7BYb0nE6+m18uLL4b D8ETQHiuETdFJZ3okm5cVAg0qZx+/sdTbdWUh8giTyeUIkWw5VIGUIos7llQUfGFYjdk 0/Fg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t15si6130281ejj.506.2021.02.05.15.18.33; Fri, 05 Feb 2021 15:18:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231344AbhBEXR5 (ORCPT + 99 others); Fri, 5 Feb 2021 18:17:57 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:36122 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232568AbhBEO2A (ORCPT ); Fri, 5 Feb 2021 09:28:00 -0500 Received: from mail-lj1-f199.google.com ([209.85.208.199]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1l82rh-0001U0-Ky for linux-kernel@vger.kernel.org; Fri, 05 Feb 2021 15:17:45 +0000 Received: by mail-lj1-f199.google.com with SMTP id k20so5666030ljk.19 for ; Fri, 05 Feb 2021 07:17:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dcCnqRmAjchBIWgopIMyFLGi9ScvcHWbPb+SJE+igpM=; b=cmfjWj6oRL6/IrA9pFkBuCQoDwTmZLgh1++dwju54uApBWycZEcZz3ywlvLExv2URw Glvk68eo1CorLxUtdMFT3K+2Qqy1YGJVr4cegX/Ty8x3UOiwdDbXhQ/E7nQWJUSHfQlM ye5UxfmnJlcU93O+nEbgpu/9RvD5OCnC+rG5fcoUAwV+2wfhoLUmx2S9bdnBzWhrI8zy UnQu6i47KdRxLo1mlxvK5K+W00rvwLL0YYV8ooJLDqChbWX3mndl8HwRH9hKdGeHPIHv DlJQdeUDYN1MhANAXRV5dVfijZdBhgPAds4yuDhkqCt4H0O3uGz/mwP2b3ugx2Ti2lqQ TiQw== X-Gm-Message-State: AOAM532ac79LNYAGG70QvwzF85ieKJYBd9t7wXW66jvM86FKDuhbrI0u td1LoykD4iZPHcbU8t8K2GHY9sNvI5rR2Zh+YMdh+Xc617cHf7JSdAYtADYMgpLxDi0RgzojQbI TwERBbk0SkwZ/HckecMD7VgP1zrNjOnCIHtPgJg9V7laJYYYHU+3nfJFSDw== X-Received: by 2002:a05:6512:b1b:: with SMTP id w27mr2804654lfu.10.1612538264925; Fri, 05 Feb 2021 07:17:44 -0800 (PST) X-Received: by 2002:a05:6512:b1b:: with SMTP id w27mr2804637lfu.10.1612538264573; Fri, 05 Feb 2021 07:17:44 -0800 (PST) MIME-Version: 1.0 References: <20210204232758.GA125392@bjorn-Precision-5520> In-Reply-To: <20210204232758.GA125392@bjorn-Precision-5520> From: Kai-Heng Feng Date: Fri, 5 Feb 2021 23:17:32 +0800 Message-ID: Subject: Re: [PATCH 1/2] PCI/AER: Disable AER interrupt during suspend To: Bjorn Helgaas Cc: Bjorn Helgaas , Russell Currey , "Oliver O'Halloran" , Mika Westerberg , Lalithambika Krishnakumar , Lu Baolu , Joerg Roedel , Alex Williamson , "open list:PCI ENHANCED ERROR HANDLING (EEH) FOR POWERPC" , "open list:PCI SUBSYSTEM" , open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 5, 2021 at 7:28 AM Bjorn Helgaas wrote: > > [+cc Alex] > > On Thu, Jan 28, 2021 at 12:09:37PM +0800, Kai-Heng Feng wrote: > > On Thu, Jan 28, 2021 at 4:51 AM Bjorn Helgaas wrote: > > > On Thu, Jan 28, 2021 at 01:31:00AM +0800, Kai-Heng Feng wrote: > > > > Commit 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in > > > > hint") enables ACS, and some platforms lose its NVMe after resume from > > > > firmware: > > > > [ 50.947816] pcieport 0000:00:1b.0: DPC: containment event, status:0x1f01 source:0x0000 > > > > [ 50.947817] pcieport 0000:00:1b.0: DPC: unmasked uncorrectable error detected > > > > [ 50.947829] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) > > > > [ 50.947830] pcieport 0000:00:1b.0: device [8086:06ac] error status/mask=00200000/00010000 > > > > [ 50.947831] pcieport 0000:00:1b.0: [21] ACSViol (First) > > > > [ 50.947841] pcieport 0000:00:1b.0: AER: broadcast error_detected message > > > > [ 50.947843] nvme nvme0: frozen state error detected, reset controller > > > > > > > > It happens right after ACS gets enabled during resume. > > > > > > > > To prevent that from happening, disable AER interrupt and enable it on > > > > system suspend and resume, respectively. > > > > > > Lots of questions here. Maybe this is what we'll end up doing, but I > > > am curious about why the error is reported in the first place. > > > > > > Is this a consequence of the link going down and back up? > > > > Could be. From the observations, it only happens when firmware suspend > > (S3) is used. > > Maybe it happens when it's gets powered up, but I don't have equipment > > to debug at hardware level. > > > > If we use non-firmware suspend method, enabling ACS after resume won't > > trip AER and DPC. > > > > > Is it consequence of the device doing a DMA when it shouldn't? > > > > If it's doing DMA while suspending, the same error should also happen > > after NVMe is suspended and before PCIe port suspending. > > Furthermore, if non-firmware suspend method is used, there's so such > > issue, so less likely to be any DMA operation. > > > > > Are we doing something in the wrong order during suspend? Or maybe > > > resume, since I assume the error is reported during resume? > > > > Yes the error is reported during resume. The suspend/resume order > > seems fine as non-firmware suspend doesn't have this issue. > > I really feel like we need a better understanding of what's going on > here. Disabling the AER interrupt is like closing our eyes and > pretending that because we don't see it, it didn't happen. > > An ACS error is triggered by a DMA, right? I'm assuming an MMIO > access from the CPU wouldn't trigger this error. And it sounds like > the error is triggered before we even start running the driver after > resume. > > If we're powering up an NVMe device from D3cold and it DMAs before the > driver touches it, something would be seriously broken. I doubt > that's what's happening. Maybe a device could resume some previously > programmed DMA after powering up from D3hot. I am not that familiar with PCIe ACS/AER/DPC, so I can't really answer questions you raised. PCIe spec doesn't say the suspend/resume order is also not helping here. However, I really think it's a system firmware issue. I've seen some suspend-to-idle platforms with NVMe can reach D3cold, those are unaffected. > > Or maybe the error occurred on suspend, like if the device wasn't > quiesced or something, but we didn't notice it until resume? The > AER error status bits are RW1CS, which means they can be preserved > across hot/warm/cold resets. > > Can you instrument the code to see whether the AER error status bit is > set before enabling ACS? I'm not sure that merely enabling ACS (I > assume you mean pci_std_enable_acs(), where we write PCI_ACS_CTRL) > should cause an interrupt for a previously-logged error. I suspect > that could happen when enabling *AER*, but I wouldn't think it would > happen when enabling *ACS*. Diff to print AER status: https://bugzilla.kernel.org/show_bug.cgi?id=209149#c11 And dmesg: https://bugzilla.kernel.org/show_bug.cgi?id=209149#c12 Looks like the read before suspend and after resume are both fine. > > Does this error happen on multiple machines from different vendors? > Wondering if it could be a BIOS issue, e.g., BIOS not cleaning up > after it did something to cause an error. AFAIK, systems from both HP and Dell are affected. I was told that the reference platform from Intel is using suspend-to-idle, but vendors changed the sleep method to S3 to have lower power consumption to pass regulation. Kai-Heng > > > > If we *do* take the error, why doesn't DPC recovery work? > > > > It works for the root port, but not for the NVMe drive: > > [ 50.947816] pcieport 0000:00:1b.0: DPC: containment event, > > status:0x1f01 source:0x0000 > > [ 50.947817] pcieport 0000:00:1b.0: DPC: unmasked uncorrectable error detected > > [ 50.947829] pcieport 0000:00:1b.0: PCIe Bus Error: > > severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver > > ID) > > [ 50.947830] pcieport 0000:00:1b.0: device [8086:06ac] error > > status/mask=00200000/00010000 > > [ 50.947831] pcieport 0000:00:1b.0: [21] ACSViol (First) > > [ 50.947841] pcieport 0000:00:1b.0: AER: broadcast error_detected message > > [ 50.947843] nvme nvme0: frozen state error detected, reset controller > > [ 50.948400] ACPI: EC: event unblocked > > [ 50.948432] xhci_hcd 0000:00:14.0: PME# disabled > > [ 50.948444] xhci_hcd 0000:00:14.0: enabling bus mastering > > [ 50.949056] pcieport 0000:00:1b.0: PME# disabled > > [ 50.949068] pcieport 0000:00:1c.0: PME# disabled > > [ 50.949416] e1000e 0000:00:1f.6: PME# disabled > > [ 50.949463] e1000e 0000:00:1f.6: enabling bus mastering > > [ 50.951606] sd 0:0:0:0: [sda] Starting disk > > [ 50.951610] nvme 0000:01:00.0: can't change power state from D3hot > > to D0 (config space inaccessible) > > [ 50.951730] nvme nvme0: Removing after probe failure status: -19 > > [ 50.952360] nvme nvme0: failed to set APST feature (-19) > > [ 50.971136] snd_hda_intel 0000:00:1f.3: PME# disabled > > [ 51.089330] pcieport 0000:00:1b.0: AER: broadcast resume message > > [ 51.089345] pcieport 0000:00:1b.0: AER: device recovery successful > > > > But I think why recovery doesn't work for NVMe is for another discussion... > > > > Kai-Heng > > > > > > > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209149 > > > > Fixes: 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint") > > > > Signed-off-by: Kai-Heng Feng > > > > --- > > > > drivers/pci/pcie/aer.c | 18 ++++++++++++++++++ > > > > 1 file changed, 18 insertions(+) > > > > > > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > > > index 77b0f2c45bc0..0e9a85530ae6 100644 > > > > --- a/drivers/pci/pcie/aer.c > > > > +++ b/drivers/pci/pcie/aer.c > > > > @@ -1365,6 +1365,22 @@ static int aer_probe(struct pcie_device *dev) > > > > return 0; > > > > } > > > > > > > > +static int aer_suspend(struct pcie_device *dev) > > > > +{ > > > > + struct aer_rpc *rpc = get_service_data(dev); > > > > + > > > > + aer_disable_rootport(rpc); > > > > + return 0; > > > > +} > > > > + > > > > +static int aer_resume(struct pcie_device *dev) > > > > +{ > > > > + struct aer_rpc *rpc = get_service_data(dev); > > > > + > > > > + aer_enable_rootport(rpc); > > > > + return 0; > > > > +} > > > > + > > > > /** > > > > * aer_root_reset - reset Root Port hierarchy, RCEC, or RCiEP > > > > * @dev: pointer to Root Port, RCEC, or RCiEP > > > > @@ -1437,6 +1453,8 @@ static struct pcie_port_service_driver aerdriver = { > > > > .service = PCIE_PORT_SERVICE_AER, > > > > > > > > .probe = aer_probe, > > > > + .suspend = aer_suspend, > > > > + .resume = aer_resume, > > > > .remove = aer_remove, > > > > }; > > > > > > > > -- > > > > 2.29.2 > > > >