Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp1373719pxb; Sat, 29 Jan 2022 04:21:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJwYLKVGvYCm25Z//Upfj62fymxsr9zxIYlbdXNZJqQmkEe1IBHI/ERkHJylLoDYTQha1gbp X-Received: by 2002:a17:907:2d92:: with SMTP id gt18mr10080552ejc.579.1643458869927; Sat, 29 Jan 2022 04:21:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643458869; cv=none; d=google.com; s=arc-20160816; b=Kz8ymSgSBBGCFHmw3ORDkNuBJ/XJUTk+94K671FUcVvA9wT9KpAqfYJFRMGd8ZyS5L mf3gerX+n+4db2lXJwGCaS+E7I02oX8zBcN0mTyzEWUrP0zkxOoj0LzP5gV7eRRKWWjw e8PrXOcfcpm/3rKq83eNdWc8buGFFlOpmj9Vcd6t/b7h9nRmYPvGiDyARiEhRUw+rm/g UcKggu+QPqoejMg5PG+7h57YSf8Eb3MOgIK3TFp7F3j1mAukSZmN9B/8YTSGAlhDQxPF u+Q6zdEulnqdaZFxfe8gtDVkVATyNkifQp8NBNh1Bi1z2p2DkQC9bVwYBQUnt+BD4NhO anxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=o1I8UaIYrM6dC/0Rd0QRoVtyxyLa0HoViEvgtP4OM7I=; b=xCQrEUyaWByiK512TIVEJVMtddjJleg5+tCDiIqeuWFZbF2TgXeZfB6qWWcUdZJPsg bEI4t/z7SBOKAb5J9pF1UBZaOnFcvH5BlomUyu7okPb2Ytgdj5p9Oy6JacceiKcmB6ef h1gtosEra0DghGiCRaCT9e2Wuy1m2R3pdyDQUc2OHxMJgDfndRHZsUZXr8w5NEGkPX5j m5IV3hOyaEJ1G6tSyoKFpvsqSReGfNE77zHTY2IneN3pwVCW2RA7pEpAFlol8X5bDO+e kgITPW6m/nltmlQzK3omB3ncoRDafxkT5qsTrir+aTT/r8aoIrBdGjJBWhBrwCmxk7IK WmVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b="nO5Y//Xo"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hg2si2679295ejc.231.2022.01.29.04.20.45; Sat, 29 Jan 2022 04:21:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b="nO5Y//Xo"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345788AbiA1DaK (ORCPT + 99 others); Thu, 27 Jan 2022 22:30:10 -0500 Received: from smtp-relay-internal-0.canonical.com ([185.125.188.122]:58638 "EHLO smtp-relay-internal-0.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345589AbiA1DaJ (ORCPT ); Thu, 27 Jan 2022 22:30:09 -0500 Received: from mail-oo1-f71.google.com (mail-oo1-f71.google.com [209.85.161.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id DA4D04004A for ; Fri, 28 Jan 2022 03:30:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1643340607; bh=o1I8UaIYrM6dC/0Rd0QRoVtyxyLa0HoViEvgtP4OM7I=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=nO5Y//XoBV+zNQYzMG8F9jNmJkui9O+/Cssc0Wltivpqo5x4By6tmVM7LBcd+rMl6 MnQpukQqOw9X8FBmcpq/cIbXVgoj5UAP8pHyIlkAgHSPlDSBDi/dw/vMAnddXbwnwe 61IgiU1OT5lKDJg+OL9Vo3ss1adKX+WCakDJrH83MSHgQl2QXiHRj6p1mCBUh7RoVd TvO+AHBS/UfQ1wqPpHjzEetjs0nKb1wSivtH0ch1W491DmlQTp9PWQ95Q+fFSVYK4u Wa1JElgtRYWw7BBjthmb2KvCnHq2pJTh15M7Gjcc58boe0r25tnNPeYss8hTqLPyJY ugCJL1p0DCS4A== Received: by mail-oo1-f71.google.com with SMTP id n30-20020a4a611e000000b002e519f04f8cso2677869ooc.7 for ; Thu, 27 Jan 2022 19:30:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o1I8UaIYrM6dC/0Rd0QRoVtyxyLa0HoViEvgtP4OM7I=; b=idWxHLoHKBqzRv7ILrWj926AR1KYBfQ24tmA2+Zr9b21zlUjUMlPYaCFw/JL7KHw4q H5pcy1tEGgtGpHhFcFBI9zhDV7ORKAYAS8HASzVl0PTipy54FAZfrS+Ug2Hj+pcAGdhm TcBSIJ+CV3bpGLYfl6SN4InJTjJbjR+QqlbQimq/ZUlVWUlEv7wRxfY30iPKwqpS7ds/ lHX8g73hM6Ipn0Nkc1pLvAXIv0+KVw/ICVbU4V/JsiHD2x/+Ye2N5C7TK5IBit2JzZXa uCkN+bA+2ZBKT782nnQWhH/Anhe9Hm650n4uTIdC9RsmpY0wB1IVS8VmDVmAriL/d8ps Oekg== X-Gm-Message-State: AOAM531/gyhIa27Py/lkQ9xhta6j6G2gEdbMt9gRM2RHY+mOnMSKbJ9+ vbKQ1+cQTldwGlxsrIArdHi2mqm3G0tYQag3zsoy/p3G0W8MXDIeyL+cAoLCv/EJihxTZ2TJg3Q 8IaYi5m4XrVCl/0Wa43fQ0g6isjpWMK05dyyu3RGwQ7eyhjiBu25ET/e/4Q== X-Received: by 2002:a05:6808:191a:: with SMTP id bf26mr9172613oib.111.1643340606601; Thu, 27 Jan 2022 19:30:06 -0800 (PST) X-Received: by 2002:a05:6808:191a:: with SMTP id bf26mr9172601oib.111.1643340606215; Thu, 27 Jan 2022 19:30:06 -0800 (PST) MIME-Version: 1.0 References: <20220127025418.1989642-1-kai.heng.feng@canonical.com> <0259955f-8bbb-1778-f234-398f1356db8b@linux.intel.com> <11891652-40c6-f111-46b7-e96d1729815e@linux.intel.com> In-Reply-To: <11891652-40c6-f111-46b7-e96d1729815e@linux.intel.com> From: Kai-Heng Feng Date: Fri, 28 Jan 2022 11:29:54 +0800 Message-ID: Subject: Re: [PATCH v2 1/2] PCI/AER: Disable AER service when link is in L2/L3 ready, L2 and L3 state To: Lu Baolu Cc: bhelgaas@google.com, mika.westerberg@linux.intel.com, koba.ko@canonical.com, Russell Currey , "Oliver O'Halloran" , Lalithambika Krishnakumar , Joerg Roedel , linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 28, 2022 at 10:57 AM Lu Baolu wrote: > > On 1/27/22 7:14 PM, Kai-Heng Feng wrote: > > On Thu, Jan 27, 2022 at 3:01 PM Lu Baolu wrote: > >> > >> On 2022/1/27 10:54, Kai-Heng Feng wrote: > >>> Commit 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in > >>> hint") enables ACS, and some platforms lose its NVMe after resume from > >>> S3: > >>> [ 50.947816] pcieport 0000:00:1b.0: DPC: containment event, status:0x1f01 source:0x0000 > >>> [ 50.947817] pcieport 0000:00:1b.0: DPC: unmasked uncorrectable error detected > >>> [ 50.947829] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID) > >>> [ 50.947830] pcieport 0000:00:1b.0: device [8086:06ac] error status/mask=00200000/00010000 > >>> [ 50.947831] pcieport 0000:00:1b.0: [21] ACSViol (First) > >>> [ 50.947841] pcieport 0000:00:1b.0: AER: broadcast error_detected message > >>> [ 50.947843] nvme nvme0: frozen state error detected, reset controller > >>> > >>> It happens right after ACS gets enabled during resume. > >>> > >>> There's another case, when Thunderbolt reaches D3cold: > >>> [ 30.100211] pcieport 0000:00:1d.0: AER: Uncorrected (Non-Fatal) error received: 0000:00:1d.0 > >>> [ 30.100251] pcieport 0000:00:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) > >>> [ 30.100256] pcieport 0000:00:1d.0: device [8086:7ab0] error status/mask=00100000/00004000 > >>> [ 30.100262] pcieport 0000:00:1d.0: [20] UnsupReq (First) > >>> [ 30.100267] pcieport 0000:00:1d.0: AER: TLP Header: 34000000 08000052 00000000 00000000 > >>> [ 30.100372] thunderbolt 0000:0a:00.0: AER: can't recover (no error_detected callback) > >>> [ 30.100401] xhci_hcd 0000:3e:00.0: AER: can't recover (no error_detected callback) > >>> [ 30.100427] pcieport 0000:00:1d.0: AER: device recovery failed > >>> > >>> So disable AER service to avoid the noises from turning power rails > >>> on/off when the device is in low power states (D3hot and D3cold), as > >>> PCIe spec "5.2 Link State Power Management" states that TLP and DLLP > >>> transmission is disabled for a Link in L2/L3 Ready (D3hot), L2 (D3cold > >>> with aux power) and L3 (D3cold). > >>> > >>> Bugzilla:https://bugzilla.kernel.org/show_bug.cgi?id=209149 > >>> Bugzilla:https://bugzilla.kernel.org/show_bug.cgi?id=215453 > >>> Fixes: 50310600ebda ("iommu/vt-d: Enable PCI ACS for platform opt in hint") > >> > >> I don't know what this fix has to do with the commit 50310600ebda. > > > > Commit 50310600ebda only exposed the underlying issue. Do you think > > "Fixes:" tag should change to other commits? > > > >> Commit 50310600ebda only makes sure that PCI ACS is enabled whenever > >> Intel IOMMU is on. Before this commit, PCI ACS could also be enabled > >> and result in the same problem. Or anything I missed? > > > > The system in question didn't enable ACS before commit 50310600ebda. > > This commit exposed the issue on your configuration doesn't mean the > fix should be back ported as far as that commit. I believe if you add > intel-iommu=on in the kernel parameter, the issue still exists even you > revert commit 50310600ebda or checkout a tag before it. That's true. I guess it's better to drop the "Fixes:" tag. Bjorn, should I send another version of it? Kai-Heng > > Best regards, > baolu