Received: by 2002:a05:7412:8521:b0:e2:908c:2ebd with SMTP id t33csp672614rdf; Fri, 3 Nov 2023 11:21:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGIPqLQvZzXL9FowYjNmXYQEHHHmZRwW9QcW48ZY8QWzQHjSL9asLHUEIVc+Xyxqmu+KA8q X-Received: by 2002:a05:6a20:918d:b0:133:d17d:193a with SMTP id v13-20020a056a20918d00b00133d17d193amr25226917pzd.59.1699035694913; Fri, 03 Nov 2023 11:21:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1699035694; cv=none; d=google.com; s=arc-20160816; b=EXtPuBXe2FOgA4p9alSGxWD3UXNWBkTghgZB0s6bseI/kM/J4J6Y2Wlh1tBgYaKscV qagEFJ6Ej6JGCNotyS89N/819xQ/mfstyyVxaO6w9smoAlCB9GHmcouCwYO6s3LFQwji 9Ev8+qHNVnGOLiN6xbRDCZZhvkIqc0C0Bc6oqW6xS7vjodw4xgIqSFGu6rjv+h0jjszp 5hAU2oJCrILq6d70fskcr3r48Whzu3M3dh/nPq2Dhpl+ruvQbEnS9RMyXO7pbZyOYdBM g04pZ2iaMjC0zNJe8GPGFwrGpraSRmc5c7Np1+bgwo544gP/nAT9+QDgHbfilTgzhmAx v6Lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :message-id:subject:cc:to:from:date:dkim-signature; bh=7WLcSuA6CMKxHVNWPv9qLykOXABI3PK5i4j/vqfXEVE=; fh=gEp/fgRWndoeHoVKGtGgdZG6wIB1Tb6I0Qp3Y12JgXI=; b=rFup/axYmYL5xvw09U4ukwLIga4PM4Nbv0vT0duHnlIyDg82a/XCX5QLgz3GeqmLjI OjTTKmmoo/Fg1ZbmQsa6qsHSgJi0F2+eEEWwwH0CFyehJZh1A7qCM8J6YPXRSs4QgXM0 QHMEyHrCLq+jtufpzyWSTUNKo1cijVERPiMOLBbLTw3uhxinrRrzZkGXuWzEUKgjNwjA 13hhPIA7XVPdRHLe0ngyVhmiLs23Vu6AGa+t14Gj/3MAGza+WTHwos3/5keBJkKkN6/U SAVu2qZn7rG6ZxFNm2EQ5rPAC00xdmDfDoQM0L++w0PACUZ23YqiL6oLQbnaZdM37P3f 94sA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=IqXu1JfP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id bx31-20020a056a02051f00b0059c02d055c4si2154534pgb.668.2023.11.03.11.21.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Nov 2023 11:21:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=IqXu1JfP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id CB0A980F9CA4; Fri, 3 Nov 2023 11:21:28 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233570AbjKCSU7 (ORCPT + 99 others); Fri, 3 Nov 2023 14:20:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230197AbjKCSU6 (ORCPT ); Fri, 3 Nov 2023 14:20:58 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5CC1D4D; Fri, 3 Nov 2023 11:20:55 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4D9CCC433C8; Fri, 3 Nov 2023 18:20:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1699035655; bh=7tsGjmrJ/IXbOi51N3JxFSTY9p457HkH2KZbbUbO1iE=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=IqXu1JfP1CdmSvVNo4GhiVPq19YApTjCfV10gbi+4ezSJBuhOQLrV7Lld2xJM2LUx puek0io2gVD/ianpjw+DsgRCFgXKFqvgMh2EJ219wgEkW/tE8N7OiaSdFhk2kqu2w5 kKxQmgUkpWFKi6oiKTUG09k7i4ZA6/r78auAsFNwwVP31aP4ENBsfSzoCSxTMO/WYg MeOqfdN0yFZ9GgzpMZ5z7Tq1oPXGtTP7pVkkWorYHbWofeKhVFz7VsGNk5fSnIpYcR e5ZuduAjfrGeozEFDZXcOJ5gXMOZ9YZ/c/aH1dXMq79ehpg4meXTG0jK8gBKcqjl8t uLu7wzcWes8gA== Date: Fri, 3 Nov 2023 13:20:53 -0500 From: Bjorn Helgaas To: Vidya Sagar Cc: Lorenzo Pieralisi , Vikram Sethi , Thierry Reding , Jonathan Hunter , Krishna Thota , "linux-pci@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: Question: Clearing error bits in the root port post enumeration Message-ID: <20231103182053.GA160440@bhelgaas> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.7 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 03 Nov 2023 11:21:28 -0700 (PDT) On Tue, Oct 31, 2023 at 12:26:31PM +0000, Vidya Sagar wrote: > Hi folks, > > I would like to know your comments on the following scenario where > we are observing the root port logging errors because of the > enumeration flow being followed. > > DUT information: > - Has a root port and an endpoint connected to it > - Uses ECAM mechanism to access the configuration space > - Booted through ACPI flow > - Has a Firmware-First approach for handling the errors > - System is configured to treat Unsupported Requests as > AdvisoryNon-Fatal errors > > As we all know, when a configuration read request comes in for a > device number that is not implemented, a UR would be returned as per > the PCIe spec. > > As part of the enumeration flow on DUT, when the kernel reads offset > 0x0 of B:D:F=0:0:0, the root port responds with its valid Vendor-ID > and Device-ID values. But, when B:D:F=0:1:0 is probed, since there > is no device present there, the root port responds with an > Unsupported Request and simultaneously logs the same in the Device > Status register (i.e. bit-3). Because of it, there is a UR logged > in the Device Status register of the RP by the time enumeration is > complete. > > In the case of AER capability natively owned by the kernel, the AER > driver's init call would clear all such pending bits. > > Since we are going with the Firmware-First approach, and the system > is configured to treat Unsupported Requests as AdvisoryNon-Fatal > errors, only a correctable error interrupt can be raised to the > Firmware which takes care of clearing the corresponding status > registers. The firmware can't know about the UnsupReq bit being set > as the interrupt it received is for a correctable error hence it > clears only bits related to correctable error. > > All these events leave a freshly booted system with the following > bits set. > > Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- (UnsupReq) > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- (UnsupReq) > > Since the reason for UR is well understood at this point, I would > like to weigh in on the idea of clearing the aforementioned bits in > the root port once the enumeration is done particularly to cater to > the configurations where Firmware-First approach is in place. > Please let me know your comments on this approach. I think Secondary status (PCI_SEC_STATUS) is always owned by the OS and is not affected by _OSC negotiation, right? Linux does basically nothing with that today, but I think it *could* clear the "Received Master Abort" bit. I'm not very familiar with Advisory Non-Fatal errors. I'm curious about the UESta situation: why can't firmware know about UnsupReq being set? I assume PCI_ERR_COR_ADV_NFAT is the Correctable Error Status bit the firmware *does* see and clear. But isn't the whole point of Advisory Non-Fatal errors that an error that is logged as an Uncorrectable Error and that normally would be signaled with ERR_NONFATAL is signaled with ERR_COR instead? So doesn't PCI_ERR_COR_ADV_NFAT being set imply that some PCI_ERR_UNCOR_STATUS must be set as well? If so, I would think firmware *could* figure that out and clear the PCI_ERR_UNCOR_STATUS bit. Bjorn