Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp178444imm; Tue, 22 May 2018 16:28:41 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrHWrq9zLfeiEOFEt/Q4myhCnzxo1tL91gonG7pgjnbJSZeC9O99BXlvsf6nrKqXbGvbhJz X-Received: by 2002:a62:4d02:: with SMTP id a2-v6mr479900pfb.2.1527031721502; Tue, 22 May 2018 16:28:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527031721; cv=none; d=google.com; s=arc-20160816; b=lQA+hjMOg9K2e7rCpOfynTHkuJQNc7/bnL+04S03SH3P5I3wJBPThV9u9yron+gko0 c/PGHeFZ3n3UK7YU4QKYJvgiYYcn/2XsfjQ+GKdfJUkvsmvUGLZJGqVVMAv1KEskI24/ QiQO6EIL+k6IcFEOLDufr0RH8BkDNKAmtaRWGzysVhtwcQ+e8Q5IUxlQubL+0/UWR3UV AAiZbwjKInr9xc6YH7n5M1PLDuUnsgNKoUKzTQP6h9Tkov/QPQ+3XcqmzeJVxSW4kRQN qeNLMB6ncu2CmLHVqV08hl1v6Rq/uRXr5bRtID7E6UkQzIUzJ+/qkMHDQXlIIuiLfDNX Y2sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=i58Ppfqdh+5bHRvHkrmBOHDeMbXtCVBJU1fm7/fb24M=; b=tTcnwYPDfsXsRPsosxORkqaB/EdQjfJNWC+3KJcicXolrxFNlizq0oqEXlmsY2lI+7 +gLu7CJGytriIq38DJPmcVynMO/CTLZ5IRMNCbXzVMaow67MHpg0JqzVXTOh9Cpctc40 uiSZrB2Ac1E0Tvt3JXJTFh8orULQ+2A2y5dq7QtQEs8SNOAph/qipuf0swlaGpvA06EG ivuR5clHDSV9O80QV8K+FKvn+Kz1dAzmayV5BcpnKlSr5A+NxZeD+we/oIHR5bTNLff1 0zLxqgrui3Ppp+Y17Xxei4ihJI1U2ragBnwOybhkzo6GvpUjpIO+nJFfLd/6GDNB5SwY swFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=D7GBWTIK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q9-v6si17361937plr.144.2018.05.22.16.28.26; Tue, 22 May 2018 16:28:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=D7GBWTIK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753308AbeEVX2S (ORCPT + 99 others); Tue, 22 May 2018 19:28:18 -0400 Received: from mail-ot0-f194.google.com ([74.125.82.194]:39870 "EHLO mail-ot0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753105AbeEVX2Q (ORCPT ); Tue, 22 May 2018 19:28:16 -0400 Received: by mail-ot0-f194.google.com with SMTP id l12-v6so23052620oth.6 for ; Tue, 22 May 2018 16:28:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=i58Ppfqdh+5bHRvHkrmBOHDeMbXtCVBJU1fm7/fb24M=; b=D7GBWTIKkxc3WCMbM7qtjvHszJuRq3KGJD3eVmIdNcxFUxvB7CN4yYm3HJgssQpdMn 4mLwI8U++QSLAJV+mdBbg2eWIc/ZYRFydwod8HXc7BRpRHK7uAjJMyErmq80NlCybwSN Tu3Bwc3lHvDoH4YMDDDaRVPHwA1FLi7LX8eQX65uqk8vQ9fSXZj8qzttbMXPw5ZHd6NT l5a/mkuq4SuighknQ1mjPSQfeahBV9kbQ59OiNrEED98bEtfGDRcgNatC1OB+kZn2+Nd ShcEbgkzrU6Ond77LfoW7GMg3jzXC5WeOD704xzIWFnuT6bj1KgUOgHmTJNicIbTjf+t M5fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=i58Ppfqdh+5bHRvHkrmBOHDeMbXtCVBJU1fm7/fb24M=; b=YgAZMSROpBy4viUW3d4DpM+3HWcmYZI4bVxjmvBFzd7cQFP15zak31Eu4P5cT97vpr EntaJ9HYNs9lvnYM3IoC0kVV7WgFb6o6fI3829lK4eOxAUTbBhXX+1eNKUrvPur8Yz1a XnlN1LlJmz/QHZlNpZRdd9rDj1+jmKAr3GxehgwdtbtCzWE4ay4YH5htR1cn4ZhNsqXt txsYzVmMx9xrcpOasCfs+kJMHKFBgKpfQ0o8p3zCwbG116Ra3uoHO1jqo+1HmE+70KKp /3bf8i1/bMT2NHyfxzzwlhI0dPL1DGhHGnmE9tt6yQYOMvNFBXt9NcGOQDEDKbnpO+zX RkPA== X-Gm-Message-State: ALKqPwfafye0pIloxXvge/zvWlHFiAxm7GamytM9CN7h6WaTg5KryMda 2MSh7Sq/k1KSsPcRTNWZXF0i/Em2HlI4nzsohgI4Pw== X-Received: by 2002:a9d:2dc9:: with SMTP id g67-v6mr302942otb.135.1527031694465; Tue, 22 May 2018 16:28:14 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4a:b57:0:0:0:0:0 with HTTP; Tue, 22 May 2018 16:27:33 -0700 (PDT) In-Reply-To: References: <20180522222805.80314-1-rajatja@google.com> <20180522222805.80314-3-rajatja@google.com> From: Rajat Jain Date: Tue, 22 May 2018 16:27:33 -0700 Message-ID: Subject: Re: [PATCH 2/5] PCI/AER: Add sysfs stats for AER capable devices To: "Alex G." Cc: Bjorn Helgaas , Jonathan Corbet , Philippe Ombredanne , Kate Stewart , Thomas Gleixner , Greg Kroah-Hartman , Frederick Lawler , Oza Pawandeep , Keith Busch , Gabriele Paoloni , Thomas Tai , "Steven Rostedt (VMware)" , linux-pci , linux-doc@vger.kernel.org, Linux Kernel Mailing List , Jes Sorensen , Kyle McMartin , Rajat Jain Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 22, 2018 at 3:50 PM, Alex G. wrote: > > > On 05/22/2018 05:28 PM, Rajat Jain wrote: >> Add the following AER sysfs stats to represent the counters for each >> kind of error as seen by the device: >> >> dev_total_cor_errs >> dev_total_fatal_errs >> dev_total_nonfatal_errs >> >> Signed-off-by: Rajat Jain >> --- >> drivers/pci/pci-sysfs.c | 3 ++ >> drivers/pci/pci.h | 4 +- >> drivers/pci/pcie/aer/aerdrv.h | 1 + >> drivers/pci/pcie/aer/aerdrv_errprint.c | 1 + >> drivers/pci/pcie/aer/aerdrv_stats.c | 72 ++++++++++++++++++++++++++ >> 5 files changed, 80 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >> index 366d93af051d..730f985a3dc9 100644 >> --- a/drivers/pci/pci-sysfs.c >> +++ b/drivers/pci/pci-sysfs.c >> @@ -1743,6 +1743,9 @@ static const struct attribute_group *pci_dev_attr_groups[] = { >> #endif >> &pci_bridge_attr_group, >> &pcie_dev_attr_group, >> +#ifdef CONFIG_PCIEAER >> + &aer_stats_attr_group, >> +#endif >> NULL, >> }; > > So if the device is removed as part of recovery, then these get reset, > right? So if the device fails intermittently, these counters would keep > getting reset. Is this the intent? Umm, kind of. * One argument is that if a PCI device is removed and then re-enumerated, how do we know it is the same device and has not been replaced by another device for e.g.? Note that the root port counters that have the cumulative counters for all the errors seen will still have them logged in the situation you describe. > > (snip) > >> /** >> * pci_match_one_device - Tell if a PCI device structure has a matching >> diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h >> index d8b9fba536ed..b5d5ad6f2c03 100644 >> --- a/drivers/pci/pcie/aer/aerdrv.h >> +++ b/drivers/pci/pcie/aer/aerdrv.h >> @@ -87,6 +87,7 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info); >> irqreturn_t aer_irq(int irq, void *context); >> int pci_aer_stats_init(struct pci_dev *pdev); >> void pci_aer_stats_exit(struct pci_dev *pdev); >> +void pci_dev_aer_stats_incr(struct pci_dev *pdev, struct aer_err_info *info); >> >> #ifdef CONFIG_ACPI_APEI >> int pcie_aer_get_firmware_first(struct pci_dev *pci_dev); >> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c >> index 21ca5e1b0ded..5e8b98deda08 100644 >> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c >> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c >> @@ -155,6 +155,7 @@ static void __aer_print_error(struct pci_dev *dev, >> pci_err(dev, " [%2d] Unknown Error Bit%s\n", >> i, info->first_error == i ? " (First)" : ""); >> } >> + pci_dev_aer_stats_incr(dev, info); > > What about AER errors that are contained by DPC? Thanks, You are right, this patch does not take care of the DPC. I'll try to read up on DPC and can integrate it if it turns out to be easy enough. Thanks, Rajat > > Alex