Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp2674877lqp; Mon, 25 Mar 2024 06:22:45 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXKSwrQQDMZBF9DhZAPoh1JcTSOemxKfOtYemX02G5/LeqBwME+Wbint0hgGKaikq+T//16ka+wAkxBRH9g8tW+NIZdQ8iU3PYav+2maA== X-Google-Smtp-Source: AGHT+IHS8Fw7mstGCRI7hTwtuptvTyTbfu+kuGJA018HCdFfJXjkChfPJ2OG5tLhDhUjvpQlwTOQ X-Received: by 2002:a17:906:79ca:b0:a46:6623:6938 with SMTP id m10-20020a17090679ca00b00a4666236938mr4474519ejo.50.1711372964876; Mon, 25 Mar 2024 06:22:44 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711372964; cv=pass; d=google.com; s=arc-20160816; b=U9x1yZ5qgdWArhb9D3+kMh/KRDEonqTNkOEDtghXe93h2kZH/E0FYLYXW3mZjr5W1a A9hyQG6PjxlyCx3PtWl0uvhyVs/XVA4uBJRDVSGKMGcz4MHWY2HnQjyv2c/Ngi3pk28d r5bTtnec2iAwJnrof2e0jivyYGAUBYgHSDN4jZK0++QyWwH0tbaNthCHDe1rpyDIeq+j 8cmQG9p0H6jCmv/9fFZdI5dZv4k79Xjltc9hajcP6OY1uis7GiGmgD8aPQT0pqNGjb9a lspGZrAYPmmXoNgeHgBSsZtTTYqIhxuDpjxeKAJFNf7Aix8scgntPB3fLYJIiUdbf4XL SS3w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id:dkim-signature; bh=BqP9MM6vfq4dcNRlrcEB7oXJEDXyTBhxcRr2EQB0mvI=; fh=TyvaHJ7D2VV4Bu1gG+CdIBNcQ6lhIjZLXp2sJR0Lgv4=; b=m4Ymtrb0iCUE8ZbEYp/QXhY+IvLwbIheH0wuTz7i+4cH42y6hCfRYyuh+6MAIi9nTe WtSj+QSrfs90/bHxL5CQSlC0Q7nAdLmfPXxwxRtWs8WrOEQsQxzaAu35bsNaP/0Nhf9N Nwz7/14XY7H4lBhQe9+4mseRwaau8TjOM0zbkr9pgrBgYiD6Fs+7cPLU/2ZiXrETN+rd CldizRkodeSMo/sJ55FHCpkRS9Ez7Y/LUWAUdDMUkL8z5IK8e/26RKf4M2ldMSCpbUcP NnCSF9K0zbQA9Q6yu9hcf3fgaf5cd34b2NMqOI7FpWckktXv4PtyMoLC18MCmyrQBpaH LTiA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Y6esWeq/"; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-116714-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-116714-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id n25-20020a1709062bd900b00a465f035922si2471110ejg.784.2024.03.25.06.22.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Mar 2024 06:22:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-116714-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Y6esWeq/"; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-116714-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-116714-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 96B921F3A5AA for ; Mon, 25 Mar 2024 13:22:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BB11615667F; Mon, 25 Mar 2024 10:28:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Y6esWeq/" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EC40156C4E; Mon, 25 Mar 2024 08:45:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711356332; cv=none; b=juGrcGJcdSEbqNTiFza+F3y6EH8Ctf/o6yBb3HZ6OY+MpA9x0oTAyQr+JTTamezxaRpUBuKYs7nHuRvcPbXfPsbPXKDBcbE07HoHW6z4BN1d8rgBVirWuzxhA4XgYZ0WuUe2P3kQR3qde71qxMY6ZVHT6gG+1IVXfQ4GnNobdtw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711356332; c=relaxed/simple; bh=knUdCX/JOXqL4snKEL002fjNHL5rz0BFYkaGhcfpTvA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=inB2YfXiFArquK4+DBviCWFifZ2UvViaYUUwXtygeLU9uoDolPwAE1vThszhIjjWKo3znwurm/y5yxrVnjiDdoktMK8xU3Aon94BU1PlX+LgJIhGC4kaSyT4l9q5RtEDoS1sdi70+0pF19bU7E9wMgpVwcSGMOqW5UreY/At/U0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Y6esWeq/; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1711356331; x=1742892331; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=knUdCX/JOXqL4snKEL002fjNHL5rz0BFYkaGhcfpTvA=; b=Y6esWeq/YL90tUTozZ0miM+/Ui38kyaC33sZrPsi1t0Sjk2JPv84OCo3 ZGX7CZhQ7W7g4Ji1WsXOK0KE9/ieHOsnIaFoJaLZCvIQ/4ozOiTAJWADx PLeqhR9XfqutylNOxerrAA3SQMgWzSnouKYEPO2jlO9z6P6mvgC3xAeCm 6QxhOLZta1SmIlsWi4UcC7OsMt7fm88D+tb/vAnoQIHTrZl6D3SXm23MD dQSEl2jhlHMb/xZRsHdswcjEjsU3GCvHL/vVzgPkAXlqTVPRdkEMBHvKh qlogBT+9a89eL8TXObBUz37Q5VarXFk0jR+nG9pyAMkcw6sgOkOnpkso2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,11023"; a="10139249" X-IronPort-AV: E=Sophos;i="6.07,152,1708416000"; d="scan'208";a="10139249" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 01:45:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,152,1708416000"; d="scan'208";a="15601892" Received: from zhaohaif-mobl.ccr.corp.intel.com (HELO [10.254.215.245]) ([10.254.215.245]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2024 01:45:27 -0700 Message-ID: <38601aef-b082-463f-8e41-f73a4307de21@linux.intel.com> Date: Mon, 25 Mar 2024 16:45:08 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCHv3 pci-next 1/2] PCI/AER: correctable error message as KERN_INFO To: Xi Ruoyao , Bjorn Helgaas Cc: Grant Grundler , bhelgaas@google.com, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, mahesh@linux.ibm.com, oohall@gmail.com, rajat.khandelwal@linux.intel.com, rajatja@chromium.org References: <20230918193913.GA203601@bhelgaas> <0a44fd663e93ac5b36865b0080da52d94252791a.camel@xry111.site> From: Ethan Zhao In-Reply-To: <0a44fd663e93ac5b36865b0080da52d94252791a.camel@xry111.site> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 3/25/2024 1:19 AM, Xi Ruoyao wrote: > On Mon, 2023-09-18 at 14:39 -0500, Bjorn Helgaas wrote: >> On Mon, Sep 18, 2023 at 07:42:30PM +0800, Xi Ruoyao wrote: >>> ... >>> My workstation suffers from too much correctable AER reporting as well >>> (related to Intel's errata "RPL013: Incorrectly Formed PCIe Packets May >>> Generate Correctable Errors" and/or the motherboard design, I guess). >> We should rate-limit correctable error reporting so it's not >> overwhelming. >> >> At the same time, I'm *also* interested in the cause of these errors, >> in case there's a Linux defect or a hardware erratum that we can work >> around.  Do you have a bug report with any more details, e.g., a dmesg >> log and "sudo lspci -vv" output? > Hi Bjorn, > > Sorry for the *very* late reply (somehow I didn't see the reply at all > before it was removed by my cron job, and now I just savaged it from > lore.kernel.org...) > > The dmesg is like: > > [ 882.456994] pcieport 0000:00:1c.1: AER: Multiple Correctable error message received from 0000:00:1c.1 > [ 882.457002] pcieport 0000:00:1c.1: AER: found no error details for 0000:00:1c.1 > [ 882.457003] pcieport 0000:00:1c.1: AER: Multiple Correctable error message received from 0000:06:00.0 > [ 883.545763] pcieport 0000:00:1c.1: AER: Multiple Correctable error message received from 0000:00:1c.1 > [ 883.545789] pcieport 0000:00:1c.1: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID) > [ 883.545790] pcieport 0000:00:1c.1: device [8086:7a39] error status/mask=00000001/00002000 > [ 883.545792] pcieport 0000:00:1c.1: [ 0] RxErr (First) > [ 883.545794] pcieport 0000:00:1c.1: AER: Error of this Agent is reported first > [ 883.545798] r8169 0000:06:00.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Transmitter ID) > [ 883.545799] r8169 0000:06:00.0: device [10ec:8125] error status/mask=00001101/0000e000 > [ 883.545800] r8169 0000:06:00.0: [ 0] RxErr (First) > [ 883.545801] r8169 0000:06:00.0: [ 8] Rollover > [ 883.545802] r8169 0000:06:00.0: [12] Timeout > [ 883.545815] pcieport 0000:00:1c.1: AER: Correctable error message received from 0000:00:1c.1 > [ 883.545823] pcieport 0000:00:1c.1: AER: found no error details for 0000:00:1c.1 > [ 883.545824] pcieport 0000:00:1c.1: AER: Multiple Correctable error message received from 0000:06:00.0 > > lspci output attached. > > Intel has issued an errata "RPL013" saying: > > "Under complex microarchitectural conditions, the PCIe controller may > transmit an incorrectly formed Transaction Layer Packet (TLP), which > will fail CRC checks. When this erratum occurs, the PCIe end point may > record correctable errors resulting in either a NAK or link recovery. > Intel® has not observed any functional impact due to this erratum." > > But I'm really unsure if it describes my issue. > > Do you think I have some broken hardware and I should replace the CPU > and/or the motherboard (where the r8169 is soldered)? I've noticed that > my 13900K is almost impossible to overclock (despite it's a K), but I've > not encountered any issue other than these AER reporting so far after I > gave up overclocking. Seems there are two r8169 nics on your board, only 0000:06:00.0 reports aer errors, how about another one the 0000:07:00.0 nic ? Thanks, Ethan