Received: by 10.223.185.116 with SMTP id b49csp1263070wrg; Wed, 21 Feb 2018 15:21:44 -0800 (PST) X-Google-Smtp-Source: AH8x226e4iAMMjxdpltw2TGpUsq8Ozqifu+aF2JFTdFjpS7Oe9CAkXfkUNWE3NIpCDXpeUHXXIGA X-Received: by 10.98.102.155 with SMTP id s27mr4880751pfj.198.1519255304590; Wed, 21 Feb 2018 15:21:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519255304; cv=none; d=google.com; s=arc-20160816; b=s9qXIeF2hNsyHPPCoJwNOHInJeZrn/phbseQOIXDn2Rv5Pdd14YW+Hu+8rbcx6EqSR BXYQKiDnMSf1D2+qqm9ZPVDNm04HJ3gj+eWM7sPwDm/dM6HS9MIma/DMk8Slq2KL0cmt TCiHgScWT5n9cPeErojMnaqyKqr8PxUqfVNzaAcvjQ3U/oVRas8nbUM68M4Ry8J9AaDU Qg8vOjQd+VhcOIHFlSGALtwAEhuBirKMxljpD76LK7ChG2K4SUxu82p292jfwFLvKFhU WQ6wiepmjQ4uj7RIuTijO0HDZJfGrjTNiZX6/Af+cKuSwwS/ejBUIzJ3PD3492jaGLEe TJNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:arc-authentication-results; bh=jJ3Ggh0gV/LzleLJfyLFeWgvWB8bFKuFMypjfc6mOlM=; b=GBLAUejHULiMnE0FeFScGMVM+uy2HiMkrlsZ2eAk+xN3AhsrWR8J1QvZlxiYijKlT1 fohaV7oOmw0EHaMQ9FQm2SRlIdnz8/e9tPlimbfj3aXEcTz+kv5Zzk8Z96cndSZggOln gi3QNPumfnS0bJH7PVtDP4CicQsXx98UbH3EvKahqoj4f4n+cu3fWJ3M8s1ZKIMOxAcF 2ZcPQ78a3AkdNB6YRRpKKMcVmiaTAbX85elGaeZ1K0z3ULm9poT4j0WpIP/hmoRo7RTf 2Bqq1wilKqogtHBGYexapNsWuhO8GngFZDO5+duq5xIL+nL9yRv1+9YAB8Fv02Bd6Rwf 0Kuw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f5-v6si2333892plf.223.2018.02.21.15.21.29; Wed, 21 Feb 2018 15:21:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751314AbeBUXUt (ORCPT + 99 others); Wed, 21 Feb 2018 18:20:49 -0500 Received: from mail.kernel.org ([198.145.29.99]:60234 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750811AbeBUXUs (ORCPT ); Wed, 21 Feb 2018 18:20:48 -0500 Received: from localhost (unknown [69.71.4.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id A37D520685; Wed, 21 Feb 2018 23:20:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A37D520685 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=helgaas@kernel.org Date: Wed, 21 Feb 2018 17:20:40 -0600 From: Bjorn Helgaas To: George Cherian Cc: Lukas Wunner , "Rafael J. Wysocki" , Mika Westerberg , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, bhelgaas@google.com, Jayachandran.Nair@cavium.com, Robert.Richter@cavium.com, Lorenzo Pieralisi , Huang Ying Subject: Re: [PATCH] PCI: Add quirk for Cavium Thunder-X2 PCIe erratum #173 Message-ID: <20180221232040.GA52685@bhelgaas-glaptop.roam.corp.google.com> References: <1517554846-16703-1-git-send-email-george.cherian@cavium.com> <2323301.ORZpb3hFRe@aspire.rjw.lan> <20180216203434.GC11014@bhelgaas-glaptop.roam.corp.google.com> <2858019.9TUCWsDpTB@aspire.rjw.lan> <20180220015433.GA9656@wunner.de> <20180220190037.GB32228@bhelgaas-glaptop.roam.corp.google.com> <20180221095435.xe5lmes7mpxca3en@wunner.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 21, 2018 at 04:25:08PM +0530, George Cherian wrote: > On 02/21/2018 03:24 PM, Lukas Wunner wrote: > > On Wed, Feb 21, 2018 at 02:58:13PM +0530, George Cherian wrote: > > > I will explain the setup used > > > To the Cavium ThunderX RC the following PLX device is connected. > > > PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express Gen 3 (8.0 GT/s) > > > Switch > > > There is no device connected downstream to the PLX switch. > > > > > > AFAIU the pcie_port driver probes PLX and enters autosuspend after 100ms > > > since pci_bridge_d3_possible() returns true. > > > > > > And later pci_sysfs_init() ends up doing a config access of PLX which fails > > > with a "synchronous external abort" Thanks for the details! This one *should* be fixed by this patch: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization&id=bf6c089ee2ac67eb22c0ff0ac9cc7f9ccd619d90 Any chance you could try that out? > > Then you're missing a pci_config_pm_runtime_get() in pci_sysfs_init() or > > further down in the call stack, rather than a quirk which just papers > > over the issue. > > I have found another configuration where this fails. > Following is the configuration > 1) Connected a PCIe Intel i40 card under the root port. > 2) unbind the i40 driver and bind with vfio-pci driver. > 3) Run lspci in a loop. "lspci -s xx:xx.xx -vvv" > > I get the same synchronous external abort. > In this case the vfio-pci driver probe it moves the device (i40) to > D3hot provided disable_idle_d3 is not set. lspci tries to do > the config_access which fails with synchronous external abort when > the root port transitions to D3hot. This one sounds like we're missing something in this path: pci_read_config pci_config_pm_runtime_get if (parent) pm_runtime_get_sync __pm_runtime_resume(dev, RPM_GET_PUT) rpm_resume It *looks* like rpm_resume() should resume parent devices, i.e., the root port, but I don't know that code at all. Maybe Rafael or Lukas could confirm that? pci_config_pm_runtime_get() knows that config space is always accessible unless the device is in D3cold, so if the target device is in D3hot, it will leave it there. I assume that if/when rpm_resume() resumes the parent bridges, it will resume them all the way to D0. I'm *really* glad you're finding these issues, because on most platforms we would just silently read invalid data (all ones) and the caller would have no idea what's going wrong. Bjorn