Received: by 10.223.185.116 with SMTP id b49csp2034788wrg; Thu, 22 Feb 2018 07:11:56 -0800 (PST) X-Google-Smtp-Source: AH8x225sP2z0m6TEIz37G2WtSfJee+/g0ka7g9Raz2Urv9Ar6mOmmdxKNYG2BHl+RL+cbTGc3ZOY X-Received: by 2002:a17:902:ab8c:: with SMTP id f12-v6mr6461931plr.171.1519312316241; Thu, 22 Feb 2018 07:11:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519312316; cv=none; d=google.com; s=arc-20160816; b=DJwgYAwhMgPkH6sYmfY/yJY44+ZtQp2DXabz78pIXsFWo13KnwDxwyE/2zfzEfz9n1 5pkfX41o1d/NrekaBO0iCmIY+cz+zDxo1dMPd3cKEvf/P3WnxtutQj1HZbgF4CE4ubpM LEun1layW9gfANQW7VSFeiJBkdBsPpgbScRg31sbo2JlewkkkXlCWDJF0KpikFvEBUqv WmIRCG6RC+97+JTDlJkRfxbaG/b5L0rqmCn3qb2b4BEf+XhxIKhNERkcDpe7VSV5y2Qt Fsj/RiZR1lJMNQ0m8t41aSD4GhQpwVAcQtTDhN99IIV3pFP7V6OlFXZAJ9GT7eiKRTsw ExZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dmarc-filter:arc-authentication-results; bh=2Qe/JtUcF5Ju/l9U/6ofVKAHKOS3jN80lNxknr6/5tw=; b=RvOsAXDx8JIDVUI5RmweQbnbN6B68rjFmKCqtViQXI8aG7ZkgTwQYalCd0cnE3iRwM HMe/oPPhFtVqgrf0WHfrezVzcXai27ma+byZI0Jxi+b9Vw+efVsLGJevha9Yzg3wXkzT 8jy8PQdXJ6b4ue3rnu1ynz7WP1dZIynw3tlqvH060/2OKmQcxQi0qrpjSIEtY4Q0K61p 1g1LKJmdrn8yPUqNjRv9F+YrjvX80F86TiOUJ/VjnP19JQxqJxlPtQXNpFcanxpHCZou dRadqjMStgGP3Z/XEhsuW7vPsdh04nN8u0r2D4aG38Bcy9g5IXZ45y6sRkciZtGYX2PM ckhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l80si149912pfb.178.2018.02.22.07.11.41; Thu, 22 Feb 2018 07:11:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932870AbeBVPJv (ORCPT + 99 others); Thu, 22 Feb 2018 10:09:51 -0500 Received: from mail.kernel.org ([198.145.29.99]:39344 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753827AbeBVPJt (ORCPT ); Thu, 22 Feb 2018 10:09:49 -0500 Received: from localhost (50-81-63-165.client.mchsi.com [50.81.63.165]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CB02220837; Thu, 22 Feb 2018 15:09:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CB02220837 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=helgaas@kernel.org Date: Thu, 22 Feb 2018 09:09:47 -0600 From: Bjorn Helgaas To: George Cherian Cc: Lukas Wunner , "Rafael J. Wysocki" , Mika Westerberg , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, bhelgaas@google.com, Jayachandran.Nair@cavium.com, Robert.Richter@cavium.com, Lorenzo Pieralisi , Huang Ying Subject: Re: [PATCH] PCI: Add quirk for Cavium Thunder-X2 PCIe erratum #173 Message-ID: <20180222150947.GB52685@bhelgaas-glaptop.roam.corp.google.com> References: <2323301.ORZpb3hFRe@aspire.rjw.lan> <20180216203434.GC11014@bhelgaas-glaptop.roam.corp.google.com> <2858019.9TUCWsDpTB@aspire.rjw.lan> <20180220015433.GA9656@wunner.de> <20180220190037.GB32228@bhelgaas-glaptop.roam.corp.google.com> <20180221095435.xe5lmes7mpxca3en@wunner.de> <20180221232040.GA52685@bhelgaas-glaptop.roam.corp.google.com> <305a9d29-4749-12bd-e0ab-903b58cda134@caviumnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <305a9d29-4749-12bd-e0ab-903b58cda134@caviumnetworks.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 22, 2018 at 06:43:34PM +0530, George Cherian wrote: > On 02/22/2018 04:50 AM, Bjorn Helgaas wrote: > > On Wed, Feb 21, 2018 at 04:25:08PM +0530, George Cherian wrote: > > > On 02/21/2018 03:24 PM, Lukas Wunner wrote: > > > > On Wed, Feb 21, 2018 at 02:58:13PM +0530, George Cherian wrote: > > > > > I will explain the setup used > > > > > To the Cavium ThunderX RC the following PLX device is connected. > > > > > PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express > > > > > Gen 3 (8.0 GT/s) Switch > > > > > There is no device connected downstream to the PLX switch. > > > > > > > > > > AFAIU the pcie_port driver probes PLX and enters autosuspend > > > > > after 100ms since pci_bridge_d3_possible() returns true. > > > > > > > > > > And later pci_sysfs_init() ends up doing a config access of > > > > > PLX which fails with a "synchronous external abort" > > > > Thanks for the details! > > > > This one *should* be fixed by this patch: > > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization&id=bf6c089ee2ac67eb22c0ff0ac9cc7f9ccd619d90 > > > > Any chance you could try that out? > > I did try your patch and it works fine on the above failing setup. Thanks for testing it! > > > I have found another configuration where this fails. > > > Following is the configuration > > > 1) Connected a PCIe Intel i40 card under the root port. > > > 2) unbind the i40 driver and bind with vfio-pci driver. > > > 3) Run lspci in a loop. "lspci -s xx:xx.xx -vvv" > > > > > > I get the same synchronous external abort. > > > In this case the vfio-pci driver probe it moves the device (i40) to > > > D3hot provided disable_idle_d3 is not set. lspci tries to do > > > the config_access which fails with synchronous external abort when > > > the root port transitions to D3hot. > the stack trace for this issue looks like this > [] pci_generic_config_read+0x5c/0xf0 > [] pci_user_read_config_dword+0x84/0x110 > [] pci_vpd_read+0x100/0x208 > [] pci_read_vpd+0x50/0x68 > [] read_vpd_attr+0x60/0x80 > [] sysfs_kf_bin_read+0x6c/0xa8 > [] kernfs_fop_read+0xa4/0x1c8 > [] __vfs_read+0x60/0x170 > [] vfs_read+0x8c/0x148 > [] SyS_pread64+0xbc/0xd8 > > I have tried adding pci_config_pm_runtime_get/put pair inside > pci_vpd_read(), which I guess might be needed, in case the device goes > to D3cold. But having said that it didnt fix the problem in our platform. Your original patch avoids this problem by setting PCI_DEV_FLAGS_NO_D3 on the root port, so it seems like this must be somehow related to the root port's state. I assume this VPD read is on the i40 device, right? Since you're still seeing the problem even after calling pci_config_pm_runtime_get(), I assume the root port is still not in D0. Can you add a little more instrumentation to read PCI_PM_CTRL and PCI_PM_PPB_EXTENSIONS for the root port and PCI_PM_CTRL for the i40 device right after you call pci_config_pm_runtime_get()? I don't see anything obviously different between the pci_read_config() path and the pci_vpd_read() path except for the pci_config_pm_runtime_get() call that you've already added. I guess you could try using setpci instead of lspci to see if the failure only happens in the pci_vpd_read() path. I assume that will be the case because lspci probably does config reads before it does the VPD read, and those initial config reads seemed to work OK. The VPD path does do config writes in addition to config reads. Maybe there's something special about writes, although I don't know what that would be. You can tell I'm running out of ideas here :) Bjorn