Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3942618imu; Fri, 30 Nov 2018 08:22:21 -0800 (PST) X-Google-Smtp-Source: AFSGD/WFqKtE/Ta+/9eBRJIz6dWrq+MHUjm7UeUhI8UJVcl2g82An1wgl43aF5wWMbHycYaHK8Sm X-Received: by 2002:a63:a401:: with SMTP id c1mr5385229pgf.403.1543594941863; Fri, 30 Nov 2018 08:22:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543594941; cv=none; d=google.com; s=arc-20160816; b=ISTxzjv40p7avCFW9ZzsUrTHWggQFPuJ1mfl//+nLF7bcZnhKJwWHvMWVLaXUNkEtv wwFZm2MjdybebZJIqrNb1yhRj4S9s0ZTa74Zlk4Wcg9WYvqJGVk3uyb/TiK1Rpzm+0H7 AOk9ld8GmtE2TSPdZr1cAkmeZ5c4+JgRzhUOPg4WHGGk4kK/fQ11UX/34+LbPwynLtAd 9ucqDv+AaWCovdOfiyiBDg2ZBfo5jjxbzm/jSrFtuMhopFnoHi2A7MovMP/M5SbQgQZu yDMHvrzqXwSzF3/UDr2qY4MycllesGyhK3JvhKZCC8EsRGFzCigoyQqg5xyYA27oxXJK 0kxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=fzO8YkGcFptPYEgJR+Mfj5kJWTdN0BTpC+E5b6bqoBQ=; b=S2drzszJHio2uA/PVZjjN87fbeUieOWCumL1gbQwj2Ugci3p3Vwb6HAESD2V4ksHbg tTLMKUd2t8rbNjXzXT07S6yvr+wLco07idJBJETBp+bYm3ex+hesjCF7ETmzFrdkPTcS 0cyzDimyHE90zo4X1MFS6iwhr3qiLq6U8wij/IP4gGDYu5R/qUUPjGr6rMdlXvIu7apQ yua0R/rbmCZShsb0BGjz2rQ2HNQEywhwU4wSelESTLxTxmD4Opvw7qLth3mfR06DAUWJ SMYVjMxPQ6zr1WzPCEcoV8gnU9Ry3XK5pRgh13GH1wZimJt6A8dId8jpQgoVEivgmmZC 9r1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d195si5979780pfd.93.2018.11.30.08.22.06; Fri, 30 Nov 2018 08:22:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727186AbeLAD3V (ORCPT + 99 others); Fri, 30 Nov 2018 22:29:21 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49654 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726644AbeLAD3V (ORCPT ); Fri, 30 Nov 2018 22:29:21 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 19E99A797; Fri, 30 Nov 2018 16:19:31 +0000 (UTC) Received: from x1.home (ovpn-116-92.phx2.redhat.com [10.3.116.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id D8A40600C9; Fri, 30 Nov 2018 16:19:29 +0000 (UTC) Date: Fri, 30 Nov 2018 09:19:29 -0700 From: Alex Williamson To: Bharat Bhushan Cc: Bjorn Helgaas , Bjorn Helgaas , "linux-pci@vger.kernel.org" , Linux Kernel Mailing List , "bharatb.yadav@gmail.com" , David Daney , "jglauber@cavium.com" , "mbroemme@libmpq.org" , "chrisrblake93@gmail.com" Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus Message-ID: <20181130091929.3eae4e99@x1.home> In-Reply-To: References: <20181127083454.26560-1-Bharat.Bhushan@nxp.com> <20181127153356.GA112381@google.com> <20181127090830.084fedf1@x1.home> <20181129225606.328f7386@x1.home> Organization: Red Hat MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 30 Nov 2018 16:19:31 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 30 Nov 2018 06:24:16 +0000 Bharat Bhushan wrote: > Hi Alex, > > > -----Original Message----- > > From: Alex Williamson > > Sent: Friday, November 30, 2018 11:26 AM > > To: Bharat Bhushan > > Cc: Bjorn Helgaas ; Bjorn Helgaas > > ; linux-pci@vger.kernel.org; Linux Kernel Mailing List > > ; bharatb.yadav@gmail.com; David Daney > > ; jglauber@cavium.com; > > mbroemme@libmpq.org; chrisrblake93@gmail.com > > Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus > > > > On Fri, 30 Nov 2018 05:29:47 +0000 > > Bharat Bhushan wrote: > > > > > Hi, > > > > > > > -----Original Message----- > > > > From: Bjorn Helgaas > > > > Sent: Thursday, November 29, 2018 1:46 AM > > > > To: Bharat Bhushan > > > > Cc: alex.williamson@redhat.com; Bjorn Helgaas ; > > > > linux- pci@vger.kernel.org; Linux Kernel Mailing List > > > kernel@vger.kernel.org>; bharatb.yadav@gmail.com; David Daney > > > > ; jglauber@cavium.com; > > mbroemme@libmpq.org; > > > > chrisrblake93@gmail.com > > > > Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus > > > > > > > > On Tue, Nov 27, 2018 at 10:32 PM Bharat Bhushan > > > > wrote: > > > > > > > > > > -----Original Message----- > > > > > > From: Alex Williamson > > > > > > Sent: Tuesday, November 27, 2018 9:39 PM > > > > > > To: Bjorn Helgaas > > > > > > Cc: Bharat Bhushan ; > > > > > > linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; > > > > > > bharatb.yadav@gmail.com; David Daney > > ; > > > > Jan > > > > > > Glauber ; Maik Broemme > > > > ; > > > > > > Chris Blake > > > > > > Subject: Re: [PATCH] PCI: Mark NXP LS1088 to avoid bus reset bus > > > > > > > > > > > > On Tue, 27 Nov 2018 09:33:56 -0600 Bjorn Helgaas > > > > > > wrote: > > > > > > > > > > > 4) Is there a hardware erratum for this? If so, please > > > > > > > include the URL here. > > > > > > > > > > No h/w errata as of now. > > > > > > > > Does that mean (a) the HW folks agree this is a hardware problem but > > > > they haven't written an erratum, (b) there is an erratum but it > > > > isn't public, (c) we don't have any concrete evidence of a hardware > > > > problem, but things just don't work if we do a bus reset, (d) something > > else? > > > > > > I will say it is (c) - not concluded to be hardware h/w issue. > > > > > > > > > > > > In pci_reset_secondary_bus() I have tried to increase the delay > > > > > after reset > > > > but not helped. > > > > > Do I need to add delay at some other place as well? > > > > > > > > No, I think the place you tried should be enough. > > > > > > > > You should also be able to exercise this from user-space by using > > > > "setpci" to set and clear the Secondary Bus Reset bit in the Bridge > > > > Control register. Then you can also use setpci to read/write config > > > > space of the NIC. The kernel would normally read the Vendor and > > > > Device IDs as the first access to the device during enumeration. > > > > You also might be able to learn something by using "lspci -vv" on > > > > the bridge before and after the reset to see if it logs any AER bits (if it > > supports AER) or the other standard error logging bits. > > > > > > I tried below sequence for Secondary bus reset and device config space > > > show 0xff > > > > > > root@localhost:~# lspci -x > > > 0002:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 80c0 (rev > > > 10) > > > 00: 57 19 c0 80 07 01 10 00 10 00 04 06 08 00 01 00 > > > 10: 00 00 00 00 00 00 00 00 00 01 ff 00 01 01 00 00 > > > 20: 00 40 00 40 f1 ff 01 00 00 00 00 00 00 00 00 00 > > > 30: 00 00 00 00 40 00 00 00 00 00 00 40 63 01 00 00 > > > > > > 0002:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit > > > Network Connection > > > 00: 86 80 d3 10 06 04 10 00 00 00 00 02 10 00 00 00 > > > 10: 00 00 0c 40 00 00 00 40 01 00 00 00 00 00 0e 40 > > > 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 1f a0 > > > 30: 00 00 24 40 c8 00 00 00 00 00 00 00 63 01 00 00 > > > > > > root@localhost:~# setpci -s 0002:00:00.0 0x3e.b=0x40 > > > root@localhost:~# setpci -s 0002:00:00.0 0x3e.b=0x00 > > > > > > root@localhost:~# lspci -x > > > 0002:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 80c0 (rev > > > 10) > > > 00: 57 19 c0 80 07 01 10 00 10 00 04 06 08 00 01 00 > > > 10: 00 00 00 00 00 00 00 00 00 01 ff 00 01 01 00 00 > > > 20: 00 40 00 40 f1 ff 01 00 00 00 00 00 00 00 00 00 > > > 30: 00 00 00 00 40 00 00 00 00 00 00 40 63 01 00 00 > > > > Just for curiosity sake, what if you re-write the secondary and subordinate > > bus registers here: > > > > # setpci -s 0002:00:00.0 0x19.b=0x01 > > # setpci -s 0002:00:00.0 0x1a.b=0xff > > Result is same, here are logs > > root@localhost:~# setpci -s 0002:00:00.0 0x3e.b=0x40 > root@localhost:~# setpci -s 0002:00:00.0 0x3e.b=0x00 > root@localhost:~# setpci -s 0002:00:00.0 0x19.b=0x01 > root@localhost:~# setpci -s 0002:00:00.0 0x1a.b=0xff > root@localhost:~# lspci -x > 0002:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 80c0 (rev 10) > 00: 57 19 c0 80 07 01 10 00 10 00 04 06 08 00 01 00 > 10: 00 00 00 00 00 00 00 00 00 01 ff 00 01 01 00 00 > 20: 00 40 00 40 f1 ff 01 00 00 00 00 00 00 00 00 00 > 30: 00 00 00 00 40 00 00 00 00 00 00 40 63 01 00 00 > > 0002:01:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection (rev ff) > 00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > 30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Ok, thanks for scratching my itch. > > IIRC the users that debugged the AMD bus reset issue re-wrote the entire 64 > > bytes of the bridge config header and then further narrowed the issue down > > to the two registers above. If one bridge implementation can have such an > > issue, maybe others do too. Perhaps there's common IP in use. > > > Are you able > > to test other endpoints besides this e1000e device with this setpci > > technique? Thanks, > > I tried with " Broadcom Limited NetXtreme BCM5722 Gigabit Ethernet PCI Express" I observe same issue. Personally I'd exhaust talking with your hardware folks before blocking bus resets at the software level, it seems like a gap in PCIe compliance of the device. Thanks, Alex