Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5078037ybl; Mon, 9 Dec 2019 23:28:56 -0800 (PST) X-Google-Smtp-Source: APXvYqwzRjQy47gA5HUarw9x4oJs4ZWtpcKorFf/HJWgYclot0S2QvzMnwz2E3CmpJTBu7anq/6p X-Received: by 2002:a9d:7393:: with SMTP id j19mr23972679otk.336.1575962936035; Mon, 09 Dec 2019 23:28:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575962936; cv=none; d=google.com; s=arc-20160816; b=b8/b79tBV1nVMzXt1vryqlSBgie2kXOAZ5OHeEmCadUOAUF+HGmkCy/gl/w6AQWqbo ncrlUfyyaGPNUjx3afqecVxPHaeWyiaaJSOgLZPkQ5cfqHpu87K0xgXwo8hoo6acSLcx cr0LQnQJLFLHgKHEZfZKev6tX8YMHAxYG5tcpFR72kOsNwtITS6AYv0vq3cGYOGCuv6u VErRl7Fq+WySTgdYcDkN41/A7xddNid2u5ZqlIaW/zUBqY+wxuYXTxZUsdAXmcaJOdDY j9ZWGTTQLHNpx2vFEc/gMIhCi8QE1QvDqth2TmV25V6a0BsAd9z3NoBa1eMQIARlfyxq ZUnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:organization:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=eeufcYFkBScUjlnK00NSb68PYQECsd/tfDZLMZBdlc4=; b=EyF7Vr8ZgFTcoHynFw6fzA3NTBYxzQZxrBNTlPaPzW9FIC+OzKzCkkgy63rAyT6GfJ CDiqN3JK4T9Y5DEGU01nECNQ84qbji9Fxwm+7CaNwMqZpgXktOlBjDZgF9dQwfB9zBts BsaiTjBTWnXa4ctBjYhpnHrbIH2zjDCMCKc7oCu669IGni05clSkhqJ9L6MP48G8cbLa ZsLRrGuzEYq53Ty2+qIsjyN87LwDPhdPo4j5r6nYBh1NZIMt9yThqJzFhF8LM+oVHVgY 9t7LQghwOtADsZYkhyQqxKFah9udMgysKx6RVaqhIxalRa9kQkDBHpjXiWAmSchNDaOY 68NQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d27si1574245otk.206.2019.12.09.23.28.42; Mon, 09 Dec 2019 23:28:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727335AbfLJH2E (ORCPT + 99 others); Tue, 10 Dec 2019 02:28:04 -0500 Received: from mga01.intel.com ([192.55.52.88]:52144 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727143AbfLJH2E (ORCPT ); Tue, 10 Dec 2019 02:28:04 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 09 Dec 2019 23:28:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,298,1571727600"; d="scan'208";a="220151589" Received: from lahna.fi.intel.com (HELO lahna) ([10.237.72.163]) by fmsmga001.fm.intel.com with SMTP; 09 Dec 2019 23:28:01 -0800 Received: by lahna (sSMTP sendmail emulation); Tue, 10 Dec 2019 09:28:00 +0200 Date: Tue, 10 Dec 2019 09:28:00 +0200 From: "mika.westerberg@linux.intel.com" To: Nicholas Johnson Cc: Bjorn Helgaas , "linux-kernel@vger.kernel.org" , "linux-pci@vger.kernel.org" Subject: Re: Linux v5.5 serious PCI bug Message-ID: <20191210072800.GY2665@lahna.fi.intel.com> References: <20191209131239.GP2665@lahna.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo User-Agent: Mutt/1.12.1 (2019-06-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 09, 2019 at 01:33:49PM +0000, Nicholas Johnson wrote: > On Mon, Dec 09, 2019 at 03:12:39PM +0200, mika.westerberg@linux.intel.com wrote: > > On Mon, Dec 09, 2019 at 12:34:04PM +0000, Nicholas Johnson wrote: > > > Hi, > > > > > > I have compiled Linux v5.5-rc1 and thought all was good until I > > > hot-removed a Gigabyte Aorus eGPU from Thunderbolt. The driver for the > > > GPU was not loaded (blacklisted) so the crash is nothing to do with the > > > GPU driver. > > > > > > We had: > > > - kernel NULL pointer dereference > > > - refcount_t: underflow; use-after-free. > > > > > > Attaching dmesg for now; will bisect and come back with results. > > > > Looks like something related to iommu. Does it work if you disable it? > > (intel_iommu=off in the command line). > On Mon, Dec 09, 2019 at 03:12:39PM +0200, mika.westerberg@linux.intel.com wrote: > > On Mon, Dec 09, 2019 at 12:34:04PM +0000, Nicholas Johnson wrote: > > > Hi, > > > > > > I have compiled Linux v5.5-rc1 and thought all was good until I > > > hot-removed a Gigabyte Aorus eGPU from Thunderbolt. The driver for the > > > GPU was not loaded (blacklisted) so the crash is nothing to do with the > > > GPU driver. > > > > > > We had: > > > - kernel NULL pointer dereference > > > - refcount_t: underflow; use-after-free. > > > > > > Attaching dmesg for now; will bisect and come back with results. > > > > Looks like something related to iommu. Does it work if you disable it? > > (intel_iommu=off in the command line). > I thought it could be that, too. > > The attachment "dmesg-4" from the original email is with iommu parameters. > The attachment "dmesg-5" from the original email is with no iommu parameters. > Attaching here "dmesg-6" with the iommu explicitly set off like you said. > > No difference, still broken. Although, with iommu off, there are less stack traces. > > Could it be sysfs-related? Bisect would probably be the best option to find the culprit commit. There are couple of commits done for pciehp so reverting them one by one may help as well: 87d0f2a5536f PCI: pciehp: Prevent deadlock on disconnect 75fcc0ce72e5 PCI: pciehp: Do not disable interrupt twice on suspend b94ec12dfaee PCI: pciehp: Refactor infinite loop in pcie_poll_cmd() 157c1062fcd8 PCI: pciehp: Avoid returning prematurely from sysfs requests