Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp3363352imm; Mon, 8 Oct 2018 02:44:42 -0700 (PDT) X-Google-Smtp-Source: ACcGV62WM79IDa9p6mizkdNuD+1/S7HbHbpyXKV4lu2bno4s3S7j4lSbPYDfk+HP9q+Zq0+X56lH X-Received: by 2002:a63:ee13:: with SMTP id e19-v6mr19655061pgi.8.1538991882531; Mon, 08 Oct 2018 02:44:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538991882; cv=none; d=google.com; s=arc-20160816; b=y9KG1VLDkyXCnJyqEhQJl1WoSB45tDdLaJjJdZe4TlRTd7EjHAzBWcXFsoMazR3of4 eYVM8DzExrpcqwQcMGiI0/DTwrPoUUAChcAd79MoUKaOo+UmDghamBul7ZuvfVdUkRXo Vnut8iAoEqUnVtC/vWf73qoIA9TaROpyZoox/PQQkOhnag5sjokQMH5l5QnPd9FLyIwu IJHz/cT229gdVM+fi1dFQJS3L58Eo6iqidbDF+jrfDMil2ufsAMvnr+ErjaeYvT+NRk8 mZGDbqbYHydFQssnx6z6mKyruSWzcdIOmQsfo1HRlZsvFzyh4c0hbGiTfQszDKVjoWaj 9jXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Meg3Bs0/GqsCryowAzrLarabsCcBRgg0wa6pSkTtAKI=; b=WNV3TiyvyluH2AEFaIVmND6tzfU8WDkrveWK2j1nRynRctlKM+yb2RGPhc0zOfGnwF 22H+n9syb3edZlWo/zw2NFDg9iPvfxsz4gfBDAJVgBFqlfBCjh3X28qXG7QVdiB/M2Vg jApJnrYYYlJWpH23gwmgf8g+TWDCSD6pYqbJst1wnbmga2wqx+QTyrsmh5NGIUeJzCl7 247HADzmMG14In6V+s+tGNsgr3nY6qC/OZYaFMnj1EpLI20RfN++vIxUdC3W7zKqawlK m6v0p8/06ecfCT5aOXQywk4+oWYA0XY2dqZ+zPaQe3PHVI5+0p8hoblW7hEepeCKpDg+ 82gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=jYFtab0+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p74-v6si17071279pfa.44.2018.10.08.02.44.26; Mon, 08 Oct 2018 02:44:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=jYFtab0+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726982AbeJHQzM (ORCPT + 99 others); Mon, 8 Oct 2018 12:55:12 -0400 Received: from mail-ed1-f65.google.com ([209.85.208.65]:41177 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726354AbeJHQzM (ORCPT ); Mon, 8 Oct 2018 12:55:12 -0400 Received: by mail-ed1-f65.google.com with SMTP id x31-v6so2265796edd.8; Mon, 08 Oct 2018 02:44:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Meg3Bs0/GqsCryowAzrLarabsCcBRgg0wa6pSkTtAKI=; b=jYFtab0+/vDHw4sfrDh60jOlzlnLuihjU81vZh/jq4qP9I4+Ojc4gb6vyHD3HdDRDT 3GNhH2F/U+5tB8x0ktaYh7Bwovxtpg1+74QLOcIvZpFaKiK1izU/ZdV5WCety/6GuWWX cwPzTGBZySPTPPRd9vpSxomO1JhPTqTOWA+vXjNgDdfS4TgqT3FBRuJ7TMq2uSvlzRuX dz6kVABdaBg4vD88DkvNG1hFgCCaE5rN0dL5WCzH7H+J5RceSJc3SbhnBN+xUHfijX9t UxBGhsdIz4KzifjF1H24+/RwCF5TVlzDHPfvKHd5kUxdr8UZeZ7aNh2y6RRdzXdH46MA xyRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Meg3Bs0/GqsCryowAzrLarabsCcBRgg0wa6pSkTtAKI=; b=horxr7u4Xoay66JsK7XIiXhEIH3Svh73kD3RUbmSjrkaSOkhjiS+iJwETOHyOUM4ul PlnUyXSxVh3yYmi3j8qvMrTr9Ee5EdeGDachPd0PSu4oQ0uanJJdH8Y8IM30ziYJR8Oo v4OcycMHpvoyftkeMn5wKGSsc++F1CEfqRDBDoC/vcryxbvztm/wbJ38TkAmjG2Ndpfr bkr4L4fcX87U/uO8kBqVMstJQOu/IlsK+ro6NIYjw8ejrY5rH0IK7JFzbYbAza3jA4Q+ +fZBDzpobLK8cDOxlZEGxx2E6WzsmcgHaV9/FXXbipnnwIca46Fpg/ULXu8IC4jWkrcI Hfvg== X-Gm-Message-State: ABuFfoi/kCJcc+rTVH2SdlPXxeJ5h+z4uPgiuz7G7vl/qbnipToY/xVd PKLt6JtX0GI96snHcnd2l8SPNd0eTX4bgF1wTY0= X-Received: by 2002:a50:a643:: with SMTP id d61-v6mr2445874edc.281.1538991860016; Mon, 08 Oct 2018 02:44:20 -0700 (PDT) MIME-Version: 1.0 References: <1537974841-29928-1-git-send-email-bmeng.cn@gmail.com> <20180926165721.GA28024@bhelgaas-glaptop.roam.corp.google.com> <20181003201244.GG120535@bhelgaas-glaptop.roam.corp.google.com> In-Reply-To: <20181003201244.GG120535@bhelgaas-glaptop.roam.corp.google.com> From: Bin Meng Date: Mon, 8 Oct 2018 17:44:08 +0800 Message-ID: Subject: Re: [PATCH] pci: Add a few new IDs for Intel GPU "spurious interrupt" quirk To: helgaas@kernel.org Cc: Bjorn Helgaas , linux-pci , Thomas Jarosch , stable , jani.nikula@linux.intel.com, joonas.lahtinen@linux.intel.com, rodrigo.vivi@intel.com, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Bjorn, On Thu, Oct 4, 2018 at 4:12 AM Bjorn Helgaas wrote: > > On Thu, Sep 27, 2018 at 10:10:07AM +0800, Bin Meng wrote: > > On Thu, Sep 27, 2018 at 12:57 AM Bjorn Helgaas wrote: > > > On Wed, Sep 26, 2018 at 08:14:01AM -0700, Bin Meng wrote: > > > > Add more PCI IDs to the Intel GPU "spurious interrupt" quirk table, > > > > which are known to break. > > > > > > Do you have a reference for this? Any public bug reports, bugzilla, > > > Intel spec reference or errata? "Which are known to break" is pretty > > > vague. > > > > Sorry I used wrong words and should have been clearer. These devices > > are validated to be broken. The test I used is very simple, just > > unplug the VGA cable and plug it again, and "spurious interrupt" will > > be seen on the interrupt line of the IGD device. I was not aware of > > any public bugs filed to Intel, nor seen any errata from Intel. > > The original commit, f67fd55fa96f ("PCI: Add quirk for still enabled > interrupts on Intel Sandy Bridge GPUs"), says some systems "crash" > (not sure if that means an oops or an actual crash that requires a > reboot) and on other systems, Linux disables the shared interrupt > line. I assume disabling the interrupt line keeps devices using that > line from working, but does not directly cause a crash. > Correct, disable the shared interrupt line keeps all devices using that line from working, which is current kernel's behavior w/o this quirk handling: it disables the (shared) interrupt line after 100.000+ generated interrupts. But the side effect is that other devices become unusable after that (eg: USB devices which share the same interrupt line with the Intel GPU). That's why the original commit, f67fd55fa96f ("PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge GPUs") disables the GPU's interrupt directly, which should really be done by the VGA BIOS itself (a buggy VBIOS!). > What specific symptom do you see here? I think it might be useful to > collect details, e.g., dmesg logs, /proc/interrupts contents, output > of "sudo lspci -vv", etc., for the systems you're quirking here. I'm > hoping we can eventually figure out a solution that doesn't require a > quirk for every new GPU, and maybe that info will help find it. > The symptom was described briefly in the original commit f67fd55fa96f too, that disables the (shared) interrupt line after 100.000+ generated interrupts (can be observed via /proc/interrupts). > > > > See commit f67fd55fa96f ("PCI: Add quirk for still enabled interrupts > > > > on Intel Sandy Bridge GPUs"), and commit 7c82126a94e6 ("PCI: Add new > > > > ID for Intel GPU "spurious interrupt" quirk") for some history. > > > > > > > > Based on current findings, it is highly possible that all Intel > > > > 1st/2nd/3rd generation Core processors' IGD has such quirk. > > > > > > Can you include a reference to these "current findings"? I assume you > > > have bug reports that include the device IDs you're adding? If not, > > > how did you build this list of new IDs? > > > > By "current findings" I mean given the IDs we have here, plus previous > > one added by Thomas, it's highly possible this VGA BIOS bug exists in > > every 1st/2nd/3rd generation Core processors. > > > > > The function comment added by f67fd55fa96f ("PCI: Add quirk for still > > > enabled interrupts on Intel Sandy Bridge GPUs") suggests that this is > > > actually a BIOS issue, not a hardware erratum, i.e., I don't see > > > anything there that suggests a hardware defect. > > > > > > But there must be a hole somewhere -- the kernel can't be expected to > > > disable interrupts in device-specific ways when there's no driver > > > loaded. Maybe it's simply a BIOS defect or maybe there's some > > > interrupt or _PRT-related setup we're missing. > > > > It's a pure VGA BIOS bug, not the BIOS bug or _PRT etc. The VGA BIOS > > forgot to turn off the interrupt on these devices. > > If this is a VGA BIOS defect, it's not very likely that it will > magically be fixed for all new Intel GPUs, so in effect it sounds like > we need to update this list of quirks in Linux every time a new Intel > GPU comes out. That prospect is a little daunting. > I don't have a relatively newer Intel board at hand for testing right now. I can try to locate one. But as I said, it's highly possible at least all 1st/2nd/3rd generation Core processors are affected. Maybe we can add all these known GPU devices of 1st/2nd/3rd generation Core processors all together for now? For newer GPUs, let's wait until someone reports the issue again? > Do you happen to know if Windows has the same problem? I.e., if you > boot an old version of Windows with a new GPU, and unplug the VGA > cable, does Windows crash? If Windows can figure out how to handle > that situation gracefully, Linux should be able to do it, too. > I suspect Windows cannot handle it too. Without the GPU awareness, the interrupt line is simply on and no driver claims the devices and will cause issues. I can test this. > > > > Signed-off-by: Bin Meng > > > > Cc: # v3.4+ > > > > --- > > > > > > > > drivers/pci/quirks.c | 4 ++++ > > > > 1 file changed, 4 insertions(+) > > > > > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > > > > index 6bc27b7..c0673a7 100644 > > > > --- a/drivers/pci/quirks.c > > > > +++ b/drivers/pci/quirks.c > > > > @@ -3190,7 +3190,11 @@ static void disable_igfx_irq(struct pci_dev *dev) > > > > > > > > pci_iounmap(dev, regs); > > > > } > > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0042, disable_igfx_irq); > > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0046, disable_igfx_irq); > > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x004a, disable_igfx_irq); > > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq); > > > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0106, disable_igfx_irq); > > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq); > > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq); > > > > > > > > -- Regards, Bin