Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp2371900imm; Thu, 18 Oct 2018 13:32:35 -0700 (PDT) X-Google-Smtp-Source: ACcGV60fQ+A0yaBhroJVqwkBS4FrFI8sCCs1MeddsCld2JVadXQV6Mb4DXMXaHHrUvIbdhZoV4vj X-Received: by 2002:a63:5c63:: with SMTP id n35-v6mr29555886pgm.402.1539894755793; Thu, 18 Oct 2018 13:32:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539894755; cv=none; d=google.com; s=arc-20160816; b=CJuxDUNRVOhXctKfmtGaZyp5N1PiIX+rAwmfI5Aw1eQff0jO/IWJpvuzopCxzotTgU 0B7QfLVRynADKHAaP31Ph7MImbgbLqJDWyYLMvD5CfLSdhaAZDOdjE3Oa8i4C21BXWsl HZjs5MHAMidEygRMEopPtPjzSwHXJOjBQMrrEKmsHCwywNA3DHwJ9MJzz09lhxQyLieT +qUDT0wZ3xYOZf3+fyHQ79ye+B4h7t6lRTAq/9YFME39B7s9il1EryvfEvoLKourEcrC ZOi7O3vbvcj9TlQevDaVDQq4e/uYHg9Q7LuTj0+qjCbJkQgLSRa025YjAq5enL1jyXxx 3XqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=0AEUBHKIGxKmIbiaSJdDQI/fqh9+xRVxUg+KYwxQfuk=; b=bw7I1dop7ijUMFZjDuPus4RTZuDQrFGYmN0IU2q5DCoO0wjPqnNCZ7ISddJyn5WPfC vrqYvb3odCOXBoQQsjvwc3zKvkI17gYni1OqbfBoFF9loN50tkV0+G/Al8ygvJSzhUlt WSVFXQxzISLEak7OgkvjRKZfIKdrVL0LQTg3PYNq1i2McaIOq71Ko1QkMyl1YZhFDSLJ dXvvSuA6PhcX9wYWL2davkpif9r7bkNbB68Yhmg0++uDDTfFdX9VPBeCWPchXV8MwfKZ qHC21SxNGxL0AvdK4jVgGOojqW/wmp8qfGD88WX+nTqoAndbd8d5pOGMKoqpKCHyOrpb 7xpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=U35eCchs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n33-v6si22052322pgm.510.2018.10.18.13.32.20; Thu, 18 Oct 2018 13:32:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=U35eCchs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727160AbeJSEdJ (ORCPT + 99 others); Fri, 19 Oct 2018 00:33:09 -0400 Received: from mail.kernel.org ([198.145.29.99]:39364 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725738AbeJSEdJ (ORCPT ); Fri, 19 Oct 2018 00:33:09 -0400 Received: from [10.80.45.152] (unknown [71.69.156.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EAB102145D; Thu, 18 Oct 2018 20:30:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1539894626; bh=ru6OX5LR/9642YeLpzLwhx/DB0bSIvUmX/c1D9h1UUI=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=U35eCchslzRc9dg8uSaQL71VT3olPvRfG0ude3wEpwMS2HlQcW3ykUHTt+aEpKt4g +28N5oP9NtenU9AcXEUiUMXwcftmvjKoo1Eh9gV4C+cSz9YhXDob6WiIQDNwBuyY1m Y8Bsr6YlMuZoWa4dJBwHU0A0uzwpk8cjj5DSlFMg= Subject: Re: [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot To: "Guilherme G. Piccoli" , linux-pci@vger.kernel.org, kexec@lists.infradead.org, x86@kernel.org Cc: linux-kernel@vger.kernel.org, bhelgaas@google.com, dyoung@redhat.com, bhe@redhat.com, vgoyal@redhat.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, andi@firstfloor.org, lukas@wunner.de, billy.olsen@canonical.com, cascardo@canonical.com, ddstreet@canonical.com, fabiomirmar@canonical.com, gavin.guo@canonical.com, jay.vosburgh@canonical.com, kernel@gpiccoli.net, mfo@canonical.com, shan.gavin@linux.alibaba.com References: <20181018183721.27467-1-gpiccoli@canonical.com> <20181018183721.27467-3-gpiccoli@canonical.com> <6fd4e2d2-c0ac-b26d-9a14-0379b4421679@kernel.org> <12d6175b-7f09-872a-61c4-700e905579c7@canonical.com> From: Sinan Kaya Message-ID: <50d84d48-eebf-ed91-8148-be727f76883f@kernel.org> Date: Thu, 18 Oct 2018 16:30:22 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <12d6175b-7f09-872a-61c4-700e905579c7@canonical.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/18/2018 4:13 PM, Guilherme G. Piccoli wrote: >> These kind of issues are usually fixed by fixing the network driver's >> shutdown routine to ensure that MSI interrupts are cleared there. > > Sinan, I'm not sure shutdown handlers for drivers are called in panic > kexec (I remember of an old experiment I did, loading a kernel > with "kexec -p" didn't trigger the handlers). AFAIK, all shutdown (not remove) routines are called before launching the next kernel even in crash scenario. It is not safe to start the new kernel while hardware is doing a DMA to the system memory and triggering interrupts. Shutdown routine in PCI core used to disable MSI/MSI-x on behalf of all endpoints but it was later decided that this is the responsibility of the endpoint driver. commit fda78d7a0ead144f4b2cdb582dcba47911f4952c Author: Prarit Bhargava Date: Thu Jan 26 14:07:47 2017 -0500 PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown() The pci_bus_type .shutdown method, pci_device_shutdown(), is called from device_shutdown() in the kernel restart and shutdown paths. Previously, pci_device_shutdown() called pci_msi_shutdown() and pci_msix_shutdown(). This disables MSI and MSI-X, which causes the device to fall back to raising interrupts via INTx. But the driver is still bound to the device, it doesn't know about this change, and it likely doesn't have an INTx handler, so these INTx interrupts cause "nobody cared" warnings like this: irq 16: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1 Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/ ... The MSI disabling code was added by d52877c7b1af ("pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2") because a driver left MSI enabled and kdump failed because the kexeced kernel wasn't prepared to receive the MSI interrupts. Subsequent commits 1851617cd2da ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI") and e80e7edc55ba ("PCI/MSI: Initialize MSI capability for all architectures") changed the kexeced kernel to disable all MSIs itself so it no longer depends on the crashed kernel to clean up after itself. Stop disabling MSI/MSI-X in pci_device_shutdown(). This resolves the "nobody cared" unhandled IRQ issue above. It also allows PCI serial devices, which may rely on the MSI interrupts, to continue outputting messages during reboot/shutdown. [bhelgaas: changelog, drop pci_msi_shutdown() and pci_msix_shutdown() calls altogether] Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=187351 Signed-off-by: Prarit Bhargava Signed-off-by: Bjorn Helgaas CC: Alex Williamson CC: David Arcari CC: Myron Stowe CC: Lukas Wunner CC: Keith Busch CC: Mika Westerberg > > But this case is even worse, because the NICs were in PCI passthrough > mode, using vfio. So, they were completely unaware of what happened > in the host kernel. > > Also, this is spec compliant - system reset events should guarantee the > bits are cleared (although kexec is not exactly a system reset, it's > similar)