Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp8579952ybl; Thu, 16 Jan 2020 20:06:27 -0800 (PST) X-Google-Smtp-Source: APXvYqzTiA8Z0fpiSn5o5KMHRBo12qY7wmNMxqg/OCXx5oAn9hHe0UqcDdVj5iQjLWWkmD3Z0j6G X-Received: by 2002:aca:8d5:: with SMTP id 204mr1928705oii.141.1579233986856; Thu, 16 Jan 2020 20:06:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579233986; cv=none; d=google.com; s=arc-20160816; b=b530RoudFIvNYkcLYmr9mE+i+Djs9qVtyoHkVRH73X5lS8aJAGMKhmYF9Yq6LhkqOi ZpYZgGPzRGHjde+4HKVPhy2mKIWl/UyMe4kPokR8pKlEc0VAi0QPIzF6ITxH5mqzkEVf Wtu9HHv4BTyd/Jg7tZsPyGbZ4vBsFfp7DkmS+7IYmOqujUHkoIV9jmr98t/k3OTnITp/ oAM9/0+uWFF8MMx8hUQ7a29wUtbtCLafOAkwZEPixS+KQ1woYsng1/zMCA+wt2OLpxLf b2pSGoNp+iGAPHC936fQE0XhW6Xpjhko69q1Fp5IugI/ZjL4jZDlFHfgkaXRaybEOUyA W0yw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=+z4PFQogA0lIvahw7YP4uLF4t1xZszt+CkkNUyFYHwc=; b=Aem8OPMSUf1jpt7brmB68d2vdjbcyg9sPTs1dB3q/0B3cT41XDAt04Wtk9MkCDDoj6 fiCSFg3j+v+R1IYLVrvK6/pU3Qub/T85Lf9UH+k1DOQyRIvE6OUv46ABffdecRi6AkZx 276QFgzVU6y+eEguEr51wjDjOVW5TYrJ7C8gUufyfQZ6LyyP9vNimY56rseOFlWmR6rf bmPhG23d4yfjYhhLJUb7yQqpqakFC+gXyBENSr68/d37AVzJrENFOCT/AtTEEIPIZS1D 7oXH+icmyKnlQHseJK2a2WOZF85t8/gSjGG4FwxjNvnUaYXt7weDzC5cBWvjDarkdqkx X7OQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YXyAbo2k; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l3si14991063otq.40.2020.01.16.20.06.14; Thu, 16 Jan 2020 20:06:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YXyAbo2k; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729188AbgAQDYj (ORCPT + 99 others); Thu, 16 Jan 2020 22:24:39 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:40768 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727015AbgAQDYj (ORCPT ); Thu, 16 Jan 2020 22:24:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579231477; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+z4PFQogA0lIvahw7YP4uLF4t1xZszt+CkkNUyFYHwc=; b=YXyAbo2k8YzkqF5kJHfZ4T1xmtFQ2MAHITu+H5TBgYIbR937usc6Wh7iv+SoNJMRSx2p9O USbU/KpZlEz0wjzJmfUSzpTNsrHz9MaD0uUuuik023I5Xxuyus5H2l5QpdKYZSGMnl/Z4f 5TSvcPy8tOGCovF2vK2QtpJBblYWMM8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-91-795C9JhONcm5-QoMmNE9_A-1; Thu, 16 Jan 2020 22:24:23 -0500 X-MC-Unique: 795C9JhONcm5-QoMmNE9_A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 02CDB18C8C01; Fri, 17 Jan 2020 03:24:21 +0000 (UTC) Received: from dhcp-128-65.nay.redhat.com (ovpn-12-166.pek2.redhat.com [10.72.12.166]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0EF5160C63; Fri, 17 Jan 2020 03:24:16 +0000 (UTC) Date: Fri, 17 Jan 2020 11:24:13 +0800 From: Dave Young To: Khalid Aziz Cc: Kairui Song , Baoquan He , linux-pci@vger.kernel.org, kexec@lists.infradead.org, Jerry Hoemann , Randy Wright , Linux Kernel Mailing List , Bjorn Helgaas , Deepa Dinamani Subject: Re: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel Message-ID: <20200117032413.GA16906@dhcp-128-65.nay.redhat.com> References: <20200110230003.GB1875851@anatevka.americas.hpqcorp.net> <20200111005041.GB19291@MiWiFi-R3L-srv> <6b56ce15-5a5a-97b7-ded1-1fd88fec26eb@gonehiking.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6b56ce15-5a5a-97b7-ded1-1fd88fec26eb@gonehiking.org> User-Agent: Mutt/1.12.1 (2019-06-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/15/20 at 02:17pm, Khalid Aziz wrote: > On 1/15/20 11:05 AM, Kairui Song wrote: > > On Thu, Jan 16, 2020 at 1:31 AM Khalid Aziz wrote: > >> > >> On 1/13/20 10:07 AM, Kairui Song wrote: > >>> On Sun, Jan 12, 2020 at 2:33 AM Deepa Dinamani wrote: > >>>> > >>>>> Hi, there are some previous works about this issue, reset PCI devices > >>>>> in kdump kernel to stop ongoing DMA: > >>>>> > >>>>> [v7,0/5] Reset PCIe devices to address DMA problem on kdump with iommu > >>>>> https://lore.kernel.org/patchwork/cover/343767/ > >>>>> > >>>>> [v2] PCI: Reset PCIe devices to stop ongoing DMA > >>>>> https://lore.kernel.org/patchwork/patch/379191/ > >>>>> > >>>>> And didn't get merged, that patch are trying to fix some DMAR error > >>>>> problem, but resetting devices is a bit too destructive, and the > >>>>> problem is later fixed in IOMMU side. And in most case the DMA seems > >>>>> harmless, as they targets first kernel's memory and kdump kernel only > >>>>> live in crash memory. > >>>> > >>>> I was going to ask the same. If the kdump kernel had IOMMU on, would > >>>> that still be a problem? > >>> > >>> It will still fail, doing DMA is not a problem, it only go wrong when > >>> a device's upstream bridge is mistakenly shutdown before the device > >>> shutdown. > >>> > >>>> > >>>>> Also, by the time kdump kernel is able to scan and reset devices, > >>>>> there are already a very large time window where things could go > >>>>> wrong. > >>>>> > >>>>> The currently problem observed only happens upon kdump kernel > >>>>> shutdown, as the upper bridge is disabled before the device is > >>>>> disabledm so DMA will raise error. It's more like a problem of wrong > >>>>> device shutting down order. > >>>> > >>>> The way it was described earlier "During this time, the SUT sometimes > >>>> gets a PCI error that raises an NMI." suggests that it isn't really > >>>> restricted to kexec/kdump. > >>>> Any attached device without an active driver might attempt spurious or > >>>> malicious DMA and trigger the same during normal operation. > >>>> Do you have available some more reporting of what happens during the > >>>> PCIe error handling? > >>> > >>> Let me add more info about this: > >>> > >>> On the machine where I can reproduce this issue, the first kernel > >>> always runs fine, and kdump kernel works fine during dumping the > >>> vmcore, even if I keep the kdump kernel running for hours, nothing > >>> goes wrong. If there are DMA during normal operation that will cause > >>> problem, this should have exposed it. > >>> > >> > >> This is the part that is puzzling me. Error shows up only when kdump > >> kernel is being shut down. kdump kernel can run for hours without this > >> issue. What is the operation from downstream device that is resulting in > >> uncorrectable error - is it indeed a DMA request? Why does that > >> operation from downstream device not happen until shutdown? > >> > >> I just want to make sure we fix the right problem in the right way. > >> > > > > Actually the device could keep sending request with no problem during > > kdump kernel running. Eg. keep sending DMA, and all DMA targets first > > kernel's system memory, so kdump runs fine as long as nothing touch > > the reserved crash memory. And the error is reported by the port, when > > shutdown it has bus master bit, and downstream request will cause > > error. > > > > Problem really is there are active devices while kdump kernel is > running. You did say earlier - "And in most case the DMA seems > harmless, as they targets first kernel's memory and kdump kernel only > live in crash memory.". Even if this holds today, it is going to break > one of these days. There is the "reset_devices" option but that does not > work if driver is not loaded by kdump kernel. Can we try to shut down > devices in machine_crash_shutdown() before we start kdump kernel? It is not a good idea :) We do not add extra logic after a panic because the kernel is not stable and we want a correct vmcore. Similar suggestions had been rejected a lot of times.. Thanks Dave