Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5897288ybl; Tue, 14 Jan 2020 17:18:11 -0800 (PST) X-Google-Smtp-Source: APXvYqz6fweO7u6E2JDDDQvfo3lcBhhCQYawbgTmUH6/j6FjRvb3DnE9Wnz7uPnGBf8FDngoHTdp X-Received: by 2002:aca:be56:: with SMTP id o83mr19445542oif.25.1579051091015; Tue, 14 Jan 2020 17:18:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579051091; cv=none; d=google.com; s=arc-20160816; b=ca4eQ8g5xSylDG9uKBkS3jrQt+ed4++y/rqRwm+5r7tupE0IvLEjl0qOInBZJDthFa boi1IZO2MRECSYMOmeI+eCcs5B1Bn6TBMF8qxyK5sluD5DonU1qLowrwg+BMxAj6oY6n DKp18PSE8780CU7IUpQpVcmO0I+hptlwxYlee7TT/voafSsNVpYUy3N7g8b5RfGubX4z /93/6MVZHjjDCeJxxKQdGZ1OlykKbI/v1n2xgAv/Lw7Kb4yD952i86dpCK7Ms0q+4IK/ 2UB1ipgy1U2ybi43hlTPk11hKm2zLw23vk8U6oMa8rhge8iQIjR91Z78hlPwmMmvuJ0C N/aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=nrM9hyWl7t9/dtdjzNEgfpEEAIhduRic/LBoSXXTBB4=; b=Q4mitqD0i+z7QI9USs2yWrXLg1dA++3bccAlgi6FY2xiUhlGZJcL7rrp5Ztj7gAKmB jG7PLXO7Uu2nCkX5H1kmRIHFz2ZHrk8TOCE258LHnqM5XQsDrN992kSaoyIH7aC9TikY JRfa3Ej6JJqs+4UIBPakFe47Dt41K0VSsbKaVONGpeej8f0110MCELbS44mNnsAqp54v T21cTju5GYMTrgj2TbsiQ1Kl0cZilXNzPEWSL7TBNBM6BOc2xXUtTPvjd0DF25h51t66 68NkIGvSIab3eSQg7kuzewAC1HqSvh4vaB4ck7fUoUImRU71HOKj+zB9hVfL6FroVXHg dbCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=pns5Lv88; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k8si10042012otp.13.2020.01.14.17.17.56; Tue, 14 Jan 2020 17:18:11 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=pns5Lv88; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728877AbgAOBQz (ORCPT + 99 others); Tue, 14 Jan 2020 20:16:55 -0500 Received: from mail-il1-f195.google.com ([209.85.166.195]:43973 "EHLO mail-il1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728795AbgAOBQz (ORCPT ); Tue, 14 Jan 2020 20:16:55 -0500 Received: by mail-il1-f195.google.com with SMTP id v69so13323689ili.10; Tue, 14 Jan 2020 17:16:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nrM9hyWl7t9/dtdjzNEgfpEEAIhduRic/LBoSXXTBB4=; b=pns5Lv88UX7SC6zugFqCrYQDJ8yr5nVBeQOCbrRoelFwML7CXtJOCgvAk7D9lIBzJY b+xehkBgNGjxC4rvBKghrmatoOKYLyPBPcifAa0nS4VcXMIxFHjrY8zZSBLcWyKkLy3v T+VBPQbNF4wpZ1H3Et/rLrUffo/HRvngprOQlCRDUBnLaPE+0Wqy/6cw3kZvyKoWbSCt tyqNoDHA7r0lN1NiDLB5Kmftb1O+bCtbf0EXJGBGX9Rds/74zBr9qls2+LaDP4U2RwLV 2CmBepR1mVIdHfprrZxzITv+mza444v9Zg3k/GqA5G9FmjBdONoQFQswAk0yBzWX+7rh l96g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nrM9hyWl7t9/dtdjzNEgfpEEAIhduRic/LBoSXXTBB4=; b=U8MEtLEPMR/19W+2eQjtUgKly/3qvAQOF3P3Fn6809i9u+aWpcDSOMPv1oERE5Afjl zFCv6fsTO/kAN4AaFOyxU8QCMMilmQGS0oSA9hQMv0R8P4aNY3SFZMzIT7dz+NqwpOuM Iu9l6zHVJYvu+iimOTYemLLkq7cSdL9+7viqxCgP5kS4QWKf9tkrvBO2UrVCLAuKKBdY qhkjIqAGc+fOR7VQOxcih7dsCwbVT0xQlgC3Q/sr1Q+bUwcqABpSmaKf57rPMZwW8mwT J7JNZRgRHlC2e7nqvglEsKqY/zsur9ryW6h6qiSrkoyDdRknSQx3/pp4JrbSMo4SsYwX V1oQ== X-Gm-Message-State: APjAAAWTJo5nTDWcezePl332lujxm7f/GGyPOnmjo+GHpnICa+soZFR5 0dCDD72kC88BW3BJZQmLSPI6cb9tlKakEkdpv9U= X-Received: by 2002:a92:1906:: with SMTP id 6mr1401696ilz.130.1579051014396; Tue, 14 Jan 2020 17:16:54 -0800 (PST) MIME-Version: 1.0 References: <20200110214217.GA88274@google.com> <20200110230003.GB1875851@anatevka.americas.hpqcorp.net> <20200111005041.GB19291@MiWiFi-R3L-srv> In-Reply-To: From: Deepa Dinamani Date: Tue, 14 Jan 2020 17:16:41 -0800 Message-ID: Subject: Re: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel To: Kairui Song Cc: Khalid Aziz , Baoquan He , Jerry Hoemann , Bjorn Helgaas , linux-pci@vger.kernel.org, kexec@lists.infradead.org, Linux Kernel Mailing List , Randy Wright Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 13, 2020 at 9:07 AM Kairui Song wrote: > > On Sun, Jan 12, 2020 at 2:33 AM Deepa Dinamani wrote: > > > > > Hi, there are some previous works about this issue, reset PCI devices > > > in kdump kernel to stop ongoing DMA: > > > > > > [v7,0/5] Reset PCIe devices to address DMA problem on kdump with iommu > > > https://lore.kernel.org/patchwork/cover/343767/ > > > > > > [v2] PCI: Reset PCIe devices to stop ongoing DMA > > > https://lore.kernel.org/patchwork/patch/379191/ > > > > > > And didn't get merged, that patch are trying to fix some DMAR error > > > problem, but resetting devices is a bit too destructive, and the > > > problem is later fixed in IOMMU side. And in most case the DMA seems > > > harmless, as they targets first kernel's memory and kdump kernel only > > > live in crash memory. > > > > I was going to ask the same. If the kdump kernel had IOMMU on, would > > that still be a problem? > > It will still fail, doing DMA is not a problem, it only go wrong when > a device's upstream bridge is mistakenly shutdown before the device > shutdown. > > > > > > Also, by the time kdump kernel is able to scan and reset devices, > > > there are already a very large time window where things could go > > > wrong. > > > > > > The currently problem observed only happens upon kdump kernel > > > shutdown, as the upper bridge is disabled before the device is > > > disabledm so DMA will raise error. It's more like a problem of wrong > > > device shutting down order. > > > > The way it was described earlier "During this time, the SUT sometimes > > gets a PCI error that raises an NMI." suggests that it isn't really > > restricted to kexec/kdump. > > Any attached device without an active driver might attempt spurious or > > malicious DMA and trigger the same during normal operation. > > Do you have available some more reporting of what happens during the > > PCIe error handling? > > Let me add more info about this: > > On the machine where I can reproduce this issue, the first kernel > always runs fine, and kdump kernel works fine during dumping the > vmcore, even if I keep the kdump kernel running for hours, nothing > goes wrong. If there are DMA during normal operation that will cause > problem, this should have exposed it. > > The problem only occur when kdump kernel try to reboot, no matter how > long the kdump kernel have been running (few minutes or hours). The > machine is dead after printing: > [ 101.438300] reboot: Restarting system^M > [ 101.455360] reboot: machine restart^M > > And I can find following logs happend just at that time, in the > "Integrated Management Log" from the iLO web interface: > 1254 OS 12/25/2019 09:08 12/25/2019 09:08 1 User Remotely Initiated NMI Switch > 1253 System Error 12/25/2019 09:08 12/25/2019 09:08 1 An Unrecoverable > System Error (NMI) has occurred (Service Information: 0x00000000, > 0x00000000) > 1252 PCI Bus 12/25/2019 09:07 12/25/2019 09:07 1 Uncorrectable PCI > Express Error (Embedded device, Bus 0, Device 2, Function 2, Error > status 0x00100000) > 1251 System Error 12/25/2019 09:07 12/25/2019 09:07 1 Unrecoverable > System Error (NMI) has occurred. System Firmware will log additional > details in a separate IML entry if possible > 1250 PCI Bus 12/25/2019 09:07 12/25/2019 09:07 1 PCI Bus Error (Slot > 0, Bus 0, Device 2, Function 2) > > And the topology is: > [0000:00]-+-00.0 Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2 > +-01.0-[02]-- > +-01.1-[05]-- > +-02.0-[06]--+-00.0 Emulex Corporation OneConnect NIC (Skyhawk) > | +-00.1 Emulex Corporation OneConnect NIC (Skyhawk) > | +-00.2 Emulex Corporation OneConnect NIC (Skyhawk) > | +-00.3 Emulex Corporation OneConnect NIC (Skyhawk) > | +-00.4 Emulex Corporation OneConnect NIC (Skyhawk) > | +-00.5 Emulex Corporation OneConnect NIC (Skyhawk) > | +-00.6 Emulex Corporation OneConnect NIC (Skyhawk) > | \-00.7 Emulex Corporation OneConnect NIC (Skyhawk) > +-02.1-[0f]-- > +-02.2-[07]----00.0 Hewlett-Packard Company Smart Array > Gen9 Controllers > > It's a bridge reporting the error. It should be an unsupported request > error, bacause downstream device is still alive and sending request, > but the port have bus mastering off. If I manually shutdown the "Smart > Array" (HPSA) device before kdump reboot, it will always reboot just > fine. > > And as the patch descriptions said, the HPSA is used in first kernel, > but didn't get reset in kdump kernel because driver is not loaded. > When shutting down a bridge, kernel should shutdown downstream device > first, and then shutdown and clear bus master bit of the bridge. But > in kdump case, kernel skipped some device shutdown due to driver not > loaded issue, and kernel don't know they are enabled. > > This problem is not limited to HPSA, the NIC listed in above topology > maybe also make the bridge error out, if HPSA get loaded in kdump > kernel and NIC get ignored. It looks like the right answer is for the kernel to handle such cases gracefully. From what I recall, we can only trust the bus mastering at root ports. So, it is possible that the endpoint devices can always try to DMA, but it can be blocked by the root port. So the right fix seems to teach kernel how to handle these insted of hacking the shutdown code. -Deepa -Deepa