Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1346029ybl; Fri, 10 Jan 2020 16:48:16 -0800 (PST) X-Google-Smtp-Source: APXvYqxEVCYc1TykVUIQ/c8P2bRbrNqog7xmoo0xInWNwTpX4fQWcxTockXrPMJ/szXG3IwpNkFI X-Received: by 2002:a05:6830:1185:: with SMTP id u5mr4650967otq.147.1578703696327; Fri, 10 Jan 2020 16:48:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578703696; cv=none; d=google.com; s=arc-20160816; b=z6m0fvWnDT/mLkxSGMExcydSISY+KwSJEyCsgZQV/j3gjAAOu4b95Clw9CikjHxCxc Lnx7X/MU5/9qVdD+dyT6C4u9p7ep97pETPoe9JmcnC6hct1AP7BoPXjbnxKHXJjt/TSU UeSrV09c7ZOnhjUnfgh4glUfqxvHoFBh4UU6et1+R6qLO1ixuHnHpkDKf12ukX3PSbNB eJse9XokUzKN9WfmpunB7m9sOnllRzlBISjoAlOJ9fAAxGKZ8/6WFyNgolfYZEa+/D9V XVz89F0JXPcyWO2PEC7Rbeyrje8xEaU3Hl/aGlvoIkCb0Xjd94KYmpSmikmObSnlTC+z sEHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=iciugV/+jGgeytpBp9cwmPzmMPdUEsiCwhKZNaz4hBc=; b=B6E6kbFJiQsnZfaPNI12+cJ5f9vcT+eI+aX++FkK0k/V9plH2y7Fli3JX06NGhPlrq cHMOIrMwlrqn1CQjSCUV8I84QjaMDqJ5FHm6crT73ZFZaMQULR1Ud0iBsrfKbadRd3HW WmFni9wM1HdIY5zIMGAcUEG4cLcjXkjNeVc/tSqK5KpZx6BycndAAqhDrh2A2BQV/E3y OxWgXj19JanLG67goRwI3cFdAstwVLQr/1gb6N5CiDp9VbKCmUNTxJdjTJiQdQcUWO55 ku6L8t6okPd4uN6+fUWvzc2a9Ov/KWzLZq85oOfxSHersle0IkaHWh859jevUHlTIivo atWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="S/LRCtu9"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h125si2072176oia.253.2020.01.10.16.48.03; Fri, 10 Jan 2020 16:48:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="S/LRCtu9"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727768AbgAKApU (ORCPT + 99 others); Fri, 10 Jan 2020 19:45:20 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:60436 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727647AbgAKApU (ORCPT ); Fri, 10 Jan 2020 19:45:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1578703519; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=iciugV/+jGgeytpBp9cwmPzmMPdUEsiCwhKZNaz4hBc=; b=S/LRCtu9MHMr/PQIceVX/SvkSuJMpNDXG+QEFKoUgWipivIwW4R/p+eNIJI7xnpoPj0pCT Qquy/QDg7PMZK/RRtMG7dm2M006o4g5gMrSPyqMVXgBA82qhypkGg/QvAyWyx2tv9BcpXI J7U9/G2Qk+EI+zMyHm5UwFdV16uR9hU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-46-Ty8LDPj2M76JWCO6KMHjEw-1; Fri, 10 Jan 2020 19:45:18 -0500 X-MC-Unique: Ty8LDPj2M76JWCO6KMHjEw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 940A2800D48; Sat, 11 Jan 2020 00:45:16 +0000 (UTC) Received: from localhost (ovpn-12-27.pek2.redhat.com [10.72.12.27]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5EA315C1B5; Sat, 11 Jan 2020 00:45:13 +0000 (UTC) Date: Sat, 11 Jan 2020 08:45:10 +0800 From: Baoquan He To: Jerry Hoemann Cc: Khalid Aziz and Shuah Khan , Bjorn Helgaas , Kairui Song , linux-pci@vger.kernel.org, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Deepa Dinamani , Randy Wright , dyoung@redhat.com Subject: Re: [RFC PATCH] PCI, kdump: Clear bus master bit upon shutdown in kdump kernel Message-ID: <20200111004510.GA19291@MiWiFi-R3L-srv> References: <20200110214217.GA88274@google.com> <20200110230003.GB1875851@anatevka.americas.hpqcorp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200110230003.GB1875851@anatevka.americas.hpqcorp.net> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/10/20 at 04:00pm, Jerry Hoemann wrote: > > I am not understanding this failure mode either. That code in > > pci_device_shutdown() was added originally to address this very issue. > > The patch 4fc9bbf98fd6 ("PCI: Disable Bus Master only on kexec reboot") > > shut down any errant DMAs from PCI devices as we kexec a new kernel. In > > this new patch, this is the same code path that will be taken again when > > kdump kernel is shutting down. If the errant DMA problem was not fixed > > by clearing Bus Master bit in this path when kdump kernel was being > > kexec'd, why does the same code path work the second time around when > > kdump kernel is shutting down? Is there more going on that we don't > > understand? > > > > Khalid, > > I don't believe we execute that code path in the crash case. > > The variable kexec_in_progress is set true in kernel_kexec() before calling > machine_kexec(). This is the fast reboot case. > > I don't see kexec_in_progress set true elsewhere. > > > The code path for crash is different. > > For instance, panic() will call > -> __crash_kexec() which calls > -> machine_kexec(). > > So the setting of kexec_in_progress is bypassed. Yeah, it's a differet behaviour than kexec case. I talked to Kairui, the patch log may be not very clear. Below is summary I got from my understanding about this issue: ~~~~~~~~~~~~~~~~~~~~~~~ Problem: When crash is triggered, system jumps into kdump kernel to collect vmcore and dump out. After dumping is finished, kdump kernel will try ty reboot to normal kernel. This hang happened during kdump kernel rebooting, when dumping is network dumping, e.g ssh/nfs, local storage is HPSA. Root cause: When configuring network dumping, only network driver modules are added into kdump initramfs. However, the storage HPSA pcie device is enabled in 1st kernel, its status is PCI_D3hot. When crashed system jumps to kdump kernel, we didn't shutdown any device for safety and efficiency. Then during kdump kernel boot up, the pci scan will get hpsa device and only initialize its status as pci_dev->current_state = PCI_UNKNOWN. This pci_dev->current_state will be manipulated by the relevant device driver. So HPSA device will never have chance to calibrate its status, and can't be shut down by pci_device_shutdown() called by reboot service. It's still PCI_D3hot, then crash happened when system try to shutdown its upper bridge. Fix: Here, Kairui uses a quirk to get PM state and mask off value bigger than PCI_D3cold. Means, all devices will get PM state pci_dev->current_state = PCI_D0 or PCI_D3hot. Finally, during kdump reboot stage, this device can be shut down successfully by clearing its master bit. ~~~~~~~~~~~~~~~ About this patch, I think the quirk getting active PM state for all devices may be risky, it will impact normal kernel too which doesn't have this issue. Wondering if there's any other way to fix or work around it. Thanks Baoquan