Received: by 10.192.165.148 with SMTP id m20csp1336364imm; Fri, 27 Apr 2018 17:57:52 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrKCQ/D9nLPH457ybAVclPCdOqQcU8PCgzMnbHiBohYCA2A1J2uYNrUnpfxBnXYLaF+2XDX X-Received: by 10.98.156.13 with SMTP id f13mr3093726pfe.15.1524877072817; Fri, 27 Apr 2018 17:57:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524877072; cv=none; d=google.com; s=arc-20160816; b=MwXzWmXbTRjnyGbmvH5d5h1ybbRpPZj1fzGi2RRjN34VSqCw8OTP+6LvEJ/jUJYCvL Z6jgfs7M5T1ODM5KuLKD9zvhSTq6019LYKDmiqFVqCT0cPkwlIcJlfLwzY5xueeDWnqS 05mXH+MVx466uc4AUOr6d8M+IMCHIpgWQIDlcibW+32dxg554GuMlRKl2kmZDvWcxn5N rHF+hywtt0bFVd5QhiCPg2Z9R3/6Qh7JoKbkh5Su832FNrW7jNwyyRw4QZ8I7q+0cMST gagQCztNskWYtOjVgf0QGkYpoGk1r/hXcYnZM0vqxnV4P5mB9KiWHmlT22CNJmh16Eki nIrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=PbQ/gbbfPU1c5708vP5uBvtZOh1xkF4k9FmMzYSg0G8=; b=ppnFiOMDH5eYfjn+6QzaMWf+m2+vlWy99B3sJoroWWvuo6q3yQgWKe6ZQKsXqSFb4y 2BrIeguubVGvayx/Gdi55F4+igBASi8S9cKFR1Vhu3g2kd7Zv7n0ANOMBB81Fpg9rYK2 P056XolHJu9DACxCbNy6smA1dAf9s1g51YycNP1hixwby7fbK8jmM5zacyZtkNVX/QIR SXOto2yfcsuPRVFY5r30I1SjAJl5asgDN5jAaJCnXU05a+H/0qzlX/0I55iESvW8C1ps OUTes02IuEEQKbyx7Elt+gZBrKYK5Cna69CzA/B1wN0UM+Ju2UeA19gZT9N9OI7G+Fqc 5dTw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l3si2321927pfi.179.2018.04.27.17.57.38; Fri, 27 Apr 2018 17:57:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933029AbeD1A4e (ORCPT + 99 others); Fri, 27 Apr 2018 20:56:34 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:39728 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755617AbeD1A4d (ORCPT ); Fri, 27 Apr 2018 20:56:33 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 98BFF818B137; Sat, 28 Apr 2018 00:56:31 +0000 (UTC) Received: from dhcp-128-65.nay.redhat.com (ovpn-12-96.pek2.redhat.com [10.72.12.96]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D1EC410AF9D4; Sat, 28 Apr 2018 00:56:25 +0000 (UTC) Date: Sat, 28 Apr 2018 08:56:20 +0800 From: Dave Young To: Bjorn Helgaas Cc: Sinan Kaya , linux-pci@vger.kernel.org, Paul Menzel , kexec@lists.infradead.org, linux-kernel@vger.kernel.org, Lukas Wunner , Eric Biederman , Bjorn Helgaas , Vivek Goyal Subject: Re: pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago) Message-ID: <20180428005620.GB1675@dhcp-128-65.nay.redhat.com> References: <8770820b-85a0-172b-7230-3a44524e6c9f@molgen.mpg.de> <20180427192207.GG8199@bhelgaas-glaptop.roam.corp.google.com> <20180427211255.GI8199@bhelgaas-glaptop.roam.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180427211255.GI8199@bhelgaas-glaptop.roam.corp.google.com> User-Agent: Mutt/1.9.5 (2018-04-13) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Sat, 28 Apr 2018 00:56:32 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Sat, 28 Apr 2018 00:56:32 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'dyoung@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/27/18 at 04:12pm, Bjorn Helgaas wrote: > [+cc Eric, Vivek, kexec list] > > On Fri, Apr 27, 2018 at 03:34:30PM -0400, Sinan Kaya wrote: > > On 4/27/2018 3:22 PM, Bjorn Helgaas wrote: > > > Sinan mooted the idea of using a "no-wait" path of sending the "don't > > > generate hotplug interrupts" command. I think we should work on this > > > idea a little more. If we're shutting down the whole system, I can't > > > believe there's much value in *anything* we do in the pciehp_remove() > > > path. > > > > > > Maybe we should just get rid of pciehp_remove() (and probably > > > pcie_port_remove_service() and the other service driver remove methods) > > > completely. That dates from when the service drivers could be modules that > > > could be potentially unloaded, but unloading them hasn't been possible for > > > years. > > > > Shutdown path is also used for kexec. Leaving hotplug interrupts > > pending is dangerous for the newly loaded kernel as it leaves > > spurious interrupts during the new kernel boot. > > > > I think we should always disable the hotplug interrupt on shutdown. > > We might think of not waiting for command-completion as a > > middle-ground or go to polling path instead of interrupts all the > > time. > > Ah, I forgot about the kexec path. The kexec path is used for > crashdump, too, so ideally the newly-loaded kernel would defend itself > when possible so it doesn't depend on the original kernel doing things > correctly. It is true for kdump. But kexec needs device shutdown. > > Seems like this question of whether to do things in the original > kernel or the kexec-ed kernel comes up periodically, but I can never > remember a definitive answer. My initial reaction is that it'd be > nice if we didn't have to do *any* shutdown in the original kernel, > but I'm sure there are reasons that's not practical. Devices sometimes assume it is in a good state initialized in firmware boot phase, so we need a shutdown in 1st kernel so that kexec kernel can boot correctly for those devices. For kdump since kernel already panicked and it is not reliable so we do as less as we can in the 1st kernel crash path, but there are some special handling for kdump in various drivers to reset the devices in 2nd kernel, eg. when it see "reset_devices" kernel parameter. > > I copied Eric (kexec maintainer) and Vivek (contact listed in > Documentation/kdump/kdump.txt) in case they have suggestions or would > consider some sort of Documentation/ update. > > Bjorn > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec Thanks Dave