Received: by 10.192.165.148 with SMTP id m20csp5130549imm; Tue, 1 May 2018 09:32:40 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoeg/dWFpeixFG8aOU0IcnStKoXNVK80PlMI98GMfW75d8cH+HDdoqTV34+qOgfDEtPcT33 X-Received: by 10.98.234.26 with SMTP id t26mr16330999pfh.117.1525192360109; Tue, 01 May 2018 09:32:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525192360; cv=none; d=google.com; s=arc-20160816; b=dpYisVLg+fk/1t8fXAHh2sHWWEwgTWnddoIhwIPYglPp7uIR089gT3bzRY4b9RZfsa g7W/it0ZeUIjNrqLEi9bN+soa0OJXWSX3eXTbp34QaS0uCXgL59kMZL7r6gIHlP11DP6 LNhV9BwnAnUiNnm+n/Wv5r/gp8LRCMIT/9is/nGqR+mI0zgCsDutDngc+3RhV5Ex3X+d ggjceVakhAeoGYw/S8HgNkWnH8ssepn3lkrrHFQgth47XAlit3iaMtRJhEfsmHPEbBh+ nnbezooDbxewNHuHzKyGTWShetJdSNOSAGCjkKkquCh4R2Q7sKswnNqR52ud3mt7lkv7 9Q6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:organization:user-agent :references:in-reply-to:subject:cc:to:from:message-id:date :arc-authentication-results; bh=6bzVK66WAhM9PT8/H4q+cpqalxE49/hIzEPbBrNJNqo=; b=IswZFLfJer8CPhPO1NX82idEj2JRkMmPOja1ALrzMkQ63w6HntWQNHQk0Ah70Jv/L8 SL0Q9fHQ/Ki9ZtkwBWV+hWa1geHq30Q7scSfsAGyoFQ/ose/y152C5jPRgB7hw2suQBe 0YZBMBsYvsMvVJYxPm5XuhQN4OcijQWtp+5IggJGXSx3sLiCzutfexcUG+17d5cwUULl qWCY02tP61q0pzqJaFvORWcTECMTfDSEv4iOVFwGFKhSFEvp5rJozWzNCrx+CUEhPxsw yVkmvKetCaCzxfM/DVYEehf2QTeB2BLFOmdlhF86V/UK8aerSx5g+0emIp+CultwoVLj 5fmQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ay10-v6si8075652plb.120.2018.05.01.09.32.25; Tue, 01 May 2018 09:32:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756053AbeEAQcI (ORCPT + 99 others); Tue, 1 May 2018 12:32:08 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:49356 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755955AbeEAQcG (ORCPT ); Tue, 1 May 2018 12:32:06 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4000C1435; Tue, 1 May 2018 09:32:06 -0700 (PDT) Received: from big-swifty.misterjones.org (big-swifty.cambridge.arm.com [10.1.35.188]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DB38D3F5A0; Tue, 1 May 2018 09:31:56 -0700 (PDT) Date: Tue, 01 May 2018 17:31:43 +0100 Message-ID: <861sevcp74.wl-marc.zyngier@arm.com> From: Marc Zyngier To: Bjorn Helgaas Cc: Sinan Kaya , Paul Menzel , Dave Young , , , , Lukas Wunner , Eric Biederman , Bjorn Helgaas , Vivek Goyal Subject: Re: pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago) In-Reply-To: <20180501132554.GA11698@bhelgaas-glaptop.roam.corp.google.com> References: <20180427211255.GI8199@bhelgaas-glaptop.roam.corp.google.com> <20180428005620.GB1675@dhcp-128-65.nay.redhat.com> <20180428011845.GC1675@dhcp-128-65.nay.redhat.com> <3ebc908fb196168bf0373875ffc5679e@codeaurora.org> <20180430211740.GG95643@bhelgaas-glaptop.roam.corp.google.com> <7285da70-2c3e-c3b7-62e1-fdbb55a77729@codeaurora.org> <3549ffe8-7605-d72c-5c09-1436a4288c7d@codeaurora.org> <20180501132554.GA11698@bhelgaas-glaptop.roam.corp.google.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/25.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: ARM Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 01 May 2018 14:25:54 +0100, Bjorn Helgaas wrote: Hi Bjorn, > On Tue, May 01, 2018 at 01:59:20PM +0100, Marc Zyngier wrote: > > On 01/05/18 13:38, Sinan Kaya wrote: > > > +Marc, > > > > > > On 4/30/2018 5:27 PM, Sinan Kaya wrote: > > >> On 4/30/2018 5:17 PM, Bjorn Helgaas wrote: > > >>>> What should we do about this? > > >>>> > > >>>> Since there is an actual HW errata involved, should we quirk this > > >>>> root port and not wait as if remove/shutdown doesn't exist? > > >>> I was hoping to avoid a quirk because AFAIK all Intel parts have this > > >>> issue so it will be an ongoing maintenance issue. I tried to avoid > > >>> the timeout delays, e.g., with 40b960831cfa ("PCI: pciehp: Compute > > >>> timeout from hotplug command start time"). > > >>> > > >>> But we still see the alarming messages, so we should probably add a > > >>> quirk to get rid of those. > > >>> > > >>> But I haven't given up on the idea of getting rid of the > > >>> pciehp_remove() path. I'm not convinced yet that we actually need to > > >>> do anything to shut this device down. I don't like the assumption > > >>> that kexec requires this. The kexec is fundamentally just a branch, > > >>> and anything we do before the branch (i.e., in the old kernel), we > > >>> should also be able to do after the branch (i.e., in the kexec-ed > > >>> kernel). > > >>> > > >> > > >> In my experience with kexec, MSI type edge interrupts are harmless. > > >> You might just see a few unhandled interrupt messages during boot > > >> if something is pending from the first kernel. > > > > Unfortunately, that's not always the case. > > > > A number of GICv3/v4 implementations (a very common interrupt controller > > on ARM servers) cannot be disabled, which means they will keep writing > > to their pending tables long after kexec will have started the new > > kernel. And since we don't track memory allocation across kexec, you > > end-up with significant chances of observing single bit corruption as > > interrupts carry on being delivered. Oh, and you won't actually be able > > to take MSIs because you can't even reprogram the damn thing. > > > > Yes, this can be considered a HW bug. > > > > >> It is the level interrupts that are more concerning. It remains pending > > >> until the interrupt source is cleared. CPU never returns from the > > >> interrupt handler to actually continue booting the second kernel. > > > > > > This makes me wonder why kexec doesn't disable all interrupt sources by > > > itself instead of relying on the drivers shutdown routine. Some drivers > > > don't even have a shutdown callback. Kexec could have done both as another > > > example. Something like. > > > > > > 1. Call shutdown for all drivers if available. > > > 2. Disable all interrupt sources in the interrupt controller > > > 3. Start the new kernel. > > > > See above. Although you can shut off the end-point and to some extent > > mask interrupts before jumping into the payload, it is not always > > possible to go back to a reasonable state where you can take actually MSIs. > > This is exactly the sort of thing it would be nice to collect and > document as part of the background of "why kexec works the way it > does." It certainly helps explain things that are far from obvious if > you don't have the background. I'd certainly be happy to help with it if someone was willing to kickstart such a document. kexec/kdump is a huge bag of "interesting" tricks, and it has driven me mad over the past couple of months (I'm typing this from a laptop that uses kexec as its bootloader, and it is *not fun*). M. -- Jazz is not dead, it just smell funny.