Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757685AbXFLUpw (ORCPT ); Tue, 12 Jun 2007 16:45:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756918AbXFLUpn (ORCPT ); Tue, 12 Jun 2007 16:45:43 -0400 Received: from ogre.sisk.pl ([217.79.144.158]:44327 "EHLO ogre.sisk.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756914AbXFLUpm (ORCPT ); Tue, 12 Jun 2007 16:45:42 -0400 From: "Rafael J. Wysocki" To: "Eric W. Biederman" Subject: Re: [PATCH] x86: Document the hotplug code is incompatible with x86 irq handling Date: Tue, 12 Jun 2007 22:52:09 +0200 User-Agent: KMail/1.9.5 Cc: Pavel Machek , Andrew Morton , Andi Kleen , linux-kernel@vger.kernel.org, Neil Brown , Ingo Molnar , "Siddha, Suresh B" References: <20070607140102.GA9094@ucw.cz> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200706122252.10886.rjw@sisk.pl> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4688 Lines: 130 On Tuesday, 12 June 2007 20:19, Eric W. Biederman wrote: > Pavel Machek writes: > > > Hi! > > > >> I just realized that except for doing the code review and noticing > >> that the current cpu hotplug code is fundamentally incompatible > >> with x86 I haven't done anything about it. So here is my patch > >> to document what is wrong. > >> > >> The current cpu hotplug code requires irqs to be migrated from a cpu > >> outside of irq context. On x86 ioapics simply do not support this, > >> making the code unfixable without major redesign of the generic cpu > >> hotplug code. > >> > >> So this patch makes CPU_HOTPLUG on x86 depend on CONFIG_BROKEN > >> and adds a WARN_ON so people that do enable it are not in doubt about > >> which part of the code is broken, even if it does work for them. > > > > > >> --- a/arch/i386/kernel/irq.c > >> +++ b/arch/i386/kernel/irq.c > >> @@ -312,6 +312,19 @@ void fixup_irqs(cpumask_t map) > >> unsigned int irq; > >> static int warned; > >> > >> + /* > >> + * Function is so wrong at so many levels. > >> + * - We migrate irqs that are directed at the cpu we are > >> + * removing. > > > > Is this about irq pinning? > > Sorry. That should have been: We migrate irqs that are not directed at the > cpu we are removing. (We are migrating irqs when it is unnecessary). > > >> + * - We cannot safely migrate ioapic irqs on x86 except in > >> + * side of irq context. > > > > 'inside'? > > > > Can you be more specific for this one? > > Yes inside. > > An irq migration currently requires two instances of the irq firing to > complete. Once on the source cpu once on the target cpu. > > Migrating irqs while the irq is alive is a royal pain. > > >> + * Since someone probably finds this useful just warn very > >> + * loudly until cpu hotplug is redesigned. > >> + */ > >> + WARN_ON(1); > > > > Ugh, no, this does not warn anyone. This will just make people ask me > > why they see stack trace while suspending... and we are not interested > > in the stack trace, anyway. > > > > printk(KERN_WARNING)? > > > Because you are calling unfixably broken code. That should be a decent > incentive to do something else won't it? Can you please tell me _what_ else can be done? > IOAPICs do not support what the code is doing here. There is lots of > practical evidence including bad experiences and practical tests that > support this. Well, AFAICS, Suresh has tried to debug one failing case recently without any consistent conclusions. I don't know of any other test cases (links, please?). > I suspect the only reason you don't have problems is that the irqs are > already shut down at the source before we get to this code path. You are probably right, but OTOH I've tried the CPU hotplug via the sysfs interface for _many_ times and it has _never_ failed for me. > >> index 5ce9443..a61c4f2 100644 > >> --- a/arch/x86_64/Kconfig > >> +++ b/arch/x86_64/Kconfig > >> @@ -429,7 +429,7 @@ config NR_CPUS > >> > >> config HOTPLUG_CPU > >> bool "Support for suspend on SMP and hot-pluggable CPUs (EXPERIMENTAL)" > >> - depends on SMP && HOTPLUG && EXPERIMENTAL > >> + depends on SMP && HOTPLUG && EXPERIMENTAL && BROKEN > >> help > > > > Great, this will force everyone and their dog to enable broken, making > > broken useless. Please don't. > > CONFIG_BROKEN is quite likely excessive but the code is totally and > unfixably broken. So it still doesn't feel wrong to me. > > Perhaps we should just disable swap suspend on SMP until we get a > design that can be implemented correctly on existing hardware. All kinds of suspend, not only hibernation. > I am not happy with people telling me that we must keep broken code because > people with brand new SMP laptops will scream otherwise. Sorry, but that's how it goes. We've been using that code for more then a year now and I'd expect someone to tell us that it's wrong a bit earlier. If you are telling us to drop it now, then _please_ advise what we can use instead, because we obviously need the functionality. > Since the fundamental problems in this code path don't appear to bite people > very frequently I don't mind waiting while an alternative solution is > debugged and tested, but there is no way it makes sense to keep this > code in service more for more than a kernel release or two. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/