Subject: [patch] x86_64: fix boot hang caused by CALGARY_IOMMU_ENABLED_BY_DEFAULT
From: Ingo Molnar <[email protected]>
one of my boxes didnt boot the 2.6.20-rc1-rt0 kernel rpm, it hung during
early bootup. After an hour or two of happy debugging i narrowed it down
to the CALGARY_IOMMU_ENABLED_BY_DEFAULT option, which was freshly added
to 2.6.20 via the x86_64 tree and /enabled by default/.
commit bff6547bb6a4e82c399d74e7fba78b12d2f162ed claims:
[PATCH] Calgary: allow compiling Calgary in but not using it by default
This patch makes it possible to compile Calgary in but not use it by
default. In this mode, use 'iommu=calgary' to activate it.
but the change does not actually practice it:
config CALGARY_IOMMU_ENABLED_BY_DEFAULT
bool "Should Calgary be enabled by default?"
default y
depends on CALGARY_IOMMU
help
Should Calgary be enabled by default? if you choose 'y', Calgary
will be used (if it exists). If you choose 'n', Calgary will not be
used even if it exists. If you choose 'n' and would like to use
Calgary anyway, pass 'iommu=calgary' on the kernel command line.
If unsure, say Y.
it's both 'default y', and says "If unsure, say Y". Clearly not a typo.
disabling this option makes my box boot again. The patch below fixes the
Kconfig entry. Grumble.
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86_64/Kconfig | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
Index: linux/arch/x86_64/Kconfig
===================================================================
--- linux.orig/arch/x86_64/Kconfig
+++ linux/arch/x86_64/Kconfig
@@ -495,14 +495,13 @@ config CALGARY_IOMMU
config CALGARY_IOMMU_ENABLED_BY_DEFAULT
bool "Should Calgary be enabled by default?"
- default y
depends on CALGARY_IOMMU
help
- Should Calgary be enabled by default? if you choose 'y', Calgary
+ Should Calgary be enabled by default? If you choose 'y', Calgary
will be used (if it exists). If you choose 'n', Calgary will not be
used even if it exists. If you choose 'n' and would like to use
Calgary anyway, pass 'iommu=calgary' on the kernel command line.
- If unsure, say Y.
+ If unsure, say N.
# need this always selected by IOMMU for the VIA workaround
config SWIOTLB
On Wed, Dec 20, 2006 at 11:28:46AM +0100, Ingo Molnar wrote:
> config CALGARY_IOMMU_ENABLED_BY_DEFAULT
> bool "Should Calgary be enabled by default?"
> default y
> depends on CALGARY_IOMMU
> help
> Should Calgary be enabled by default? if you choose 'y', Calgary
> will be used (if it exists). If you choose 'n', Calgary will not be
> used even if it exists. If you choose 'n' and would like to use
> Calgary anyway, pass 'iommu=calgary' on the kernel command line.
> If unsure, say Y.
>
> it's both 'default y', and says "If unsure, say Y". Clearly not a
> typo.
I think that it makes sense to have it default y for the mainline
kernel and default n for the distro kernels, which is why I added the
option to make it possible to compile Calgary in but only enable it if
you want to use it. Previously if you compiled it in it would be used,
period. You may disagree, but fundamentally I think the mainline
kernel should be fairly experimental, which means enabling new code by
default.
As to what actually happened, I'm betting your machine has both
Calgary and CalIOC2, the PCI-e version of Calgary, which is not yet
supported by pci-calgary.c. I have a patch for CalIOC2 which is work
in progress and not ready for inclusion yet. Can you please send lspci
-v to confirm?
Cheers,
Muli
* Muli Ben-Yehuda <[email protected]> wrote:
> On Wed, Dec 20, 2006 at 11:28:46AM +0100, Ingo Molnar wrote:
>
> > config CALGARY_IOMMU_ENABLED_BY_DEFAULT
> > bool "Should Calgary be enabled by default?"
> > default y
> > depends on CALGARY_IOMMU
> > help
> > Should Calgary be enabled by default? if you choose 'y', Calgary
> > will be used (if it exists). If you choose 'n', Calgary will not be
> > used even if it exists. If you choose 'n' and would like to use
> > Calgary anyway, pass 'iommu=calgary' on the kernel command line.
> > If unsure, say Y.
> >
> > it's both 'default y', and says "If unsure, say Y". Clearly not a
> > typo.
>
> I think that it makes sense to have it default y for the mainline
> kernel and default n for the distro kernels, which is why I added the
> option to make it possible to compile Calgary in but only enable it if
> you want to use it. Previously if you compiled it in it would be used,
> period. You may disagree, but fundamentally I think the mainline
> kernel should be fairly experimental, which means enabling new code by
> default.
that's a totally wrong attitude - the mainline kernel is /not/
experimental. A distro might or might not enable the new option, but we
just dont enable experimental platform support code via "default y"...
The other problem is that the changelog entry says that it's off by
default, while in reality the new option switched this code on for my
box, and broke it.
> As to what actually happened, I'm betting your machine has both
> Calgary and CalIOC2, the PCI-e version of Calgary, which is not yet
> supported by pci-calgary.c. [...]
no, what happened is what i described in my second patch. That 'new
code' which was default-enabled had a bug which locked up my box.
Ingo
On Wed, Dec 20, 2006 at 05:23:38PM +0100, Ingo Molnar wrote:
> > I think that it makes sense to have it default y for the mainline
> > kernel and default n for the distro kernels, which is why I added the
> > option to make it possible to compile Calgary in but only enable it if
> > you want to use it. Previously if you compiled it in it would be used,
> > period. You may disagree, but fundamentally I think the mainline
> > kernel should be fairly experimental, which means enabling new code by
> > default.
>
> that's a totally wrong attitude - the mainline kernel is /not/
> experimental. A distro might or might not enable the new option, but
> we just dont enable experimental platform support code via "default
> y"...
I disagree, it seems to me most "experimental platform support code"
is simply enabled because it doesn't even have a CONFIG option (c.f.,
recent genirq and IO-APIC breakage on x86-64). With regards to this
specific option, you might even say that not defaulting to 'y' here
would be a regression in behaviour against previous released kernels,
which used Calgary if it was compiled in, no questions asked. So at
least in that sense, instructing the user to select y if unsure and
default y are appropriate.
> The other problem is that the changelog entry says that it's off by
> default, while in reality the new option switched this code on for
> my box, and broke it.
Sorry about that (both the wrong changelog entry and the fact that it
broke your box).
> > As to what actually happened, I'm betting your machine has both
> > Calgary and CalIOC2, the PCI-e version of Calgary, which is not yet
> > supported by pci-calgary.c. [...]
>
> no, what happened is what i described in my second patch. That 'new
> code' which was default-enabled had a bug which locked up my box.
Yes, I realized that once I've read your other mail.
Cheers,
Muli
* Muli Ben-Yehuda <[email protected]> wrote:
> > > it would be used, period. You may disagree, but fundamentally I
> > > think the mainline kernel should be fairly experimental, which
> > > means enabling new code by default.
> >
> > that's a totally wrong attitude - the mainline kernel is /not/
> > experimental. A distro might or might not enable the new option, but
> > we just dont enable experimental platform support code via "default
> > y"...
>
> I disagree, it seems to me most "experimental platform support code"
> is simply enabled because it doesn't even have a CONFIG option (c.f.,
> recent genirq and IO-APIC breakage on x86-64). With regards to this
> specific option, you might even say that not defaulting to 'y' here
> would be a regression in behaviour against previous released kernels,
> which used Calgary if it was compiled in, no questions asked. So at
> least in that sense, instructing the user to select y if unsure and
> default y are appropriate.
i still disagree. We should be /a lot/ more careful in adding patches
that change platform behavior.
i happen to run a distro kernel, and in fact i'm maintaining the -rt yum
repository which offers a distro kernel and which also tracks mainline
very closely.
i agree with you that Calgary support, if it was enabled (which it is in
my config), would run through the calgary_detect() function. The
breakage was not caused by the default-y patch, it was caused by the
other patch preceding it:
commit b34e90b8f0f30151349134f87b5dc6ef75a5218c
Date: Thu Dec 7 02:14:06 2006 +0100
[PATCH] Calgary: use BIOS supplied BBARs and topology information
I think in the future it would be better to annotate the introduction of
new, widely used codepaths via KERN_DEBUG printouts, something along the
lines of:
printk(KERN_DEBUG "calgary: running new EBDA code.\n");
...
printk(KERN_DEBUG "calgary: done.\n");
That way "debug ignore_loglevel console=tty" bootopions would have
uncovered the source of this hang. Just like i can use the
initcall_debug boot option to figure out where the bootup hangs.
but i still /strongly/ disagree with your attitude that mainline is
'experimental' and hence there's nothing to see here, move over.
> > The other problem is that the changelog entry says that it's off by
> > default, while in reality the new option switched this code on for
> > my box, and broke it.
>
> Sorry about that (both the wrong changelog entry and the fact that it
> broke your box).
there's really no need to apologize, i probably broke your box a few
orders of magnitude more times than you broke mine ;)
Nevertheless my point is that we /have/ to be more supportive of early
adopter distro kernels (Dave Jones says that the same bug has hit Fedora
rawhide too), and have to be doubly careful about anything that goes
into code that is run by /everyone/. Especially if it's in a hard to
debug place like early bootup code.
This incident i think shows that bisection and testing in -mm doesnt
always cut it (bisection-testing is quite laborous with a large distro
kernel), and i think we could do more technological measures in this
area to lower the bar of entry for users to help us narrow down such
bugs. Such as a KERN_DEBUG policy for annotating new code in the bootup
path.
Ingo
On Thu, Dec 21, 2006 at 11:37:02AM +0100, Ingo Molnar wrote:
> I think in the future it would be better to annotate the introduction of
> new, widely used codepaths via KERN_DEBUG printouts, something along the
> lines of:
>
> printk(KERN_DEBUG "calgary: running new EBDA code.\n");
> ...
> printk(KERN_DEBUG "calgary: done.\n");
>
> That way "debug ignore_loglevel console=tty" bootopions would have
> uncovered the source of this hang. Just like i can use the
> initcall_debug boot option to figure out where the bootup hangs.
>
> but i still /strongly/ disagree with your attitude that mainline is
> 'experimental' and hence there's nothing to see here, move over.
We can agree to disagree about how "experimental" mainline should
be. The bottom line is that the patch that broke your boot was the
"obviously correct" thing to do, and even if mainline was "rock
stable", whatever that means, we still would've put it in because it
fixed boot on a different set of machines... Basically we have to
query the BIOS for Calgary register window and bus setup informationn,
what we had before was a brittle hack (replicate what the BIOS did to
get the same values) that eventually broke when the BIOS - surprise,
surprise - did something differently. Of course that doesn't mean the
bug you ran into shouldn't have been caught earlier.
> > > The other problem is that the changelog entry says that it's off by
> > > default, while in reality the new option switched this code on for
> > > my box, and broke it.
> >
> > Sorry about that (both the wrong changelog entry and the fact that it
> > broke your box).
>
> there's really no need to apologize, i probably broke your box a few
> orders of magnitude more times than you broke mine ;)
You got that right :-) but still, I should've caught the potential
infinite loop with bad BIOS inputs when reviewing the patch. For that
I do apologize.
> Nevertheless my point is that we /have/ to be more supportive of
> early adopter distro kernels (Dave Jones says that the same bug has
> hit Fedora rawhide too), and have to be doubly careful about
> anything that goes into code that is run by /everyone/. Especially
> if it's in a hard to debug place like early bootup code.
No argument about this.
> This incident i think shows that bisection and testing in -mm doesnt
> always cut it (bisection-testing is quite laborous with a large
> distro kernel), and i think we could do more technological measures
> in this area to lower the bar of entry for users to help us narrow
> down such bugs. Such as a KERN_DEBUG policy for annotating new code
> in the bootup path.
Agreed.
Cheers,
Muli
* Muli Ben-Yehuda <[email protected]> wrote:
> > but i still /strongly/ disagree with your attitude that mainline is
> > 'experimental' and hence there's nothing to see here, move over.
>
> We can agree to disagree about how "experimental" mainline should be.
> [...]
there's not much to disagree about. Mainline early bootup _must not
break_, and if it breaks, it must be easy for the tester to figure it
out. Simple as that. If it ever breaks and the user cannot give us other
feedback but: "my laptop hung", we screwed up the process!
once the system has booted up into a minimal state, up to the stage
where say netconsole works, we've got an exponentially increasing number
of measures to find /all the other bugs/. But early bootup is like
sacred. It's not experimental at all. Really. Having a system that
doesnt even boot and gives no feedback at all is an absolute showstopper
and a lost tester to us.
if we need draconian measures such as having two versions of early
bootup code present in the kernel and a runtime boot switch to trigger
between the old-trusted and the new-unknown one [perhaps even
automatically, via the help of Grub] then so it be - but we cannot
tolerate hung bootups.
Ingo