2011-06-15 08:12:18

by Petr Tesařík

[permalink] [raw]
Subject: bug: kernel 3.0-rc3 not relocatable on i386?

Hi all,

it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
warnings about jiffies being an absolute symbol, and indeed, when GRUB
loads the kernel at a non-default address, jiffies is not relocated.

In my example the kernel is configured with
CONFIG_PHYSICAL_START=0x1000000
CONFIG_PHYSICAL_ALIGN=0x200000
CONFIG_RELOCATABLE=y
and loaded at 0x200000 by GRUB.

Booting fails when checking whether the timer works, because do_timer()
increments jiffies_64, but timer_irq_works() checks jiffies. The code
looks like this:

c13daab7: 8b 3d 40 7a 39 c1 mov 0xc1397a40,%edi

but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
Consequently, timer_irq_works() reads the wrong memory location and
fails, causing a panic:

kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
send a report. Then try booting with the 'noapic' option.

Needless to say, the kernel freezes a few initcalls later when booted
with noapic, because IO-APIC worked fine, in fact. I verified that by
inserting a debugging printk() in do_timer(), and I also verified with
that printk() that the address of jiffies_64 and the address of jiffies
differ at run time.

Any idea how to fix this?

Petr Tesarik


2011-06-15 09:21:24

by Maarten Lankhorst

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Hi Petr,

2011/6/15 Petr Tesarik <[email protected]>:
> Hi all,
>
> it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> warnings about jiffies being an absolute symbol, and indeed, when GRUB
> loads the kernel at a non-default address, jiffies is not relocated.
>
> In my example the kernel is configured with
> CONFIG_PHYSICAL_START=0x1000000
> CONFIG_PHYSICAL_ALIGN=0x200000
> CONFIG_RELOCATABLE=y
> and loaded at 0x200000 by GRUB.
>
> Booting fails when checking whether the timer works, because do_timer()
> increments jiffies_64, but timer_irq_works() checks jiffies. The code
> looks like this:
>
> c13daab7:       8b 3d 40 7a 39 c1       mov    0xc1397a40,%edi
>
> but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> Consequently, timer_irq_works() reads the wrong memory location and
> fails, causing a panic:
>
> kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> send a report.  Then try booting with the 'noapic' option.
>
> Needless to say, the kernel freezes a few initcalls later when booted
> with noapic, because IO-APIC worked fine, in fact. I verified that by
> inserting a debugging printk() in do_timer(), and I also verified with
> that printk() that the address of jiffies_64 and the address of jiffies
> differ at run time.
>
> Any idea how to fix this?
Does reverting this commit fix it?

commit 8c49d9a74bac5ea3f18480307057241b808fcc0c
Author: Andy Lutomirski <[email protected]>
Date: Mon May 23 09:31:24 2011 -0400

x86-64: Clean up vdso/kernel shared variables

~Maarten

2011-06-15 09:36:48

by Maarten Lankhorst

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Hi Petr,

Op 15-06-11 10:12, Petr Tesarik schreef:
> Hi all,
>
> it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> warnings about jiffies being an absolute symbol, and indeed, when GRUB
> loads the kernel at a non-default address, jiffies is not relocated.
>
> In my example the kernel is configured with
> CONFIG_PHYSICAL_START=0x1000000
> CONFIG_PHYSICAL_ALIGN=0x200000
> CONFIG_RELOCATABLE=y
> and loaded at 0x200000 by GRUB.
>
> Booting fails when checking whether the timer works, because do_timer()
> increments jiffies_64, but timer_irq_works() checks jiffies. The code
> looks like this:
>
> c13daab7: 8b 3d 40 7a 39 c1 mov 0xc1397a40,%edi
>
> but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> Consequently, timer_irq_works() reads the wrong memory location and
> fails, causing a panic:
>
> kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> send a report. Then try booting with the 'noapic' option.
>
> Needless to say, the kernel freezes a few initcalls later when booted
> with noapic, because IO-APIC worked fine, in fact. I verified that by
> inserting a debugging printk() in do_timer(), and I also verified with
> that printk() that the address of jiffies_64 and the address of jiffies
> differ at run time.

Can you try this patch?

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 89aed99..49e666e 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -34,12 +34,11 @@ OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT, CONFIG_OUTPUT_FORMAT, CONFIG_OUTPUT_FORMAT)
#ifdef CONFIG_X86_32
OUTPUT_ARCH(i386)
ENTRY(phys_startup_32)
-jiffies = jiffies_64;
#else
OUTPUT_ARCH(i386:x86-64)
ENTRY(phys_startup_64)
-jiffies_64 = jiffies;
#endif
+jiffies_64 = jiffies;

#if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
/*

2011-06-15 09:58:07

by Petr Tesařík

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Maarten Lankhorst píše v St 15. 06. 2011 v 11:36 +0200:
> Hi Petr,
>
> Op 15-06-11 10:12, Petr Tesarik schreef:
> > Hi all,
> >
> > it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> > warnings about jiffies being an absolute symbol, and indeed, when GRUB
> > loads the kernel at a non-default address, jiffies is not relocated.
> >
> > In my example the kernel is configured with
> > CONFIG_PHYSICAL_START=0x1000000
> > CONFIG_PHYSICAL_ALIGN=0x200000
> > CONFIG_RELOCATABLE=y
> > and loaded at 0x200000 by GRUB.
> >
> > Booting fails when checking whether the timer works, because do_timer()
> > increments jiffies_64, but timer_irq_works() checks jiffies. The code
> > looks like this:
> >
> > c13daab7: 8b 3d 40 7a 39 c1 mov 0xc1397a40,%edi
> >
> > but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> > Consequently, timer_irq_works() reads the wrong memory location and
> > fails, causing a panic:
> >
> > kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> > send a report. Then try booting with the 'noapic' option.
> >
> > Needless to say, the kernel freezes a few initcalls later when booted
> > with noapic, because IO-APIC worked fine, in fact. I verified that by
> > inserting a debugging printk() in do_timer(), and I also verified with
> > that printk() that the address of jiffies_64 and the address of jiffies
> > differ at run time.
>
> Can you try this patch?

Hi Maarten,

thanks for the quick replay, but no, I won't even try. It can't work,
because jiffies is undefined in the final vmlinux link on 32-bit, so I
believe the link will simply fail because of undefined symbols. ;)

Petr

> diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
> index 89aed99..49e666e 100644
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -34,12 +34,11 @@ OUTPUT_FORMAT(CONFIG_OUTPUT_FORMAT, CONFIG_OUTPUT_FORMAT, CONFIG_OUTPUT_FORMAT)
> #ifdef CONFIG_X86_32
> OUTPUT_ARCH(i386)
> ENTRY(phys_startup_32)
> -jiffies = jiffies_64;
> #else
> OUTPUT_ARCH(i386:x86-64)
> ENTRY(phys_startup_64)
> -jiffies_64 = jiffies;
> #endif
> +jiffies_64 = jiffies;
>
> #if defined(CONFIG_X86_64) && defined(CONFIG_DEBUG_RODATA)
> /*

2011-06-15 10:01:22

by Petr Tesařík

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Maarten Lankhorst píše v St 15. 06. 2011 v 11:21 +0200:
> Hi Petr,
>
> 2011/6/15 Petr Tesarik <[email protected]>:
> > Hi all,
> >
> > it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> > warnings about jiffies being an absolute symbol, and indeed, when GRUB
> > loads the kernel at a non-default address, jiffies is not relocated.
> >
> > In my example the kernel is configured with
> > CONFIG_PHYSICAL_START=0x1000000
> > CONFIG_PHYSICAL_ALIGN=0x200000
> > CONFIG_RELOCATABLE=y
> > and loaded at 0x200000 by GRUB.
> >
> > Booting fails when checking whether the timer works, because do_timer()
> > increments jiffies_64, but timer_irq_works() checks jiffies. The code
> > looks like this:
> >
> > c13daab7: 8b 3d 40 7a 39 c1 mov 0xc1397a40,%edi
> >
> > but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> > Consequently, timer_irq_works() reads the wrong memory location and
> > fails, causing a panic:
> >
> > kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> > send a report. Then try booting with the 'noapic' option.
> >
> > Needless to say, the kernel freezes a few initcalls later when booted
> > with noapic, because IO-APIC worked fine, in fact. I verified that by
> > inserting a debugging printk() in do_timer(), and I also verified with
> > that printk() that the address of jiffies_64 and the address of jiffies
> > differ at run time.
> >
> > Any idea how to fix this?
> Does reverting this commit fix it?

Isn't this related to VDSO? I've got no troubles with the VDSO. It's
just that the kernel assumes a fixed location of jiffies (in the kernel
direct mapping), so it cannot be relocated.

OTOH this must have worked the other day, so searching for the commit
that broke it is a good hint. I'll try it here, too.

Petr

> commit 8c49d9a74bac5ea3f18480307057241b808fcc0c
> Author: Andy Lutomirski <[email protected]>
> Date: Mon May 23 09:31:24 2011 -0400
>
> x86-64: Clean up vdso/kernel shared variables
>
> ~Maarten

2011-06-15 10:07:31

by Petr Tesařík

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Petr Tesarik píše v St 15. 06. 2011 v 12:01 +0200:
> Maarten Lankhorst píše v St 15. 06. 2011 v 11:21 +0200:
> > Hi Petr,
> >
> > 2011/6/15 Petr Tesarik <[email protected]>:
> > > Hi all,
> > >
> > > it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> > > warnings about jiffies being an absolute symbol, and indeed, when GRUB
> > > loads the kernel at a non-default address, jiffies is not relocated.
> > >
> > > In my example the kernel is configured with
> > > CONFIG_PHYSICAL_START=0x1000000
> > > CONFIG_PHYSICAL_ALIGN=0x200000
> > > CONFIG_RELOCATABLE=y
> > > and loaded at 0x200000 by GRUB.
> > >
> > > Booting fails when checking whether the timer works, because do_timer()
> > > increments jiffies_64, but timer_irq_works() checks jiffies. The code
> > > looks like this:
> > >
> > > c13daab7: 8b 3d 40 7a 39 c1 mov 0xc1397a40,%edi
> > >
> > > but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> > > Consequently, timer_irq_works() reads the wrong memory location and
> > > fails, causing a panic:
> > >
> > > kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> > > send a report. Then try booting with the 'noapic' option.
> > >
> > > Needless to say, the kernel freezes a few initcalls later when booted
> > > with noapic, because IO-APIC worked fine, in fact. I verified that by
> > > inserting a debugging printk() in do_timer(), and I also verified with
> > > that printk() that the address of jiffies_64 and the address of jiffies
> > > differ at run time.
> > >
> > > Any idea how to fix this?
> > Does reverting this commit fix it?
>
> Isn't this related to VDSO? I've got no troubles with the VDSO. It's
> just that the kernel assumes a fixed location of jiffies (in the kernel
> direct mapping), so it cannot be relocated.
>
> OTOH this must have worked the other day, so searching for the commit
> that broke it is a good hint. I'll try it here, too.

Ah, it turns out this is in fact reported here:

http://sourceware.org/bugzilla/show_bug.cgi?id=12327

But the patch was reverted by commit
6b35eb9ddcddde7b510726de03fae071178f1ec4, so these binutils have been
broken again since January.

Yes, I've got binutils-2.21 here. :/

Petr

2011-06-15 10:28:09

by Maarten Lankhorst

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Op 15-06-11 12:07, Petr Tesarik schreef:
> Ah, it turns out this is in fact reported here:
>
> http://sourceware.org/bugzilla/show_bug.cgi?id=12327
>
> But the patch was reverted by commit
> 6b35eb9ddcddde7b510726de03fae071178f1ec4, so these binutils have been
> broken again since January.
>
> Yes, I've got binutils-2.21 here. :/
Ah, my bad for the stray path though, looked the only thing slightly related to breakage recently. I guess upgrading binutils triggered it. :)

~Maarten

2011-06-15 11:17:23

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

On Wed, Jun 15, 2011 at 4:12 AM, Petr Tesarik <[email protected]> wrote:
> Hi all,
>
> it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> warnings about jiffies being an absolute symbol, and indeed, when GRUB
> loads the kernel at a non-default address, jiffies is not relocated.
>
> In my example the kernel is configured with
> CONFIG_PHYSICAL_START=0x1000000
> CONFIG_PHYSICAL_ALIGN=0x200000
> CONFIG_RELOCATABLE=y
> and loaded at 0x200000 by GRUB.
>
> Booting fails when checking whether the timer works, because do_timer()
> increments jiffies_64, but timer_irq_works() checks jiffies. The code
> looks like this:
>
> c13daab7: ? ? ? 8b 3d 40 7a 39 c1 ? ? ? mov ? ?0xc1397a40,%edi
>
> but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> Consequently, timer_irq_works() reads the wrong memory location and
> fails, causing a panic:
>
> kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> send a report. ?Then try booting with the 'noapic' option.
>
> Needless to say, the kernel freezes a few initcalls later when booted
> with noapic, because IO-APIC worked fine, in fact. I verified that by
> inserting a debugging printk() in do_timer(), and I also verified with
> that printk() that the address of jiffies_64 and the address of jiffies
> differ at run time.
>
> Any idea how to fix this?

This could be a regression in
8c49d9a74bac5ea3f18480307057241b808fcc0c, but I haven't spotted it
yet. I'm having trouble reproducing this, though: I see the
relocation in the output of relocs --text.

Can you send me your .config? I'll fiddle with it.

--Andy

2011-06-15 12:26:31

by Petr Tesařík

[permalink] [raw]
Subject: Re: bug: kernel 3.0-rc3 not relocatable on i386?

Petr Tesarik píše v St 15. 06. 2011 v 12:07 +0200:
> Petr Tesarik píše v St 15. 06. 2011 v 12:01 +0200:
> > Maarten Lankhorst píše v St 15. 06. 2011 v 11:21 +0200:
> > > Hi Petr,
> > >
> > > 2011/6/15 Petr Tesarik <[email protected]>:
> > > > Hi all,
> > > >
> > > > it seems that the 3.0-rc3 kernel is not relocatable on i386. I get
> > > > warnings about jiffies being an absolute symbol, and indeed, when GRUB
> > > > loads the kernel at a non-default address, jiffies is not relocated.
> > > >
> > > > In my example the kernel is configured with
> > > > CONFIG_PHYSICAL_START=0x1000000
> > > > CONFIG_PHYSICAL_ALIGN=0x200000
> > > > CONFIG_RELOCATABLE=y
> > > > and loaded at 0x200000 by GRUB.
> > > >
> > > > Booting fails when checking whether the timer works, because do_timer()
> > > > increments jiffies_64, but timer_irq_works() checks jiffies. The code
> > > > looks like this:
> > > >
> > > > c13daab7: 8b 3d 40 7a 39 c1 mov 0xc1397a40,%edi
> > > >
> > > > but arch/x86/boot/compressed/vmlinux.relocs does not contain c13daaba.
> > > > Consequently, timer_irq_works() reads the wrong memory location and
> > > > fails, causing a panic:
> > > >
> > > > kernel panic: IO-APIC + timer doesn't work! Boot with apic=debug and
> > > > send a report. Then try booting with the 'noapic' option.
> > > >
> > > > Needless to say, the kernel freezes a few initcalls later when booted
> > > > with noapic, because IO-APIC worked fine, in fact. I verified that by
> > > > inserting a debugging printk() in do_timer(), and I also verified with
> > > > that printk() that the address of jiffies_64 and the address of jiffies
> > > > differ at run time.
> > > >
> > > > Any idea how to fix this?
> > > Does reverting this commit fix it?
> >
> > Isn't this related to VDSO? I've got no troubles with the VDSO. It's
> > just that the kernel assumes a fixed location of jiffies (in the kernel
> > direct mapping), so it cannot be relocated.
> >
> > OTOH this must have worked the other day, so searching for the commit
> > that broke it is a good hint. I'll try it here, too.
>
> Ah, it turns out this is in fact reported here:
>
> http://sourceware.org/bugzilla/show_bug.cgi?id=12327
>
> But the patch was reverted by commit
> 6b35eb9ddcddde7b510726de03fae071178f1ec4, so these binutils have been
> broken again since January.

I tried to understand what was wrong with the approach taken by Shaohua
and what caused a problem for Markus Trippelsdorf, but it seems the
revert didn't help in any way: I don't get build failures. I get a
non-booting kernel, and the traces didn't give me any clue what was
wrong.

Anyway, what's wrong with the generic approach suggested by hpa?

http://linux.derkeiler.com/Mailing-Lists/Kernel/2011-01/msg06976.html

Petr