2013-09-30 14:40:03

by Mel Gorman

[permalink] [raw]
Subject: [REGRESSION] jump label safety checks break automatic numa balancing

With CONFIG_NUMA_BALANCING=y and booting with numa_balancing=enable
there is a crash very early in the lifetime of the system. By setting
earlyprintk=ttyS0,115200 the error is visible and looks something like
this

[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.11.0-vanilla+ root=/dev/sda5 reboot=pci console=tty0 console=ttyS0,115200 numa_balancing=enable earlyprintk=ttyS0,115200
[ 0.000000] Unexpected op at task_numa_fault+0x1d/0xa0 [ffffffff81085ded] (0f 1f 44 00 00) arch/x86/kernel/jump_label.c:53
PANIC: early exception 06 rip 10:ffffffff815b2663 error 0 cr2 ffff88107ffff000
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.11.0-vanilla+ #23
[ 0.000000] Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
[ 0.000000] ffffffff81009220 ffffffff81a01e10 ffffffff815b7e7b 00000000000003f8
[ 0.000000] ffffffff81085ded ffffffff81a01ec8 ffffffff81ad4197 6a2f6c656e72656b
[ 0.000000] 2f3638782f686372 000000000000012b 6562616c5f706d75 ffffffff81c68444
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff81009220>] ? alternatives_text_reserved+0x80/0x80
[ 0.000000] [<ffffffff815b7e7b>] dump_stack+0x55/0x86
[ 0.000000] [<ffffffff81085ded>] ? task_numa_fault+0x1d/0xa0
[ 0.000000] [<ffffffff81ad4197>] early_idt_handler+0x77/0xa4
[ 0.000000] [<ffffffff815b2663>] ? bug_at+0x45/0x47
[ 0.000000] [<ffffffff815b2663>] ? bug_at+0x45/0x47
[ 0.000000] [<ffffffff81006db6>] __jump_label_transform.isra.0+0x136/0x150
[ 0.000000] [<ffffffff81006ea7>] arch_jump_label_transform_static+0x77/0xc0
[ 0.000000] [<ffffffff81af8596>] jump_label_init+0x81/0xaf
[ 0.000000] [<ffffffff81ad4c02>] start_kernel+0x161/0x3ce
[ 0.000000] [<ffffffff81ad48a0>] ? repair_env_string+0x5e/0x5e
[ 0.000000] [<ffffffff81ad45a5>] x86_64_start_reservations+0x2a/0x2c
[ 0.000000] [<ffffffff81ad469f>] x86_64_start_kernel+0xf8/0xfc
[ 0.000000] RIP 0x46

Bisection identified this as the problem commit.

9c85f3bdf400665eecf62658a9106501f6a77a13 is the first bad commit
commit 9c85f3bdf400665eecf62658a9106501f6a77a13
Author: Steven Rostedt <[email protected]>
Date: Thu Jan 26 18:38:07 2012 -0500

x86/jump-label: Add safety checks to jump label conversions

I did no further investigation yet in case this is already a known
problem.

--
Mel Gorman
SUSE Labs


2013-10-04 15:03:58

by Steven Rostedt

[permalink] [raw]
Subject: Re: [REGRESSION] jump label safety checks break automatic numa balancing


FYI, please remove my redhat email from your address book. I don't read
my RH email when I travel (which I've been doing a lot lately).


On Fri, 04 Oct 2013 10:44:00 -0400
Mel Gorman <[email protected]> wrote:

> With CONFIG_NUMA_BALANCING=y and booting with numa_balancing=enable
> there is a crash very early in the lifetime of the system. By setting
> earlyprintk=ttyS0,115200 the error is visible and looks something like
> this
>
> [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.11.0-vanilla+ root=/dev/sda5 reboot=pci console=tty0 console=ttyS0,115200 numa_balancing=enable earlyprintk=ttyS0,115200
> [ 0.000000] Unexpected op at task_numa_fault+0x1d/0xa0 [ffffffff81085ded] (0f 1f 44 00 00) arch/x86/kernel/jump_label.c:53
> PANIC: early exception 06 rip 10:ffffffff815b2663 error 0 cr2 ffff88107ffff000

What's at ffffffff815b2663?

> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.11.0-vanilla+ #23
> [ 0.000000] Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
> [ 0.000000] ffffffff81009220 ffffffff81a01e10 ffffffff815b7e7b 00000000000003f8
> [ 0.000000] ffffffff81085ded ffffffff81a01ec8 ffffffff81ad4197 6a2f6c656e72656b
> [ 0.000000] 2f3638782f686372 000000000000012b 6562616c5f706d75 ffffffff81c68444
> [ 0.000000] Call Trace:
> [ 0.000000] [<ffffffff81009220>] ? alternatives_text_reserved+0x80/0x80
> [ 0.000000] [<ffffffff815b7e7b>] dump_stack+0x55/0x86
> [ 0.000000] [<ffffffff81085ded>] ? task_numa_fault+0x1d/0xa0
> [ 0.000000] [<ffffffff81ad4197>] early_idt_handler+0x77/0xa4
> [ 0.000000] [<ffffffff815b2663>] ? bug_at+0x45/0x47
> [ 0.000000] [<ffffffff815b2663>] ? bug_at+0x45/0x47
> [ 0.000000] [<ffffffff81006db6>] __jump_label_transform.isra.0+0x136/0x150
> [ 0.000000] [<ffffffff81006ea7>] arch_jump_label_transform_static+0x77/0xc0
> [ 0.000000] [<ffffffff81af8596>] jump_label_init+0x81/0xaf
> [ 0.000000] [<ffffffff81ad4c02>] start_kernel+0x161/0x3ce
> [ 0.000000] [<ffffffff81ad48a0>] ? repair_env_string+0x5e/0x5e
> [ 0.000000] [<ffffffff81ad45a5>] x86_64_start_reservations+0x2a/0x2c
> [ 0.000000] [<ffffffff81ad469f>] x86_64_start_kernel+0xf8/0xfc
> [ 0.000000] RIP 0x46
>
> Bisection identified this as the problem commit.
>
> 9c85f3bdf400665eecf62658a9106501f6a77a13 is the first bad commit
> commit 9c85f3bdf400665eecf62658a9106501f6a77a13
> Author: Steven Rostedt <[email protected]>
> Date: Thu Jan 26 18:38:07 2012 -0500
>
> x86/jump-label: Add safety checks to jump label conversions
>
> I did no further investigation yet in case this is already a known
> problem.
>

We had a similar bug with Xen like this. It ended up being that jump
labels are used before they are initialized, and that is a real bug
too, as the jump labels do not get converted until initialization, and
why would something convert it before then?

-- Steve

2013-10-07 08:42:42

by Mel Gorman

[permalink] [raw]
Subject: Re: [REGRESSION] jump label safety checks break automatic numa balancing

On Fri, Oct 04, 2013 at 11:03:56AM -0400, Steven Rostedt wrote:
>
> FYI, please remove my redhat email from your address book. I don't read
> my RH email when I travel (which I've been doing a lot lately).
>
>
> On Fri, 04 Oct 2013 10:44:00 -0400
> Mel Gorman <[email protected]> wrote:
>
> > With CONFIG_NUMA_BALANCING=y and booting with numa_balancing=enable
> > there is a crash very early in the lifetime of the system. By setting
> > earlyprintk=ttyS0,115200 the error is visible and looks something like
> > this
> >
> > [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.11.0-vanilla+ root=/dev/sda5 reboot=pci console=tty0 console=ttyS0,115200 numa_balancing=enable earlyprintk=ttyS0,115200
> > [ 0.000000] Unexpected op at task_numa_fault+0x1d/0xa0 [ffffffff81085ded] (0f 1f 44 00 00) arch/x86/kernel/jump_label.c:53
> > PANIC: early exception 06 rip 10:ffffffff815b2663 error 0 cr2 ffff88107ffff000
>
> What's at ffffffff815b2663?
>

kernel/sched/fair.c:5901

--
Mel Gorman
SUSE Labs

2013-10-08 10:33:47

by Raghavendra KT

[permalink] [raw]
Subject: Re: [REGRESSION] jump label safety checks break automatic numa balancing

On Fri, Oct 4, 2013 at 8:33 PM, Steven Rostedt <[email protected]> wrote:
CCing my IBM id
> >
> > Bisection identified this as the problem commit.
> >
> > 9c85f3bdf400665eecf62658a9106501f6a77a13 is the first bad commit
> > commit 9c85f3bdf400665eecf62658a9106501f6a77a13
> > Author: Steven Rostedt <[email protected]>
> > Date: Thu Jan 26 18:38:07 2012 -0500
> >
> > x86/jump-label: Add safety checks to jump label conversions
> >
> > I did no further investigation yet in case this is already a known
> > problem.
> >
>
> We had a similar bug with Xen like this. It ended up being that jump
> labels are used before they are initialized, and that is a real bug
> too, as the jump labels do not get converted until initialization, and
> why would something convert it before then?

Just FYI,

I also faced KVM hang with CONFIG_PARAVIRT_SPINLOCK=y that uses jump label

The bisection lead to:

# good: [8d7551eb1916832f2a5b27346edf24e7b2382f67] Merge tag
'cris-for-3.12' of git://jni.nu/cris
git bisect good 8d7551eb1916832f2a5b27346edf24e7b2382f67
# good: [fb40d7a8994a3cc7a1e1c1f3258ea8662a366916] x86/jump-label:
Show where and what was wrong on errors
git bisect good fb40d7a8994a3cc7a1e1c1f3258ea8662a366916
# good: [8876dd78d9f0cd317d55dc19e5cd17194af15b52] hwmon: (ina2xx)
Remove casting the return value which is a void pointer
git bisect good 8876dd78d9f0cd317d55dc19e5cd17194af15b52
# bad: [442e0973e9273ae8832abd70f52efde8b8326178] Merge branch
'x86/jumplabel' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 442e0973e9273ae8832abd70f52efde8b8326178

I got confirmed that reverting three hunks from the jump label merge
solved the problem.
Disabling jump label resulted in eventual halt of all the VCPUs when
we use paravirt spinlock.
But I think exact fix is to split kvm_guest_spinlock_init similar to
what Xen did.

I am testing the fix and will post that soon.