2014-06-05 05:03:18

by Jet Chen

[permalink] [raw]
Subject: [rcu] BUG: unable to handle kernel NULL pointer dereference at (null)

Hi Paul,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/dev
commit a558e309f99cf8b809691417324c6770e10bb614
Author: Paul E. McKenney <[email protected]>
AuthorDate: Wed Jun 4 13:46:03 2014 -0700
Commit: Paul E. McKenney <[email protected]>
CommitDate: Wed Jun 4 15:07:01 2014 -0700

rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs
Binding the grace-period kthreads to the timekeeping CPU resulted in
significant performance decreases for some workloads. For more detail,
see:
https://lkml.org/lkml/2014/6/3/395 for benchmark numbers
https://lkml.org/lkml/2014/6/4/218 for CPU statistics
This commit avoids this issue for many configurations by binding the
grace-period kthreads to all the non-NO_HZ_FULL CPUs, not just to the
sole timekeeping CPU.
Reported-by: Jet Chen <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

+------------------------------------------------------+------------+------------+
| | b5ec6ac529 | a558e309f9 |
+------------------------------------------------------+------------+------------+
| boot_successes | 78 | 0 |
| boot_failures | 2 | 30 |
| BUG:kernel_test_crashed | 2 | |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference | 0 | 29 |
| Oops | 0 | 30 |
| RIP:__bitmap_equal | 0 | 30 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 30 |
| backtrace:set_cpus_allowed_ptr | 0 | 30 |
+------------------------------------------------------+------------+------------+

[ 0.252885] pci 0000:00:02.0: Boot video device
[ 0.253561] PCI: CLS 0 bytes, default 64
[ 0.254253] Unpacking initramfs...
[ 0.254266] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 0.254289] IP: [<ffffffff813d93c1>] __bitmap_equal+0x51/0x90
[ 0.254290] PGD 0 [ 0.254291] Oops: 0000 [#1] SMP [ 0.254294] Modules linked in:
[ 0.254300] CPU: 1 PID: 7 Comm: rcu_sched Not tainted 3.15.0-rc1-00084-ga558e30 #1
[ 0.254300] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 0.254301] task: ffff880013783b00 ti: ffff880013788000 task.ti: ffff880013788000
[ 0.254303] RIP: 0010:[<ffffffff813d93c1>] [<ffffffff813d93c1>] __bitmap_equal+0x51/0x90
[ 0.254306] RSP: 0000:ffff880013789dd0 EFLAGS: 00010046
[ 0.254306] RAX: 0000000000000000 RBX: ffff880013783b00 RCX: 0000000000000000
[ 0.254307] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff880013783e08
[ 0.254307] RBP: ffff880013789dd0 R08: 0000000000000282 R09: 0000000000000000
[ 0.254308] R10: 0000000000000000 R11: 0000000000000005 R12: ffffffff81c7fc60
[ 0.254308] R13: 0000000000000000 R14: ffffffff81c5f400 R15: ffff880013b149c0
[ 0.254309] FS: 0000000000000000(0000) GS:ffff880013b00000(0000) knlGS:0000000000000000
[ 0.254309] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 0.254310] CR2: 0000000000000000 CR3: 0000000001bfd000 CR4: 00000000000006e0
[ 0.254316] Stack:
[ 0.254317] ffff880013789e20 ffffffff8109c60d 0000000000000282 ffff880013789e78
[ 0.254318] ffff880013789e20 ffffffff81c5f300 ffffffff81c7fc60 ffff880013b0eb20
[ 0.254319] ffffffff81c5f400 ffffffff81c5f300 ffff880013789ec8 ffffffff810ceb3c
[ 0.254319] Call Trace:
[ 0.254338] [<ffffffff8109c60d>] set_cpus_allowed_ptr+0x3d/0x110
[ 0.254344] [<ffffffff810ceb3c>] rcu_gp_kthread+0xec/0x8d0
[ 0.254349] [<ffffffff810b0410>] ? abort_exclusive_wait+0xb0/0xb0
[ 0.254350] [<ffffffff810cea50>] ? rcu_process_callbacks+0x5d0/0x5d0
[ 0.254356] [<ffffffff8108b852>] kthread+0xd2/0xf0
[ 0.254358] [<ffffffff8108b780>] ? kthread_create_on_node+0x180/0x180
[ 0.254372] [<ffffffff81831e7c>] ret_from_fork+0x7c/0xb0
[ 0.254374] [<ffffffff8108b780>] ? kthread_create_on_node+0x180/0x180
[ 0.254383] Code: 1f 00 4c 8b 44 07 08 48 83 c0 08 4c 3b 04 06 75 49 83 c1 01 44 39 c9 75 e9 f6 c2 3f b8 01 00 00 00 74 30 89 d0 48 63 c9 c1 f8 1f <48> 8b 34 ce 48 33 34 cf c1 e8 1a 8d 0c 02 83 e1 3f 29 c1 b8 01 [ 0.254385] RIP [<ffffffff813d93c1>] __bitmap_equal+0x51/0x90
[ 0.254385] RSP <ffff880013789dd0>
[ 0.254386] CR2: 0000000000000000
[ 0.254397] ---[ end trace 685126bdf0ef28ff ]---
[ 0.254398] Kernel panic - not syncing: Fatal exception

Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.


git bisect start a558e309f99cf8b809691417324c6770e10bb614 c9eaa447e77efe77b7fa4c953bd62de8297fd6c5 --
git bisect good 812cae8dd849450dcdf4bcf2b180c46a6b9dc165 # 09:08 20+ 1 rcu: Handle obsolete references to TINY_PREEMPT_RCU
git bisect good 2486acf2d73ff255fadc92a4ee45b6c466ac15ec # 09:15 20+ 0 torture: Avoid format string leak to thead name
git bisect good aaf0bec87d07b6ccede533c6eaac3dcd7eed7638 # 09:28 20+ 0 MAINTAINERS: Add "R:" designated-reviewers tag
git bisect good 8f23f609aea25eef7a70a81749c1cc01f908b34c # 09:39 20+ 0 rcu: Add designated reviewers for RCU
git bisect good b5ec6ac52922763165166f6aa6ffde948d4dd067 # 09:51 20+ 1 Update RCU maintainership
# first bad commit: [a558e309f99cf8b809691417324c6770e10bb614] rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs
git bisect good b5ec6ac52922763165166f6aa6ffde948d4dd067 # 09:56 60+ 2 Update RCU maintainership
git bisect bad a558e309f99cf8b809691417324c6770e10bb614 # 09:56 0- 30 rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs
git bisect good 54539cd217d687d9acf385eab22ec02b3f7a86a0 # 10:10 60+ 1 Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
git bisect good d0b2e95e0bccbe33652d76f4979603447de2e048 # 10:24 60+ 0 Add linux-next specific files for 20140604


This script may reproduce the error.

-----------------------------------------------------------------------------
#!/bin/bash

kernel=$1
initrd=yocto-minimal-x86_64.cgz

wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/blob/master/initrd/$initrd

kvm=(
qemu-system-x86_64 -cpu kvm64 -enable-kvm
-kernel $kernel
-initrd $initrd
-smp 2
-m 256M
-net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio
-net user,vlan=0
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-serial stdio
-display none
-monitor null
)

append=(
debug
sched_debug
apic=debug
ignore_loglevel
sysrq_always_enabled
panic=10
prompt_ramdisk=0
earlyprintk=ttyS0,115200
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
)

"${kvm[@]}" --append "${append[*]}"
-----------------------------------------------------------------------------

Thanks,
Jet


Attachments:
dmesg-yocto-ivb44-33:20140605090019:x86_64-rhel:3.15.0-rc1-00084-ga558e30:1 (33.94 kB)
Attached Message Part (86.00 B)
config-3.15.0-rc1-00084-ga558e30 (120.41 kB)
dmesg-quantal-ivb44-108:20140605095447:x86_64-rhel:3.15.0-rc1-00083-gb5ec6ac:1 (48.16 kB)
Download all attachments