2011-05-15 10:57:16

by Marcin Slusarz

[permalink] [raw]
Subject: early kernel crash when kmemleak is enabled

Hi

Yesterday I enabled kmemleak for the first time and it greeted me with a panic :).
It's fully reproducible in qemu (for some reason only when kvm is enabled).
Dmesg, .config and disassembly of __queue_work (which it dies in) below.
Booting with kmemleak=off does not crash the kernel.

[ 0.000000] Linux version 2.6.39-rc4-nv+ (marcin@joi) (gcc version 4.4.5 (Gentoo 4.4.5 p1.0, pie-0.4.5) ) #697 SMP PREEMPT Sun May 15 12:30:37 CEST 2011
[ 0.000000] Command line: console=ttyS0
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
[ 0.000000] BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 0000000007ffd000 (usable)
[ 0.000000] BIOS-e820: 0000000007ffd000 - 0000000008000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI 2.4 present.
[ 0.000000] No AGP bridge found
[ 0.000000] last_pfn = 0x7ffd max_arch_pfn = 0x400000000
[ 0.000000] PAT not supported by CPU.
[ 0.000000] init_memory_mapping: 0000000000000000-0000000007ffd000
[ 0.000000] ACPI: RSDP 00000000000fdbb0 00014 (v00 BOCHS )
[ 0.000000] ACPI: RSDT 0000000007ffde30 00034 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: FACP 0000000007fffe70 00074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001)
[ 0.000000] ACPI: DSDT 0000000007ffdfd0 01E22 (v01 BXPC BXDSDT 00000001 INTL 20090123)
[ 0.000000] ACPI: FACS 0000000007fffe00 00040
[ 0.000000] ACPI: SSDT 0000000007ffdf90 00037 (v01 BOCHS BXPCSSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: APIC 0000000007ffdeb0 00072 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001)
[ 0.000000] ACPI: HPET 0000000007ffde70 00038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001)
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0x00000010 -> 0x00001000
[ 0.000000] DMA32 0x00001000 -> 0x00100000
[ 0.000000] Normal empty
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0x00000010 -> 0x0000009f
[ 0.000000] 0: 0x00000100 -> 0x00007ffd
[ 0.000000] ACPI: PM-Timer IO Port: 0xb008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[ 0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.000000] Allocating PCI resources starting at 8000000 (gap: 8000000:f7fc0000)
[ 0.000000] setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:1 nr_node_ids:1
[ 0.000000] PERCPU: Embedded 26 pages/cpu @ffff880007c00000 s76224 r8192 d22080 u2097152
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32199
[ 0.000000] Kernel command line: console=ttyS0
[ 0.000000] PID hash table entries: 512 (order: 0, 4096 bytes)
[ 0.000000] Dentry cache hash table entries: 16384 (order: 5, 131072 bytes)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
[ 0.000000] Checking aperture...
[ 0.000000] No AGP bridge found
[ 0.000000] Memory: 112972k/131060k available (4866k kernel code, 452k absent, 17636k reserved, 5923k data, 1740k init)
[ 0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] Preemptable hierarchical RCU implementation.
[ 0.000000] NR_IRQS:512
[ 0.000000] Console: colour VGA+ 80x25
[ 0.000000] console [ttyS0] enabled
[ 0.000000] Fast TSC calibration using PIT
[ 0.000000] Detected 2672.813 MHz processor.
[ 0.020003] Calibrating delay loop (skipped), value calculated using timer frequency.. 5345.62 BogoMIPS (lpj=26728130)
[ 0.021929] pid_max: default: 32768 minimum: 301
[ 0.023380] Mount-cache hash table entries: 256
[ 0.030625] mce: CPU supports 10 MCE banks
[ 0.031614] SMP alternatives: switching to UP code
[ 0.040891] debug: unmapping init memory ffffffff81c3e000..ffffffff81c43000
[ 0.042167] ACPI: Core revision 20110316
[ 0.077518] Setting APIC routing to flat
[ 0.079415] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.100047] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 0.101416] IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
[ 0.102390] PGD 0
[ 0.102751] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 0.103711] last sysfs file:
[ 0.104216] CPU 0
[ 0.104535] Modules linked in:
[ 0.105117]
[ 0.105379] Pid: 1, comm: swapper Not tainted 2.6.39-rc4-nv+ #697 Bochs Bochs
[ 0.106605] RIP: 0010:[<ffffffff810854d1>] [<ffffffff810854d1>] __queue_work+0x29/0x41a
[ 0.108001] RSP: 0018:ffff880007c03c90 EFLAGS: 00010246
[ 0.108912] RAX: 0000000000000000 RBX: ffffffff81a4fa80 RCX: ffff880007c03bb0
[ 0.110000] RDX: 0000000000000003 RSI: ffffffff81e41de8 RDI: 0000000000000001
[ 0.110000] RBP: ffff880007c03ce0 R08: ffffffff81e41de0 R09: 000000000000005a
[ 0.110000] R10: ffffea0000000008 R11: 0000000000000000 R12: 0000000000000000
[ 0.110000] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88000794b150
[ 0.110000] FS: 0000000000000000(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
[ 0.110000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 0.110000] CR2: 0000000000000000 CR3: 0000000001a23000 CR4: 00000000000006f0
[ 0.110000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.110000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.110000] Process swapper (pid: 1, threadinfo ffff880007ac2000, task ffff880007ac8000)
[ 0.110000] Stack:
[ 0.110000] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.110000] 0000000000000000 0000000000000000 ffffffff81a4fa80 ffff88000794b000
[ 0.110000] 0000000000000000 ffff88000794b150 ffff880007c03cf0 ffffffff81085910
[ 0.110000] Call Trace:
[ 0.110000] <IRQ>
[ 0.110000] [<ffffffff81085910>] queue_work_on+0x16/0x1d
[ 0.110000] [<ffffffff81085abc>] queue_work+0x29/0x55
[ 0.110000] [<ffffffff81085afb>] schedule_work+0x13/0x15
[ 0.110000] [<ffffffff81242de1>] free_object+0x90/0x95
[ 0.110000] [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
[ 0.110000] [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
[ 0.110000] [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
[ 0.110000] [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
[ 0.110000] [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
[ 0.110000] [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
[ 0.110000] [<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
[ 0.110000] [<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
[ 0.110000] [<ffffffff810733b2>] __do_softirq+0x117/0x207
[ 0.110000] [<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
[ 0.110000] [<ffffffff814bd0cc>] call_softirq+0x1c/0x30
[ 0.110000] [<ffffffff81034bc4>] do_softirq+0x38/0x80
[ 0.110000] [<ffffffff810730ed>] irq_exit+0x4e/0xa0
[ 0.110000] [<ffffffff8103429a>] do_IRQ+0x97/0xae
[ 0.110000] [<ffffffff814b6853>] common_interrupt+0x13/0x13
[ 0.110000] <EOI>
[ 0.110000] [<ffffffff814b92b8>] ? add_preempt_count+0x62/0xc5
[ 0.110000] [<ffffffff8123d28f>] delay_tsc+0x78/0xcb
[ 0.110000] [<ffffffff8123d32a>] __const_udelay+0x25/0x27
[ 0.110000] [<ffffffff81aaae16>] timer_irq_works+0x3c/0x77
[ 0.110000] [<ffffffff81aab422>] setup_IO_APIC+0x337/0x755
[ 0.110000] [<ffffffff81aa956c>] native_smp_prepare_cpus+0x3a0/0x451
[ 0.110000] [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[ 0.110000] [<ffffffff81a9e52b>] kernel_init+0x4e/0x135
[ 0.110000] [<ffffffff8123e0fe>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[ 0.110000] [<ffffffff814bcfd4>] kernel_thread_helper+0x4/0x10
[ 0.110000] [<ffffffff81063d70>] ? finish_task_switch+0x5a/0xcb
[ 0.110000] [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[ 0.110000] [<ffffffff814b68b1>] ? retint_restore_args+0xe/0xe
[ 0.110000] [<ffffffff81a9e4dd>] ? parse_early_options+0x20/0x20
[ 0.110000] [<ffffffff814bcfd0>] ? gs_change+0xb/0xb
[ 0.110000] Code: c9 c3 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 48 c7 c6 a0 b7 a3 81 53 41 89 fc 48 83 ec 28 48 89 d3 48 89 d7 e8 63 d7 1b 00
[ 0.110000] f6 45 00 40 0f 84 6b 01 00 00 b8 09 00 00 00 83 3d 28 10 a0
[ 0.110000] RIP [<ffffffff810854d1>] __queue_work+0x29/0x41a
[ 0.110000] RSP <ffff880007c03c90>
[ 0.110000] CR2: 0000000000000000
[ 0.110005] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 0.110788] Kernel panic - not syncing: Fatal exception in interrupt
[ 0.120003] Pid: 1, comm: swapper Tainted: G D 2.6.39-rc4-nv+ #697
[ 0.121184] Call Trace:
[ 0.121609] <IRQ> [<ffffffff814b2e58>] panic+0x96/0x1a1
[ 0.122558] [<ffffffff814b7667>] oops_end+0x81/0x8e
[ 0.123417] [<ffffffff81052ac3>] no_context+0x1f7/0x206
[ 0.124352] [<ffffffff81052c62>] __bad_area_nosemaphore+0x190/0x1b3
[ 0.125427] [<ffffffff81052c93>] bad_area_nosemaphore+0xe/0x10
[ 0.126436] [<ffffffff814b8f80>] do_page_fault+0x265/0x48a
[ 0.127377] [<ffffffff8123e13a>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 0.128488] [<ffffffff814b6acf>] page_fault+0x1f/0x30
[ 0.130004] [<ffffffff810854d1>] ? __queue_work+0x29/0x41a
[ 0.130965] [<ffffffff81085910>] queue_work_on+0x16/0x1d
[ 0.131884] [<ffffffff81085abc>] queue_work+0x29/0x55
[ 0.132762] [<ffffffff81085afb>] schedule_work+0x13/0x15
[ 0.133685] [<ffffffff81242de1>] free_object+0x90/0x95
[ 0.134572] [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
[ 0.135678] [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
[ 0.136850] [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
[ 0.137835] [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
[ 0.138810] [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
[ 0.139786] [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
[ 0.140004] [<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
[ 0.141106] [<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
[ 0.142185] [<ffffffff810733b2>] __do_softirq+0x117/0x207
[ 0.143167] [<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
[ 0.144199] [<ffffffff814bd0cc>] call_softirq+0x1c/0x30
[ 0.145142] [<ffffffff81034bc4>] do_softirq+0x38/0x80
[ 0.146062] [<ffffffff810730ed>] irq_exit+0x4e/0xa0
[ 0.146940] [<ffffffff8103429a>] do_IRQ+0x97/0xae
[ 0.147784] [<ffffffff814b6853>] common_interrupt+0x13/0x13
[ 0.148768] <EOI> [<ffffffff814b92b8>] ? add_preempt_count+0x62/0xc5
[ 0.150791] [<ffffffff8123d28f>] delay_tsc+0x78/0xcb
[ 0.151645] [<ffffffff8123d32a>] __const_udelay+0x25/0x27
[ 0.152594] [<ffffffff81aaae16>] timer_irq_works+0x3c/0x77
[ 0.153545] [<ffffffff81aab422>] setup_IO_APIC+0x337/0x755
[ 0.154533] [<ffffffff81aa956c>] native_smp_prepare_cpus+0x3a0/0x451
[ 0.155627] [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[ 0.156679] [<ffffffff81a9e52b>] kernel_init+0x4e/0x135
[ 0.157590] [<ffffffff8123e0fe>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[ 0.158680] [<ffffffff814bcfd4>] kernel_thread_helper+0x4/0x10
[ 0.160004] [<ffffffff81063d70>] ? finish_task_switch+0x5a/0xcb
[ 0.161038] [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[ 0.162087] [<ffffffff814b68b1>] ? retint_restore_args+0xe/0xe
[ 0.163086] [<ffffffff81a9e4dd>] ? parse_early_options+0x20/0x20
[ 0.164130] [<ffffffff814bcfd0>] ? gs_change+0xb/0xb


(gdb) disassemble __queue_work
Dump of assembler code for function __queue_work:
0x0000000000085318 <+0>: push %rbp
0x0000000000085319 <+1>: mov %rsp,%rbp
0x000000000008531c <+4>: push %r15
0x000000000008531e <+6>: push %r14
0x0000000000085320 <+8>: push %r13
0x0000000000085322 <+10>: mov %rsi,%r13
0x0000000000085325 <+13>: push %r12
0x0000000000085327 <+15>: mov $0x0,%rsi
0x000000000008532e <+22>: push %rbx
0x000000000008532f <+23>: mov %edi,%r12d
0x0000000000085332 <+26>: sub $0x28,%rsp
0x0000000000085336 <+30>: mov %rdx,%rbx
0x0000000000085339 <+33>: mov %rdx,%rdi
0x000000000008533c <+36>: callq 0x85341 <__queue_work+41> <----------- this line ?!
0x0000000000085341 <+41>: testb $0x40,0x0(%r13)
0x0000000000085346 <+46>: je 0x854b7 <__queue_work+415>
0x000000000008534c <+52>: mov $0x9,%eax
0x0000000000085351 <+57>: cmpl $0x0,0x0(%rip) # 0x85358 <__queue_work+64>
0x0000000000085358 <+64>: js 0x8537b <__queue_work+99>
0x000000000008535a <+66>: xor %edx,%edx
0x000000000008535c <+68>: mov $0x8,%esi
0x0000000000085361 <+73>: mov 0x0(%rip),%rdi # 0x85368 <__queue_work+80>
0x0000000000085368 <+80>: callq 0x8536d <__queue_work+85>
0x000000000008536d <+85>: mov $0x8,%edx
0x0000000000085372 <+90>: cmp 0x0(%rip),%eax # 0x85378 <__queue_work+96>
0x0000000000085378 <+96>: cmovge %edx,%eax
0x000000000008537b <+99>: mov %eax,-0x34(%rbp)
0x000000000008537e <+102>: mov $0x0,%r14
0x0000000000085385 <+109>: mov $0x8,%r15d
0x000000000008538b <+115>: mov %gs:0x0,%rax
0x0000000000085394 <+124>: mov %rax,-0x40(%rbp)
0x0000000000085398 <+128>: jmpq 0x8547d <__queue_work+357>
0x000000000008539d <+133>: jne 0x853a8 <__queue_work+144>
0x000000000008539f <+135>: mov $0x0,%rdx
0x00000000000853a6 <+142>: jmp 0x853b6 <__queue_work+158>
0x00000000000853a8 <+144>: mov -0x34(%rbp),%eax
0x00000000000853ab <+147>: mov %r14,%rdx
0x00000000000853ae <+150>: add 0x0(,%rax,8),%rdx
0x00000000000853b6 <+158>: mov %rdx,%rdi
0x00000000000853b9 <+161>: mov %rdx,-0x48(%rbp)
0x00000000000853bd <+165>: callq 0x853c2 <__queue_work+170>
0x00000000000853c2 <+170>: xor %edi,%edi
0x00000000000853c4 <+172>: mov %rax,%rsi
0x00000000000853c7 <+175>: mov -0x48(%rbp),%rdx
0x00000000000853cb <+179>: mov 0x48(%rdx,%rdi,1),%rcx
0x00000000000853d0 <+184>: jmp 0x853f5 <__queue_work+221>
0x00000000000853d2 <+186>: mov %rcx,-0x48(%rbp)
0x00000000000853d6 <+190>: mov %rdx,%rdi
0x00000000000853d9 <+193>: callq 0x853de <__queue_work+198>
0x00000000000853de <+198>: mov -0x48(%rbp),%rcx
0x00000000000853e2 <+202>: mov 0x18(%rcx),%rax
0x00000000000853e6 <+206>: cmp %r13,0x8(%rax)
0x00000000000853ea <+210>: sete %al
0x00000000000853ed <+213>: jmpq 0x85489 <__queue_work+369>
0x00000000000853f2 <+218>: mov %rax,%rcx
0x00000000000853f5 <+221>: test %rcx,%rcx
0x00000000000853f8 <+224>: je 0x8540c <__queue_work+244>
0x00000000000853fa <+226>: mov (%rcx),%rax
0x00000000000853fd <+229>: mov -0x40(%rbp),%r8
0x0000000000085401 <+233>: prefetcht0 (%rax)
0x0000000000085404 <+236>: cmp %r8,0x30(%rcx)
0x0000000000085408 <+240>: jne 0x853f2 <__queue_work+218>
0x000000000008540a <+242>: jmp 0x853d2 <__queue_work+186>
0x000000000008540c <+244>: add $0x8,%rdi
0x0000000000085410 <+248>: cmp $0x200,%rdi
0x0000000000085417 <+255>: jne 0x853cb <__queue_work+179>
0x0000000000085419 <+257>: mov %rdx,%rdi
0x000000000008541c <+260>: callq 0x85421 <__queue_work+265>
0x0000000000085421 <+265>: mov -0x34(%rbp),%edx
0x0000000000085424 <+268>: mov $0x9,%eax
0x0000000000085429 <+273>: cmp 0x0(%rip),%edx # 0x8542f <__queue_work+279>
0x000000000008542f <+279>: jge 0x8547a <__queue_work+354>
0x0000000000085431 <+281>: cmp $0x8,%edx
0x0000000000085434 <+284>: jne 0x85457 <__queue_work+319>
0x0000000000085436 <+286>: cmpb $0x1,0x0(%rip) # 0x8543d <__queue_work+293>
0x000000000008543d <+293>: je 0x85457 <__queue_work+319>
0x000000000008543f <+295>: mov $0x6b,%esi
0x0000000000085444 <+300>: mov $0x0,%rdi
0x000000000008544b <+307>: callq 0x85450 <__queue_work+312>
0x0000000000085450 <+312>: movb $0x1,0x0(%rip) # 0x85457 <__queue_work+319>
0x0000000000085457 <+319>: mov -0x34(%rbp),%edx
0x000000000008545a <+322>: mov $0x8,%esi
0x000000000008545f <+327>: inc %edx
0x0000000000085461 <+329>: mov 0x0(%rip),%rdi # 0x85468 <__queue_work+336>
0x0000000000085468 <+336>: movslq %edx,%rdx
0x000000000008546b <+339>: callq 0x85470 <__queue_work+344>
0x0000000000085470 <+344>: cmp 0x0(%rip),%eax # 0x85476 <__queue_work+350>
0x0000000000085476 <+350>: cmovge %r15d,%eax
0x000000000008547a <+354>: mov %eax,-0x34(%rbp)
0x000000000008547d <+357>: cmpl $0x8,-0x34(%rbp)
0x0000000000085481 <+361>: jbe 0x8539d <__queue_work+133>
0x0000000000085487 <+367>: xor %eax,%eax
0x0000000000085489 <+369>: test %al,%al
0x000000000008548b <+371>: jne 0x854b7 <__queue_work+415>
0x000000000008548d <+373>: cmpb $0x1,0x0(%rip) # 0x85494 <__queue_work+380>
0x0000000000085494 <+380>: je 0x85723 <__queue_work+1035>
0x000000000008549a <+386>: mov $0x3e2,%esi
0x000000000008549f <+391>: mov $0x0,%rdi
0x00000000000854a6 <+398>: callq 0x854ab <__queue_work+403>
0x00000000000854ab <+403>: movb $0x1,0x0(%rip) # 0x854b2 <__queue_work+410>
0x00000000000854b2 <+410>: jmpq 0x85723 <__queue_work+1035>
0x00000000000854b7 <+415>: mov 0x0(%r13),%eax
0x00000000000854bb <+419>: test $0x2,%al
0x00000000000854bd <+421>: jne 0x85571 <__queue_work+601>
0x00000000000854c3 <+427>: cmp $0x8,%r12d
0x00000000000854c7 <+431>: jne 0x854df <__queue_work+455>
0x00000000000854c9 <+433>: mov %gs:0x0,%r12d
0x00000000000854d2 <+442>: mov $0x0,%r14
0x00000000000854d9 <+449>: cmp $0x8,%r12d
0x00000000000854dd <+453>: je 0x854f1 <__queue_work+473>
0x00000000000854df <+455>: mov $0x0,%r14
0x00000000000854e6 <+462>: mov %r12d,%edx
0x00000000000854e9 <+465>: add 0x0(,%rdx,8),%r14
0x00000000000854f1 <+473>: test $0x1,%al
0x00000000000854f3 <+475>: je 0x85564 <__queue_work+588>
0x00000000000854f5 <+477>: mov %rbx,%rdi
0x00000000000854f8 <+480>: callq 0x8243c <get_work_gcwq>
0x00000000000854fd <+485>: test %rax,%rax
0x0000000000085500 <+488>: je 0x85564 <__queue_work+588>
0x0000000000085502 <+490>: cmp %r14,%rax
0x0000000000085505 <+493>: je 0x85564 <__queue_work+588>
0x0000000000085507 <+495>: mov %rax,%rdi
0x000000000008550a <+498>: mov %rax,-0x48(%rbp)
0x000000000008550e <+502>: callq 0x85513 <__queue_work+507>
0x0000000000085513 <+507>: mov %rbx,%rcx
0x0000000000085516 <+510>: mov %rax,%r15
0x0000000000085519 <+513>: shr $0xb,%rcx
0x000000000008551d <+517>: mov %rbx,%rax
0x0000000000085520 <+520>: mov -0x48(%rbp),%rdx
0x0000000000085524 <+524>: shr $0x5,%rax
0x0000000000085528 <+528>: lea (%rcx,%rax,1),%rax
0x000000000008552c <+532>: and $0x3f,%eax
0x000000000008552f <+535>: mov 0x48(%rdx,%rax,8),%rax
0x0000000000085534 <+540>: test %rax,%rax
0x0000000000085537 <+543>: je 0x85559 <__queue_work+577>
0x0000000000085539 <+545>: mov (%rax),%rcx
0x000000000008553c <+548>: cmp %rbx,0x10(%rax)
0x0000000000085540 <+552>: prefetcht0 (%rcx)
0x0000000000085543 <+555>: je 0x8554a <__queue_work+562>
0x0000000000085545 <+557>: mov %rcx,%rax
0x0000000000085548 <+560>: jmp 0x85534 <__queue_work+540>
0x000000000008554a <+562>: mov 0x18(%rax),%rax
0x000000000008554e <+566>: cmp %r13,0x8(%rax)
0x0000000000085552 <+570>: jne 0x85559 <__queue_work+577>
0x0000000000085554 <+572>: mov %rdx,%r14
0x0000000000085557 <+575>: jmp 0x85587 <__queue_work+623>
0x0000000000085559 <+577>: mov %r15,%rsi
0x000000000008555c <+580>: mov %rdx,%rdi
0x000000000008555f <+583>: callq 0x85564 <__queue_work+588>
0x0000000000085564 <+588>: mov %r14,%rdi
0x0000000000085567 <+591>: callq 0x8556c <__queue_work+596>
0x000000000008556c <+596>: mov %rax,%r15
0x000000000008556f <+599>: jmp 0x85587 <__queue_work+623>
0x0000000000085571 <+601>: mov $0x0,%rdi
0x0000000000085578 <+608>: mov $0x0,%r14
0x000000000008557f <+615>: callq 0x85584 <__queue_work+620>
0x0000000000085584 <+620>: mov %rax,%r15
0x0000000000085587 <+623>: mov 0x28(%r14),%eax
0x000000000008558b <+627>: testb $0x2,0x0(%r13)
0x0000000000085590 <+632>: jne 0x855aa <__queue_work+658>
0x0000000000085592 <+634>: cmp 0x0(%rip),%eax # 0x85598 <__queue_work+640>
0x0000000000085598 <+640>: jae 0x855b5 <__queue_work+669>
0x000000000008559a <+642>: mov 0x8(%r13),%r13
0x000000000008559e <+646>: mov %eax,%eax
0x00000000000855a0 <+648>: add 0x0(,%rax,8),%r13
0x00000000000855a8 <+656>: jmp 0x855b8 <__queue_work+672>
0x00000000000855aa <+658>: cmp $0x8,%eax
0x00000000000855ad <+661>: jne 0x855b5 <__queue_work+669>
0x00000000000855af <+663>: mov 0x8(%r13),%r13
0x00000000000855b3 <+667>: jmp 0x855b8 <__queue_work+672>
0x00000000000855b5 <+669>: xor %r13d,%r13d
0x00000000000855b8 <+672>: cmpl $0x0,0x0(%rip) # 0x855bf <__queue_work+679>
0x00000000000855bf <+679>: je 0x8561c <__queue_work+772>
0x00000000000855c1 <+681>: mov %gs:0x0,%rax
0x00000000000855ca <+690>: incl -0x1fbc(%rax)
0x00000000000855d0 <+696>: mov 0x0(%rip),%rax # 0x855d7 <__queue_work+703>
0x00000000000855d7 <+703>: test %rax,%rax
0x00000000000855da <+706>: je 0x855fd <__queue_work+741>
0x00000000000855dc <+708>: mov 0x8(%rax),%rdi
0x00000000000855e0 <+712>: mov %rbx,%rcx
0x00000000000855e3 <+715>: mov %rax,-0x48(%rbp)
0x00000000000855e7 <+719>: mov %r13,%rdx
0x00000000000855ea <+722>: mov %r12d,%esi
0x00000000000855ed <+725>: callq *(%rax)
0x00000000000855ef <+727>: mov -0x48(%rbp),%rax
0x00000000000855f3 <+731>: add $0x10,%rax
0x00000000000855f7 <+735>: cmpq $0x0,(%rax)
0x00000000000855fb <+739>: jmp 0x855da <__queue_work+706>
0x00000000000855fd <+741>: mov %gs:0x0,%rax
0x0000000000085606 <+750>: decl -0x1fbc(%rax)
0x000000000008560c <+756>: mov -0x1fc8(%rax),%rax
0x0000000000085613 <+763>: test $0x8,%al
0x0000000000085615 <+765>: je 0x8561c <__queue_work+772>
0x0000000000085617 <+767>: callq 0x8561c <__queue_work+772>
0x000000000008561c <+772>: lea 0x8(%rbx),%rax
0x0000000000085620 <+776>: cmp %rax,0x8(%rbx)
0x0000000000085624 <+780>: je 0x8562a <__queue_work+786>
0x0000000000085626 <+782>: ud2a
0x0000000000085628 <+784>: jmp 0x85628 <__queue_work+784>
0x000000000008562a <+786>: movslq 0x10(%r13),%rax
0x000000000008562e <+790>: incl 0x18(%r13,%rax,4)
0x0000000000085633 <+795>: mov 0x10(%r13),%ecx
0x0000000000085637 <+799>: mov 0x54(%r13),%eax
0x000000000008563b <+803>: shl $0x5,%ecx
0x000000000008563e <+806>: cmp 0x58(%r13),%eax
0x0000000000085642 <+810>: jge 0x85706 <__queue_work+1006>
0x0000000000085648 <+816>: cmpl $0x0,0x0(%rip) # 0x8564f <__queue_work+823>
0x000000000008564f <+823>: je 0x856ae <__queue_work+918>
0x0000000000085651 <+825>: mov %gs:0x0,%rax
0x000000000008565a <+834>: incl -0x1fbc(%rax)
0x0000000000085660 <+840>: mov 0x0(%rip),%r12 # 0x85667 <__queue_work+847>
0x0000000000085667 <+847>: test %r12,%r12
0x000000000008566a <+850>: je 0x85689 <__queue_work+881>
0x000000000008566c <+852>: mov 0x8(%r12),%rdi
0x0000000000085671 <+857>: mov %rbx,%rsi
0x0000000000085674 <+860>: mov %ecx,-0x48(%rbp)
0x0000000000085677 <+863>: callq *(%r12)
0x000000000008567b <+867>: add $0x10,%r12
0x000000000008567f <+871>: mov -0x48(%rbp),%ecx
0x0000000000085682 <+874>: cmpq $0x0,(%r12)
0x0000000000085687 <+879>: jmp 0x8566a <__queue_work+850>
0x0000000000085689 <+881>: mov %gs:0x0,%rax
0x0000000000085692 <+890>: decl -0x1fbc(%rax)
0x0000000000085698 <+896>: mov -0x1fc8(%rax),%rax
0x000000000008569f <+903>: test $0x8,%al
0x00000000000856a1 <+905>: je 0x856ae <__queue_work+918>
0x00000000000856a3 <+907>: mov %ecx,-0x48(%rbp)
0x00000000000856a6 <+910>: callq 0x856ab <__queue_work+915>
0x00000000000856ab <+915>: mov -0x48(%rbp),%ecx
0x00000000000856ae <+918>: incl 0x54(%r13)
0x00000000000856b2 <+922>: mov 0x8(%r13),%rax
0x00000000000856b6 <+926>: lea 0x18(%r14),%rsi
0x00000000000856ba <+930>: testb $0x10,(%rax)
0x00000000000856bd <+933>: jne 0x856c4 <__queue_work+940>
0x00000000000856bf <+935>: mov %rsi,%rdx
0x00000000000856c2 <+938>: jmp 0x8570d <__queue_work+1013>
0x00000000000856c4 <+940>: mov 0x18(%r14),%rax
0x00000000000856c8 <+944>: sub $0x8,%rax
0x00000000000856cc <+948>: jmp 0x856ef <__queue_work+983>
0x00000000000856ce <+950>: mov (%rax),%r8
0x00000000000856d1 <+953>: xor %eax,%eax
0x00000000000856d3 <+955>: test $0x4,%r8b
0x00000000000856d7 <+959>: je 0x856e2 <__queue_work+970>
0x00000000000856d9 <+961>: mov %r8,%rax
0x00000000000856dc <+964>: and $0xfffffffffffffe00,%rax
0x00000000000856e2 <+970>: mov 0x8(%rax),%rax
0x00000000000856e6 <+974>: testb $0x10,(%rax)
0x00000000000856e9 <+977>: je 0x856ff <__queue_work+999>
0x00000000000856eb <+979>: lea -0x8(%rdi),%rax
0x00000000000856ef <+983>: mov 0x8(%rax),%rdi
0x00000000000856f3 <+987>: lea 0x8(%rax),%rdx
0x00000000000856f7 <+991>: prefetcht0 (%rdi)
0x00000000000856fa <+994>: cmp %rsi,%rdx
0x00000000000856fd <+997>: jne 0x856ce <__queue_work+950>
0x00000000000856ff <+999>: orl $0x10,0x2c(%r14)
0x0000000000085704 <+1004>: jmp 0x8570d <__queue_work+1013>
0x0000000000085706 <+1006>: or $0x2,%ecx
0x0000000000085709 <+1009>: lea 0x60(%r13),%rdx
0x000000000008570d <+1013>: mov %rbx,%rsi
0x0000000000085710 <+1016>: mov %r13,%rdi
0x0000000000085713 <+1019>: callq 0x82e4a <insert_work>
0x0000000000085718 <+1024>: mov %r15,%rsi
0x000000000008571b <+1027>: mov %r14,%rdi
0x000000000008571e <+1030>: callq 0x85723 <__queue_work+1035>
0x0000000000085723 <+1035>: add $0x28,%rsp
0x0000000000085727 <+1039>: pop %rbx
0x0000000000085728 <+1040>: pop %r12
0x000000000008572a <+1042>: pop %r13
0x000000000008572c <+1044>: pop %r14
0x000000000008572e <+1046>: pop %r15
0x0000000000085730 <+1048>: leaveq
0x0000000000085731 <+1049>: retq
End of assembler dump.

Ideas, patches welcomed.

Marcin


Attachments:
(No filename) (27.27 kB)
.config (59.15 kB)
Download all attachments

2011-05-19 13:42:25

by Tejun Heo

[permalink] [raw]
Subject: Re: early kernel crash when kmemleak is enabled

Hello,

On Sun, May 15, 2011 at 12:55:05PM +0200, Marcin Slusarz wrote:
> [ 0.100047] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 0.101416] IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
...
> [ 0.110000] Call Trace:
> [ 0.110000] <IRQ>
> [ 0.110000] [<ffffffff81085910>] queue_work_on+0x16/0x1d
> [ 0.110000] [<ffffffff81085abc>] queue_work+0x29/0x55
> [ 0.110000] [<ffffffff81085afb>] schedule_work+0x13/0x15
> [ 0.110000] [<ffffffff81242de1>] free_object+0x90/0x95
> [ 0.110000] [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
> [ 0.110000] [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
> [ 0.110000] [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
> [ 0.110000] [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
> [ 0.110000] [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
> [ 0.110000] [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
> [ 0.110000] [<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
> [ 0.110000] [<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
> [ 0.110000] [<ffffffff810733b2>] __do_softirq+0x117/0x207
> [ 0.110000] [<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
> [ 0.110000] [<ffffffff814bd0cc>] call_softirq+0x1c/0x30
> [ 0.110000] [<ffffffff81034bc4>] do_softirq+0x38/0x80
> [ 0.110000] [<ffffffff810730ed>] irq_exit+0x4e/0xa0
> [ 0.110000] [<ffffffff8103429a>] do_IRQ+0x97/0xae
> [ 0.110000] [<ffffffff814b6853>] common_interrupt+0x13/0x13

I can reproduce this reliably with your config too. From a quick
glance, the cause seems to be debug objects using RCU callback
free_object() to free objects, which ends up being called before
workqueue is initialized. The offending object type is "rcu_head" and
turning off CONFIG_DEBUG_OBJECTS_RCU_HEAD makes the problem go away.

Any ideas on how to fix this?

Thanks.

--
tejun

2011-05-19 13:48:55

by Catalin Marinas

[permalink] [raw]
Subject: Re: early kernel crash when kmemleak is enabled

On Thu, 2011-05-19 at 14:42 +0100, Tejun Heo wrote:
> Hello,
>
> On Sun, May 15, 2011 at 12:55:05PM +0200, Marcin Slusarz wrote:
> > [ 0.100047] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [ 0.101416] IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
> ...
> > [ 0.110000] Call Trace:
> > [ 0.110000] <IRQ>
> > [ 0.110000] [<ffffffff81085910>] queue_work_on+0x16/0x1d
> > [ 0.110000] [<ffffffff81085abc>] queue_work+0x29/0x55
> > [ 0.110000] [<ffffffff81085afb>] schedule_work+0x13/0x15
> > [ 0.110000] [<ffffffff81242de1>] free_object+0x90/0x95
> > [ 0.110000] [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
> > [ 0.110000] [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
> > [ 0.110000] [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
> > [ 0.110000] [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
> > [ 0.110000] [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
> > [ 0.110000] [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
> > [ 0.110000] [<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
> > [ 0.110000] [<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
> > [ 0.110000] [<ffffffff810733b2>] __do_softirq+0x117/0x207
> > [ 0.110000] [<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
> > [ 0.110000] [<ffffffff814bd0cc>] call_softirq+0x1c/0x30
> > [ 0.110000] [<ffffffff81034bc4>] do_softirq+0x38/0x80
> > [ 0.110000] [<ffffffff810730ed>] irq_exit+0x4e/0xa0
> > [ 0.110000] [<ffffffff8103429a>] do_IRQ+0x97/0xae
> > [ 0.110000] [<ffffffff814b6853>] common_interrupt+0x13/0x13
>
> I can reproduce this reliably with your config too. From a quick
> glance, the cause seems to be debug objects using RCU callback
> free_object() to free objects, which ends up being called before
> workqueue is initialized. The offending object type is "rcu_head" and
> turning off CONFIG_DEBUG_OBJECTS_RCU_HEAD makes the problem go away.
>
> Any ideas on how to fix this?

Thanks for tracking this down. Untested (I can add a log afterwards):

diff --git a/init/main.c b/init/main.c
index 4a9479e..48df882 100644
--- a/init/main.c
+++ b/init/main.c
@@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
#endif
page_cgroup_init();
enable_debug_pagealloc();
- kmemleak_init();
debug_objects_mem_init();
+ kmemleak_init();
setup_per_cpu_pageset();
numa_policy_init();
if (late_time_init)

--
Catalin

2011-05-19 13:54:31

by Tejun Heo

[permalink] [raw]
Subject: Re: early kernel crash when kmemleak is enabled

On Thu, May 19, 2011 at 02:48:44PM +0100, Catalin Marinas wrote:
> Thanks for tracking this down. Untested (I can add a log afterwards):
>
> diff --git a/init/main.c b/init/main.c
> index 4a9479e..48df882 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
> #endif
> page_cgroup_init();
> enable_debug_pagealloc();
> - kmemleak_init();
> debug_objects_mem_init();
> + kmemleak_init();
> setup_per_cpu_pageset();
> numa_policy_init();
> if (late_time_init)

Heh, that was swift. Yeap, seems to work here. Please feel free to
add my Tested-by.

Thank you.

--
tejun

2011-05-19 14:09:08

by Catalin Marinas

[permalink] [raw]
Subject: Re: early kernel crash when kmemleak is enabled

On Thu, 2011-05-19 at 14:54 +0100, Tejun Heo wrote:
> On Thu, May 19, 2011 at 02:48:44PM +0100, Catalin Marinas wrote:
> > Thanks for tracking this down. Untested (I can add a log afterwards):
> >
> > diff --git a/init/main.c b/init/main.c
> > index 4a9479e..48df882 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
> > #endif
> > page_cgroup_init();
> > enable_debug_pagealloc();
> > - kmemleak_init();
> > debug_objects_mem_init();
> > + kmemleak_init();
> > setup_per_cpu_pageset();
> > numa_policy_init();
> > if (late_time_init)
>
> Heh, that was swift. Yeap, seems to work here. Please feel free to
> add my Tested-by.

Thanks. I have two other minor kmemleak fixes, so I'll send Linus a pull
request in the next day or so.

--
Catalin

2011-05-27 20:25:15

by Marcin Slusarz

[permalink] [raw]
Subject: Re: early kernel crash when kmemleak is enabled

On Thu, May 19, 2011 at 03:08:53PM +0100, Catalin Marinas wrote:
> On Thu, 2011-05-19 at 14:54 +0100, Tejun Heo wrote:
> > On Thu, May 19, 2011 at 02:48:44PM +0100, Catalin Marinas wrote:
> > > Thanks for tracking this down. Untested (I can add a log afterwards):
> > >
> > > diff --git a/init/main.c b/init/main.c
> > > index 4a9479e..48df882 100644
> > > --- a/init/main.c
> > > +++ b/init/main.c
> > > @@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
> > > #endif
> > > page_cgroup_init();
> > > enable_debug_pagealloc();
> > > - kmemleak_init();
> > > debug_objects_mem_init();
> > > + kmemleak_init();
> > > setup_per_cpu_pageset();
> > > numa_policy_init();
> > > if (late_time_init)
> >
> > Heh, that was swift. Yeap, seems to work here. Please feel free to
> > add my Tested-by.
>
> Thanks. I have two other minor kmemleak fixes, so I'll send Linus a pull
> request in the next day or so.
>

With this patch applied kernel didn't panic, but kmemleak did not work either:

kmemleak: Early log buffer exceeded, please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
kmemleak: Kernel memory leak detector disabled

I increased DEBUG_KMEMLEAK_EARLY_LOG_SIZE from 400 to 1000, and it crashed in
exactly the same way:

[ 0.100045] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 0.101303] IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
[ 0.102210] PGD 0
[ 0.102551] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 0.103459] last sysfs file:
[ 0.103949] CPU 0
[ 0.104260] Modules linked in:
[ 0.104803]
[ 0.105052] Pid: 1, comm: swapper Not tainted 2.6.39-rc4-nv+ #721 Bochs Bochs
[ 0.106193] RIP: 0010:[<ffffffff810854d1>] [<ffffffff810854d1>] __queue_work+0x29/0x41a
[ 0.107472] RSP: 0018:ffff880007c03c90 EFLAGS: 00010246
[ 0.108310] RAX: 0000000000000000 RBX: ffffffff81a4fa80 RCX: ffff880007c03bb0
[ 0.109423] RDX: 0000000000000003 RSI: ffffffff81e5ade8 RDI: 0000000000000001
[ 0.110000] RBP: ffff880007c03ce0 R08: ffffffff81e5ade0 R09: 000000000000005a
[ 0.110000] R10: ffffea0000000008 R11: 0000000000000000 R12: 0000000000000000
[ 0.110000] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880007946480
[ 0.110000] FS: 0000000000000000(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
[ 0.110000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 0.110000] CR2: 0000000000000000 CR3: 0000000001a23000 CR4: 00000000000006f0
[ 0.110000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.110000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.110000] Process swapper (pid: 1, threadinfo ffff880007ac2000, task ffff880007ac8000)
[ 0.110000] Stack:
[ 0.110000] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 0.110000] 0000000000000000 0000000000000000 ffffffff81a4fa80 ffff880007946330
[ 0.110000] 0000000000000000 ffff880007946480 ffff880007c03cf0 ffffffff81085910
[ 0.110000] Call Trace:
[ 0.110000] <IRQ>
[ 0.110000] [<ffffffff81085910>] queue_work_on+0x16/0x1d
[ 0.110000] [<ffffffff81085abc>] queue_work+0x29/0x55
[ 0.110000] [<ffffffff81085afb>] schedule_work+0x13/0x15
[ 0.110000] [<ffffffff81242de1>] free_object+0x90/0x95
[ 0.110000] [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
[ 0.110000] [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
[ 0.110000] [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
[ 0.110000] [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
[ 0.110000] [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
[ 0.110000] [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
[ 0.110000] [<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
[ 0.110000] [<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
[ 0.110000] [<ffffffff810733b2>] __do_softirq+0x117/0x207
[ 0.110000] [<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
[ 0.110000] [<ffffffff814bd0cc>] call_softirq+0x1c/0x30
[ 0.110000] [<ffffffff81034bc4>] do_softirq+0x38/0x80
[ 0.110000] [<ffffffff810730ed>] irq_exit+0x4e/0xa0
[ 0.110000] [<ffffffff8103429a>] do_IRQ+0x97/0xae
[ 0.110000] [<ffffffff814b6853>] common_interrupt+0x13/0x13
[ 0.110000] <EOI>
[ 0.110000] [<ffffffff8123d25f>] ? delay_tsc+0x48/0xcb
[ 0.110000] [<ffffffff8123d32a>] __const_udelay+0x25/0x27
[ 0.110000] [<ffffffff81aaae16>] timer_irq_works+0x3c/0x77
[ 0.110000] [<ffffffff81aab422>] setup_IO_APIC+0x337/0x755
[ 0.110000] [<ffffffff81aa956c>] native_smp_prepare_cpus+0x3a0/0x451
[ 0.110000] [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[ 0.110000] [<ffffffff81a9e52b>] kernel_init+0x4e/0x135
[ 0.110000] [<ffffffff8123e0fe>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[ 0.110000] [<ffffffff814bcfd4>] kernel_thread_helper+0x4/0x10
[ 0.110000] [<ffffffff81063d70>] ? finish_task_switch+0x5a/0xcb
[ 0.110000] [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[ 0.110000] [<ffffffff814b68b1>] ? retint_restore_args+0xe/0xe
[ 0.110000] [<ffffffff81a9e4dd>] ? parse_early_options+0x20/0x20
[ 0.110000] [<ffffffff814bcfd0>] ? gs_change+0xb/0xb
[ 0.110000] Code: c9 c3 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 48 c7 c6 a0 b7 a3 81 53 41 89 fc 48 83 ec 28 48 89 d3 48 89 d7 e8 63 d7 1b 00
[ 0.110000] f6 45 00 40 0f 84 6b 01 00 00 b8 09 00 00 00 83 3d 28 10 a0
[ 0.110000] RIP [<ffffffff810854d1>] __queue_work+0x29/0x41a
[ 0.110000] RSP <ffff880007c03c90>
[ 0.110000] CR2: 0000000000000000
[ 0.110005] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 0.110766] Kernel panic - not syncing: Fatal exception in interrupt


The problem is: debugobjects want to use workqueues (system_wq actually), but they
are initialized much later in a boot process.

Attached patch fixes this issue for me.


diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 9d86e45..a78b7c6 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
* initialized:
*/
if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
- sched = !work_pending(&debug_obj_work);
+ sched = keventd_up() && !work_pending(&debug_obj_work);
hlist_add_head(&obj->node, &obj_pool);
obj_pool_free++;
obj_pool_used--;

2011-05-27 20:38:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: early kernel crash when kmemleak is enabled

On Fri, 27 May 2011, Marcin Slusarz wrote:
> On Thu, May 19, 2011 at 03:08:53PM +0100, Catalin Marinas wrote:
> > On Thu, 2011-05-19 at 14:54 +0100, Tejun Heo wrote:
> > > On Thu, May 19, 2011 at 02:48:44PM +0100, Catalin Marinas wrote:
> > > > Thanks for tracking this down. Untested (I can add a log afterwards):
> > > >
> > > > diff --git a/init/main.c b/init/main.c
> > > > index 4a9479e..48df882 100644
> > > > --- a/init/main.c
> > > > +++ b/init/main.c
> > > > @@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
> > > > #endif
> > > > page_cgroup_init();
> > > > enable_debug_pagealloc();
> > > > - kmemleak_init();
> > > > debug_objects_mem_init();
> > > > + kmemleak_init();
> > > > setup_per_cpu_pageset();
> > > > numa_policy_init();
> > > > if (late_time_init)
> > >
> > > Heh, that was swift. Yeap, seems to work here. Please feel free to
> > > add my Tested-by.
> >
> > Thanks. I have two other minor kmemleak fixes, so I'll send Linus a pull
> > request in the next day or so.
> >
>
> With this patch applied kernel didn't panic, but kmemleak did not work either:
>
> kmemleak: Early log buffer exceeded, please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
> kmemleak: Kernel memory leak detector disabled
>
> I increased DEBUG_KMEMLEAK_EARLY_LOG_SIZE from 400 to 1000, and it crashed in
> exactly the same way:

...

> The problem is: debugobjects want to use workqueues (system_wq actually), but they
> are initialized much later in a boot process.
>
> Attached patch fixes this issue for me.
>
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 9d86e45..a78b7c6 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
> * initialized:
> */
> if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
> - sched = !work_pending(&debug_obj_work);
> + sched = keventd_up() && !work_pending(&debug_obj_work);
> hlist_add_head(&obj->node, &obj_pool);
> obj_pool_free++;
> obj_pool_used--;
>

Sigh, yes. Care to resend with changelog and signed-off-by ?

Thanks,

tglx

2011-05-28 11:23:54

by Marcin Slusarz

[permalink] [raw]
Subject: [PATCH] debugobjects: fix boot crash when both kmemleak and debugobjects are enabled (was: Re: early kernel crash when kmemleak is enabled)

On Fri, May 27, 2011 at 10:37:54PM +0200, Thomas Gleixner wrote:
> On Fri, 27 May 2011, Marcin Slusarz wrote:
> > On Thu, May 19, 2011 at 03:08:53PM +0100, Catalin Marinas wrote:
> > > On Thu, 2011-05-19 at 14:54 +0100, Tejun Heo wrote:
> > > > On Thu, May 19, 2011 at 02:48:44PM +0100, Catalin Marinas wrote:
> > > > > Thanks for tracking this down. Untested (I can add a log afterwards):
> > > > >
> > > > > diff --git a/init/main.c b/init/main.c
> > > > > index 4a9479e..48df882 100644
> > > > > --- a/init/main.c
> > > > > +++ b/init/main.c
> > > > > @@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
> > > > > #endif
> > > > > page_cgroup_init();
> > > > > enable_debug_pagealloc();
> > > > > - kmemleak_init();
> > > > > debug_objects_mem_init();
> > > > > + kmemleak_init();
> > > > > setup_per_cpu_pageset();
> > > > > numa_policy_init();
> > > > > if (late_time_init)
> > > >
> > > > Heh, that was swift. Yeap, seems to work here. Please feel free to
> > > > add my Tested-by.
> > >
> > > Thanks. I have two other minor kmemleak fixes, so I'll send Linus a pull
> > > request in the next day or so.
> > >
> >
> > With this patch applied kernel didn't panic, but kmemleak did not work either:
> >
> > kmemleak: Early log buffer exceeded, please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
> > kmemleak: Kernel memory leak detector disabled
> >
> > I increased DEBUG_KMEMLEAK_EARLY_LOG_SIZE from 400 to 1000, and it crashed in
> > exactly the same way:
>
> ...
>
> > The problem is: debugobjects want to use workqueues (system_wq actually), but they
> > are initialized much later in a boot process.
> >
> > Attached patch fixes this issue for me.
> >
> >
> > diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> > index 9d86e45..a78b7c6 100644
> > --- a/lib/debugobjects.c
> > +++ b/lib/debugobjects.c
> > @@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
> > * initialized:
> > */
> > if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
> > - sched = !work_pending(&debug_obj_work);
> > + sched = keventd_up() && !work_pending(&debug_obj_work);
> > hlist_add_head(&obj->node, &obj_pool);
> > obj_pool_free++;
> > obj_pool_used--;
> >
>
> Sigh, yes. Care to resend with changelog and signed-off-by ?
>

Sure.

---
From: Marcin Slusarz <[email protected]>
Subject: [PATCH] debugobjects: fix boot crash when both kmemleak and debugobjects are enabled

order of initialization look like this:

...
debugobjects
kmemleak
...(lots of other subsystems)...
workqueues (through early initcall)
...

debugobjects use schedule_work for batch freeing of its data and kmemleak
heavily use debugobjects, so when it comes to freeing and workqueues were
not initialized yet, kernel crashes:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
(...)
Pid: 1, comm: swapper Not tainted 2.6.39-rc4-nv+ #721 Bochs Bochs
RIP: 0010:[<ffffffff810854d1>] [<ffffffff810854d1>] __queue_work+0x29/0x41a
(...)
Call Trace:
<IRQ>
[<ffffffff81085910>] queue_work_on+0x16/0x1d
[<ffffffff81085abc>] queue_work+0x29/0x55
[<ffffffff81085afb>] schedule_work+0x13/0x15
[<ffffffff81242de1>] free_object+0x90/0x95
[<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
[<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
[<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
[<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
[<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
[<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
[<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
[<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
[<ffffffff810733b2>] __do_softirq+0x117/0x207
[<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
[<ffffffff814bd0cc>] call_softirq+0x1c/0x30
[<ffffffff81034bc4>] do_softirq+0x38/0x80
[<ffffffff810730ed>] irq_exit+0x4e/0xa0
[<ffffffff8103429a>] do_IRQ+0x97/0xae
[<ffffffff814b6853>] common_interrupt+0x13/0x13
<EOI>
[<ffffffff8123d25f>] ? delay_tsc+0x48/0xcb
[<ffffffff8123d32a>] __const_udelay+0x25/0x27
[<ffffffff81aaae16>] timer_irq_works+0x3c/0x77
[<ffffffff81aab422>] setup_IO_APIC+0x337/0x755
[<ffffffff81aa956c>] native_smp_prepare_cpus+0x3a0/0x451
[<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[<ffffffff81a9e52b>] kernel_init+0x4e/0x135
[<ffffffff8123e0fe>] ? trace_hardirqs_on_thunk+0x3a/0x3c
[<ffffffff814bcfd4>] kernel_thread_helper+0x4/0x10
[<ffffffff81063d70>] ? finish_task_switch+0x5a/0xcb
[<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
[<ffffffff814b68b1>] ? retint_restore_args+0xe/0xe
[<ffffffff81a9e4dd>] ? parse_early_options+0x20/0x20
[<ffffffff814bcfd0>] ? gs_change+0xb/0xb
Code: c9 c3 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 48 c7 c6 a0 b7 a3 81 53 41 89 fc 48 83 ec 28 48 89 d3 48 89 d7 e8 63 d7 1b 00
f6 45 00 40 0f 84 6b 01 00 00 b8 09 00 00 00 83 3d 28 10 a0
RIP [<ffffffff810854d1>] __queue_work+0x29/0x41a
RSP <ffff880007c03c90>
CR2: 0000000000000000
---[ end trace 4eaa2a86a8e2da22 ]---
Kernel panic - not syncing: Fatal exception in interrupt

...because system_wq is NULL.

Fix it by checking if workqueues susbystem was initialized before using.

Signed-off-by: Marcin Slusarz <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: [email protected]
---
lib/debugobjects.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 9d86e45..a78b7c6 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
* initialized:
*/
if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
- sched = !work_pending(&debug_obj_work);
+ sched = keventd_up() && !work_pending(&debug_obj_work);
hlist_add_head(&obj->node, &obj_pool);
obj_pool_free++;
obj_pool_used--;
--
1.7.4.1

2011-06-19 20:43:40

by Marcin Slusarz

[permalink] [raw]
Subject: Re: [PATCH] debugobjects: fix boot crash when both kmemleak and debugobjects are enabled (was: Re: early kernel crash when kmemleak is enabled)

On Sat, May 28, 2011 at 01:23:42PM +0200, Marcin Slusarz wrote:
> On Fri, May 27, 2011 at 10:37:54PM +0200, Thomas Gleixner wrote:
> > On Fri, 27 May 2011, Marcin Slusarz wrote:
> > > On Thu, May 19, 2011 at 03:08:53PM +0100, Catalin Marinas wrote:
> > > > On Thu, 2011-05-19 at 14:54 +0100, Tejun Heo wrote:
> > > > > On Thu, May 19, 2011 at 02:48:44PM +0100, Catalin Marinas wrote:
> > > > > > Thanks for tracking this down. Untested (I can add a log afterwards):
> > > > > >
> > > > > > diff --git a/init/main.c b/init/main.c
> > > > > > index 4a9479e..48df882 100644
> > > > > > --- a/init/main.c
> > > > > > +++ b/init/main.c
> > > > > > @@ -580,8 +580,8 @@ asmlinkage void __init start_kernel(void)
> > > > > > #endif
> > > > > > page_cgroup_init();
> > > > > > enable_debug_pagealloc();
> > > > > > - kmemleak_init();
> > > > > > debug_objects_mem_init();
> > > > > > + kmemleak_init();
> > > > > > setup_per_cpu_pageset();
> > > > > > numa_policy_init();
> > > > > > if (late_time_init)
> > > > >
> > > > > Heh, that was swift. Yeap, seems to work here. Please feel free to
> > > > > add my Tested-by.
> > > >
> > > > Thanks. I have two other minor kmemleak fixes, so I'll send Linus a pull
> > > > request in the next day or so.
> > > >
> > >
> > > With this patch applied kernel didn't panic, but kmemleak did not work either:
> > >
> > > kmemleak: Early log buffer exceeded, please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
> > > kmemleak: Kernel memory leak detector disabled
> > >
> > > I increased DEBUG_KMEMLEAK_EARLY_LOG_SIZE from 400 to 1000, and it crashed in
> > > exactly the same way:
> >
> > ...
> >
> > > The problem is: debugobjects want to use workqueues (system_wq actually), but they
> > > are initialized much later in a boot process.
> > >
> > > Attached patch fixes this issue for me.
> > >
> > >
> > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> > > index 9d86e45..a78b7c6 100644
> > > --- a/lib/debugobjects.c
> > > +++ b/lib/debugobjects.c
> > > @@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
> > > * initialized:
> > > */
> > > if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
> > > - sched = !work_pending(&debug_obj_work);
> > > + sched = keventd_up() && !work_pending(&debug_obj_work);
> > > hlist_add_head(&obj->node, &obj_pool);
> > > obj_pool_free++;
> > > obj_pool_used--;
> > >
> >
> > Sigh, yes. Care to resend with changelog and signed-off-by ?
> >
>
> Sure.
>
> ---
> From: Marcin Slusarz <[email protected]>
> Subject: [PATCH] debugobjects: fix boot crash when both kmemleak and debugobjects are enabled
>
> order of initialization look like this:
>
> ...
> debugobjects
> kmemleak
> ...(lots of other subsystems)...
> workqueues (through early initcall)
> ...
>
> debugobjects use schedule_work for batch freeing of its data and kmemleak
> heavily use debugobjects, so when it comes to freeing and workqueues were
> not initialized yet, kernel crashes:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
> PGD 0
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> (...)
> Pid: 1, comm: swapper Not tainted 2.6.39-rc4-nv+ #721 Bochs Bochs
> RIP: 0010:[<ffffffff810854d1>] [<ffffffff810854d1>] __queue_work+0x29/0x41a
> (...)
> Call Trace:
> <IRQ>
> [<ffffffff81085910>] queue_work_on+0x16/0x1d
> [<ffffffff81085abc>] queue_work+0x29/0x55
> [<ffffffff81085afb>] schedule_work+0x13/0x15
> [<ffffffff81242de1>] free_object+0x90/0x95
> [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
> [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
> [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
> [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
> [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
> [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
> [<ffffffff81095c9f>] ? tick_handle_periodic+0x1f/0x6c
> [<ffffffff810b5a5a>] rcu_process_callbacks+0x7b/0x83
> [<ffffffff810733b2>] __do_softirq+0x117/0x207
> [<ffffffff810b05d3>] ? handle_irq_event+0x47/0x5c
> [<ffffffff814bd0cc>] call_softirq+0x1c/0x30
> [<ffffffff81034bc4>] do_softirq+0x38/0x80
> [<ffffffff810730ed>] irq_exit+0x4e/0xa0
> [<ffffffff8103429a>] do_IRQ+0x97/0xae
> [<ffffffff814b6853>] common_interrupt+0x13/0x13
> <EOI>
> [<ffffffff8123d25f>] ? delay_tsc+0x48/0xcb
> [<ffffffff8123d32a>] __const_udelay+0x25/0x27
> [<ffffffff81aaae16>] timer_irq_works+0x3c/0x77
> [<ffffffff81aab422>] setup_IO_APIC+0x337/0x755
> [<ffffffff81aa956c>] native_smp_prepare_cpus+0x3a0/0x451
> [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
> [<ffffffff81a9e52b>] kernel_init+0x4e/0x135
> [<ffffffff8123e0fe>] ? trace_hardirqs_on_thunk+0x3a/0x3c
> [<ffffffff814bcfd4>] kernel_thread_helper+0x4/0x10
> [<ffffffff81063d70>] ? finish_task_switch+0x5a/0xcb
> [<ffffffff814b64b9>] ? _raw_spin_unlock_irq+0x19/0x34
> [<ffffffff814b68b1>] ? retint_restore_args+0xe/0xe
> [<ffffffff81a9e4dd>] ? parse_early_options+0x20/0x20
> [<ffffffff814bcfd0>] ? gs_change+0xb/0xb
> Code: c9 c3 55 48 89 e5 41 57 41 56 41 55 49 89 f5 41 54 48 c7 c6 a0 b7 a3 81 53 41 89 fc 48 83 ec 28 48 89 d3 48 89 d7 e8 63 d7 1b 00
> f6 45 00 40 0f 84 6b 01 00 00 b8 09 00 00 00 83 3d 28 10 a0
> RIP [<ffffffff810854d1>] __queue_work+0x29/0x41a
> RSP <ffff880007c03c90>
> CR2: 0000000000000000
> ---[ end trace 4eaa2a86a8e2da22 ]---
> Kernel panic - not syncing: Fatal exception in interrupt
>
> ...because system_wq is NULL.
>
> Fix it by checking if workqueues susbystem was initialized before using.
>
> Signed-off-by: Marcin Slusarz <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Tejun Heo <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: [email protected]
> ---
> lib/debugobjects.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 9d86e45..a78b7c6 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
> * initialized:
> */
> if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
> - sched = !work_pending(&debug_obj_work);
> + sched = keventd_up() && !work_pending(&debug_obj_work);
> hlist_add_head(&obj->node, &obj_pool);
> obj_pool_free++;
> obj_pool_used--;
> --


What's up with this patch? I can't find it in -next...

Marcin

2011-06-20 12:43:48

by Marcin Slusarz

[permalink] [raw]
Subject: [tip:core/urgent] debugobjects: Fix boot crash when kmemleak and debugobjects enabled

Commit-ID: 161b6ae0e067e421b20bb35caf66bdb405c929ac
Gitweb: http://git.kernel.org/tip/161b6ae0e067e421b20bb35caf66bdb405c929ac
Author: Marcin Slusarz <[email protected]>
AuthorDate: Sat, 28 May 2011 13:23:42 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 20 Jun 2011 14:38:43 +0200

debugobjects: Fix boot crash when kmemleak and debugobjects enabled

Order of initialization look like this:
...
debugobjects
kmemleak
...(lots of other subsystems)...
workqueues (through early initcall)
...

debugobjects use schedule_work for batch freeing of its data and kmemleak
heavily use debugobjects, so when it comes to freeing and workqueues were
not initialized yet, kernel crashes:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
[<ffffffff81085910>] queue_work_on+0x16/0x1d
[<ffffffff81085abc>] queue_work+0x29/0x55
[<ffffffff81085afb>] schedule_work+0x13/0x15
[<ffffffff81242de1>] free_object+0x90/0x95
[<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
[<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
[<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
[<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
[<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
[<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
...

because system_wq is NULL.

Fix it by checking if workqueues susbystem was initialized before using.

Signed-off-by: Marcin Slusarz <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Dipankar Sarma <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
lib/debugobjects.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 9d86e45..a78b7c6 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -198,7 +198,7 @@ static void free_object(struct debug_obj *obj)
* initialized:
*/
if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
- sched = !work_pending(&debug_obj_work);
+ sched = keventd_up() && !work_pending(&debug_obj_work);
hlist_add_head(&obj->node, &obj_pool);
obj_pool_free++;
obj_pool_used--;