2016-10-10 15:38:05

by Qian Cai

[permalink] [raw]
Subject: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

Not sure if anyone reported this before. With this kernel config, it is 100% kernel panic so far with today's
mainline master HEAD.

http://people.redhat.com/qcai/tmp/config-kasan-remove

[ 36.318420] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 36.325626] software IO TLB [mem 0x71c7d000-0x75c7d000] (64MB) mapped at [ffff880071c7d000-ffff880075c7cfff]
[ 36.339108] Intel CQM monitoring enabled
[ 36.343507] Intel MBM enabled
[ 36.358713] RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
[ 36.367563] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
[ 36.373984] RAPL PMU: hw unit of domain package 2^-14 Joules
[ 36.380308] RAPL PMU: hw unit of domain dram 2^-14 Joules
[ 36.386337] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[ 36.410064] kasan: CONFIG_KASAN_INLINE enabled
[ 36.415042] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 36.423111] general protection fault: 0000 [#1] PREEMPT SMP KASAN
[ 36.429911] Modules linked in:
[ 36.433331] CPU: 48 PID: 1 Comm: swapper/0 Not tainted 4.8.0remove+ #4
[ 36.440616] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[ 36.451974] task: ffff880e524d0000 task.stack: ffff880852880000
[ 36.458578] RIP: 0010:[<ffffffff81ea08c0>] [<ffffffff81ea08c0>] device_del+0x80/0x700
[ 36.467431] RSP: 0000:ffff880852887938 EFLAGS: 00010246
[ 36.473357] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff10109e6f101
[ 36.481319] RDX: dffffc0000000000 RSI: 000000000000000b RDI: 0000000000000000
[ 36.489281] RBP: ffff8808528879e8 R08: 0000000000000001 R09: 0000000000000000
[ 36.497243] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880e501b4b00
[ 36.505208] R13: ffff880e31988480 R14: 0000000000000001 R15: ffff880e31988480
[ 36.513171] FS: 0000000000000000(0000) GS:ffff88085ec80000(0000) knlGS:0000000000000000
[ 36.522201] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 36.528613] CR2: 0000000000000000 CR3: 0000000002e0a000 CR4: 00000000003406e0
[ 36.536576] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 36.544537] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 36.552499] Stack:
[ 36.554742] 1ffff1010a510f28 1ffff1010a510f2c ffffffff82d3abe4 ffffffff81a6d060
[ 36.563037] 0000000000000296 0000000041b58ab3 ffffffff82d48cc5 ffffffff81ea0840
[ 36.571329] ffffffff828a3040 ffff880800000000 ffff880852887980 ffffffff82f0ba20
[ 36.579624] Call Trace:
[ 36.582355] [<ffffffff81a6d060>] ? idr_mark_full+0xc0/0xc0
[ 36.588573] [<ffffffff81ea0840>] ? cleanup_glue_dir+0xe0/0xe0
[ 36.595086] [<ffffffff814c228d>] perf_pmu_unregister+0x18d/0x530
[ 36.601890] [<ffffffff826f8811>] ? _raw_spin_unlock+0x31/0x50
[ 36.608393] [<ffffffff8103c54e>] ? uncore_pcibus_to_physid+0x10e/0x1c0
[ 36.615766] [<ffffffff810418ee>] uncore_pci_remove+0x24e/0x440
[ 36.622375] [<ffffffff81b91662>] pci_device_remove+0xa2/0x1e0
[ 36.628888] [<ffffffff81eadd01>] driver_probe_device+0x171/0xd50
[ 36.635688] [<ffffffff81eae8e0>] ? driver_probe_device+0xd50/0xd50
[ 36.642685] [<ffffffff81eaea79>] __driver_attach+0x199/0x1e0
[ 36.649097] [<ffffffff81ea7fc6>] bus_for_each_dev+0x126/0x1e0
[ 36.655607] [<ffffffff81ea7ea0>] ? subsys_dev_iter_exit+0x10/0x10
[ 36.662508] [<ffffffff812103ae>] ? preempt_count_sub+0x5e/0xe0
[ 36.669105] [<ffffffff81eacc1d>] driver_attach+0x3d/0x50
[ 36.675129] [<ffffffff81eabd84>] bus_add_driver+0x554/0x790
[ 36.681444] [<ffffffff81eb067c>] driver_register+0x18c/0x3b0
[ 36.687861] [<ffffffff812b3212>] ? __raw_spin_lock_init+0x32/0x100
[ 36.694854] [<ffffffff81b8bbea>] __pci_register_driver+0x13a/0x1e0
[ 36.701853] [<ffffffff83492467>] intel_uncore_init+0x465/0x54f
[ 36.708459] [<ffffffff83492002>] ? uncore_type_init+0x4d6/0x4d6
[ 36.715165] [<ffffffff81002299>] do_one_initcall+0xa9/0x240
[ 36.721473] [<ffffffff810021f0>] ? initcall_blacklisted+0x180/0x180
[ 36.728568] [<ffffffff811f5a10>] ? parse_args+0x520/0x990
[ 36.734692] [<ffffffff811d5bc2>] ? __usermodehelper_set_disable_depth+0x42/0x50
[ 36.742948] [<ffffffff83485d1f>] kernel_init_freeable+0x540/0x610
[ 36.749845] [<ffffffff834857df>] ? start_kernel+0x70d/0x70d
[ 36.756161] [<ffffffff826f88ad>] ? _raw_spin_unlock_irq+0x3d/0x60
[ 36.763060] [<ffffffff8120eb19>] ? finish_task_switch+0x189/0x6c0
[ 36.769957] [<ffffffff8120eaeb>] ? finish_task_switch+0x15b/0x6c0
[ 36.776857] [<ffffffff826e0060>] ? rest_init+0x160/0x160
[ 36.782875] [<ffffffff826e0073>] kernel_init+0x13/0x120
[ 36.788802] [<ffffffff826e0060>] ? rest_init+0x160/0x160
[ 36.794826] [<ffffffff826f93ba>] ret_from_fork+0x2a/0x40
[ 36.800851] Code: 81 c7 00 f1 f1 f1 f1 c7 40 04 00 07 f4 f4 c7 40 08 f3 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 89 f8 48 c1 e8 03 <80> 3c 10 00 0f 85 1a 06 00 00 48 8b 03 48 89 85 68 ff ff ff 48
[ 36.822549] RIP [<ffffffff81ea08c0>] device_del+0x80/0x700
[ 36.828778] RSP <ffff880852887938>
[ 36.832743] ---[ end trace f3cec3a0c6cb2258 ]---
[ 36.838054] Kernel panic - not syncing: Fatal exception
[ 36.843967] ---[ end Kernel panic - not syncing: Fatal exception


2016-10-10 17:10:09

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Mon, Oct 10, 2016 at 10:37 AM, CAI Qian <[email protected]> wrote:
> Not sure if anyone reported this before. With this kernel config, it is 100% kernel panic so far with today's
> mainline master HEAD.

Looks like it is catching what it is supposed to. Though looking
through the code, I haven't found where the problem is. Does bind and
unbind for this normally work?

> http://people.redhat.com/qcai/tmp/config-kasan-remove
>
> [ 36.318420] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> [ 36.325626] software IO TLB [mem 0x71c7d000-0x75c7d000] (64MB) mapped at [ffff880071c7d000-ffff880075c7cfff]
> [ 36.339108] Intel CQM monitoring enabled
> [ 36.343507] Intel MBM enabled
> [ 36.358713] RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
> [ 36.367563] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
> [ 36.373984] RAPL PMU: hw unit of domain package 2^-14 Joules
> [ 36.380308] RAPL PMU: hw unit of domain dram 2^-14 Joules
> [ 36.386337] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
> [ 36.410064] kasan: CONFIG_KASAN_INLINE enabled
> [ 36.415042] kasan: GPF could be caused by NULL-ptr deref or user memory access
> [ 36.423111] general protection fault: 0000 [#1] PREEMPT SMP KASAN
> [ 36.429911] Modules linked in:
> [ 36.433331] CPU: 48 PID: 1 Comm: swapper/0 Not tainted 4.8.0remove+ #4
> [ 36.440616] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
> [ 36.451974] task: ffff880e524d0000 task.stack: ffff880852880000
> [ 36.458578] RIP: 0010:[<ffffffff81ea08c0>] [<ffffffff81ea08c0>] device_del+0x80/0x700
> [ 36.467431] RSP: 0000:ffff880852887938 EFLAGS: 00010246
> [ 36.473357] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff10109e6f101
> [ 36.481319] RDX: dffffc0000000000 RSI: 000000000000000b RDI: 0000000000000000
> [ 36.489281] RBP: ffff8808528879e8 R08: 0000000000000001 R09: 0000000000000000
> [ 36.497243] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880e501b4b00
> [ 36.505208] R13: ffff880e31988480 R14: 0000000000000001 R15: ffff880e31988480
> [ 36.513171] FS: 0000000000000000(0000) GS:ffff88085ec80000(0000) knlGS:0000000000000000
> [ 36.522201] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 36.528613] CR2: 0000000000000000 CR3: 0000000002e0a000 CR4: 00000000003406e0
> [ 36.536576] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 36.544537] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 36.552499] Stack:
> [ 36.554742] 1ffff1010a510f28 1ffff1010a510f2c ffffffff82d3abe4 ffffffff81a6d060
> [ 36.563037] 0000000000000296 0000000041b58ab3 ffffffff82d48cc5 ffffffff81ea0840
> [ 36.571329] ffffffff828a3040 ffff880800000000 ffff880852887980 ffffffff82f0ba20
> [ 36.579624] Call Trace:
> [ 36.582355] [<ffffffff81a6d060>] ? idr_mark_full+0xc0/0xc0
> [ 36.588573] [<ffffffff81ea0840>] ? cleanup_glue_dir+0xe0/0xe0
> [ 36.595086] [<ffffffff814c228d>] perf_pmu_unregister+0x18d/0x530
> [ 36.601890] [<ffffffff826f8811>] ? _raw_spin_unlock+0x31/0x50
> [ 36.608393] [<ffffffff8103c54e>] ? uncore_pcibus_to_physid+0x10e/0x1c0
> [ 36.615766] [<ffffffff810418ee>] uncore_pci_remove+0x24e/0x440
> [ 36.622375] [<ffffffff81b91662>] pci_device_remove+0xa2/0x1e0
> [ 36.628888] [<ffffffff81eadd01>] driver_probe_device+0x171/0xd50
> [ 36.635688] [<ffffffff81eae8e0>] ? driver_probe_device+0xd50/0xd50
> [ 36.642685] [<ffffffff81eaea79>] __driver_attach+0x199/0x1e0
> [ 36.649097] [<ffffffff81ea7fc6>] bus_for_each_dev+0x126/0x1e0
> [ 36.655607] [<ffffffff81ea7ea0>] ? subsys_dev_iter_exit+0x10/0x10
> [ 36.662508] [<ffffffff812103ae>] ? preempt_count_sub+0x5e/0xe0
> [ 36.669105] [<ffffffff81eacc1d>] driver_attach+0x3d/0x50
> [ 36.675129] [<ffffffff81eabd84>] bus_add_driver+0x554/0x790
> [ 36.681444] [<ffffffff81eb067c>] driver_register+0x18c/0x3b0
> [ 36.687861] [<ffffffff812b3212>] ? __raw_spin_lock_init+0x32/0x100
> [ 36.694854] [<ffffffff81b8bbea>] __pci_register_driver+0x13a/0x1e0
> [ 36.701853] [<ffffffff83492467>] intel_uncore_init+0x465/0x54f
> [ 36.708459] [<ffffffff83492002>] ? uncore_type_init+0x4d6/0x4d6
> [ 36.715165] [<ffffffff81002299>] do_one_initcall+0xa9/0x240
> [ 36.721473] [<ffffffff810021f0>] ? initcall_blacklisted+0x180/0x180
> [ 36.728568] [<ffffffff811f5a10>] ? parse_args+0x520/0x990
> [ 36.734692] [<ffffffff811d5bc2>] ? __usermodehelper_set_disable_depth+0x42/0x50
> [ 36.742948] [<ffffffff83485d1f>] kernel_init_freeable+0x540/0x610
> [ 36.749845] [<ffffffff834857df>] ? start_kernel+0x70d/0x70d
> [ 36.756161] [<ffffffff826f88ad>] ? _raw_spin_unlock_irq+0x3d/0x60
> [ 36.763060] [<ffffffff8120eb19>] ? finish_task_switch+0x189/0x6c0
> [ 36.769957] [<ffffffff8120eaeb>] ? finish_task_switch+0x15b/0x6c0
> [ 36.776857] [<ffffffff826e0060>] ? rest_init+0x160/0x160
> [ 36.782875] [<ffffffff826e0073>] kernel_init+0x13/0x120
> [ 36.788802] [<ffffffff826e0060>] ? rest_init+0x160/0x160
> [ 36.794826] [<ffffffff826f93ba>] ret_from_fork+0x2a/0x40
> [ 36.800851] Code: 81 c7 00 f1 f1 f1 f1 c7 40 04 00 07 f4 f4 c7 40 08 f3 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 89 f8 48 c1 e8 03 <80> 3c 10 00 0f 85 1a 06 00 00 48 8b 03 48 89 85 68 ff ff ff 48
> [ 36.822549] RIP [<ffffffff81ea08c0>] device_del+0x80/0x700
> [ 36.828778] RSP <ffff880852887938>
> [ 36.832743] ---[ end trace f3cec3a0c6cb2258 ]---
> [ 36.838054] Kernel panic - not syncing: Fatal exception
> [ 36.843967] ---[ end Kernel panic - not syncing: Fatal exception

2016-10-10 17:20:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Mon, Oct 10, 2016 at 11:37:27AM -0400, CAI Qian wrote:
> Not sure if anyone reported this before. With this kernel config, it is 100% kernel panic so far with today's
> mainline master HEAD.
>
> http://people.redhat.com/qcai/tmp/config-kasan-remove

Oh it breaks things with kasan disabled as well :)

See Laszlo's bug report already a few hours ago, Rob is on it...

2016-10-10 18:15:55

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Mon, Oct 10, 2016 at 12:20 PM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Mon, Oct 10, 2016 at 11:37:27AM -0400, CAI Qian wrote:
>> Not sure if anyone reported this before. With this kernel config, it is 100% kernel panic so far with today's
>> mainline master HEAD.
>>
>> http://people.redhat.com/qcai/tmp/config-kasan-remove
>
> Oh it breaks things with kasan disabled as well :)
>
> See Laszlo's bug report already a few hours ago, Rob is on it...

I think this one is different though. It has a remove() hook.

Rob

2016-10-10 18:22:30

by Qian Cai

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic



----- Original Message -----
> From: "Rob Herring" <[email protected]>
> To: "Greg Kroah-Hartman" <[email protected]>
> Cc: "CAI Qian" <[email protected]>, "linux-kernel" <[email protected]>
> Sent: Monday, October 10, 2016 2:15:29 PM
> Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic
>
> On Mon, Oct 10, 2016 at 12:20 PM, Greg Kroah-Hartman
> <[email protected]> wrote:
> > On Mon, Oct 10, 2016 at 11:37:27AM -0400, CAI Qian wrote:
> >> Not sure if anyone reported this before. With this kernel config, it is
> >> 100% kernel panic so far with today's
> >> mainline master HEAD.
> >>
> >> http://people.redhat.com/qcai/tmp/config-kasan-remove
> >
> > Oh it breaks things with kasan disabled as well :)
> >
> > See Laszlo's bug report already a few hours ago, Rob is on it...
>
> I think this one is different though. It has a remove() hook.
FYI, this can also be reproduced without kasan.
CAI Qian

2016-10-10 18:26:48

by Qian Cai

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic



----- Original Message -----
> From: "Rob Herring" <[email protected]>
> To: "CAI Qian" <[email protected]>
> Cc: "linux-kernel" <[email protected]>, "Greg Kroah-Hartman" <[email protected]>
> Sent: Monday, October 10, 2016 1:09:43 PM
> Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic
>
> On Mon, Oct 10, 2016 at 10:37 AM, CAI Qian <[email protected]> wrote:
> > Not sure if anyone reported this before. With this kernel config, it is
> > 100% kernel panic so far with today's
> > mainline master HEAD.
>
> Looks like it is catching what it is supposed to. Though looking
> through the code, I haven't found where the problem is. Does bind and
> unbind for this normally work?
I am not sure. It just panic at the bootup. If you can tell me debugging steps
you want to run, I can help test it out.
CAI qian

2016-10-10 19:35:22

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Mon, Oct 10, 2016 at 1:22 PM, CAI Qian <[email protected]> wrote:
>
>
> ----- Original Message -----
>> From: "Rob Herring" <[email protected]>
>> To: "Greg Kroah-Hartman" <[email protected]>
>> Cc: "CAI Qian" <[email protected]>, "linux-kernel" <[email protected]>
>> Sent: Monday, October 10, 2016 2:15:29 PM
>> Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic
>>
>> On Mon, Oct 10, 2016 at 12:20 PM, Greg Kroah-Hartman
>> <[email protected]> wrote:
>> > On Mon, Oct 10, 2016 at 11:37:27AM -0400, CAI Qian wrote:
>> >> Not sure if anyone reported this before. With this kernel config, it is
>> >> 100% kernel panic so far with today's
>> >> mainline master HEAD.
>> >>
>> >> http://people.redhat.com/qcai/tmp/config-kasan-remove
>> >
>> > Oh it breaks things with kasan disabled as well :)
>> >
>> > See Laszlo's bug report already a few hours ago, Rob is on it...
>>
>> I think this one is different though. It has a remove() hook.
> FYI, this can also be reproduced without kasan.

Is the backtrace the same in that case?

Rob

2016-10-10 20:09:51

by Qian Cai

[permalink] [raw]
Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic


> Is the backtrace the same in that case?
Very close. I saw "intel" there, and here is the list those modules on the system.

# lsmod | grep intel
intel_rapl 20480 0
intel_powerclamp 16384 0
kvm_intel 208896 0
kvm 630784 1 kvm_intel
ghash_clmulni_intel 16384 0
aesni_intel 167936 0
lrw 16384 1 aesni_intel
glue_helper 16384 1 aesni_intel
ablk_helper 16384 1 aesni_intel
cryptd 24576 3 ablk_helper,ghash_clmulni_intel,aesni_intel
crc32c_intel 24576 1

[ 17.884926] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 17.893700] IP: [<ffffffff81546ff7>] device_del+0x17/0x280
[ 17.899848] PGD 0
[ 17.902109] Oops: 0000 [#1] PREEMPT SMP
[ 17.906394] Modules linked in:
[ 17.909823] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.8.0-remove-nokasan+ #5
[ 17.917985] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[ 17.929347] task: ffff8810556c8000 task.stack: ffffc90000078000
[ 17.935955] RIP: 0010:[<ffffffff81546ff7>] [<ffffffff81546ff7>] device_del+0x17/0x280
[ 17.944811] RSP: 0000:ffffc9000007bc00 EFLAGS: 00010286
[ 17.950742] RAX: 0000000000000000 RBX: ffff88085c8e3c00 RCX: 0000000000000001
[ 17.958708] RDX: ffff881059d60000 RSI: 000000000000000b RDI: 0000000000000000
[ 17.966675] RBP: ffffc9000007bc38 R08: 00000000d38c0f63 R09: 0000000000000000
[ 17.974640] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 17.982606] R13: ffff881054099000 R14: 0000000000000001 R15: 0000000000000000
[ 17.990574] FS: 0000000000000000(0000) GS:ffff88105e400000(0000) knlGS:0000000000000000
[ 17.999606] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 18.006022] CR2: 0000000000000000 CR3: 0000000001c06000 CR4: 00000000003406e0
[ 18.013989] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 18.021954] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 18.029919] Stack:
[ 18.032163] 0000000000000000 00000000dd652bd0 ffff88085c8e3c00 ffff88085c8e3c00
[ 18.040475] ffff88085c8e3400 ffff881054099000 0000000000000001 ffffc9000007bc58
[ 18.048788] ffffffff811c9680 ffff88085c8e3c00 ffff88085c8e3400 ffffc9000007bc88
[ 18.057090] Call Trace:
[ 18.059819] [<ffffffff811c9680>] perf_pmu_unregister+0x90/0x150
[ 18.066529] [<ffffffff81017678>] uncore_pci_remove+0xc8/0x160
[ 18.073044] [<ffffffff814428c9>] pci_device_remove+0x39/0xc0
[ 18.079468] [<ffffffff8154bf4e>] driver_probe_device+0xbe/0x4d0
[ 18.086176] [<ffffffff8154c443>] __driver_attach+0xe3/0xf0
[ 18.092399] [<ffffffff8154c360>] ? driver_probe_device+0x4d0/0x4d0
[ 18.099400] [<ffffffff81549b43>] bus_for_each_dev+0x73/0xc0
[ 18.105722] [<ffffffff8154b7de>] driver_attach+0x1e/0x20
[ 18.111752] [<ffffffff8154b290>] bus_add_driver+0x200/0x270
[ 18.118078] [<ffffffff8154d160>] driver_register+0x60/0xe0
[ 18.124303] [<ffffffff81440ee0>] __pci_register_driver+0x60/0x70
[ 18.131117] [<ffffffff81f1e6e1>] intel_uncore_init+0x277/0x2df
[ 18.137728] [<ffffffff81f1e46a>] ? uncore_type_init+0x15f/0x15f
[ 18.144441] [<ffffffff81002190>] do_one_initcall+0x50/0x190
[ 18.150768] [<ffffffff810c5bf1>] ? parse_args+0x2d1/0x490
[ 18.156894] [<ffffffff81f19243>] kernel_init_freeable+0x1ff/0x29e
[ 18.163801] [<ffffffff817dd840>] ? rest_init+0x140/0x140
[ 18.169831] [<ffffffff817dd84e>] kernel_init+0xe/0x100
[ 18.175668] [<ffffffff817e957a>] ret_from_fork+0x2a/0x40
[ 18.181695] Code: e8 cf d4 29 00 5b 5d c3 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 53 49 89 fc 48 83 ec 18 <4c> 8b 2f 65 48 8b 04 25 28 00 00 00 48 89 45 d8 31 c0 48 8b 87
[ 18.203631] RIP [<ffffffff81546ff7>] device_del+0x17/0x280
[ 18.209867] RSP <ffffc9000007bc00>
[ 18.213759] CR2: 0000000000000000
[ 18.217548] ---[ end trace 91188545987fc9d9 ]---
[ 18.222706] Kernel panic - not syncing: Fatal exception
[ 18.228692] ---[ end Kernel panic - not syncing: Fatal exception

2016-10-19 14:46:04

by Qian Cai

[permalink] [raw]
Subject: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

It turns out this can only be reproducible when compiled intel_uncore as a builtin, i.e.,
not compiled it as a module. The can still be reproduced in the yesterday's mainline.

Here is some information about the system,

Intel Platform: Grantley-R Wildcat Pass CPU: Broadwell-EP, B0.
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz

[   66.349263] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[   66.356672] software IO TLB [mem 0x71c7d000-0x75c7d000] (64MB) mapped at [ffff880071c7d000-ffff880075c7cfff]
[   66.369911] Intel CQM monitoring enabled
[   66.374445] Intel MBM enabled
[   66.385708] RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
[   66.394564] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
[   66.400991] RAPL PMU: hw unit of domain package 2^-14 Joules
[   66.407317] RAPL PMU: hw unit of domain dram 2^-14 Joules
[   66.413358] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[   66.434040] ================================================================================
[   66.443462] UBSAN: Undefined behaviour in drivers/base/core.c:1251:17
[   66.450653] member access within null pointer of type 'struct device'
[   66.457845] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc1-lockfix+ #48
[   66.465809] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[   66.477168]  ffff880847aff798 ffffffff81d370b4 0000000041b58ab3 ffffffff83348dcf
[   66.485469]  ffffffff81d36ff4 ffff880847aff7c0 ffff880847aff770 ffff880e3f9d8000
[   66.493770]  ffffffff82ff8a00 ffffffff8309c5c0 00000000000004e3 000000009091f309
[   66.502073] Call Trace:
[   66.504811]  [<ffffffff81d370b4>] dump_stack+0xc0/0x12c
[   66.510644]  [<ffffffff81d36ff4>] ? _atomic_dec_and_lock+0xc4/0xc4
[   66.517548]  [<ffffffff81e5ac85>] ubsan_epilogue+0xd/0x8a
[   66.523574]  [<ffffffff81e5ae68>] __ubsan_handle_type_mismatch+0x166/0x434
[   66.531253]  [<ffffffff813294dd>] ? get_lock_stats+0x1d/0x120
[   66.537667]  [<ffffffff81e5ad02>] ? ubsan_epilogue+0x8a/0x8a
[   66.543985]  [<ffffffff82241acc>] device_del+0x6fc/0x860
[   66.549917]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   66.557494]  [<ffffffff822413d0>] ? cleanup_glue_dir+0x140/0x140
[   66.564202]  [<ffffffff8160a6f2>] perf_pmu_unregister+0x142/0x6d0
[   66.571006]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   66.577619]  [<ffffffff810559f7>] uncore_pmu_unregister+0x67/0xd0
[   66.584422]  [<ffffffff8105ae6c>] uncore_pci_remove+0x32c/0x510
[   66.591025]  [<ffffffff81ec8392>] pci_device_remove+0xb2/0x240
[   66.597539]  [<ffffffff8224fe76>] driver_probe_device+0x146/0xfc0
[   66.604340]  [<ffffffff82250cf0>] ? driver_probe_device+0xfc0/0xfc0
[   66.611334]  [<ffffffff82250ea5>] __driver_attach+0x1b5/0x230
[   66.617749]  [<ffffffff82248e60>] bus_for_each_dev+0x130/0x200
[   66.624264]  [<ffffffff81353300>] ? do_raw_spin_trylock+0x110/0x110
[   66.631258]  [<ffffffff82248d30>] ? subsys_dev_iter_init+0x100/0x100
[   66.638349]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   66.644959]  [<ffffffff8224eaa2>] driver_attach+0x42/0x70
[   66.650976]  [<ffffffff8224d846>] bus_add_driver+0x406/0x870
[   66.657292]  [<ffffffff822535b9>] driver_register+0x1a9/0x3d0
[   66.663704]  [<ffffffff81352942>] ? __raw_spin_lock_init+0x32/0x120
[   66.670700]  [<ffffffff81ec2a1d>] __pci_register_driver+0x1ad/0x2b0
[   66.677694]  [<ffffffff81ec2870>] ? pci_pm_runtime_idle+0x180/0x180
[   66.684694]  [<ffffffff858f57b5>] intel_uncore_init+0x58d/0x64c
[   66.691300]  [<ffffffff858ed56d>] ? amd_iommu_pc_init+0x16/0x344
[   66.698006]  [<ffffffff858f5228>] ? uncore_type_init+0x5cb/0x5cb
[   66.704710]  [<ffffffff81000587>] do_one_initcall+0xb7/0x2a0
[   66.711025]  [<ffffffff810004d0>] ? initcall_blacklisted+0x1a0/0x1a0
[   66.718116]  [<ffffffff8132687d>] ? up_write+0x7d/0x120
[   66.723949]  [<ffffffff81326800>] ? up_read+0x40/0x40
[   66.729587]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   66.737165]  [<ffffffff8130db04>] ? __wake_up+0x44/0x50
[   66.743000]  [<ffffffff858e71b9>] kernel_init_freeable+0x68a/0x768
[   66.749900]  [<ffffffff858e6b2f>] ? start_kernel+0x751/0x751
[   66.756219]  [<ffffffff81075ec0>] ? compat_start_thread+0xa0/0xa0
[   66.763013]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   66.769039]  [<ffffffff82c704d3>] kernel_init+0x13/0x140
[   66.774967]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   66.780993]  [<ffffffff82c8b0d7>] ret_from_fork+0x27/0x40
[   66.787019] ================================================================================
[   66.796479] kasan: CONFIG_KASAN_INLINE enabled
[   66.801450] kasan: GPF could be caused by NULL-ptr deref or user memory access
[   66.809525] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[   66.817878] Modules linked in:
[   66.821295] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc1-lockfix+ #48
[   66.829260] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[   66.840618] task: ffff880e3f9d8000 task.stack: ffff880847af8000
[   66.847225] RIP: 0010:[<ffffffff82241466>]  [<ffffffff82241466>] device_del+0x96/0x860
[   66.856076] RSP: 0000:ffff880847aff868  EFLAGS: 00010246
[   66.862002] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   66.869967] RDX: 0000000000000000 RSI: ffffffff82ea0cc0 RDI: ffffed0108f5ff06
[   66.877931] RBP: ffff880847aff920 R08: ffff880e3f9d8000 R09: 0000000000000007
[   66.885894] R10: 0000000000000000 R11: 0000000000000006 R12: ffff880844094930
[   66.893859] R13: 0000000000000001 R14: ffff880844094800 R15: ffff880844095258
[   66.901824] FS:  0000000000000000(0000) GS:ffff880e54e00000(0000) knlGS:0000000000000000
[   66.910853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.917265] CR2: 0000000000000000 CR3: 000000000360a000 CR4: 00000000003406e0
[   66.925228] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   66.933191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.941154] Stack:
[   66.943396]  ffffffff82c8a5d2 ffff881077f705c0 1ffff10108f5ff13 ffff880847aff920
[   66.951698]  0000000000000000 ffffffff86d346c8 0000000041b58ab3 ffffffff8338e870
[   66.959997]  ffffffff822413d0 ffff880e00000044 ffffffff00000000 ffff880847aff8c0
[   66.968296] Call Trace:
[   66.971025]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   66.978603]  [<ffffffff822413d0>] ? cleanup_glue_dir+0x140/0x140
[   66.985309]  [<ffffffff8160a6f2>] perf_pmu_unregister+0x142/0x6d0
[   66.992111]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   66.998720]  [<ffffffff810559f7>] uncore_pmu_unregister+0x67/0xd0
[   67.005523]  [<ffffffff8105ae6c>] uncore_pci_remove+0x32c/0x510
[   67.012131]  [<ffffffff81ec8392>] pci_device_remove+0xb2/0x240
[   67.018641]  [<ffffffff8224fe76>] driver_probe_device+0x146/0xfc0
[   67.025442]  [<ffffffff82250cf0>] ? driver_probe_device+0xfc0/0xfc0
[   67.032437]  [<ffffffff82250ea5>] __driver_attach+0x1b5/0x230
[   67.038852]  [<ffffffff82248e60>] bus_for_each_dev+0x130/0x200
[   67.045361]  [<ffffffff81353300>] ? do_raw_spin_trylock+0x110/0x110
[   67.052357]  [<ffffffff82248d30>] ? subsys_dev_iter_init+0x100/0x100
[   67.059450]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   67.066056]  [<ffffffff8224eaa2>] driver_attach+0x42/0x70
[   67.072081]  [<ffffffff8224d846>] bus_add_driver+0x406/0x870
[   67.078397]  [<ffffffff822535b9>] driver_register+0x1a9/0x3d0
[   67.084809]  [<ffffffff81352942>] ? __raw_spin_lock_init+0x32/0x120
[   67.091803]  [<ffffffff81ec2a1d>] __pci_register_driver+0x1ad/0x2b0
[   67.098798]  [<ffffffff81ec2870>] ? pci_pm_runtime_idle+0x180/0x180
[   67.105792]  [<ffffffff858f57b5>] intel_uncore_init+0x58d/0x64c
[   67.112399]  [<ffffffff858ed56d>] ? amd_iommu_pc_init+0x16/0x344
[   67.119103]  [<ffffffff858f5228>] ? uncore_type_init+0x5cb/0x5cb
[   67.125806]  [<ffffffff81000587>] do_one_initcall+0xb7/0x2a0
[   67.132124]  [<ffffffff810004d0>] ? initcall_blacklisted+0x1a0/0x1a0
[   67.139215]  [<ffffffff8132687d>] ? up_write+0x7d/0x120
[   67.145046]  [<ffffffff81326800>] ? up_read+0x40/0x40
[   67.150684]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   67.158262]  [<ffffffff8130db04>] ? __wake_up+0x44/0x50
[   67.164094]  [<ffffffff858e71b9>] kernel_init_freeable+0x68a/0x768
[   67.170992]  [<ffffffff858e6b2f>] ? start_kernel+0x751/0x751
[   67.177310]  [<ffffffff81075ec0>] ? compat_start_thread+0xa0/0xa0
[   67.184111]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   67.190137]  [<ffffffff82c704d3>] kernel_init+0x13/0x140
[   67.196064]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   67.202090]  [<ffffffff82c8b0d7>] ret_from_fork+0x27/0x40
[   67.208115] Code: f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 85 ff 0f 84 69 06 00 00 48 89 da 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 41 06 00 00 48 8b 03 48 89 85 68 ff ff ff 48
[   67.229872] RIP  [<ffffffff82241466>] device_del+0x96/0x860
[   67.236101]  RSP <ffff880847aff868>
[   67.240059] ---[ end trace 69358e866a1e3f6c ]---
[   67.245377] Kernel panic - not syncing: Fatal exception
[   67.251271] ---[ end Kernel panic - not syncing: Fatal exception


----- Original Message -----
> From: "Rob Herring" <[email protected]>
> To: "Greg Kroah-Hartman" <[email protected]>
> Cc: "CAI Qian" <[email protected]>, "linux-kernel" <[email protected]>
> Sent: Monday, October 10, 2016 2:15:29 PM
> Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic
>
> On Mon, Oct 10, 2016 at 12:20 PM, Greg Kroah-Hartman
> <[email protected]> wrote:
> > On Mon, Oct 10, 2016 at 11:37:27AM -0400, CAI Qian wrote:
> >> Not sure if anyone reported this before. With this kernel config, it is
> >> 100% kernel panic so far with today's
> >> mainline master HEAD.
> >>
> >> http://people.redhat.com/qcai/tmp/config-kasan-remove
> >
> > Oh it breaks things with kasan disabled as well :)
> >
> > See Laszlo's bug report already a few hours ago, Rob is on it...
>
> I think this one is different though. It has a remove() hook.
>
> Rob
>

2016-10-19 19:19:54

by Jiri Olsa

[permalink] [raw]
Subject: Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Wed, Oct 19, 2016 at 10:45:31AM -0400, CAI Qian wrote:
> It turns out this can only be reproducible when compiled intel_uncore as a builtin, i.e.,
> not compiled it as a module. The can still be reproduced in the yesterday's mainline.
>
> Here is some information about the system,
>
> Intel Platform: Grantley-R Wildcat Pass CPU: Broadwell-EP, B0.
> Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>
> [ ? 66.349263] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> [ ? 66.356672] software IO TLB [mem 0x71c7d000-0x75c7d000] (64MB) mapped at [ffff880071c7d000-ffff880075c7cfff]
> [ ? 66.369911] Intel CQM monitoring enabled
> [ ? 66.374445] Intel MBM enabled
> [ ? 66.385708] RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
> [ ? 66.394564] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
> [ ? 66.400991] RAPL PMU: hw unit of domain package 2^-14 Joules
> [ ? 66.407317] RAPL PMU: hw unit of domain dram 2^-14 Joules
> [ ? 66.413358] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
> [ ? 66.434040] ================================================================================
> [ ? 66.443462] UBSAN: Undefined behaviour in drivers/base/core.c:1251:17
> [ ? 66.450653] member access within null pointer of type 'struct device'
> [ ? 66.457845] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc1-lockfix+ #48
> [ ? 66.465809] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
> [ ? 66.477168] ?ffff880847aff798 ffffffff81d370b4 0000000041b58ab3 ffffffff83348dcf
> [ ? 66.485469] ?ffffffff81d36ff4 ffff880847aff7c0 ffff880847aff770 ffff880e3f9d8000
> [ ? 66.493770] ?ffffffff82ff8a00 ffffffff8309c5c0 00000000000004e3 000000009091f309
> [ ? 66.502073] Call Trace:
> [ ? 66.504811] ?[<ffffffff81d370b4>] dump_stack+0xc0/0x12c
> [ ? 66.510644] ?[<ffffffff81d36ff4>] ? _atomic_dec_and_lock+0xc4/0xc4
> [ ? 66.517548] ?[<ffffffff81e5ac85>] ubsan_epilogue+0xd/0x8a
> [ ? 66.523574] ?[<ffffffff81e5ae68>] __ubsan_handle_type_mismatch+0x166/0x434
> [ ? 66.531253] ?[<ffffffff813294dd>] ? get_lock_stats+0x1d/0x120
> [ ? 66.537667] ?[<ffffffff81e5ad02>] ? ubsan_epilogue+0x8a/0x8a
> [ ? 66.543985] ?[<ffffffff82241acc>] device_del+0x6fc/0x860
> [ ? 66.549917] ?[<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
> [ ? 66.557494] ?[<ffffffff822413d0>] ? cleanup_glue_dir+0x140/0x140
> [ ? 66.564202] ?[<ffffffff8160a6f2>] perf_pmu_unregister+0x142/0x6d0
> [ ? 66.571006] ?[<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
> [ ? 66.577619] ?[<ffffffff810559f7>] uncore_pmu_unregister+0x67/0xd0
> [ ? 66.584422] ?[<ffffffff8105ae6c>] uncore_pci_remove+0x32c/0x510
> [ ? 66.591025] ?[<ffffffff81ec8392>] pci_device_remove+0xb2/0x240
> [ ? 66.597539] ?[<ffffffff8224fe76>] driver_probe_device+0x146/0xfc0
> [ ? 66.604340] ?[<ffffffff82250cf0>] ? driver_probe_device+0xfc0/0xfc0
> [ ? 66.611334] ?[<ffffffff82250ea5>] __driver_attach+0x1b5/0x230
> [ ? 66.617749] ?[<ffffffff82248e60>] bus_for_each_dev+0x130/0x200
> [ ? 66.624264] ?[<ffffffff81353300>] ? do_raw_spin_trylock+0x110/0x110
> [ ? 66.631258] ?[<ffffffff82248d30>] ? subsys_dev_iter_init+0x100/0x100
> [ ? 66.638349] ?[<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
> [ ? 66.644959] ?[<ffffffff8224eaa2>] driver_attach+0x42/0x70
> [ ? 66.650976] ?[<ffffffff8224d846>] bus_add_driver+0x406/0x870
> [ ? 66.657292] ?[<ffffffff822535b9>] driver_register+0x1a9/0x3d0
> [ ? 66.663704] ?[<ffffffff81352942>] ? __raw_spin_lock_init+0x32/0x120
> [ ? 66.670700] ?[<ffffffff81ec2a1d>] __pci_register_driver+0x1ad/0x2b0
> [ ? 66.677694] ?[<ffffffff81ec2870>] ? pci_pm_runtime_idle+0x180/0x180
> [ ? 66.684694] ?[<ffffffff858f57b5>] intel_uncore_init+0x58d/0x64c
> [ ? 66.691300] ?[<ffffffff858ed56d>] ? amd_iommu_pc_init+0x16/0x344
> [ ? 66.698006] ?[<ffffffff858f5228>] ? uncore_type_init+0x5cb/0x5cb
> [ ? 66.704710] ?[<ffffffff81000587>] do_one_initcall+0xb7/0x2a0
> [ ? 66.711025] ?[<ffffffff810004d0>] ? initcall_blacklisted+0x1a0/0x1a0
> [ ? 66.718116] ?[<ffffffff8132687d>] ? up_write+0x7d/0x120
> [ ? 66.723949] ?[<ffffffff81326800>] ? up_read+0x40/0x40
> [ ? 66.729587] ?[<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
> [ ? 66.737165] ?[<ffffffff8130db04>] ? __wake_up+0x44/0x50
> [ ? 66.743000] ?[<ffffffff858e71b9>] kernel_init_freeable+0x68a/0x768
> [ ? 66.749900] ?[<ffffffff858e6b2f>] ? start_kernel+0x751/0x751
> [ ? 66.756219] ?[<ffffffff81075ec0>] ? compat_start_thread+0xa0/0xa0
> [ ? 66.763013] ?[<ffffffff82c704c0>] ? rest_init+0x190/0x190
> [ ? 66.769039] ?[<ffffffff82c704d3>] kernel_init+0x13/0x140
> [ ? 66.774967] ?[<ffffffff82c704c0>] ? rest_init+0x190/0x190
> [ ? 66.780993] ?[<ffffffff82c8b0d7>] ret_from_fork+0x27/0x40
> [ ? 66.787019] ================================================================================
> [ ? 66.796479] kasan: CONFIG_KASAN_INLINE enabled
> [ ? 66.801450] kasan: GPF could be caused by NULL-ptr deref or user memory access
> [ ? 66.809525] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
> [ ? 66.817878] Modules linked in:
> [ ? 66.821295] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc1-lockfix+ #48
> [ ? 66.829260] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
> [ ? 66.840618] task: ffff880e3f9d8000 task.stack: ffff880847af8000
> [ ? 66.847225] RIP: 0010:[<ffffffff82241466>] ?[<ffffffff82241466>] device_del+0x96/0x860
> [ ? 66.856076] RSP: 0000:ffff880847aff868 ?EFLAGS: 00010246
> [ ? 66.862002] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ ? 66.869967] RDX: 0000000000000000 RSI: ffffffff82ea0cc0 RDI: ffffed0108f5ff06
> [ ? 66.877931] RBP: ffff880847aff920 R08: ffff880e3f9d8000 R09: 0000000000000007
> [ ? 66.885894] R10: 0000000000000000 R11: 0000000000000006 R12: ffff880844094930
> [ ? 66.893859] R13: 0000000000000001 R14: ffff880844094800 R15: ffff880844095258
> [ ? 66.901824] FS: ?0000000000000000(0000) GS:ffff880e54e00000(0000) knlGS:0000000000000000
> [ ? 66.910853] CS: ?0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ ? 66.917265] CR2: 0000000000000000 CR3: 000000000360a000 CR4: 00000000003406e0
> [ ? 66.925228] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ ? 66.933191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ ? 66.941154] Stack:
> [ ? 66.943396] ?ffffffff82c8a5d2 ffff881077f705c0 1ffff10108f5ff13 ffff880847aff920
> [ ? 66.951698] ?0000000000000000 ffffffff86d346c8 0000000041b58ab3 ffffffff8338e870
> [ ? 66.959997] ?ffffffff822413d0 ffff880e00000044 ffffffff00000000 ffff880847aff8c0
> [ ? 66.968296] Call Trace:
> [ ? 66.971025] ?[<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
> [ ? 66.978603] ?[<ffffffff822413d0>] ? cleanup_glue_dir+0x140/0x140
> [ ? 66.985309] ?[<ffffffff8160a6f2>] perf_pmu_unregister+0x142/0x6d0
> [ ? 66.992111] ?[<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
> [ ? 66.998720] ?[<ffffffff810559f7>] uncore_pmu_unregister+0x67/0xd0
> [ ? 67.005523] ?[<ffffffff8105ae6c>] uncore_pci_remove+0x32c/0x510
> [ ? 67.012131] ?[<ffffffff81ec8392>] pci_device_remove+0xb2/0x240
> [ ? 67.018641] ?[<ffffffff8224fe76>] driver_probe_device+0x146/0xfc0
> [ ? 67.025442] ?[<ffffffff82250cf0>] ? driver_probe_device+0xfc0/0xfc0
> [ ? 67.032437] ?[<ffffffff82250ea5>] __driver_attach+0x1b5/0x230
> [ ? 67.038852] ?[<ffffffff82248e60>] bus_for_each_dev+0x130/0x200
> [ ? 67.045361] ?[<ffffffff81353300>] ? do_raw_spin_trylock+0x110/0x110
> [ ? 67.052357] ?[<ffffffff82248d30>] ? subsys_dev_iter_init+0x100/0x100
> [ ? 67.059450] ?[<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
> [ ? 67.066056] ?[<ffffffff8224eaa2>] driver_attach+0x42/0x70
> [ ? 67.072081] ?[<ffffffff8224d846>] bus_add_driver+0x406/0x870
> [ ? 67.078397] ?[<ffffffff822535b9>] driver_register+0x1a9/0x3d0
> [ ? 67.084809] ?[<ffffffff81352942>] ? __raw_spin_lock_init+0x32/0x120
> [ ? 67.091803] ?[<ffffffff81ec2a1d>] __pci_register_driver+0x1ad/0x2b0
> [ ? 67.098798] ?[<ffffffff81ec2870>] ? pci_pm_runtime_idle+0x180/0x180
> [ ? 67.105792] ?[<ffffffff858f57b5>] intel_uncore_init+0x58d/0x64c
> [ ? 67.112399] ?[<ffffffff858ed56d>] ? amd_iommu_pc_init+0x16/0x344
> [ ? 67.119103] ?[<ffffffff858f5228>] ? uncore_type_init+0x5cb/0x5cb
> [ ? 67.125806] ?[<ffffffff81000587>] do_one_initcall+0xb7/0x2a0
> [ ? 67.132124] ?[<ffffffff810004d0>] ? initcall_blacklisted+0x1a0/0x1a0
> [ ? 67.139215] ?[<ffffffff8132687d>] ? up_write+0x7d/0x120
> [ ? 67.145046] ?[<ffffffff81326800>] ? up_read+0x40/0x40
> [ ? 67.150684] ?[<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
> [ ? 67.158262] ?[<ffffffff8130db04>] ? __wake_up+0x44/0x50
> [ ? 67.164094] ?[<ffffffff858e71b9>] kernel_init_freeable+0x68a/0x768
> [ ? 67.170992] ?[<ffffffff858e6b2f>] ? start_kernel+0x751/0x751
> [ ? 67.177310] ?[<ffffffff81075ec0>] ? compat_start_thread+0xa0/0xa0
> [ ? 67.184111] ?[<ffffffff82c704c0>] ? rest_init+0x190/0x190
> [ ? 67.190137] ?[<ffffffff82c704d3>] kernel_init+0x13/0x140
> [ ? 67.196064] ?[<ffffffff82c704c0>] ? rest_init+0x190/0x190
> [ ? 67.202090] ?[<ffffffff82c8b0d7>] ret_from_fork+0x27/0x40
> [ ? 67.208115] Code: f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 85 ff 0f 84 69 06 00 00 48 89 da 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 41 06 00 00 48 8b 03 48 89 85 68 ff ff ff 48
> [ ? 67.229872] RIP ?[<ffffffff82241466>] device_del+0x96/0x860
> [ ? 67.236101] ?RSP <ffff880847aff868>
> [ ? 67.240059] ---[ end trace 69358e866a1e3f6c ]---
> [ ? 67.245377] Kernel panic - not syncing: Fatal exception
> [ ? 67.251271] ---[ end Kernel panic - not syncing: Fatal exception

I think the reason here is that presume pmu devices are always added,
but we add them only if pmu_bus_running (in perf_event_sysfs_init)
is set which might happen after uncore initcall

attached patch fixes the issue for me

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index c6e47e97b33f..c2099b799d16 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8871,8 +8871,10 @@ void perf_pmu_unregister(struct pmu *pmu)
idr_remove(&pmu_idr, pmu->type);
if (pmu->nr_addr_filters)
device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
- device_del(pmu->dev);
- put_device(pmu->dev);
+ if (pmu_bus_running) {
+ device_del(pmu->dev);
+ put_device(pmu->dev);
+ }
free_pmu_context(pmu);
}
EXPORT_SYMBOL_GPL(perf_pmu_unregister);

2016-10-19 20:19:22

by Qian Cai

[permalink] [raw]
Subject: Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic


> I think the reason here is that presume pmu devices are always added,
> but we add them only if pmu_bus_running (in perf_event_sysfs_init)
> is set which might happen after uncore initcall
>
> attached patch fixes the issue for me
Tested-by: CAI Qian <[email protected]>
>
> jirka
>
>
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c6e47e97b33f..c2099b799d16 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8871,8 +8871,10 @@ void perf_pmu_unregister(struct pmu *pmu)
> idr_remove(&pmu_idr, pmu->type);
> if (pmu->nr_addr_filters)
> device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> - device_del(pmu->dev);
> - put_device(pmu->dev);
> + if (pmu_bus_running) {
> + device_del(pmu->dev);
> + put_device(pmu->dev);
> + }
> free_pmu_context(pmu);
> }
> EXPORT_SYMBOL_GPL(perf_pmu_unregister);
>

2016-10-20 05:39:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Wed, Oct 19, 2016 at 09:19:43PM +0200, Jiri Olsa wrote:
> I think the reason here is that presume pmu devices are always added,
> but we add them only if pmu_bus_running (in perf_event_sysfs_init)
> is set which might happen after uncore initcall
>
> attached patch fixes the issue for me

Right, we never expected to be unloaded before userspace runs.

Strictly speaking we should only read pmu_bus_running while holding
pmus_lock, that way we're serialized against perf_event_sysfs_init()
flipping it while we're being removed etc..

With the current setup the introduced race is harmless, but who knows
what other crazy these device people will come up with ;-)

> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index c6e47e97b33f..c2099b799d16 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8871,8 +8871,10 @@ void perf_pmu_unregister(struct pmu *pmu)
> idr_remove(&pmu_idr, pmu->type);
> if (pmu->nr_addr_filters)
> device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> - device_del(pmu->dev);
> - put_device(pmu->dev);
> + if (pmu_bus_running) {
> + device_del(pmu->dev);
> + put_device(pmu->dev);
> + }
> free_pmu_context(pmu);
> }
> EXPORT_SYMBOL_GPL(perf_pmu_unregister);

2016-10-20 08:58:30

by Jiri Olsa

[permalink] [raw]
Subject: Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Thu, Oct 20, 2016 at 07:39:44AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 19, 2016 at 09:19:43PM +0200, Jiri Olsa wrote:
> > I think the reason here is that presume pmu devices are always added,
> > but we add them only if pmu_bus_running (in perf_event_sysfs_init)
> > is set which might happen after uncore initcall
> >
> > attached patch fixes the issue for me
>
> Right, we never expected to be unloaded before userspace runs.
>
> Strictly speaking we should only read pmu_bus_running while holding
> pmus_lock, that way we're serialized against perf_event_sysfs_init()
> flipping it while we're being removed etc..
>
> With the current setup the introduced race is harmless, but who knows
> what other crazy these device people will come up with ;-)
>

right, did not think of that ;-)

also I did not noticed device_remove_file call for pmu->nr_addr_filters
and we could save one lock/unlock call later.. I'm testing attached patch
now

thanks,
jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index c6e47e97b33f..224dffbc3b9b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8581,24 +8581,24 @@ static void update_pmu_context(struct pmu *pmu, struct pmu *old_pmu)
}
}

+/*
+ * The pmus_lock lock must be taken.
+ */
static void free_pmu_context(struct pmu *pmu)
{
struct pmu *i;

- mutex_lock(&pmus_lock);
/*
* Like a real lame refcount.
*/
list_for_each_entry(i, &pmus, entry) {
if (i->pmu_cpu_context == pmu->pmu_cpu_context) {
update_pmu_context(i, pmu);
- goto out;
+ return;
}
}

free_percpu(pmu->pmu_cpu_context);
-out:
- mutex_unlock(&pmus_lock);
}

/*
@@ -8869,11 +8869,15 @@ void perf_pmu_unregister(struct pmu *pmu)
free_percpu(pmu->pmu_disable_count);
if (pmu->type >= PERF_TYPE_MAX)
idr_remove(&pmu_idr, pmu->type);
- if (pmu->nr_addr_filters)
- device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
- device_del(pmu->dev);
- put_device(pmu->dev);
+ mutex_lock(&pmus_lock);
+ if (pmu_bus_running) {
+ if (pmu->nr_addr_filters)
+ device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
+ device_del(pmu->dev);
+ put_device(pmu->dev);
+ }
free_pmu_context(pmu);
+ mutex_unlock(&pmus_lock);
}
EXPORT_SYMBOL_GPL(perf_pmu_unregister);


2016-10-20 09:04:24

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Thu, Oct 20, 2016 at 10:58:03AM +0200, Jiri Olsa wrote:

> @@ -8869,11 +8869,15 @@ void perf_pmu_unregister(struct pmu *pmu)
> free_percpu(pmu->pmu_disable_count);
> if (pmu->type >= PERF_TYPE_MAX)
> idr_remove(&pmu_idr, pmu->type);
> - if (pmu->nr_addr_filters)
> - device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> - device_del(pmu->dev);
> - put_device(pmu->dev);
> + mutex_lock(&pmus_lock);
> + if (pmu_bus_running) {
> + if (pmu->nr_addr_filters)
> + device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> + device_del(pmu->dev);
> + put_device(pmu->dev);
> + }
> free_pmu_context(pmu);
> + mutex_unlock(&pmus_lock);
> }
> EXPORT_SYMBOL_GPL(perf_pmu_unregister);

I think that is still racy..


unregister: sysfs_init:

mutex_lock(&pmus_lock);
list_del_rcu(&pmu->entry);
mutex_unlock(&pmus_lock);

synchronize_*rcu();

mutex_lock(&pmus_lock);
list_for_each_entry(pmu, &pmus, entry) {
/* add device muck */
/* will _NOT_ see our PMU */
}
pmus_bus_running = 1;
mutex_unlock(&pmus_lock);

mutex_lock(&pmus_lock);
if (pmu_bus_running) {
device_del() /* OOPS */


What you want is to read pmu_bus_running in the same pmus_lock section
as we do the list_del, and then use that local copy later.

2016-10-20 09:43:04

by Jiri Olsa

[permalink] [raw]
Subject: Re: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Thu, Oct 20, 2016 at 11:04:16AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 20, 2016 at 10:58:03AM +0200, Jiri Olsa wrote:
>
> > @@ -8869,11 +8869,15 @@ void perf_pmu_unregister(struct pmu *pmu)
> > free_percpu(pmu->pmu_disable_count);
> > if (pmu->type >= PERF_TYPE_MAX)
> > idr_remove(&pmu_idr, pmu->type);
> > - if (pmu->nr_addr_filters)
> > - device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> > - device_del(pmu->dev);
> > - put_device(pmu->dev);
> > + mutex_lock(&pmus_lock);
> > + if (pmu_bus_running) {
> > + if (pmu->nr_addr_filters)
> > + device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> > + device_del(pmu->dev);
> > + put_device(pmu->dev);
> > + }
> > free_pmu_context(pmu);
> > + mutex_unlock(&pmus_lock);
> > }
> > EXPORT_SYMBOL_GPL(perf_pmu_unregister);
>
> I think that is still racy..
>
>
> unregister: sysfs_init:
>
> mutex_lock(&pmus_lock);
> list_del_rcu(&pmu->entry);
> mutex_unlock(&pmus_lock);
>
> synchronize_*rcu();
>
> mutex_lock(&pmus_lock);
> list_for_each_entry(pmu, &pmus, entry) {
> /* add device muck */

ah, I thought this part would add the device back.. but it's
already out of the pmu list.. right :-\

thanks,
jirka

> /* will _NOT_ see our PMU */
> }
> pmus_bus_running = 1;
> mutex_unlock(&pmus_lock);
>
> mutex_lock(&pmus_lock);
> if (pmu_bus_running) {
> device_del() /* OOPS */
>
>
> What you want is to read pmu_bus_running in the same pmus_lock section
> as we do the list_del, and then use that local copy later.

2016-10-20 11:10:16

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH] perf: Protect pmu device removal with pmu_bus_running check CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic

On Thu, Oct 20, 2016 at 11:42:59AM +0200, Jiri Olsa wrote:
> On Thu, Oct 20, 2016 at 11:04:16AM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 20, 2016 at 10:58:03AM +0200, Jiri Olsa wrote:
> >
> > > @@ -8869,11 +8869,15 @@ void perf_pmu_unregister(struct pmu *pmu)
> > > free_percpu(pmu->pmu_disable_count);
> > > if (pmu->type >= PERF_TYPE_MAX)
> > > idr_remove(&pmu_idr, pmu->type);
> > > - if (pmu->nr_addr_filters)
> > > - device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> > > - device_del(pmu->dev);
> > > - put_device(pmu->dev);
> > > + mutex_lock(&pmus_lock);
> > > + if (pmu_bus_running) {
> > > + if (pmu->nr_addr_filters)
> > > + device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
> > > + device_del(pmu->dev);
> > > + put_device(pmu->dev);
> > > + }
> > > free_pmu_context(pmu);
> > > + mutex_unlock(&pmus_lock);
> > > }
> > > EXPORT_SYMBOL_GPL(perf_pmu_unregister);
> >
> > I think that is still racy..
> >
> >
> > unregister: sysfs_init:
> >
> > mutex_lock(&pmus_lock);
> > list_del_rcu(&pmu->entry);
> > mutex_unlock(&pmus_lock);
> >
> > synchronize_*rcu();
> >
> > mutex_lock(&pmus_lock);
> > list_for_each_entry(pmu, &pmus, entry) {
> > /* add device muck */
>
> ah, I thought this part would add the device back.. but it's
> already out of the pmu list.. right :-\

attached fix, thanks

jirka


---
CAI Qian reported crash [1] in uncore device removal related
to CONFIG_DEBUG_TEST_DRIVER_REMOVE option.

The reason for crash is that perf_pmu_unregister tries to remove
pmu device which is not added at this point. We add pmu devices
only after pmu_bus is registered which happens in perf_event_sysfs_init
init call and sets pmu_bus_running flag.

The fix is to get the pmu_bus_running flag state at the point
the pmu is taken out of the pmus list and remove the device
later only if it's set.

[1] https://marc.info/?l=linux-kernel&m=147688837328451

Reported-by: CAI Qian <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
kernel/events/core.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index c6e47e97b33f..a5d2e62faf7e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8855,7 +8855,10 @@ EXPORT_SYMBOL_GPL(perf_pmu_register);

void perf_pmu_unregister(struct pmu *pmu)
{
+ int remove_device;
+
mutex_lock(&pmus_lock);
+ remove_device = pmu_bus_running;
list_del_rcu(&pmu->entry);
mutex_unlock(&pmus_lock);

@@ -8869,10 +8872,12 @@ void perf_pmu_unregister(struct pmu *pmu)
free_percpu(pmu->pmu_disable_count);
if (pmu->type >= PERF_TYPE_MAX)
idr_remove(&pmu_idr, pmu->type);
- if (pmu->nr_addr_filters)
- device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
- device_del(pmu->dev);
- put_device(pmu->dev);
+ if (remove_device) {
+ if (pmu->nr_addr_filters)
+ device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
+ device_del(pmu->dev);
+ put_device(pmu->dev);
+ }
free_pmu_context(pmu);
}
EXPORT_SYMBOL_GPL(perf_pmu_unregister);
--
2.7.4

2016-10-20 14:31:06

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] perf: Protect pmu device removal with pmu_bus_running check CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic


> CAI Qian reported crash [1] in uncore device removal related
> to CONFIG_DEBUG_TEST_DRIVER_REMOVE option.
>
> The reason for crash is that perf_pmu_unregister tries to remove
> pmu device which is not added at this point. We add pmu devices
> only after pmu_bus is registered which happens in perf_event_sysfs_init
> init call and sets pmu_bus_running flag.
>
> The fix is to get the pmu_bus_running flag state at the point
> the pmu is taken out of the pmus list and remove the device
> later only if it's set.
>
> [1] https://marc.info/?l=linux-kernel&m=147688837328451
>
> Reported-by: CAI Qian <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>

Tested-by: CAI Qian <[email protected]>

Subject: [tip:perf/urgent] perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic

Commit-ID: 0933840acf7b65d6d30a5b6089d882afea57aca3
Gitweb: http://git.kernel.org/tip/0933840acf7b65d6d30a5b6089d882afea57aca3
Author: Jiri Olsa <[email protected]>
AuthorDate: Thu, 20 Oct 2016 13:10:11 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 28 Oct 2016 11:06:25 +0200

perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic

CAI Qian reported a crash in the PMU uncore device removal code,
enabled by the CONFIG_DEBUG_TEST_DRIVER_REMOVE=y option:

https://marc.info/?l=linux-kernel&m=147688837328451

The reason for the crash is that perf_pmu_unregister() tries to remove
a PMU device which is not added at this point. We add PMU devices
only after pmu_bus is registered, which happens in the
perf_event_sysfs_init() call and sets the 'pmu_bus_running' flag.

The fix is to get the 'pmu_bus_running' flag state at the point
the PMU is taken out of the PMU list and remove the device
later only if it's set.

Reported-by: CAI Qian <[email protected]>
Tested-by: CAI Qian <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/20161020111011.GA13361@krava
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/events/core.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index c6e47e9..a5d2e62 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8855,7 +8855,10 @@ EXPORT_SYMBOL_GPL(perf_pmu_register);

void perf_pmu_unregister(struct pmu *pmu)
{
+ int remove_device;
+
mutex_lock(&pmus_lock);
+ remove_device = pmu_bus_running;
list_del_rcu(&pmu->entry);
mutex_unlock(&pmus_lock);

@@ -8869,10 +8872,12 @@ void perf_pmu_unregister(struct pmu *pmu)
free_percpu(pmu->pmu_disable_count);
if (pmu->type >= PERF_TYPE_MAX)
idr_remove(&pmu_idr, pmu->type);
- if (pmu->nr_addr_filters)
- device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
- device_del(pmu->dev);
- put_device(pmu->dev);
+ if (remove_device) {
+ if (pmu->nr_addr_filters)
+ device_remove_file(pmu->dev, &dev_attr_nr_addr_filters);
+ device_del(pmu->dev);
+ put_device(pmu->dev);
+ }
free_pmu_context(pmu);
}
EXPORT_SYMBOL_GPL(perf_pmu_unregister);