2016-03-01 10:05:57

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]> wrote:
> On Sat, 3 Oct 2015 12:05:42 +0200
> Sedat Dilek <[email protected]> wrote:
>
>> So, at the beginning... dunno WTF is causing the problems - no
>> workaround for CLANG.
>
> Probably need to compile with gcc and with clang and look at the binary
> differences. Or at least what objdump shows.
>

[ Hope to address this issue to the correct people - CCed some people
I taped on their nerves ]

Not sure if I should open a new thread?
Please, some clear statements on this.
Thanks.

The issue is still visible and alive.

I switched over to Linux v4.4.y LTS and CLANG v3.8 (currently "rc3").
My GCC is still v4.9.2.
CONFIG_PARAVIRT=n as suggested by Steven is the only difference when
using CLANG.

I have seen those bug-lines first with Linux v4.2.y.
BTW, my recent testing was done with Linux v4.5-rc6 and my llvmlinux
patchset (see file attachments).

Plugging my Logitech mouse to a USB 2.0 or 3.0 port does not matter.

[ FACT #1: LOG-FILES ]

The simple fact, the bug-lines/outputs are only visible with
CLANG-compiled Linux-kernel but not GCC-compiled ones.
Folks tend to speculate on a compiler-bug - I am not sure, really.

[ FACT #2: TEST-CASE #1 ]

Doing wild cut-n-paste in Firefox crashed all my Linux v4.4.3 kernels
- CLANG and GCC (!) compiled.
Means restarting my X/desktop and landing in LightDM login-manager
here on Ubuntu/precise AMD64.

[ FACT #3: TEST-CASE #2 ]

The most reliable test-case is to simply unplug my external Logitech
USB mouse - saw this by accident.
YES, it was so simple.

--- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
21:23:56.399691975 +0100
+++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
2016-02-29 21:28:14.401832240 +0100
@@ -832,3 +832,75 @@
[ 66.529779] PPP BSD Compression module registered
[ 66.563013] PPP Deflate Compression module registered
[ 66.978977] usb 2-1.5: USB disconnect, device number 4
+[ 321.937369] usb 2-1.4: USB disconnect, device number 3
+[ 321.950810] BUG: sleeping function called from invalid context at
kernel/workqueue.c:2785
+[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name: kworker/2:1
+[ 321.950819] 9 locks held by kworker/2:1/44:
+[ 321.950820] #0: ("usb_hub_wq"){.+.+.+}, at: [<ffffffff810a7c5f>]
process_one_work+0x30f/0x8d0
+[ 321.950830] #1: ((&hub->events)){+.+.+.}, at:
[<ffffffff810a7c8c>] process_one_work+0x33c/0x8d0
+[ 321.950836] #2: (&dev->mutex){......}, at: [<ffffffff816fb0e0>]
hub_event+0x50/0x15b0
+[ 321.950844] #3: (&dev->mutex){......}, at: [<ffffffff816f686f>]
usb_disconnect+0x5f/0x2c0
+[ 321.950849] #4: (&dev->mutex){......}, at: [<ffffffff816245a2>]
device_release_driver+0x22/0x40
+[ 321.950856] #5: (&dev->mutex){......}, at: [<ffffffff816245a2>]
device_release_driver+0x22/0x40
+[ 321.950862] #6: (input_mutex){+.+.+.}, at: [<ffffffff8175171a>]
__input_unregister_device+0x9a/0x190
+[ 321.950869] #7: (&dev->mutex#2){+.+...}, at:
[<ffffffff8174fe27>] input_close_device+0x27/0x70
+[ 321.950875] #8: (hid_open_mut){+.+...}, at: [<ffffffffa0057388>]
usbhid_close+0x28/0xb0 [usbhid]
+[ 321.950883] irq event stamp: 47770
+[ 321.950885] hardirqs last enabled at (47769):
[<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
+[ 321.950889] hardirqs last disabled at (47770):
[<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
+[ 321.950894] softirqs last enabled at (47112):
[<ffffffff810871a2>] __do_softirq+0x5a2/0x610
+[ 321.950898] softirqs last disabled at (47097):
[<ffffffff8108747c>] irq_exit+0x9c/0x120
+[ 321.950903] CPU: 2 PID: 44 Comm: kworker/2:1 Not tainted
4.5.0-rc6-2-llvmlinux-amd64 #1
+[ 321.950905] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
+[ 321.950908] Workqueue: usb_hub_wq hub_event
+[ 321.950911] ffff8800d3326948 0000000000000092 0000000000000000
ffff8800d4347568
+[ 321.950915] ffffffff814ba7d4 ffff8800d43474f8 ffff8800d4340cc0
ffff8800d4347568
+[ 321.950919] ffffffff810e28fd 0000000000000092 0000000000000096
ffff8800d43475a8
+[ 321.950923] Call Trace:
+[ 321.950927] [<ffffffff814ba7d4>] dump_stack+0xa4/0xf0
+[ 321.950931] [<ffffffff810e28fd>] ? print_irqtrace_events+0xcd/0xe0
+[ 321.950936] [<ffffffff810bf495>] ___might_sleep+0x295/0x2b0
+[ 321.950939] [<ffffffff810bf18f>] __might_sleep+0x4f/0xc0
+[ 321.950943] [<ffffffff810a12ef>] start_flush_work+0x2f/0x2a0
+[ 321.950946] [<ffffffff810a129c>] flush_work+0x5c/0x80
+[ 321.950949] [<ffffffff810a125a>] ? flush_work+0x1a/0x80
+[ 321.950953] [<ffffffff810e247d>] ? trace_hardirqs_off+0xd/0x10
+[ 321.950956] [<ffffffff810a032a>] ? try_to_grab_pending+0x4a/0x260
+[ 321.950960] [<ffffffff810a1717>] __cancel_work_timer+0x197/0x290
+[ 321.950963] [<ffffffff81115b8d>] ? try_to_del_timer_sync+0xad/0xc0
+[ 321.950966] [<ffffffff810a1578>] cancel_work_sync+0x18/0x20
+[ 321.950970] [<ffffffffa00573d5>] usbhid_close+0x75/0xb0 [usbhid]
+[ 321.950977] [<ffffffffa003a481>] hidinput_close+0x31/0x40 [hid]
+[ 321.950982] [<ffffffffa003a450>] ? hidinput_open+0x40/0x40 [hid]
+[ 321.950985] [<ffffffff8174fe48>] input_close_device+0x48/0x70
+[ 321.950988] [<ffffffff81757f85>] evdev_cleanup+0xd5/0xe0
+[ 321.950990] [<ffffffff81757b0c>] evdev_disconnect+0x2c/0x60
+[ 321.950993] [<ffffffff8175173e>] __input_unregister_device+0xbe/0x190
+[ 321.950996] [<ffffffff8175164a>] input_unregister_device+0x4a/0x80
+[ 321.951001] [<ffffffffa003a29f>] hidinput_disconnect+0x8f/0xc0 [hid]
+[ 321.951007] [<ffffffffa0034a30>] hid_device_remove+0xb0/0x130 [hid]
+[ 321.951010] [<ffffffff816246bd>] __device_release_driver+0xfd/0x190
+[ 321.951014] [<ffffffff816245aa>] device_release_driver+0x2a/0x40
+[ 321.951017] [<ffffffff816224a3>] bus_remove_device+0x153/0x190
+[ 321.951020] [<ffffffff8161f5fb>] device_del+0x1db/0x2b0
+[ 321.951025] [<ffffffffa0033c1c>] hid_destroy_device+0x2c/0x60 [hid]
+[ 321.951029] [<ffffffffa0058537>] usbhid_disconnect+0x67/0x90 [usbhid]
+[ 321.951033] [<ffffffff81706ebf>] usb_unbind_interface+0xbf/0x2b0
+[ 321.951037] [<ffffffff816246bd>] __device_release_driver+0xfd/0x190
+[ 321.951040] [<ffffffff816245aa>] device_release_driver+0x2a/0x40
+[ 321.951043] [<ffffffff816224a3>] bus_remove_device+0x153/0x190
+[ 321.951046] [<ffffffff8161f5fb>] device_del+0x1db/0x2b0
+[ 321.951048] [<ffffffff8161f6fc>] ? device_unregister+0x2c/0x40
+[ 321.951051] [<ffffffff81704f6b>] usb_disable_device+0x10b/0x3b0
+[ 321.951054] [<ffffffff816f6904>] usb_disconnect+0xf4/0x2c0
+[ 321.951057] [<ffffffff816fbc3d>] hub_event+0xbad/0x15b0
+[ 321.951060] [<ffffffff819426a2>] ? _raw_spin_unlock_irq+0x32/0x60
+[ 321.951063] [<ffffffff810a7e0e>] process_one_work+0x4be/0x8d0
+[ 321.951065] [<ffffffff810a7c8c>] ? process_one_work+0x33c/0x8d0
+[ 321.951068] [<ffffffff810a757b>] worker_thread+0x5cb/0x750
+[ 321.951071] [<ffffffff810a6fb0>] ? destroy_worker+0x110/0x110
+[ 321.951074] [<ffffffff810ae1c5>] kthread+0x115/0x120
+[ 321.951077] [<ffffffff810e248d>] ? trace_hardirqs_on+0xd/0x10
+[ 321.951080] [<ffffffff810ae0b0>] ? flush_kthread_worker+0x80/0x80
+[ 321.951084] [<ffffffff819435ff>] ret_from_fork+0x3f/0x70
+[ 321.951087] [<ffffffff810ae0b0>] ? flush_kthread_worker+0x80/0x80

As this touches workqueue and usb_hub_wq...

[ drivers/usb/core/hub.c ]

5214: hub_wq = alloc_workqueue("usb_hub_wq", WQ_FREEZABLE, 0);

...see...

+[ 321.950908] Workqueue: usb_hub_wq hub_event

...I have CCed USB core maintainers ans folks I know they can help.

Here some Linux kernel-config I have enabled to investigate the issue...

[ LINUX-CONFIG ]

$ egrep 'CONFIG_TRACING=|TRACE_IRQFLAGS|IRQSOFF_TRACER|CONTEXT_TRACKING|LOCKDEP=|PROVE_LOCKING'
/tmp/config-4.5.0-rc6-2-llvmlinux-amd64
CONFIG_CONTEXT_TRACKING=y
# CONFIG_CONTEXT_TRACKING_FORCE is not set
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_TRACE_IRQFLAGS=y
CONFIG_TRACING=y
CONFIG_IRQSOFF_TRACER=y

For more details please, see attached files!

Sorry, this is far beyond a cool bug-report, but I could do it like
this for the moment.

Hope this helps to get some interesting and new ideas.

- Sedat -


Attachments:
dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt (61.61 kB)
config-4.5.0-rc6-2-llvmlinux-amd64 (128.97 kB)
4.5.0-rc6-2-llvmlinux-amd64.patch (28.57 kB)
Download all attachments

2016-03-01 15:07:46

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Tue, 1 Mar 2016 11:05:42 +0100
Sedat Dilek <[email protected]> wrote:


> [ FACT #3: TEST-CASE #2 ]
>
> The most reliable test-case is to simply unplug my external Logitech
> USB mouse - saw this by accident.
> YES, it was so simple.

Just to clarify, this happens on gcc and clang?


>
> --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
> 21:23:56.399691975 +0100
> +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
> 2016-02-29 21:28:14.401832240 +0100
> @@ -832,3 +832,75 @@
> [ 66.529779] PPP BSD Compression module registered
> [ 66.563013] PPP Deflate Compression module registered
> [ 66.978977] usb 2-1.5: USB disconnect, device number 4
> +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
> +[ 321.950810] BUG: sleeping function called from invalid context at
> kernel/workqueue.c:2785
> +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name: kworker/2:1
> +[ 321.950819] 9 locks held by kworker/2:1/44:
> +[ 321.950820] #0: ("usb_hub_wq"){.+.+.+}, at: [<ffffffff810a7c5f>]
> process_one_work+0x30f/0x8d0
> +[ 321.950830] #1: ((&hub->events)){+.+.+.}, at:
> [<ffffffff810a7c8c>] process_one_work+0x33c/0x8d0
> +[ 321.950836] #2: (&dev->mutex){......}, at: [<ffffffff816fb0e0>]
> hub_event+0x50/0x15b0
> +[ 321.950844] #3: (&dev->mutex){......}, at: [<ffffffff816f686f>]
> usb_disconnect+0x5f/0x2c0
> +[ 321.950849] #4: (&dev->mutex){......}, at: [<ffffffff816245a2>]
> device_release_driver+0x22/0x40
> +[ 321.950856] #5: (&dev->mutex){......}, at: [<ffffffff816245a2>]
> device_release_driver+0x22/0x40
> +[ 321.950862] #6: (input_mutex){+.+.+.}, at: [<ffffffff8175171a>]
> __input_unregister_device+0x9a/0x190
> +[ 321.950869] #7: (&dev->mutex#2){+.+...}, at:
> [<ffffffff8174fe27>] input_close_device+0x27/0x70
> +[ 321.950875] #8: (hid_open_mut){+.+...}, at: [<ffffffffa0057388>]
> usbhid_close+0x28/0xb0 [usbhid]
> +[ 321.950883] irq event stamp: 47770
> +[ 321.950885] hardirqs last enabled at (47769):
> [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
> +[ 321.950889] hardirqs last disabled at (47770):
> [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110

According to lockdep, interrupts were last disabled in del_timer_sync,
and they were never enabled. The numbers in parenthesis show the order
of events. _raw_spin_unlock_irq() at 47769, then del_timer_sync at
47770.

But why did they not get enabled again? We have:

local_irq_save(flags);
lock_map_acquire(&timer->lockdep_map);
lock_map_release(&timer->lockdep_map);
local_irq_restore(flags);

If this caused an issue, then you would have a lockdep splat. But
perhaps something corrupted "flags", and interrupts were not re-enabled?

> +[ 321.950894] softirqs last enabled at (47112):
> [<ffffffff810871a2>] __do_softirq+0x5a2/0x610
> +[ 321.950898] softirqs last disabled at (47097):
> [<ffffffff8108747c>] irq_exit+0x9c/0x120
> +[ 321.950903] CPU: 2 PID: 44 Comm: kworker/2:1 Not tainted
> 4.5.0-rc6-2-llvmlinux-amd64 #1
> +[ 321.950905] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
> 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
> +[ 321.950908] Workqueue: usb_hub_wq hub_event
> +[ 321.950911] ffff8800d3326948 0000000000000092 0000000000000000
> ffff8800d4347568
> +[ 321.950915] ffffffff814ba7d4 ffff8800d43474f8 ffff8800d4340cc0
> ffff8800d4347568
> +[ 321.950919] ffffffff810e28fd 0000000000000092 0000000000000096
> ffff8800d43475a8
> +[ 321.950923] Call Trace:
> +[ 321.950927] [<ffffffff814ba7d4>] dump_stack+0xa4/0xf0
> +[ 321.950931] [<ffffffff810e28fd>] ? print_irqtrace_events+0xcd/0xe0
> +[ 321.950936] [<ffffffff810bf495>] ___might_sleep+0x295/0x2b0
> +[ 321.950939] [<ffffffff810bf18f>] __might_sleep+0x4f/0xc0

This triggered on the might_sleep() in start_flush_work() because
something left interrupts enable.

Just strange.

-- Steve


> +[ 321.950943] [<ffffffff810a12ef>] start_flush_work+0x2f/0x2a0
> +[ 321.950946] [<ffffffff810a129c>] flush_work+0x5c/0x80
> +[ 321.950949] [<ffffffff810a125a>] ? flush_work+0x1a/0x80
> +[ 321.950953] [<ffffffff810e247d>] ? trace_hardirqs_off+0xd/0x10
> +[ 321.950956] [<ffffffff810a032a>] ? try_to_grab_pending+0x4a/0x260
> +[ 321.950960] [<ffffffff810a1717>] __cancel_work_timer+0x197/0x290
> +[ 321.950963] [<ffffffff81115b8d>] ? try_to_del_timer_sync+0xad/0xc0
> +[ 321.950966] [<ffffffff810a1578>] cancel_work_sync+0x18/0x20
> +[ 321.950970] [<ffffffffa00573d5>] usbhid_close+0x75/0xb0 [usbhid]
> +[ 321.950977] [<ffffffffa003a481>] hidinput_close+0x31/0x40 [hid]
> +[ 321.950982] [<ffffffffa003a450>] ? hidinput_open+0x40/0x40 [hid]
> +[ 321.950985] [<ffffffff8174fe48>] input_close_device+0x48/0x70
> +[ 321.950988] [<ffffffff81757f85>] evdev_cleanup+0xd5/0xe0
> +[ 321.950990] [<ffffffff81757b0c>] evdev_disconnect+0x2c/0x60
> +[ 321.950993] [<ffffffff8175173e>] __input_unregister_device+0xbe/0x190
> +[ 321.950996] [<ffffffff8175164a>] input_unregister_device+0x4a/0x80
> +[ 321.951001] [<ffffffffa003a29f>] hidinput_disconnect+0x8f/0xc0 [hid]
> +[ 321.951007] [<ffffffffa0034a30>] hid_device_remove+0xb0/0x130 [hid]
> +[ 321.951010] [<ffffffff816246bd>] __device_release_driver+0xfd/0x190
> +[ 321.951014] [<ffffffff816245aa>] device_release_driver+0x2a/0x40
> +[ 321.951017] [<ffffffff816224a3>] bus_remove_device+0x153/0x190
> +[ 321.951020] [<ffffffff8161f5fb>] device_del+0x1db/0x2b0
> +[ 321.951025] [<ffffffffa0033c1c>] hid_destroy_device+0x2c/0x60 [hid]
> +[ 321.951029] [<ffffffffa0058537>] usbhid_disconnect+0x67/0x90 [usbhid]
> +[ 321.951033] [<ffffffff81706ebf>] usb_unbind_interface+0xbf/0x2b0
> +[ 321.951037] [<ffffffff816246bd>] __device_release_driver+0xfd/0x190
> +[ 321.951040] [<ffffffff816245aa>] device_release_driver+0x2a/0x40
> +[ 321.951043] [<ffffffff816224a3>] bus_remove_device+0x153/0x190
> +[ 321.951046] [<ffffffff8161f5fb>] device_del+0x1db/0x2b0
> +[ 321.951048] [<ffffffff8161f6fc>] ? device_unregister+0x2c/0x40
> +[ 321.951051] [<ffffffff81704f6b>] usb_disable_device+0x10b/0x3b0
> +[ 321.951054] [<ffffffff816f6904>] usb_disconnect+0xf4/0x2c0
> +[ 321.951057] [<ffffffff816fbc3d>] hub_event+0xbad/0x15b0
> +[ 321.951060] [<ffffffff819426a2>] ? _raw_spin_unlock_irq+0x32/0x60
> +[ 321.951063] [<ffffffff810a7e0e>] process_one_work+0x4be/0x8d0
> +[ 321.951065] [<ffffffff810a7c8c>] ? process_one_work+0x33c/0x8d0
> +[ 321.951068] [<ffffffff810a757b>] worker_thread+0x5cb/0x750
> +[ 321.951071] [<ffffffff810a6fb0>] ? destroy_worker+0x110/0x110
> +[ 321.951074] [<ffffffff810ae1c5>] kthread+0x115/0x120
> +[ 321.951077] [<ffffffff810e248d>] ? trace_hardirqs_on+0xd/0x10
> +[ 321.951080] [<ffffffff810ae0b0>] ? flush_kthread_worker+0x80/0x80
> +[ 321.951084] [<ffffffff819435ff>] ret_from_fork+0x3f/0x70
> +[ 321.951087] [<ffffffff810ae0b0>] ? flush_kthread_worker+0x80/0x80
>

2016-03-01 15:17:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Tue, Mar 01, 2016 at 10:07:40AM -0500, Steven Rostedt wrote:
> On Tue, 1 Mar 2016 11:05:42 +0100
> Sedat Dilek <[email protected]> wrote:
>
>
> > [ FACT #3: TEST-CASE #2 ]
> >
> > The most reliable test-case is to simply unplug my external Logitech
> > USB mouse - saw this by accident.
> > YES, it was so simple.
>
> Just to clarify, this happens on gcc and clang?

Just clang from what I gather.

> > --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
> > 21:23:56.399691975 +0100
> > +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
> > 2016-02-29 21:28:14.401832240 +0100
> > @@ -832,3 +832,75 @@
> > [ 66.529779] PPP BSD Compression module registered
> > [ 66.563013] PPP Deflate Compression module registered
> > [ 66.978977] usb 2-1.5: USB disconnect, device number 4
> > +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
> > +[ 321.950810] BUG: sleeping function called from invalid context at
> > kernel/workqueue.c:2785
> > +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name: kworker/2:1
> > +[ 321.950819] 9 locks held by kworker/2:1/44:
> > +[ 321.950820] #0: ("usb_hub_wq"){.+.+.+}, at: [<ffffffff810a7c5f>]
> > process_one_work+0x30f/0x8d0
> > +[ 321.950830] #1: ((&hub->events)){+.+.+.}, at:
> > [<ffffffff810a7c8c>] process_one_work+0x33c/0x8d0
> > +[ 321.950836] #2: (&dev->mutex){......}, at: [<ffffffff816fb0e0>]
> > hub_event+0x50/0x15b0
> > +[ 321.950844] #3: (&dev->mutex){......}, at: [<ffffffff816f686f>]
> > usb_disconnect+0x5f/0x2c0
> > +[ 321.950849] #4: (&dev->mutex){......}, at: [<ffffffff816245a2>]
> > device_release_driver+0x22/0x40
> > +[ 321.950856] #5: (&dev->mutex){......}, at: [<ffffffff816245a2>]
> > device_release_driver+0x22/0x40
> > +[ 321.950862] #6: (input_mutex){+.+.+.}, at: [<ffffffff8175171a>]
> > __input_unregister_device+0x9a/0x190
> > +[ 321.950869] #7: (&dev->mutex#2){+.+...}, at:
> > [<ffffffff8174fe27>] input_close_device+0x27/0x70
> > +[ 321.950875] #8: (hid_open_mut){+.+...}, at: [<ffffffffa0057388>]
> > usbhid_close+0x28/0xb0 [usbhid]
> > +[ 321.950883] irq event stamp: 47770
> > +[ 321.950885] hardirqs last enabled at (47769):
> > [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
> > +[ 321.950889] hardirqs last disabled at (47770):
> > [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
>
> According to lockdep, interrupts were last disabled in del_timer_sync,
> and they were never enabled. The numbers in parenthesis show the order
> of events. _raw_spin_unlock_irq() at 47769, then del_timer_sync at
> 47770.
>
> But why did they not get enabled again? We have:
>
> local_irq_save(flags);
> lock_map_acquire(&timer->lockdep_map);
> lock_map_release(&timer->lockdep_map);
> local_irq_restore(flags);
>
> If this caused an issue, then you would have a lockdep splat. But
> perhaps something corrupted "flags", and interrupts were not re-enabled?

Right, most odd. Sedat, could you provide objdump -D of the relevant
sections of vmlinux ?

2016-03-01 16:00:00

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Tue, 1 Mar 2016, Sedat Dilek wrote:

> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]> wrote:
> > On Sat, 3 Oct 2015 12:05:42 +0200
> > Sedat Dilek <[email protected]> wrote:
> >
> >> So, at the beginning... dunno WTF is causing the problems - no
> >> workaround for CLANG.
> >
> > Probably need to compile with gcc and with clang and look at the binary
> > differences. Or at least what objdump shows.
> >
>
> [ Hope to address this issue to the correct people - CCed some people
> I taped on their nerves ]
>
> Not sure if I should open a new thread?
> Please, some clear statements on this.
> Thanks.
>
> The issue is still visible and alive.

...

> [ FACT #3: TEST-CASE #2 ]
>
> The most reliable test-case is to simply unplug my external Logitech
> USB mouse - saw this by accident.
> YES, it was so simple.
>
> --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
> 21:23:56.399691975 +0100
> +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
> 2016-02-29 21:28:14.401832240 +0100
> @@ -832,3 +832,75 @@
> [ 66.529779] PPP BSD Compression module registered
> [ 66.563013] PPP Deflate Compression module registered
> [ 66.978977] usb 2-1.5: USB disconnect, device number 4
> +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
> +[ 321.950810] BUG: sleeping function called from invalid context at
> kernel/workqueue.c:2785
> +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name: kworker/2:1

> +[ 321.950885] hardirqs last enabled at (47769):
> [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
> +[ 321.950889] hardirqs last disabled at (47770):
> [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
> +[ 321.950894] softirqs last enabled at (47112):
> [<ffffffff810871a2>] __do_softirq+0x5a2/0x610
> +[ 321.950898] softirqs last disabled at (47097):
> [<ffffffff8108747c>] irq_exit+0x9c/0x120
> +[ 321.950903] CPU: 2 PID: 44 Comm: kworker/2:1 Not tainted
> 4.5.0-rc6-2-llvmlinux-amd64 #1
> +[ 321.950905] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
> 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
> +[ 321.950908] Workqueue: usb_hub_wq hub_event
> +[ 321.950911] ffff8800d3326948 0000000000000092 0000000000000000
> ffff8800d4347568
> +[ 321.950915] ffffffff814ba7d4 ffff8800d43474f8 ffff8800d4340cc0
> ffff8800d4347568
> +[ 321.950919] ffffffff810e28fd 0000000000000092 0000000000000096
> ffff8800d43475a8
> +[ 321.950923] Call Trace:
> +[ 321.950927] [<ffffffff814ba7d4>] dump_stack+0xa4/0xf0
> +[ 321.950931] [<ffffffff810e28fd>] ? print_irqtrace_events+0xcd/0xe0
> +[ 321.950936] [<ffffffff810bf495>] ___might_sleep+0x295/0x2b0
> +[ 321.950939] [<ffffffff810bf18f>] __might_sleep+0x4f/0xc0
> +[ 321.950943] [<ffffffff810a12ef>] start_flush_work+0x2f/0x2a0
> +[ 321.950946] [<ffffffff810a129c>] flush_work+0x5c/0x80
> +[ 321.950949] [<ffffffff810a125a>] ? flush_work+0x1a/0x80
> +[ 321.950953] [<ffffffff810e247d>] ? trace_hardirqs_off+0xd/0x10
> +[ 321.950956] [<ffffffff810a032a>] ? try_to_grab_pending+0x4a/0x260
> +[ 321.950960] [<ffffffff810a1717>] __cancel_work_timer+0x197/0x290
> +[ 321.950963] [<ffffffff81115b8d>] ? try_to_del_timer_sync+0xad/0xc0
> +[ 321.950966] [<ffffffff810a1578>] cancel_work_sync+0x18/0x20

It's possible that this could be a compiler-related error connected
with local_irq_save(). __cancel_work_timer() calls

ret = try_to_grab_pending(work, is_dwork, &flags);

and try_to_grab_pending() starts this way:

static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
unsigned long *flags)
{
struct worker_pool *pool;
struct pool_workqueue *pwq;

local_irq_save(*flags);

Then later on, __cancel_work_timer() does

local_irq_restore(flags);

Maybe CLANG doesn't like local_irq_save() applied to a pointer as
opposed to a stack variable.

As a quick test, try applying the patch below. (Note that there is a
similar construction in kernel/signal.c.)

Alan Stern



Index: usb-4.4/kernel/workqueue.c
===================================================================
--- usb-4.4.orig/kernel/workqueue.c
+++ usb-4.4/kernel/workqueue.c
@@ -1175,12 +1175,14 @@ out_put:
* This function is safe to call from any context including IRQ handler.
*/
static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
- unsigned long *flags)
+ unsigned long *pflags)
{
struct worker_pool *pool;
struct pool_workqueue *pwq;
+ unsigned long flags;

- local_irq_save(*flags);
+ local_irq_save(flags);
+ *pflags = flags;

/* try to steal the timer if it exists */
if (is_dwork) {
@@ -1241,7 +1243,7 @@ static int try_to_grab_pending(struct wo
}
spin_unlock(&pool->lock);
fail:
- local_irq_restore(*flags);
+ local_irq_restore(flags);
if (work_is_canceling(work))
return -ENOENT;
cpu_relax();

2016-03-02 06:25:58

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/1/16, Alan Stern <[email protected]> wrote:
> On Tue, 1 Mar 2016, Sedat Dilek wrote:
>
>> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
>> wrote:
>> > On Sat, 3 Oct 2015 12:05:42 +0200
>> > Sedat Dilek <[email protected]> wrote:
>> >
>> >> So, at the beginning... dunno WTF is causing the problems - no
>> >> workaround for CLANG.
>> >
>> > Probably need to compile with gcc and with clang and look at the binary
>> > differences. Or at least what objdump shows.
>> >
>>
>> [ Hope to address this issue to the correct people - CCed some people
>> I taped on their nerves ]
>>
>> Not sure if I should open a new thread?
>> Please, some clear statements on this.
>> Thanks.
>>
>> The issue is still visible and alive.
>
> ...
>
>> [ FACT #3: TEST-CASE #2 ]
>>
>> The most reliable test-case is to simply unplug my external Logitech
>> USB mouse - saw this by accident.
>> YES, it was so simple.
>>
>> --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
>> 21:23:56.399691975 +0100
>> +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
>> 2016-02-29 21:28:14.401832240 +0100
>> @@ -832,3 +832,75 @@
>> [ 66.529779] PPP BSD Compression module registered
>> [ 66.563013] PPP Deflate Compression module registered
>> [ 66.978977] usb 2-1.5: USB disconnect, device number 4
>> +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
>> +[ 321.950810] BUG: sleeping function called from invalid context at
>> kernel/workqueue.c:2785
>> +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name:
>> kworker/2:1
>
>> +[ 321.950885] hardirqs last enabled at (47769):
>> [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
>> +[ 321.950889] hardirqs last disabled at (47770):
>> [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
>> +[ 321.950894] softirqs last enabled at (47112):
>> [<ffffffff810871a2>] __do_softirq+0x5a2/0x610
>> +[ 321.950898] softirqs last disabled at (47097):
>> [<ffffffff8108747c>] irq_exit+0x9c/0x120
>> +[ 321.950903] CPU: 2 PID: 44 Comm: kworker/2:1 Not tainted
>> 4.5.0-rc6-2-llvmlinux-amd64 #1
>> +[ 321.950905] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
>> 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
>> +[ 321.950908] Workqueue: usb_hub_wq hub_event
>> +[ 321.950911] ffff8800d3326948 0000000000000092 0000000000000000
>> ffff8800d4347568
>> +[ 321.950915] ffffffff814ba7d4 ffff8800d43474f8 ffff8800d4340cc0
>> ffff8800d4347568
>> +[ 321.950919] ffffffff810e28fd 0000000000000092 0000000000000096
>> ffff8800d43475a8
>> +[ 321.950923] Call Trace:
>> +[ 321.950927] [<ffffffff814ba7d4>] dump_stack+0xa4/0xf0
>> +[ 321.950931] [<ffffffff810e28fd>] ? print_irqtrace_events+0xcd/0xe0
>> +[ 321.950936] [<ffffffff810bf495>] ___might_sleep+0x295/0x2b0
>> +[ 321.950939] [<ffffffff810bf18f>] __might_sleep+0x4f/0xc0
>> +[ 321.950943] [<ffffffff810a12ef>] start_flush_work+0x2f/0x2a0
>> +[ 321.950946] [<ffffffff810a129c>] flush_work+0x5c/0x80
>> +[ 321.950949] [<ffffffff810a125a>] ? flush_work+0x1a/0x80
>> +[ 321.950953] [<ffffffff810e247d>] ? trace_hardirqs_off+0xd/0x10
>> +[ 321.950956] [<ffffffff810a032a>] ? try_to_grab_pending+0x4a/0x260
>> +[ 321.950960] [<ffffffff810a1717>] __cancel_work_timer+0x197/0x290
>> +[ 321.950963] [<ffffffff81115b8d>] ? try_to_del_timer_sync+0xad/0xc0
>> +[ 321.950966] [<ffffffff810a1578>] cancel_work_sync+0x18/0x20
>
> It's possible that this could be a compiler-related error connected
> with local_irq_save(). __cancel_work_timer() calls
>
> ret = try_to_grab_pending(work, is_dwork, &flags);
>
> and try_to_grab_pending() starts this way:
>
> static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
> unsigned long *flags)
> {
> struct worker_pool *pool;
> struct pool_workqueue *pwq;
>
> local_irq_save(*flags);
>
> Then later on, __cancel_work_timer() does
>
> local_irq_restore(flags);
>
> Maybe CLANG doesn't like local_irq_save() applied to a pointer as
> opposed to a stack variable.
>
> As a quick test, try applying the patch below. (Note that there is a
> similar construction in kernel/signal.c.)
>
> Alan Stern
>
>
>
> Index: usb-4.4/kernel/workqueue.c
> ===================================================================
> --- usb-4.4.orig/kernel/workqueue.c
> +++ usb-4.4/kernel/workqueue.c
> @@ -1175,12 +1175,14 @@ out_put:
> * This function is safe to call from any context including IRQ handler.
> */
> static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
> - unsigned long *flags)
> + unsigned long *pflags)
> {
> struct worker_pool *pool;
> struct pool_workqueue *pwq;
> + unsigned long flags;
>
> - local_irq_save(*flags);
> + local_irq_save(flags);
> + *pflags = flags;
>
> /* try to steal the timer if it exists */
> if (is_dwork) {
> @@ -1241,7 +1243,7 @@ static int try_to_grab_pending(struct wo
> }
> spin_unlock(&pool->lock);
> fail:
> - local_irq_restore(*flags);
> + local_irq_restore(flags);
> if (work_is_canceling(work))
> return -ENOENT;
> cpu_relax();
>

Hi Alan,

thanks for the quick reply and test-patch.
I adapted it against Linux v4.5-rc6 (see file attachments).

Unfortunately, it does not solve the issue.

With un- and re-plugging my usbmouse I can easily reproduce both
call-traces (see file-attachments).

Regards,
- Sedat -


Attachments:
kernel-workqueue_c-4-5-rc6.diff (870.00 B)
dmesg_4.5.0-rc6-4-llvmlinux-amd64_usbmouse-unplugged-and-replugged.txt (65.34 kB)
Download all attachments

2016-03-02 06:36:38

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/1/16, Alan Stern <[email protected]> wrote:
> On Tue, 1 Mar 2016, Sedat Dilek wrote:
>
>> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
>> wrote:
>> > On Sat, 3 Oct 2015 12:05:42 +0200
>> > Sedat Dilek <[email protected]> wrote:
>> >
>> >> So, at the beginning... dunno WTF is causing the problems - no
>> >> workaround for CLANG.
>> >
>> > Probably need to compile with gcc and with clang and look at the binary
>> > differences. Or at least what objdump shows.
>> >
>>
>> [ Hope to address this issue to the correct people - CCed some people
>> I taped on their nerves ]
>>
>> Not sure if I should open a new thread?
>> Please, some clear statements on this.
>> Thanks.
>>
>> The issue is still visible and alive.
>
> ...
>
>> [ FACT #3: TEST-CASE #2 ]
>>
>> The most reliable test-case is to simply unplug my external Logitech
>> USB mouse - saw this by accident.
>> YES, it was so simple.
>>
>> --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
>> 21:23:56.399691975 +0100
>> +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
>> 2016-02-29 21:28:14.401832240 +0100
>> @@ -832,3 +832,75 @@
>> [ 66.529779] PPP BSD Compression module registered
>> [ 66.563013] PPP Deflate Compression module registered
>> [ 66.978977] usb 2-1.5: USB disconnect, device number 4
>> +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
>> +[ 321.950810] BUG: sleeping function called from invalid context at
>> kernel/workqueue.c:2785
>> +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name:
>> kworker/2:1
>
>> +[ 321.950885] hardirqs last enabled at (47769):
>> [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
>> +[ 321.950889] hardirqs last disabled at (47770):
>> [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
>> +[ 321.950894] softirqs last enabled at (47112):
>> [<ffffffff810871a2>] __do_softirq+0x5a2/0x610
>> +[ 321.950898] softirqs last disabled at (47097):
>> [<ffffffff8108747c>] irq_exit+0x9c/0x120
>> +[ 321.950903] CPU: 2 PID: 44 Comm: kworker/2:1 Not tainted
>> 4.5.0-rc6-2-llvmlinux-amd64 #1
>> +[ 321.950905] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
>> 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
>> +[ 321.950908] Workqueue: usb_hub_wq hub_event
>> +[ 321.950911] ffff8800d3326948 0000000000000092 0000000000000000
>> ffff8800d4347568
>> +[ 321.950915] ffffffff814ba7d4 ffff8800d43474f8 ffff8800d4340cc0
>> ffff8800d4347568
>> +[ 321.950919] ffffffff810e28fd 0000000000000092 0000000000000096
>> ffff8800d43475a8
>> +[ 321.950923] Call Trace:
>> +[ 321.950927] [<ffffffff814ba7d4>] dump_stack+0xa4/0xf0
>> +[ 321.950931] [<ffffffff810e28fd>] ? print_irqtrace_events+0xcd/0xe0
>> +[ 321.950936] [<ffffffff810bf495>] ___might_sleep+0x295/0x2b0
>> +[ 321.950939] [<ffffffff810bf18f>] __might_sleep+0x4f/0xc0
>> +[ 321.950943] [<ffffffff810a12ef>] start_flush_work+0x2f/0x2a0
>> +[ 321.950946] [<ffffffff810a129c>] flush_work+0x5c/0x80
>> +[ 321.950949] [<ffffffff810a125a>] ? flush_work+0x1a/0x80
>> +[ 321.950953] [<ffffffff810e247d>] ? trace_hardirqs_off+0xd/0x10
>> +[ 321.950956] [<ffffffff810a032a>] ? try_to_grab_pending+0x4a/0x260
>> +[ 321.950960] [<ffffffff810a1717>] __cancel_work_timer+0x197/0x290
>> +[ 321.950963] [<ffffffff81115b8d>] ? try_to_del_timer_sync+0xad/0xc0
>> +[ 321.950966] [<ffffffff810a1578>] cancel_work_sync+0x18/0x20
>
> It's possible that this could be a compiler-related error connected
> with local_irq_save(). __cancel_work_timer() calls
>
> ret = try_to_grab_pending(work, is_dwork, &flags);
>
> and try_to_grab_pending() starts this way:
>
> static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
> unsigned long *flags)
> {
> struct worker_pool *pool;
> struct pool_workqueue *pwq;
>
> local_irq_save(*flags);
>
> Then later on, __cancel_work_timer() does
>
> local_irq_restore(flags);
>
> Maybe CLANG doesn't like local_irq_save() applied to a pointer as
> opposed to a stack variable.
>
> As a quick test, try applying the patch below. (Note that there is a
> similar construction in kernel/signal.c.)
>
> Alan Stern
>
>
>
> Index: usb-4.4/kernel/workqueue.c
> ===================================================================
> --- usb-4.4.orig/kernel/workqueue.c
> +++ usb-4.4/kernel/workqueue.c
> @@ -1175,12 +1175,14 @@ out_put:
> * This function is safe to call from any context including IRQ handler.
> */
> static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
> - unsigned long *flags)
> + unsigned long *pflags)
> {
> struct worker_pool *pool;
> struct pool_workqueue *pwq;
> + unsigned long flags;
>
> - local_irq_save(*flags);
> + local_irq_save(flags);
> + *pflags = flags;
>
> /* try to steal the timer if it exists */
> if (is_dwork) {
> @@ -1241,7 +1243,7 @@ static int try_to_grab_pending(struct wo
> }
> spin_unlock(&pool->lock);
> fail:
> - local_irq_restore(*flags);
> + local_irq_restore(flags);
> if (work_is_canceling(work))
> return -ENOENT;
> cpu_relax();
>
>

might_sleep() is invoked in start_flush_work(), but there are no
"flags" passed to local_irq_{disable,enable}.

What about something like this?

@@ -2781,13 +2783,14 @@ static bool start_flush_work(struct
work_struct *work, struct wq_barrier *barr)
struct worker *worker = NULL;
struct worker_pool *pool;
struct pool_workqueue *pwq;
+ unsigned long flags;

might_sleep();

- local_irq_disable();
+ local_irq_disable(flags);
pool = get_work_pool(work);
if (!pool) {
- local_irq_enable();
+ local_irq_enable(flags);
return false;
}

- Sedat -

2016-03-02 08:34:50

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Sedat Dilek <[email protected]> wrote:
> On 3/1/16, Alan Stern <[email protected]> wrote:
>> On Tue, 1 Mar 2016, Sedat Dilek wrote:
>>
>>> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
>>> wrote:
>>> > On Sat, 3 Oct 2015 12:05:42 +0200
>>> > Sedat Dilek <[email protected]> wrote:
>>> >
>>> >> So, at the beginning... dunno WTF is causing the problems - no
>>> >> workaround for CLANG.
>>> >
>>> > Probably need to compile with gcc and with clang and look at the
>>> > binary
>>> > differences. Or at least what objdump shows.
>>> >
>>>
>>> [ Hope to address this issue to the correct people - CCed some people
>>> I taped on their nerves ]
>>>
>>> Not sure if I should open a new thread?
>>> Please, some clear statements on this.
>>> Thanks.
>>>
>>> The issue is still visible and alive.
>>
>> ...
>>
>>> [ FACT #3: TEST-CASE #2 ]
>>>
>>> The most reliable test-case is to simply unplug my external Logitech
>>> USB mouse - saw this by accident.
>>> YES, it was so simple.
>>>
>>> --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
>>> 21:23:56.399691975 +0100
>>> +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
>>> 2016-02-29 21:28:14.401832240 +0100
>>> @@ -832,3 +832,75 @@
>>> [ 66.529779] PPP BSD Compression module registered
>>> [ 66.563013] PPP Deflate Compression module registered
>>> [ 66.978977] usb 2-1.5: USB disconnect, device number 4
>>> +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
>>> +[ 321.950810] BUG: sleeping function called from invalid context at
>>> kernel/workqueue.c:2785
>>> +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name:
>>> kworker/2:1
>>
>>> +[ 321.950885] hardirqs last enabled at (47769):
>>> [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
>>> +[ 321.950889] hardirqs last disabled at (47770):
>>> [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
>>> +[ 321.950894] softirqs last enabled at (47112):
>>> [<ffffffff810871a2>] __do_softirq+0x5a2/0x610
>>> +[ 321.950898] softirqs last disabled at (47097):
>>> [<ffffffff8108747c>] irq_exit+0x9c/0x120
>>> +[ 321.950903] CPU: 2 PID: 44 Comm: kworker/2:1 Not tainted
>>> 4.5.0-rc6-2-llvmlinux-amd64 #1
>>> +[ 321.950905] Hardware name: SAMSUNG ELECTRONICS CO., LTD.
>>> 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
>>> +[ 321.950908] Workqueue: usb_hub_wq hub_event
>>> +[ 321.950911] ffff8800d3326948 0000000000000092 0000000000000000
>>> ffff8800d4347568
>>> +[ 321.950915] ffffffff814ba7d4 ffff8800d43474f8 ffff8800d4340cc0
>>> ffff8800d4347568
>>> +[ 321.950919] ffffffff810e28fd 0000000000000092 0000000000000096
>>> ffff8800d43475a8
>>> +[ 321.950923] Call Trace:
>>> +[ 321.950927] [<ffffffff814ba7d4>] dump_stack+0xa4/0xf0
>>> +[ 321.950931] [<ffffffff810e28fd>] ? print_irqtrace_events+0xcd/0xe0
>>> +[ 321.950936] [<ffffffff810bf495>] ___might_sleep+0x295/0x2b0
>>> +[ 321.950939] [<ffffffff810bf18f>] __might_sleep+0x4f/0xc0
>>> +[ 321.950943] [<ffffffff810a12ef>] start_flush_work+0x2f/0x2a0
>>> +[ 321.950946] [<ffffffff810a129c>] flush_work+0x5c/0x80
>>> +[ 321.950949] [<ffffffff810a125a>] ? flush_work+0x1a/0x80
>>> +[ 321.950953] [<ffffffff810e247d>] ? trace_hardirqs_off+0xd/0x10
>>> +[ 321.950956] [<ffffffff810a032a>] ? try_to_grab_pending+0x4a/0x260
>>> +[ 321.950960] [<ffffffff810a1717>] __cancel_work_timer+0x197/0x290
>>> +[ 321.950963] [<ffffffff81115b8d>] ? try_to_del_timer_sync+0xad/0xc0
>>> +[ 321.950966] [<ffffffff810a1578>] cancel_work_sync+0x18/0x20
>>
>> It's possible that this could be a compiler-related error connected
>> with local_irq_save(). __cancel_work_timer() calls
>>
>> ret = try_to_grab_pending(work, is_dwork, &flags);
>>
>> and try_to_grab_pending() starts this way:
>>
>> static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
>> unsigned long *flags)
>> {
>> struct worker_pool *pool;
>> struct pool_workqueue *pwq;
>>
>> local_irq_save(*flags);
>>
>> Then later on, __cancel_work_timer() does
>>
>> local_irq_restore(flags);
>>
>> Maybe CLANG doesn't like local_irq_save() applied to a pointer as
>> opposed to a stack variable.
>>
>> As a quick test, try applying the patch below. (Note that there is a
>> similar construction in kernel/signal.c.)
>>
>> Alan Stern
>>
>>
>>
>> Index: usb-4.4/kernel/workqueue.c
>> ===================================================================
>> --- usb-4.4.orig/kernel/workqueue.c
>> +++ usb-4.4/kernel/workqueue.c
>> @@ -1175,12 +1175,14 @@ out_put:
>> * This function is safe to call from any context including IRQ handler.
>> */
>> static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
>> - unsigned long *flags)
>> + unsigned long *pflags)
>> {
>> struct worker_pool *pool;
>> struct pool_workqueue *pwq;
>> + unsigned long flags;
>>
>> - local_irq_save(*flags);
>> + local_irq_save(flags);
>> + *pflags = flags;
>>
>> /* try to steal the timer if it exists */
>> if (is_dwork) {
>> @@ -1241,7 +1243,7 @@ static int try_to_grab_pending(struct wo
>> }
>> spin_unlock(&pool->lock);
>> fail:
>> - local_irq_restore(*flags);
>> + local_irq_restore(flags);
>> if (work_is_canceling(work))
>> return -ENOENT;
>> cpu_relax();
>>
>>
>
> might_sleep() is invoked in start_flush_work(), but there are no
> "flags" passed to local_irq_{disable,enable}.
>
> What about something like this?
>
> @@ -2781,13 +2783,14 @@ static bool start_flush_work(struct
> work_struct *work, struct wq_barrier *barr)
> struct worker *worker = NULL;
> struct worker_pool *pool;
> struct pool_workqueue *pwq;
> + unsigned long flags;
>
> might_sleep();
>
> - local_irq_disable();
> + local_irq_disable(flags);
> pool = get_work_pool(work);
> if (!pool) {
> - local_irq_enable();
> + local_irq_enable(flags);
> return false;
> }
>

start_flush_work() is the only place where no flags are passed.
That stuff above does not work - no clue how to do that.
Any help?

Then I see there...
...
spin_lock(&pool->lock);
...
spin_unlock_irq(&pool->lock);
...

Should that be spin_lock_irq()?

static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
{
struct worker *worker = NULL;
struct worker_pool *pool;
struct pool_workqueue *pwq;

might_sleep();

local_irq_disable();
pool = get_work_pool(work);
if (!pool) {
local_irq_enable();
return false;
}

spin_lock(&pool->lock); <--- XXX: spin_lock_irq() ???
/* see the comment in try_to_grab_pending() with the same code */
pwq = get_work_pwq(work);
if (pwq) {
if (unlikely(pwq->pool != pool))
goto already_gone;
} else {
worker = find_worker_executing_work(pool, work);
if (!worker)
goto already_gone;
pwq = worker->current_pwq;
}

check_flush_dependency(pwq->wq, work);

insert_wq_barrier(pwq, barr, work, worker);
spin_unlock_irq(&pool->lock);

/*
* If @max_active is 1 or rescuer is in use, flushing another work
* item on the same workqueue may lead to deadlock. Make sure the
* flusher is not running on the same workqueue by verifying write
* access.
*/
if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)
lock_map_acquire(&pwq->wq->lockdep_map);
else
lock_map_acquire_read(&pwq->wq->lockdep_map);
lock_map_release(&pwq->wq->lockdep_map);

return true;
already_gone:
spin_unlock_irq(&pool->lock);
return false;
}

- Sedat -

2016-03-02 08:37:26

by Jiri Kosina

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Wed, 2 Mar 2016, Sedat Dilek wrote:
>
> static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> {
> struct worker *worker = NULL;
> struct worker_pool *pool;
> struct pool_workqueue *pwq;
>
> might_sleep();
>
> local_irq_disable();
> pool = get_work_pool(work);
> if (!pool) {
> local_irq_enable();
> return false;
> }
>
> spin_lock(&pool->lock); <--- XXX: spin_lock_irq() ???

No, this is fine. IRQs are unconditionally disabled a few lines above.

--
Jiri Kosina
SUSE Labs

2016-03-02 09:11:51

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Jiri Kosina <[email protected]> wrote:
> On Wed, 2 Mar 2016, Sedat Dilek wrote:
>>
>> static bool start_flush_work(struct work_struct *work, struct wq_barrier
>> *barr)
>> {
>> struct worker *worker = NULL;
>> struct worker_pool *pool;
>> struct pool_workqueue *pwq;
>>
>> might_sleep();
>>
>> local_irq_disable();
>> pool = get_work_pool(work);
>> if (!pool) {
>> local_irq_enable();
>> return false;
>> }
>>
>> spin_lock(&pool->lock); <--- XXX: spin_lock_irq() ???
>
> No, this is fine. IRQs are unconditionally disabled a few lines above.
>

You are right, I tried with a substitution and that does not matter.

What about passing flags to local_irq_XXX?
And how do I do that?

- Sedat -

> --
> Jiri Kosina
> SUSE Labs
>
>

2016-03-02 15:00:53

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/1/16, Peter Zijlstra <[email protected]> wrote:
> On Tue, Mar 01, 2016 at 10:07:40AM -0500, Steven Rostedt wrote:
>> On Tue, 1 Mar 2016 11:05:42 +0100
>> Sedat Dilek <[email protected]> wrote:
>>
>>
>> > [ FACT #3: TEST-CASE #2 ]
>> >
>> > The most reliable test-case is to simply unplug my external Logitech
>> > USB mouse - saw this by accident.
>> > YES, it was so simple.
>>
>> Just to clarify, this happens on gcc and clang?
>
> Just clang from what I gather.
>

YES, gcc, but I can crash my Xorg, but do not see a pile of nothing in
my dmesg-log.

>> > --- dmesg_4.5.0-rc6-2-llvmlinux-amd64.txt 2016-02-29
>> > 21:23:56.399691975 +0100
>> > +++ dmesg_4.5.0-rc6-2-llvmlinux-amd64_usbmouse-unplugged.txt
>> > 2016-02-29 21:28:14.401832240 +0100
>> > @@ -832,3 +832,75 @@
>> > [ 66.529779] PPP BSD Compression module registered
>> > [ 66.563013] PPP Deflate Compression module registered
>> > [ 66.978977] usb 2-1.5: USB disconnect, device number 4
>> > +[ 321.937369] usb 2-1.4: USB disconnect, device number 3
>> > +[ 321.950810] BUG: sleeping function called from invalid context at
>> > kernel/workqueue.c:2785
>> > +[ 321.950816] in_atomic(): 0, irqs_disabled(): 1, pid: 44, name:
>> > kworker/2:1
>> > +[ 321.950819] 9 locks held by kworker/2:1/44:
>> > +[ 321.950820] #0: ("usb_hub_wq"){.+.+.+}, at: [<ffffffff810a7c5f>]
>> > process_one_work+0x30f/0x8d0
>> > +[ 321.950830] #1: ((&hub->events)){+.+.+.}, at:
>> > [<ffffffff810a7c8c>] process_one_work+0x33c/0x8d0
>> > +[ 321.950836] #2: (&dev->mutex){......}, at: [<ffffffff816fb0e0>]
>> > hub_event+0x50/0x15b0
>> > +[ 321.950844] #3: (&dev->mutex){......}, at: [<ffffffff816f686f>]
>> > usb_disconnect+0x5f/0x2c0
>> > +[ 321.950849] #4: (&dev->mutex){......}, at: [<ffffffff816245a2>]
>> > device_release_driver+0x22/0x40
>> > +[ 321.950856] #5: (&dev->mutex){......}, at: [<ffffffff816245a2>]
>> > device_release_driver+0x22/0x40
>> > +[ 321.950862] #6: (input_mutex){+.+.+.}, at: [<ffffffff8175171a>]
>> > __input_unregister_device+0x9a/0x190
>> > +[ 321.950869] #7: (&dev->mutex#2){+.+...}, at:
>> > [<ffffffff8174fe27>] input_close_device+0x27/0x70
>> > +[ 321.950875] #8: (hid_open_mut){+.+...}, at: [<ffffffffa0057388>]
>> > usbhid_close+0x28/0xb0 [usbhid]
>> > +[ 321.950883] irq event stamp: 47770
>> > +[ 321.950885] hardirqs last enabled at (47769):
>> > [<ffffffff819426a2>] _raw_spin_unlock_irq+0x32/0x60
>> > +[ 321.950889] hardirqs last disabled at (47770):
>> > [<ffffffff81115bdc>] del_timer_sync+0x3c/0x110
>>
>> According to lockdep, interrupts were last disabled in del_timer_sync,
>> and they were never enabled. The numbers in parenthesis show the order
>> of events. _raw_spin_unlock_irq() at 47769, then del_timer_sync at
>> 47770.
>>
>> But why did they not get enabled again? We have:
>>
>> local_irq_save(flags);
>> lock_map_acquire(&timer->lockdep_map);
>> lock_map_release(&timer->lockdep_map);
>> local_irq_restore(flags);
>>
>> If this caused an issue, then you would have a lockdep splat. But
>> perhaps something corrupted "flags", and interrupts were not re-enabled?
>
> Right, most odd. Sedat, could you provide objdump -D of the relevant
> sections of vmlinux ?
>

Can you give some clear instructions - for what shall I look for in special?

I can send you a 300MiB mail-bomb if you like :-).

$ cd linux-git
$ objdump -D vmlinux > ../objdump-D_vmlinux.txt
$ du -m ../objdump-D_vmlinux.txt
294 ../objdump-D_vmlinux.txt

I have now built with the same kernel-config for gcc and clang and
archived the most interesting stuff.
Before sending the stuff out, I would like to have some clear instructions.

Thanks.

- Sedat -

2016-03-02 15:17:26

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Wed, Mar 02, 2016 at 04:00:49PM +0100, Sedat Dilek wrote:
> >
> > Right, most odd. Sedat, could you provide objdump -D of the relevant
> > sections of vmlinux ?
> >
>
> Can you give some clear instructions - for what shall I look for in special?

objdump -D vmlinux | awk '/<[^>]*>:$/ { p=0; } /<del_timer_sync>:/ { p=1; } { if (p) print $0; }'

might be a good start, esp. if the output differs between the LLVM and
GCC cases (+- address muck).

2016-03-02 15:34:35

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Peter Zijlstra <[email protected]> wrote:
> On Wed, Mar 02, 2016 at 04:00:49PM +0100, Sedat Dilek wrote:
>> >
>> > Right, most odd. Sedat, could you provide objdump -D of the relevant
>> > sections of vmlinux ?
>> >
>>
>> Can you give some clear instructions - for what shall I look for in
>> special?
>
> objdump -D vmlinux | awk '/<[^>]*>:$/ { p=0; } /<del_timer_sync>:/ { p=1; }
> { if (p) print $0; }'
>
> might be a good start, esp. if the output differs between the LLVM and
> GCC cases (+- address muck).
>

Done as you instructed (small-sized attachment)

Plus some oldschool-grep-ing in the 2nd attached file...

$ grep '>:' objdump-D_vmlinux_peterz.txt
180585-ffffffff8109a320 <mod_delayed_work_on>:
180643:ffffffff8109a3d0 <try_to_grab_pending>:
180873-ffffffff8109a760 <flush_workqueue>:
181267-ffffffff8109ae80 <flush_workqueue_prep_pwqs>:
181423-ffffffff8109b0e0 <drain_workqueue>:
181554-ffffffff8109b2c0 <flush_work>:
181593-ffffffff8109b340 <start_flush_work>:
181756-ffffffff8109b5d0 <cancel_work_sync>:
181772-ffffffff8109b5f0 <__cancel_work_timer>:
181932-ffffffff8109b870 <flush_delayed_work>:
181963-ffffffff8109b8d0 <cancel_delayed_work>:
182027-ffffffff8109b9a0 <set_work_pool_and_clear_pending>:
182062-ffffffff8109ba10 <cancel_delayed_work_sync>:
310375:ffffffff8110f3f0 <del_timer>:
310433-ffffffff8110f4b0 <try_to_del_timer_sync>:
310490-ffffffff8110f570 <del_timer_sync>:

Hope this helps.

- Sedat -


Attachments:
objdump-D_vmlinux_peterz-instructions.txt (4.78 kB)
objdump-D_vmlinux_peterz.txt (117.10 kB)
Download all attachments

2016-03-02 15:53:41

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Peter Zijlstra <[email protected]> wrote:
> On Wed, Mar 02, 2016 at 04:00:49PM +0100, Sedat Dilek wrote:
>> >
>> > Right, most odd. Sedat, could you provide objdump -D of the relevant
>> > sections of vmlinux ?
>> >
>>
>> Can you give some clear instructions - for what shall I look for in
>> special?
>
> objdump -D vmlinux | awk '/<[^>]*>:$/ { p=0; } /<del_timer_sync>:/ { p=1; }
> { if (p) print $0; }'
>
> might be a good start, esp. if the output differs between the LLVM and
> GCC cases (+- address muck).
>

Here the most relevant objdumps as single files.

- Sedat -


Attachments:
objdump-D_vmlinux_cancel_work_sync.txt (834.00 B)
objdump-D_vmlinux___cancel_work_timer.txt (10.15 kB)
objdump-D_vmlinux_del_timer.txt (3.44 kB)
objdump-D_vmlinux_del_timer_sync.txt (4.78 kB)
objdump-D_vmlinux_flush_work.txt (2.25 kB)
objdump-D_vmlinux_start_flush_work.txt (10.08 kB)
objdump-D_vmlinux_try_to_del_timer_sync.txt (3.45 kB)
objdump-D_vmlinux_try_to_grab_pending.txt (14.78 kB)
Download all attachments

2016-03-02 15:56:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning


[ Resend with reply all, instead of just "reply" ]

On Wed, 2 Mar 2016 16:53:36 +0100
Sedat Dilek <[email protected]> wrote:

> ffffffff8110f3f0 <del_timer>:
> ffffffff8110f3f0: 55 push %rbp
> ffffffff8110f3f1: 48 89 e5 mov %rsp,%rbp
> ffffffff8110f3f4: 41 57 push %r15
> ffffffff8110f3f6: 41 56 push %r14
> ffffffff8110f3f8: 53 push %rbx
> ffffffff8110f3f9: 50 push %rax
> ffffffff8110f3fa: 48 89 fb mov %rdi,%rbx
> ffffffff8110f3fd: e8 ee 6f 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f402: e8 e9 6f 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f407: e8 e4 6f 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f40c: e8 df 6f 80 00 callq ffffffff819163f0 <mcount>

What is this about?

-- Steve

2016-03-02 16:09:04

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Steven Rostedt <[email protected]> wrote:
>
> [ Resend with reply all, instead of just "reply" ]
>
> On Wed, 2 Mar 2016 16:53:36 +0100
> Sedat Dilek <[email protected]> wrote:
>
>> ffffffff8110f3f0 <del_timer>:
>> ffffffff8110f3f0: 55 push %rbp
>> ffffffff8110f3f1: 48 89 e5 mov %rsp,%rbp
>> ffffffff8110f3f4: 41 57 push %r15
>> ffffffff8110f3f6: 41 56 push %r14
>> ffffffff8110f3f8: 53 push %rbx
>> ffffffff8110f3f9: 50 push %rax
>> ffffffff8110f3fa: 48 89 fb mov %rdi,%rbx
>> ffffffff8110f3fd: e8 ee 6f 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f402: e8 e9 6f 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f407: e8 e4 6f 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f40c: e8 df 6f 80 00 callq ffffffff819163f0 <mcount>
>
> What is this about?
>

Dunno.

Not sure if this is relevant or not...

[ LINUX-CONFIG ]

$ ./scripts/diffconfig /boot/config-4.4.3-3-iniza-small
/boot/config-4.4.3-3-llvmlinux-amd64
ARCH_HWEIGHT_CFLAGS "-fcall-saved-rdi -fcall-saved-rsi
-fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9
-fcall-saved-r10 -fcall-saved-r11" -> ""

...and my patchset on top of Linux v4.4.3 is attached.

- Sedat -


Attachments:
config-4.4.3-3-llvmlinux-amd64 (126.80 kB)

2016-03-02 16:11:39

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Sedat Dilek <[email protected]> wrote:
> On 3/2/16, Steven Rostedt <[email protected]> wrote:
>>
>> [ Resend with reply all, instead of just "reply" ]
>>
>> On Wed, 2 Mar 2016 16:53:36 +0100
>> Sedat Dilek <[email protected]> wrote:
>>
>>> ffffffff8110f3f0 <del_timer>:
>>> ffffffff8110f3f0: 55 push %rbp
>>> ffffffff8110f3f1: 48 89 e5 mov %rsp,%rbp
>>> ffffffff8110f3f4: 41 57 push %r15
>>> ffffffff8110f3f6: 41 56 push %r14
>>> ffffffff8110f3f8: 53 push %rbx
>>> ffffffff8110f3f9: 50 push %rax
>>> ffffffff8110f3fa: 48 89 fb mov %rdi,%rbx
>>> ffffffff8110f3fd: e8 ee 6f 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f402: e8 e9 6f 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f407: e8 e4 6f 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f40c: e8 df 6f 80 00 callq ffffffff819163f0 <mcount>
>>
>> What is this about?
>>
>
> Dunno.
>
> Not sure if this is relevant or not...
>
> [ LINUX-CONFIG ]
>
> $ ./scripts/diffconfig /boot/config-4.4.3-3-iniza-small
> /boot/config-4.4.3-3-llvmlinux-amd64
> ARCH_HWEIGHT_CFLAGS "-fcall-saved-rdi -fcall-saved-rsi
> -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9
> -fcall-saved-r10 -fcall-saved-r11" -> ""
>
> ...and my patchset on top of Linux v4.4.3 is attached.
>

That was my full linux-config and now and really the patchset.
( If you are ill, stay in your bed. )

- Sedat -


Attachments:
4.4.3-3-llvmlinux-amd64.patch (28.47 kB)

2016-03-02 16:21:16

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Sedat Dilek <[email protected]> wrote:
> On 3/2/16, Sedat Dilek <[email protected]> wrote:
>> On 3/2/16, Steven Rostedt <[email protected]> wrote:
>>>
>>> [ Resend with reply all, instead of just "reply" ]
>>>
>>> On Wed, 2 Mar 2016 16:53:36 +0100
>>> Sedat Dilek <[email protected]> wrote:
>>>
>>>> ffffffff8110f3f0 <del_timer>:
>>>> ffffffff8110f3f0: 55 push %rbp
>>>> ffffffff8110f3f1: 48 89 e5 mov %rsp,%rbp
>>>> ffffffff8110f3f4: 41 57 push %r15
>>>> ffffffff8110f3f6: 41 56 push %r14
>>>> ffffffff8110f3f8: 53 push %rbx
>>>> ffffffff8110f3f9: 50 push %rax
>>>> ffffffff8110f3fa: 48 89 fb mov %rdi,%rbx
>>>> ffffffff8110f3fd: e8 ee 6f 80 00 callq ffffffff819163f0
>>>> <mcount>
>>>> ffffffff8110f402: e8 e9 6f 80 00 callq ffffffff819163f0
>>>> <mcount>
>>>> ffffffff8110f407: e8 e4 6f 80 00 callq ffffffff819163f0
>>>> <mcount>
>>>> ffffffff8110f40c: e8 df 6f 80 00 callq ffffffff819163f0
>>>> <mcount>
>>>
>>> What is this about?
>>>
>>
>> Dunno.
>>
>> Not sure if this is relevant or not...
>>
>> [ LINUX-CONFIG ]
>>
>> $ ./scripts/diffconfig /boot/config-4.4.3-3-iniza-small
>> /boot/config-4.4.3-3-llvmlinux-amd64
>> ARCH_HWEIGHT_CFLAGS "-fcall-saved-rdi -fcall-saved-rsi
>> -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9
>> -fcall-saved-r10 -fcall-saved-r11" -> ""
>>
>> ...and my patchset on top of Linux v4.4.3 is attached.
>>
>
> That was my full linux-config and now and really the patchset.
> ( If you are ill, stay in your bed. )
>

For the sake of (in)completeness my dmesg after Xorg crashed and my
usbmouse un-/re-plugged.

- Sedat -


Attachments:
dmesg_4.4.3-3-llvmlinux-amd64_xorg-crashed_usbmouse-unplugged-and-replugged.txt (70.07 kB)

2016-03-02 16:24:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Wed, Mar 02, 2016 at 04:53:36PM +0100, Sedat Dilek wrote:
> ffffffff8110f570 <del_timer_sync>:
> ffffffff8110f570: 55 push %rbp
> ffffffff8110f571: 48 89 e5 mov %rsp,%rbp
> ffffffff8110f574: 41 57 push %r15
> ffffffff8110f576: 41 56 push %r14
> ffffffff8110f578: 53 push %rbx
> ffffffff8110f579: 48 83 ec 28 sub $0x28,%rsp

stack offset is 0x28 bytes [*]

> ffffffff8110f57d: 48 89 fb mov %rdi,%rbx
> ffffffff8110f580: e8 6b 6e 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f585: e8 66 6e 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f58a: e8 61 6e 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f58f: e8 5c 6e 80 00 callq ffffffff819163f0 <mcount>

Your compiler is on drugs!

> ffffffff8110f594: 9c pushfq
> ffffffff8110f595: 8f 45 e0 popq -0x20(%rbp)

Saves flags in -0x20(%rbp)

> ffffffff8110f598: 4c 8b 7d e0 mov -0x20(%rbp),%r15

And in %r15

/me wonders what's wrong with: popf %r15

> ffffffff8110f59c: e8 4f 6e 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f5a1: e8 4a 6e 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f5a6: fa cli
> ffffffff8110f5a7: e8 84 cb fc ff callq ffffffff810dc130 <trace_hardirqs_off>
> ffffffff8110f5ac: 4c 8d 73 50 lea 0x50(%rbx),%r14
> ffffffff8110f5b0: 48 c7 04 24 b0 f5 10 movq $0xffffffff8110f5b0,(%rsp)
> ffffffff8110f5b7: 81
> ffffffff8110f5b8: 31 f6 xor %esi,%esi
> ffffffff8110f5ba: 31 d2 xor %edx,%edx
> ffffffff8110f5bc: 31 c9 xor %ecx,%ecx
> ffffffff8110f5be: 41 b8 01 00 00 00 mov $0x1,%r8d
> ffffffff8110f5c4: 45 31 c9 xor %r9d,%r9d
> ffffffff8110f5c7: 4c 89 f7 mov %r14,%rdi
> ffffffff8110f5ca: e8 c1 e5 fc ff callq ffffffff810ddb90 <lock_acquire>
> ffffffff8110f5cf: be 01 00 00 00 mov $0x1,%esi
> ffffffff8110f5d4: 48 c7 c2 cf f5 10 81 mov $0xffffffff8110f5cf,%rdx
> ffffffff8110f5db: 4c 89 f7 mov %r14,%rdi
> ffffffff8110f5de: e8 8d 08 fd ff callq ffffffff810dfe70 <lock_release>
> ffffffff8110f5e3: e8 08 6e 80 00 callq ffffffff819163f0 <mcount>

> ffffffff8110f5e8: 4c 89 f8 mov %r15,%rax
> ffffffff8110f5eb: 49 89 c6 mov %rax,%r14

Moves r15 into r14 through rax


> ffffffff8110f5ee: f6 c4 02 test $0x2,%ah
> ffffffff8110f5f1: 75 19 jne ffffffff8110f60c <del_timer_sync+0x9c>
> ffffffff8110f5f3: e8 f8 6d 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f5f8: e8 f3 6d 80 00 callq ffffffff819163f0 <mcount>


> ffffffff8110f5fd: 4c 89 75 d0 mov %r14,-0x30(%rbp)
> ffffffff8110f601: ff 75 d0 pushq -0x30(%rbp)
> ffffffff8110f604: 9d popfq

put r14 into -0x30(rbp) and pushes/pops that, see [*] this is 8 bytes
over stack ?!

> ffffffff8110f605: e8 26 cb fc ff callq ffffffff810dc130 <trace_hardirqs_off>
> ffffffff8110f60a: eb 17 jmp ffffffff8110f623 <del_timer_sync+0xb3>
> ffffffff8110f60c: e8 2f cb fc ff callq ffffffff810dc140 <trace_hardirqs_on>
> ffffffff8110f611: e8 da 6d 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f616: e8 d5 6d 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f61b: 4c 89 75 d8 mov %r14,-0x28(%rbp)
> ffffffff8110f61f: ff 75 d8 pushq -0x28(%rbp)
> ffffffff8110f622: 9d popfq

puts r14 into -0x28(rbp) and pushes/pops that

> ffffffff8110f623: e8 c8 6d 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f628: 65 8b 04 25 d4 ae 00 mov %gs:0xaed4,%eax
> ffffffff8110f62f: 00
> ffffffff8110f630: a9 00 00 0f 00 test $0xf0000,%eax
> ffffffff8110f635: 74 25 je ffffffff8110f65c <del_timer_sync+0xec>
> ffffffff8110f637: f6 43 2a 20 testb $0x20,0x2a(%rbx)
> ffffffff8110f63b: 75 1f jne ffffffff8110f65c <del_timer_sync+0xec>
> ffffffff8110f63d: 48 c7 c7 04 54 c5 81 mov $0xffffffff81c55404,%rdi
> ffffffff8110f644: be 61 04 00 00 mov $0x461,%esi
> ffffffff8110f649: e8 12 c4 f6 ff callq ffffffff8107ba60 <warn_slowpath_null>
> ffffffff8110f64e: eb 0c jmp ffffffff8110f65c <del_timer_sync+0xec>
> ffffffff8110f650: e8 9b 6d 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f655: e8 96 6d 80 00 callq ffffffff819163f0 <mcount>
> ffffffff8110f65a: f3 90 pause
> ffffffff8110f65c: 48 89 df mov %rbx,%rdi
> ffffffff8110f65f: e8 4c fe ff ff callq ffffffff8110f4b0 <try_to_del_timer_sync>
> ffffffff8110f664: 85 c0 test %eax,%eax
> ffffffff8110f666: 78 e8 js ffffffff8110f650 <del_timer_sync+0xe0>
> ffffffff8110f668: 48 83 c4 28 add $0x28,%rsp
> ffffffff8110f66c: 5b pop %rbx
> ffffffff8110f66d: 41 5e pop %r14
> ffffffff8110f66f: 41 5f pop %r15
> ffffffff8110f671: 5d pop %rbp
> ffffffff8110f672: c3 retq
> ffffffff8110f673: 66 66 66 66 2e 0f 1f data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
> ffffffff8110f67a: 84 00 00 00 00 00


That LLVM generate disgusting code, as a contrast, this is what my GCC
makes of this:

ffffffff81155f50 <del_timer_sync>:
ffffffff81155f50: e8 bb 02 9c 00 callq ffffffff81b16210 <__fentry__>
ffffffff81155f55: 55 push %rbp
ffffffff81155f56: 48 89 e5 mov %rsp,%rbp
ffffffff81155f59: 41 55 push %r13
ffffffff81155f5b: 41 54 push %r12
ffffffff81155f5d: 53 push %rbx
ffffffff81155f5e: 48 89 fb mov %rdi,%rbx
ffffffff81155f61: 48 83 ec 08 sub $0x8,%rsp
ffffffff81155f65: 9c pushfq
ffffffff81155f66: 41 5c pop %r12
ffffffff81155f68: fa cli
ffffffff81155f69: e8 72 5c fd ff callq ffffffff8112bbe0 <trace_hardirqs_off>
ffffffff81155f6e: 4c 8d 6b 50 lea 0x50(%rbx),%r13
ffffffff81155f72: 45 31 c9 xor %r9d,%r9d
ffffffff81155f75: 31 c9 xor %ecx,%ecx
ffffffff81155f77: 31 d2 xor %edx,%edx
ffffffff81155f79: 31 f6 xor %esi,%esi
ffffffff81155f7b: 4c 89 ef mov %r13,%rdi
ffffffff81155f7e: 48 c7 04 24 55 5f 15 movq $0xffffffff81155f55,(%rsp)
ffffffff81155f85: 81
ffffffff81155f86: 41 b8 01 00 00 00 mov $0x1,%r8d
ffffffff81155f8c: e8 3f b3 fd ff callq ffffffff811312d0 <lock_acquire>
ffffffff81155f91: 48 c7 c2 55 5f 15 81 mov $0xffffffff81155f55,%rdx
ffffffff81155f98: be 01 00 00 00 mov $0x1,%esi
ffffffff81155f9d: 4c 89 ef mov %r13,%rdi
ffffffff81155fa0: e8 0b b5 fd ff callq ffffffff811314b0 <lock_release>
ffffffff81155fa5: 41 f7 c4 00 02 00 00 test $0x200,%r12d
ffffffff81155fac: 75 52 jne ffffffff81156000 <del_timer_sync+0xb0>
ffffffff81155fae: 41 54 push %r12
ffffffff81155fb0: 9d popfq
ffffffff81155fb1: e8 2a 5c fd ff callq ffffffff8112bbe0 <trace_hardirqs_off>
ffffffff81155fb6: 65 8b 05 5b 66 eb 7e mov %gs:0x7eeb665b(%rip),%eax # c618 <__preempt_count>
ffffffff81155fbd: a9 00 00 0f 00 test $0xf0000,%eax
ffffffff81155fc2: 74 25 je ffffffff81155fe9 <del_timer_sync+0x99>
ffffffff81155fc4: f6 43 2a 20 testb $0x20,0x2a(%rbx)
ffffffff81155fc8: 75 1f jne ffffffff81155fe9 <del_timer_sync+0x99>
ffffffff81155fca: be 61 04 00 00 mov $0x461,%esi
ffffffff81155fcf: 48 c7 c7 09 87 f3 81 mov $0xffffffff81f38709,%rdi
ffffffff81155fd6: e8 15 03 f8 ff callq ffffffff810d62f0 <warn_slowpath_null>
ffffffff81155fdb: 48 89 df mov %rbx,%rdi
ffffffff81155fde: e8 fd fe ff ff callq ffffffff81155ee0 <try_to_del_timer_sync>
ffffffff81155fe3: 85 c0 test %eax,%eax
ffffffff81155fe5: 79 0e jns ffffffff81155ff5 <del_timer_sync+0xa5>
ffffffff81155fe7: f3 90 pause
ffffffff81155fe9: 48 89 df mov %rbx,%rdi
ffffffff81155fec: e8 ef fe ff ff callq ffffffff81155ee0 <try_to_del_timer_sync>
ffffffff81155ff1: 85 c0 test %eax,%eax
ffffffff81155ff3: 78 f2 js ffffffff81155fe7 <del_timer_sync+0x97>
ffffffff81155ff5: 48 83 c4 08 add $0x8,%rsp
ffffffff81155ff9: 5b pop %rbx
ffffffff81155ffa: 41 5c pop %r12
ffffffff81155ffc: 41 5d pop %r13
ffffffff81155ffe: 5d pop %rbp
ffffffff81155fff: c3 retq
ffffffff81156000: e8 4b 8b fd ff callq ffffffff8112eb50 <trace_hardirqs_on>
ffffffff81156005: 41 54 push %r12
ffffffff81156007: 9d popfq
ffffffff81156008: eb ac jmp ffffffff81155fb6 <del_timer_sync+0x66>
ffffffff8115600a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)

2016-03-02 16:35:43

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Wed, 2 Mar 2016 17:24:12 +0100
Peter Zijlstra <[email protected]> wrote:

> On Wed, Mar 02, 2016 at 04:53:36PM +0100, Sedat Dilek wrote:
> > ffffffff8110f570 <del_timer_sync>:
> > ffffffff8110f570: 55 push %rbp
> > ffffffff8110f571: 48 89 e5 mov %rsp,%rbp
> > ffffffff8110f574: 41 57 push %r15
> > ffffffff8110f576: 41 56 push %r14
> > ffffffff8110f578: 53 push %rbx
> > ffffffff8110f579: 48 83 ec 28 sub $0x28,%rsp
>
> stack offset is 0x28 bytes [*]

Actually, isn't it really 0x40 bytes? The stack pushed 3 words (8 bytes
each) before doing the subtract. 0x28 == 40 bytes, 3 * 8 = 24,
40 + 24 = 64 == 0x40.

>
> > ffffffff8110f57d: 48 89 fb mov %rdi,%rbx
> > ffffffff8110f580: e8 6b 6e 80 00 callq ffffffff819163f0 <mcount>
> > ffffffff8110f585: e8 66 6e 80 00 callq ffffffff819163f0 <mcount>
> > ffffffff8110f58a: e8 61 6e 80 00 callq ffffffff819163f0 <mcount>
> > ffffffff8110f58f: e8 5c 6e 80 00 callq ffffffff819163f0 <mcount>
>
> Your compiler is on drugs!

Totally agree!


[..]

>
> > ffffffff8110f5fd: 4c 89 75 d0 mov %r14,-0x30(%rbp)
> > ffffffff8110f601: ff 75 d0 pushq -0x30(%rbp)
> > ffffffff8110f604: 9d popfq
>
> put r14 into -0x30(rbp) and pushes/pops that, see [*] this is 8 bytes
> over stack ?!

But from rbp, the stack is 0x40 bytes. This may be fine.

-- Steve

2016-03-02 16:42:28

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Peter Zijlstra <[email protected]> wrote:
> On Wed, Mar 02, 2016 at 04:53:36PM +0100, Sedat Dilek wrote:
>> ffffffff8110f570 <del_timer_sync>:
>> ffffffff8110f570: 55 push %rbp
>> ffffffff8110f571: 48 89 e5 mov %rsp,%rbp
>> ffffffff8110f574: 41 57 push %r15
>> ffffffff8110f576: 41 56 push %r14
>> ffffffff8110f578: 53 push %rbx
>> ffffffff8110f579: 48 83 ec 28 sub $0x28,%rsp
>
> stack offset is 0x28 bytes [*]
>
>> ffffffff8110f57d: 48 89 fb mov %rdi,%rbx
>> ffffffff8110f580: e8 6b 6e 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f585: e8 66 6e 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f58a: e8 61 6e 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f58f: e8 5c 6e 80 00 callq ffffffff819163f0 <mcount>
>
> Your compiler is on drugs!
>
>> ffffffff8110f594: 9c pushfq
>> ffffffff8110f595: 8f 45 e0 popq -0x20(%rbp)
>
> Saves flags in -0x20(%rbp)
>
>> ffffffff8110f598: 4c 8b 7d e0 mov -0x20(%rbp),%r15
>
> And in %r15
>
> /me wonders what's wrong with: popf %r15
>
>> ffffffff8110f59c: e8 4f 6e 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f5a1: e8 4a 6e 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f5a6: fa cli
>> ffffffff8110f5a7: e8 84 cb fc ff callq ffffffff810dc130
>> <trace_hardirqs_off>
>> ffffffff8110f5ac: 4c 8d 73 50 lea 0x50(%rbx),%r14
>> ffffffff8110f5b0: 48 c7 04 24 b0 f5 10 movq $0xffffffff8110f5b0,(%rsp)
>> ffffffff8110f5b7: 81
>> ffffffff8110f5b8: 31 f6 xor %esi,%esi
>> ffffffff8110f5ba: 31 d2 xor %edx,%edx
>> ffffffff8110f5bc: 31 c9 xor %ecx,%ecx
>> ffffffff8110f5be: 41 b8 01 00 00 00 mov $0x1,%r8d
>> ffffffff8110f5c4: 45 31 c9 xor %r9d,%r9d
>> ffffffff8110f5c7: 4c 89 f7 mov %r14,%rdi
>> ffffffff8110f5ca: e8 c1 e5 fc ff callq ffffffff810ddb90
>> <lock_acquire>
>> ffffffff8110f5cf: be 01 00 00 00 mov $0x1,%esi
>> ffffffff8110f5d4: 48 c7 c2 cf f5 10 81 mov $0xffffffff8110f5cf,%rdx
>> ffffffff8110f5db: 4c 89 f7 mov %r14,%rdi
>> ffffffff8110f5de: e8 8d 08 fd ff callq ffffffff810dfe70
>> <lock_release>
>> ffffffff8110f5e3: e8 08 6e 80 00 callq ffffffff819163f0 <mcount>
>
>> ffffffff8110f5e8: 4c 89 f8 mov %r15,%rax
>> ffffffff8110f5eb: 49 89 c6 mov %rax,%r14
>
> Moves r15 into r14 through rax
>
>
>> ffffffff8110f5ee: f6 c4 02 test $0x2,%ah
>> ffffffff8110f5f1: 75 19 jne ffffffff8110f60c
>> <del_timer_sync+0x9c>
>> ffffffff8110f5f3: e8 f8 6d 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f5f8: e8 f3 6d 80 00 callq ffffffff819163f0 <mcount>
>
>
>> ffffffff8110f5fd: 4c 89 75 d0 mov %r14,-0x30(%rbp)
>> ffffffff8110f601: ff 75 d0 pushq -0x30(%rbp)
>> ffffffff8110f604: 9d popfq
>
> put r14 into -0x30(rbp) and pushes/pops that, see [*] this is 8 bytes
> over stack ?!
>
>> ffffffff8110f605: e8 26 cb fc ff callq ffffffff810dc130
>> <trace_hardirqs_off>
>> ffffffff8110f60a: eb 17 jmp ffffffff8110f623
>> <del_timer_sync+0xb3>
>> ffffffff8110f60c: e8 2f cb fc ff callq ffffffff810dc140
>> <trace_hardirqs_on>
>> ffffffff8110f611: e8 da 6d 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f616: e8 d5 6d 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f61b: 4c 89 75 d8 mov %r14,-0x28(%rbp)
>> ffffffff8110f61f: ff 75 d8 pushq -0x28(%rbp)
>> ffffffff8110f622: 9d popfq
>
> puts r14 into -0x28(rbp) and pushes/pops that
>
>> ffffffff8110f623: e8 c8 6d 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f628: 65 8b 04 25 d4 ae 00 mov %gs:0xaed4,%eax
>> ffffffff8110f62f: 00
>> ffffffff8110f630: a9 00 00 0f 00 test $0xf0000,%eax
>> ffffffff8110f635: 74 25 je ffffffff8110f65c
>> <del_timer_sync+0xec>
>> ffffffff8110f637: f6 43 2a 20 testb $0x20,0x2a(%rbx)
>> ffffffff8110f63b: 75 1f jne ffffffff8110f65c
>> <del_timer_sync+0xec>
>> ffffffff8110f63d: 48 c7 c7 04 54 c5 81 mov $0xffffffff81c55404,%rdi
>> ffffffff8110f644: be 61 04 00 00 mov $0x461,%esi
>> ffffffff8110f649: e8 12 c4 f6 ff callq ffffffff8107ba60
>> <warn_slowpath_null>
>> ffffffff8110f64e: eb 0c jmp ffffffff8110f65c
>> <del_timer_sync+0xec>
>> ffffffff8110f650: e8 9b 6d 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f655: e8 96 6d 80 00 callq ffffffff819163f0 <mcount>
>> ffffffff8110f65a: f3 90 pause
>> ffffffff8110f65c: 48 89 df mov %rbx,%rdi
>> ffffffff8110f65f: e8 4c fe ff ff callq ffffffff8110f4b0
>> <try_to_del_timer_sync>
>> ffffffff8110f664: 85 c0 test %eax,%eax
>> ffffffff8110f666: 78 e8 js ffffffff8110f650
>> <del_timer_sync+0xe0>
>> ffffffff8110f668: 48 83 c4 28 add $0x28,%rsp
>> ffffffff8110f66c: 5b pop %rbx
>> ffffffff8110f66d: 41 5e pop %r14
>> ffffffff8110f66f: 41 5f pop %r15
>> ffffffff8110f671: 5d pop %rbp
>> ffffffff8110f672: c3 retq
>> ffffffff8110f673: 66 66 66 66 2e 0f 1f data32 data32 data32 nopw
>> %cs:0x0(%rax,%rax,1)
>> ffffffff8110f67a: 84 00 00 00 00 00
>
>
> That LLVM generate disgusting code, as a contrast, this is what my GCC
> makes of this:
>
> ffffffff81155f50 <del_timer_sync>:
> ffffffff81155f50: e8 bb 02 9c 00 callq ffffffff81b16210
> <__fentry__>
> ffffffff81155f55: 55 push %rbp
> ffffffff81155f56: 48 89 e5 mov %rsp,%rbp
> ffffffff81155f59: 41 55 push %r13
> ffffffff81155f5b: 41 54 push %r12
> ffffffff81155f5d: 53 push %rbx
> ffffffff81155f5e: 48 89 fb mov %rdi,%rbx
> ffffffff81155f61: 48 83 ec 08 sub $0x8,%rsp
> ffffffff81155f65: 9c pushfq
> ffffffff81155f66: 41 5c pop %r12
> ffffffff81155f68: fa cli
> ffffffff81155f69: e8 72 5c fd ff callq ffffffff8112bbe0
> <trace_hardirqs_off>
> ffffffff81155f6e: 4c 8d 6b 50 lea 0x50(%rbx),%r13
> ffffffff81155f72: 45 31 c9 xor %r9d,%r9d
> ffffffff81155f75: 31 c9 xor %ecx,%ecx
> ffffffff81155f77: 31 d2 xor %edx,%edx
> ffffffff81155f79: 31 f6 xor %esi,%esi
> ffffffff81155f7b: 4c 89 ef mov %r13,%rdi
> ffffffff81155f7e: 48 c7 04 24 55 5f 15 movq
> $0xffffffff81155f55,(%rsp)
> ffffffff81155f85: 81
> ffffffff81155f86: 41 b8 01 00 00 00 mov $0x1,%r8d
> ffffffff81155f8c: e8 3f b3 fd ff callq ffffffff811312d0
> <lock_acquire>
> ffffffff81155f91: 48 c7 c2 55 5f 15 81 mov
> $0xffffffff81155f55,%rdx
> ffffffff81155f98: be 01 00 00 00 mov $0x1,%esi
> ffffffff81155f9d: 4c 89 ef mov %r13,%rdi
> ffffffff81155fa0: e8 0b b5 fd ff callq ffffffff811314b0
> <lock_release>
> ffffffff81155fa5: 41 f7 c4 00 02 00 00 test $0x200,%r12d
> ffffffff81155fac: 75 52 jne ffffffff81156000
> <del_timer_sync+0xb0>
> ffffffff81155fae: 41 54 push %r12
> ffffffff81155fb0: 9d popfq
> ffffffff81155fb1: e8 2a 5c fd ff callq ffffffff8112bbe0
> <trace_hardirqs_off>
> ffffffff81155fb6: 65 8b 05 5b 66 eb 7e mov
> %gs:0x7eeb665b(%rip),%eax # c618 <__preempt_count>
> ffffffff81155fbd: a9 00 00 0f 00 test $0xf0000,%eax
> ffffffff81155fc2: 74 25 je ffffffff81155fe9
> <del_timer_sync+0x99>
> ffffffff81155fc4: f6 43 2a 20 testb $0x20,0x2a(%rbx)
> ffffffff81155fc8: 75 1f jne ffffffff81155fe9
> <del_timer_sync+0x99>
> ffffffff81155fca: be 61 04 00 00 mov $0x461,%esi
> ffffffff81155fcf: 48 c7 c7 09 87 f3 81 mov
> $0xffffffff81f38709,%rdi
> ffffffff81155fd6: e8 15 03 f8 ff callq ffffffff810d62f0
> <warn_slowpath_null>
> ffffffff81155fdb: 48 89 df mov %rbx,%rdi
> ffffffff81155fde: e8 fd fe ff ff callq ffffffff81155ee0
> <try_to_del_timer_sync>
> ffffffff81155fe3: 85 c0 test %eax,%eax
> ffffffff81155fe5: 79 0e jns ffffffff81155ff5
> <del_timer_sync+0xa5>
> ffffffff81155fe7: f3 90 pause
> ffffffff81155fe9: 48 89 df mov %rbx,%rdi
> ffffffff81155fec: e8 ef fe ff ff callq ffffffff81155ee0
> <try_to_del_timer_sync>
> ffffffff81155ff1: 85 c0 test %eax,%eax
> ffffffff81155ff3: 78 f2 js ffffffff81155fe7
> <del_timer_sync+0x97>
> ffffffff81155ff5: 48 83 c4 08 add $0x8,%rsp
> ffffffff81155ff9: 5b pop %rbx
> ffffffff81155ffa: 41 5c pop %r12
> ffffffff81155ffc: 41 5d pop %r13
> ffffffff81155ffe: 5d pop %rbp
> ffffffff81155fff: c3 retq
> ffffffff81156000: e8 4b 8b fd ff callq ffffffff8112eb50
> <trace_hardirqs_on>
> ffffffff81156005: 41 54 push %r12
> ffffffff81156007: 9d popfq
> ffffffff81156008: eb ac jmp ffffffff81155fb6
> <del_timer_sync+0x66>
> ffffffff8115600a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
>

OK, I cannot say much to your analysis.

The full make-lines look differently.

GCC for example has '-mfentry -DCC_USING_FENTRY'.

[ CLANG ]

mycompiler -Wp,-MD,kernel/.workqueue.o.d -nostdinc -isystem
/opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include -nostdinc
-isystem /opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include
-I./arch/x86/include -Iarch/x86/include/generated/uapi
-Iarch/x86/include/generated -Iinclude -I./arch/x86/include/uapi
-Iarch/x86/include/generated/uapi -I./include/uapi
-Iinclude/generated/uapi -include ./include/linux/kconfig.h
-D__KERNEL__ -Qunused-arguments -Wno-unknown-warning-option -Wall
-Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing
-fno-common -Werror-implicit-function-declaration -Wno-format-security
-std=gnu89 -no-integrated-as -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
-mno-avx -m64 -mtune=generic -mno-red-zone -mcmodel=kernel
-funit-at-a-time -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1
-DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
-DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1
-DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare
-fno-asynchronous-unwind-tables -O2 -Wframe-larger-than=1024
-fno-stack-protector -Wno-unused-variable
-Wno-format-invalid-specifier -Wno-gnu -Wno-asm-operand-widths
-Wno-initializer-overrides -fno-builtin -Wno-tautological-compare
-mno-global-merge -fno-omit-frame-pointer -fno-optimize-sibling-calls
-pg -Wdeclaration-after-statement -Wno-pointer-sign
-fno-strict-overflow -Werror=implicit-int -Werror=strict-prototypes
-Werror=date-time -Wno-initializer-overrides -Wno-unused-value
-Wno-format -Wno-unknown-warning-option -Wno-sign-compare
-Wno-format-zero-length -Wno-uninitialized -D"KBUILD_STR(s)=#s"
-D"KBUILD_BASENAME=KBUILD_STR(workqueue)"
-D"KBUILD_MODNAME=KBUILD_STR(workqueue)" -c -o kernel/.tmp_workqueue.o
kernel/workqueue.c

mycompiler -Wp,-MD,kernel/time/.timer.o.d -nostdinc -isystem
/opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include -nostdinc
-isystem /opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include
-I./arch/x86/include -Iarch/x86/include/generated/uapi
-Iarch/x86/include/generated -Iinclude -I./arch/x86/include/uapi
-Iarch/x86/include/generated/uapi -I./include/uapi
-Iinclude/generated/uapi -include ./include/linux/kconfig.h
-D__KERNEL__ -Qunused-arguments -Wno-unknown-warning-option -Wall
-Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing
-fno-common -Werror-implicit-function-declaration -Wno-format-security
-std=gnu89 -no-integrated-as -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
-mno-avx -m64 -mtune=generic -mno-red-zone -mcmodel=kernel
-funit-at-a-time -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1
-DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
-DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1
-DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare
-fno-asynchronous-unwind-tables -O2 -Wframe-larger-than=1024
-fno-stack-protector -Wno-unused-variable
-Wno-format-invalid-specifier -Wno-gnu -Wno-asm-operand-widths
-Wno-initializer-overrides -fno-builtin -Wno-tautological-compare
-mno-global-merge -fno-omit-frame-pointer -fno-optimize-sibling-calls
-pg -Wdeclaration-after-statement -Wno-pointer-sign
-fno-strict-overflow -Werror=implicit-int -Werror=strict-prototypes
-Werror=date-time -Wno-initializer-overrides -Wno-unused-value
-Wno-format -Wno-unknown-warning-option -Wno-sign-compare
-Wno-format-zero-length -Wno-uninitialized -D"KBUILD_STR(s)=#s"
-D"KBUILD_BASENAME=KBUILD_STR(timer)"
-D"KBUILD_MODNAME=KBUILD_STR(timer)" -c -o kernel/time/.tmp_timer.o
kernel/time/timer.c

[ GCC ]

mycompiler -Wp,-MD,kernel/.workqueue.o.d -nostdinc -isystem
/usr/lib/gcc/x86_64-linux-gnu/4.9/include -nostdinc -isystem
/usr/lib/gcc/x86_64-linux-gnu/4.9/include -I./arch/x86/include
-Iarch/x86/include/generated/uapi -Iarch/x86/include/generated
-Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi
-I./include/uapi -Iinclude/generated/uapi -include
./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -std=gnu89
-mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1
-falign-loops=1 -mno-80387 -mno-fp-ret-in-387
-mpreferred-stack-boundary=3 -mtune=generic -mno-red-zone
-mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args
-DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
-DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1
-DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe
-Wno-sign-compare -fno-asynchronous-unwind-tables
-fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
-Wframe-larger-than=1024 -fno-stack-protector
-Wno-unused-but-set-variable -fno-omit-frame-pointer
-fno-optimize-sibling-calls -fno-var-tracking-assignments -pg -mfentry
-DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign
-fno-strict-overflow -fconserve-stack -Werror=implicit-int
-Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(workqueue)"
-D"KBUILD_MODNAME=KBUILD_STR(workqueue)" -c -o kernel/.tmp_workqueue.o
kernel/workqueue.c

mycompiler -Wp,-MD,kernel/time/.timer.o.d -nostdinc -isystem
/usr/lib/gcc/x86_64-linux-gnu/4.9/include -nostdinc -isystem
/usr/lib/gcc/x86_64-linux-gnu/4.9/include -I./arch/x86/include
-Iarch/x86/include/generated/uapi -Iarch/x86/include/generated
-Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi
-I./include/uapi -Iinclude/generated/uapi -include
./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -std=gnu89
-mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1
-falign-loops=1 -mno-80387 -mno-fp-ret-in-387
-mpreferred-stack-boundary=3 -mtune=generic -mno-red-zone
-mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args
-DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
-DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1
-DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe
-Wno-sign-compare -fno-asynchronous-unwind-tables
-fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
-Wframe-larger-than=1024 -fno-stack-protector
-Wno-unused-but-set-variable -fno-omit-frame-pointer
-fno-optimize-sibling-calls -fno-var-tracking-assignments -pg -mfentry
-DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign
-fno-strict-overflow -fconserve-stack -Werror=implicit-int
-Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(timer)"
-D"KBUILD_MODNAME=KBUILD_STR(timer)" -c -o kernel/time/.tmp_timer.o
kernel/time/timer.c

I can try to use the make-line of GCC and re-compile.

- Sedat -

2016-03-02 16:42:56

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Wed, Mar 02, 2016 at 11:35:35AM -0500, Steven Rostedt wrote:
> On Wed, 2 Mar 2016 17:24:12 +0100
> Peter Zijlstra <[email protected]> wrote:
>
> > On Wed, Mar 02, 2016 at 04:53:36PM +0100, Sedat Dilek wrote:
> > > ffffffff8110f570 <del_timer_sync>:
> > > ffffffff8110f570: 55 push %rbp
> > > ffffffff8110f571: 48 89 e5 mov %rsp,%rbp
> > > ffffffff8110f574: 41 57 push %r15
> > > ffffffff8110f576: 41 56 push %r14
> > > ffffffff8110f578: 53 push %rbx
> > > ffffffff8110f579: 48 83 ec 28 sub $0x28,%rsp
> >
> > stack offset is 0x28 bytes [*]
>
> Actually, isn't it really 0x40 bytes? The stack pushed 3 words (8 bytes
> each) before doing the subtract. 0x28 == 40 bytes, 3 * 8 = 24,
> 40 + 24 = 64 == 0x40.

> > > ffffffff8110f5fd: 4c 89 75 d0 mov %r14,-0x30(%rbp)
> > > ffffffff8110f601: ff 75 d0 pushq -0x30(%rbp)
> > > ffffffff8110f604: 9d popfq
> >
> > put r14 into -0x30(rbp) and pushes/pops that, see [*] this is 8 bytes
> > over stack ?!
>
> But from rbp, the stack is 0x40 bytes. This may be fine.

Ah indeed.

2016-03-02 16:52:44

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On 3/2/16, Sedat Dilek <[email protected]> wrote:
> On 3/2/16, Peter Zijlstra <[email protected]> wrote:
>> On Wed, Mar 02, 2016 at 04:53:36PM +0100, Sedat Dilek wrote:
>>> ffffffff8110f570 <del_timer_sync>:
>>> ffffffff8110f570: 55 push %rbp
>>> ffffffff8110f571: 48 89 e5 mov %rsp,%rbp
>>> ffffffff8110f574: 41 57 push %r15
>>> ffffffff8110f576: 41 56 push %r14
>>> ffffffff8110f578: 53 push %rbx
>>> ffffffff8110f579: 48 83 ec 28 sub $0x28,%rsp
>>
>> stack offset is 0x28 bytes [*]
>>
>>> ffffffff8110f57d: 48 89 fb mov %rdi,%rbx
>>> ffffffff8110f580: e8 6b 6e 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f585: e8 66 6e 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f58a: e8 61 6e 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f58f: e8 5c 6e 80 00 callq ffffffff819163f0 <mcount>
>>
>> Your compiler is on drugs!
>>
>>> ffffffff8110f594: 9c pushfq
>>> ffffffff8110f595: 8f 45 e0 popq -0x20(%rbp)
>>
>> Saves flags in -0x20(%rbp)
>>
>>> ffffffff8110f598: 4c 8b 7d e0 mov -0x20(%rbp),%r15
>>
>> And in %r15
>>
>> /me wonders what's wrong with: popf %r15
>>
>>> ffffffff8110f59c: e8 4f 6e 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f5a1: e8 4a 6e 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f5a6: fa cli
>>> ffffffff8110f5a7: e8 84 cb fc ff callq ffffffff810dc130
>>> <trace_hardirqs_off>
>>> ffffffff8110f5ac: 4c 8d 73 50 lea 0x50(%rbx),%r14
>>> ffffffff8110f5b0: 48 c7 04 24 b0 f5 10 movq
>>> $0xffffffff8110f5b0,(%rsp)
>>> ffffffff8110f5b7: 81
>>> ffffffff8110f5b8: 31 f6 xor %esi,%esi
>>> ffffffff8110f5ba: 31 d2 xor %edx,%edx
>>> ffffffff8110f5bc: 31 c9 xor %ecx,%ecx
>>> ffffffff8110f5be: 41 b8 01 00 00 00 mov $0x1,%r8d
>>> ffffffff8110f5c4: 45 31 c9 xor %r9d,%r9d
>>> ffffffff8110f5c7: 4c 89 f7 mov %r14,%rdi
>>> ffffffff8110f5ca: e8 c1 e5 fc ff callq ffffffff810ddb90
>>> <lock_acquire>
>>> ffffffff8110f5cf: be 01 00 00 00 mov $0x1,%esi
>>> ffffffff8110f5d4: 48 c7 c2 cf f5 10 81 mov $0xffffffff8110f5cf,%rdx
>>> ffffffff8110f5db: 4c 89 f7 mov %r14,%rdi
>>> ffffffff8110f5de: e8 8d 08 fd ff callq ffffffff810dfe70
>>> <lock_release>
>>> ffffffff8110f5e3: e8 08 6e 80 00 callq ffffffff819163f0 <mcount>
>>
>>> ffffffff8110f5e8: 4c 89 f8 mov %r15,%rax
>>> ffffffff8110f5eb: 49 89 c6 mov %rax,%r14
>>
>> Moves r15 into r14 through rax
>>
>>
>>> ffffffff8110f5ee: f6 c4 02 test $0x2,%ah
>>> ffffffff8110f5f1: 75 19 jne ffffffff8110f60c
>>> <del_timer_sync+0x9c>
>>> ffffffff8110f5f3: e8 f8 6d 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f5f8: e8 f3 6d 80 00 callq ffffffff819163f0 <mcount>
>>
>>
>>> ffffffff8110f5fd: 4c 89 75 d0 mov %r14,-0x30(%rbp)
>>> ffffffff8110f601: ff 75 d0 pushq -0x30(%rbp)
>>> ffffffff8110f604: 9d popfq
>>
>> put r14 into -0x30(rbp) and pushes/pops that, see [*] this is 8 bytes
>> over stack ?!
>>
>>> ffffffff8110f605: e8 26 cb fc ff callq ffffffff810dc130
>>> <trace_hardirqs_off>
>>> ffffffff8110f60a: eb 17 jmp ffffffff8110f623
>>> <del_timer_sync+0xb3>
>>> ffffffff8110f60c: e8 2f cb fc ff callq ffffffff810dc140
>>> <trace_hardirqs_on>
>>> ffffffff8110f611: e8 da 6d 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f616: e8 d5 6d 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f61b: 4c 89 75 d8 mov %r14,-0x28(%rbp)
>>> ffffffff8110f61f: ff 75 d8 pushq -0x28(%rbp)
>>> ffffffff8110f622: 9d popfq
>>
>> puts r14 into -0x28(rbp) and pushes/pops that
>>
>>> ffffffff8110f623: e8 c8 6d 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f628: 65 8b 04 25 d4 ae 00 mov %gs:0xaed4,%eax
>>> ffffffff8110f62f: 00
>>> ffffffff8110f630: a9 00 00 0f 00 test $0xf0000,%eax
>>> ffffffff8110f635: 74 25 je ffffffff8110f65c
>>> <del_timer_sync+0xec>
>>> ffffffff8110f637: f6 43 2a 20 testb $0x20,0x2a(%rbx)
>>> ffffffff8110f63b: 75 1f jne ffffffff8110f65c
>>> <del_timer_sync+0xec>
>>> ffffffff8110f63d: 48 c7 c7 04 54 c5 81 mov $0xffffffff81c55404,%rdi
>>> ffffffff8110f644: be 61 04 00 00 mov $0x461,%esi
>>> ffffffff8110f649: e8 12 c4 f6 ff callq ffffffff8107ba60
>>> <warn_slowpath_null>
>>> ffffffff8110f64e: eb 0c jmp ffffffff8110f65c
>>> <del_timer_sync+0xec>
>>> ffffffff8110f650: e8 9b 6d 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f655: e8 96 6d 80 00 callq ffffffff819163f0 <mcount>
>>> ffffffff8110f65a: f3 90 pause
>>> ffffffff8110f65c: 48 89 df mov %rbx,%rdi
>>> ffffffff8110f65f: e8 4c fe ff ff callq ffffffff8110f4b0
>>> <try_to_del_timer_sync>
>>> ffffffff8110f664: 85 c0 test %eax,%eax
>>> ffffffff8110f666: 78 e8 js ffffffff8110f650
>>> <del_timer_sync+0xe0>
>>> ffffffff8110f668: 48 83 c4 28 add $0x28,%rsp
>>> ffffffff8110f66c: 5b pop %rbx
>>> ffffffff8110f66d: 41 5e pop %r14
>>> ffffffff8110f66f: 41 5f pop %r15
>>> ffffffff8110f671: 5d pop %rbp
>>> ffffffff8110f672: c3 retq
>>> ffffffff8110f673: 66 66 66 66 2e 0f 1f data32 data32 data32 nopw
>>> %cs:0x0(%rax,%rax,1)
>>> ffffffff8110f67a: 84 00 00 00 00 00
>>
>>
>> That LLVM generate disgusting code, as a contrast, this is what my GCC
>> makes of this:
>>
>> ffffffff81155f50 <del_timer_sync>:
>> ffffffff81155f50: e8 bb 02 9c 00 callq ffffffff81b16210
>> <__fentry__>
>> ffffffff81155f55: 55 push %rbp
>> ffffffff81155f56: 48 89 e5 mov %rsp,%rbp
>> ffffffff81155f59: 41 55 push %r13
>> ffffffff81155f5b: 41 54 push %r12
>> ffffffff81155f5d: 53 push %rbx
>> ffffffff81155f5e: 48 89 fb mov %rdi,%rbx
>> ffffffff81155f61: 48 83 ec 08 sub $0x8,%rsp
>> ffffffff81155f65: 9c pushfq
>> ffffffff81155f66: 41 5c pop %r12
>> ffffffff81155f68: fa cli
>> ffffffff81155f69: e8 72 5c fd ff callq ffffffff8112bbe0
>> <trace_hardirqs_off>
>> ffffffff81155f6e: 4c 8d 6b 50 lea 0x50(%rbx),%r13
>> ffffffff81155f72: 45 31 c9 xor %r9d,%r9d
>> ffffffff81155f75: 31 c9 xor %ecx,%ecx
>> ffffffff81155f77: 31 d2 xor %edx,%edx
>> ffffffff81155f79: 31 f6 xor %esi,%esi
>> ffffffff81155f7b: 4c 89 ef mov %r13,%rdi
>> ffffffff81155f7e: 48 c7 04 24 55 5f 15 movq
>> $0xffffffff81155f55,(%rsp)
>> ffffffff81155f85: 81
>> ffffffff81155f86: 41 b8 01 00 00 00 mov $0x1,%r8d
>> ffffffff81155f8c: e8 3f b3 fd ff callq ffffffff811312d0
>> <lock_acquire>
>> ffffffff81155f91: 48 c7 c2 55 5f 15 81 mov
>> $0xffffffff81155f55,%rdx
>> ffffffff81155f98: be 01 00 00 00 mov $0x1,%esi
>> ffffffff81155f9d: 4c 89 ef mov %r13,%rdi
>> ffffffff81155fa0: e8 0b b5 fd ff callq ffffffff811314b0
>> <lock_release>
>> ffffffff81155fa5: 41 f7 c4 00 02 00 00 test $0x200,%r12d
>> ffffffff81155fac: 75 52 jne ffffffff81156000
>> <del_timer_sync+0xb0>
>> ffffffff81155fae: 41 54 push %r12
>> ffffffff81155fb0: 9d popfq
>> ffffffff81155fb1: e8 2a 5c fd ff callq ffffffff8112bbe0
>> <trace_hardirqs_off>
>> ffffffff81155fb6: 65 8b 05 5b 66 eb 7e mov
>> %gs:0x7eeb665b(%rip),%eax # c618 <__preempt_count>
>> ffffffff81155fbd: a9 00 00 0f 00 test $0xf0000,%eax
>> ffffffff81155fc2: 74 25 je ffffffff81155fe9
>> <del_timer_sync+0x99>
>> ffffffff81155fc4: f6 43 2a 20 testb $0x20,0x2a(%rbx)
>> ffffffff81155fc8: 75 1f jne ffffffff81155fe9
>> <del_timer_sync+0x99>
>> ffffffff81155fca: be 61 04 00 00 mov $0x461,%esi
>> ffffffff81155fcf: 48 c7 c7 09 87 f3 81 mov
>> $0xffffffff81f38709,%rdi
>> ffffffff81155fd6: e8 15 03 f8 ff callq ffffffff810d62f0
>> <warn_slowpath_null>
>> ffffffff81155fdb: 48 89 df mov %rbx,%rdi
>> ffffffff81155fde: e8 fd fe ff ff callq ffffffff81155ee0
>> <try_to_del_timer_sync>
>> ffffffff81155fe3: 85 c0 test %eax,%eax
>> ffffffff81155fe5: 79 0e jns ffffffff81155ff5
>> <del_timer_sync+0xa5>
>> ffffffff81155fe7: f3 90 pause
>> ffffffff81155fe9: 48 89 df mov %rbx,%rdi
>> ffffffff81155fec: e8 ef fe ff ff callq ffffffff81155ee0
>> <try_to_del_timer_sync>
>> ffffffff81155ff1: 85 c0 test %eax,%eax
>> ffffffff81155ff3: 78 f2 js ffffffff81155fe7
>> <del_timer_sync+0x97>
>> ffffffff81155ff5: 48 83 c4 08 add $0x8,%rsp
>> ffffffff81155ff9: 5b pop %rbx
>> ffffffff81155ffa: 41 5c pop %r12
>> ffffffff81155ffc: 41 5d pop %r13
>> ffffffff81155ffe: 5d pop %rbp
>> ffffffff81155fff: c3 retq
>> ffffffff81156000: e8 4b 8b fd ff callq ffffffff8112eb50
>> <trace_hardirqs_on>
>> ffffffff81156005: 41 54 push %r12
>> ffffffff81156007: 9d popfq
>> ffffffff81156008: eb ac jmp ffffffff81155fb6
>> <del_timer_sync+0x66>
>> ffffffff8115600a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
>>
>
> OK, I cannot say much to your analysis.
>
> The full make-lines look differently.
>
> GCC for example has '-mfentry -DCC_USING_FENTRY'.
>
> [ CLANG ]
>
> mycompiler -Wp,-MD,kernel/.workqueue.o.d -nostdinc -isystem
> /opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include -nostdinc
> -isystem /opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include
> -I./arch/x86/include -Iarch/x86/include/generated/uapi
> -Iarch/x86/include/generated -Iinclude -I./arch/x86/include/uapi
> -Iarch/x86/include/generated/uapi -I./include/uapi
> -Iinclude/generated/uapi -include ./include/linux/kconfig.h
> -D__KERNEL__ -Qunused-arguments -Wno-unknown-warning-option -Wall
> -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing
> -fno-common -Werror-implicit-function-declaration -Wno-format-security
> -std=gnu89 -no-integrated-as -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
> -mno-avx -m64 -mtune=generic -mno-red-zone -mcmodel=kernel
> -funit-at-a-time -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1
> -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
> -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1
> -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare
> -fno-asynchronous-unwind-tables -O2 -Wframe-larger-than=1024
> -fno-stack-protector -Wno-unused-variable
> -Wno-format-invalid-specifier -Wno-gnu -Wno-asm-operand-widths
> -Wno-initializer-overrides -fno-builtin -Wno-tautological-compare
> -mno-global-merge -fno-omit-frame-pointer -fno-optimize-sibling-calls
> -pg -Wdeclaration-after-statement -Wno-pointer-sign
> -fno-strict-overflow -Werror=implicit-int -Werror=strict-prototypes
> -Werror=date-time -Wno-initializer-overrides -Wno-unused-value
> -Wno-format -Wno-unknown-warning-option -Wno-sign-compare
> -Wno-format-zero-length -Wno-uninitialized -D"KBUILD_STR(s)=#s"
> -D"KBUILD_BASENAME=KBUILD_STR(workqueue)"
> -D"KBUILD_MODNAME=KBUILD_STR(workqueue)" -c -o kernel/.tmp_workqueue.o
> kernel/workqueue.c
>
> mycompiler -Wp,-MD,kernel/time/.timer.o.d -nostdinc -isystem
> /opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include -nostdinc
> -isystem /opt/llvm-toolchain-3.8.0rc3/bin/../lib/clang/3.8.0/include
> -I./arch/x86/include -Iarch/x86/include/generated/uapi
> -Iarch/x86/include/generated -Iinclude -I./arch/x86/include/uapi
> -Iarch/x86/include/generated/uapi -I./include/uapi
> -Iinclude/generated/uapi -include ./include/linux/kconfig.h
> -D__KERNEL__ -Qunused-arguments -Wno-unknown-warning-option -Wall
> -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing
> -fno-common -Werror-implicit-function-declaration -Wno-format-security
> -std=gnu89 -no-integrated-as -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
> -mno-avx -m64 -mtune=generic -mno-red-zone -mcmodel=kernel
> -funit-at-a-time -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1
> -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
> -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1
> -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare
> -fno-asynchronous-unwind-tables -O2 -Wframe-larger-than=1024
> -fno-stack-protector -Wno-unused-variable
> -Wno-format-invalid-specifier -Wno-gnu -Wno-asm-operand-widths
> -Wno-initializer-overrides -fno-builtin -Wno-tautological-compare
> -mno-global-merge -fno-omit-frame-pointer -fno-optimize-sibling-calls
> -pg -Wdeclaration-after-statement -Wno-pointer-sign
> -fno-strict-overflow -Werror=implicit-int -Werror=strict-prototypes
> -Werror=date-time -Wno-initializer-overrides -Wno-unused-value
> -Wno-format -Wno-unknown-warning-option -Wno-sign-compare
> -Wno-format-zero-length -Wno-uninitialized -D"KBUILD_STR(s)=#s"
> -D"KBUILD_BASENAME=KBUILD_STR(timer)"
> -D"KBUILD_MODNAME=KBUILD_STR(timer)" -c -o kernel/time/.tmp_timer.o
> kernel/time/timer.c
>
> [ GCC ]
>
> mycompiler -Wp,-MD,kernel/.workqueue.o.d -nostdinc -isystem
> /usr/lib/gcc/x86_64-linux-gnu/4.9/include -nostdinc -isystem
> /usr/lib/gcc/x86_64-linux-gnu/4.9/include -I./arch/x86/include
> -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated
> -Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi
> -I./include/uapi -Iinclude/generated/uapi -include
> ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
> -Werror-implicit-function-declaration -Wno-format-security -std=gnu89
> -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1
> -falign-loops=1 -mno-80387 -mno-fp-ret-in-387
> -mpreferred-stack-boundary=3 -mtune=generic -mno-red-zone
> -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args
> -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
> -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1
> -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe
> -Wno-sign-compare -fno-asynchronous-unwind-tables
> -fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
> -Wframe-larger-than=1024 -fno-stack-protector
> -Wno-unused-but-set-variable -fno-omit-frame-pointer
> -fno-optimize-sibling-calls -fno-var-tracking-assignments -pg -mfentry
> -DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign
> -fno-strict-overflow -fconserve-stack -Werror=implicit-int
> -Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO
> -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(workqueue)"
> -D"KBUILD_MODNAME=KBUILD_STR(workqueue)" -c -o kernel/.tmp_workqueue.o
> kernel/workqueue.c
>
> mycompiler -Wp,-MD,kernel/time/.timer.o.d -nostdinc -isystem
> /usr/lib/gcc/x86_64-linux-gnu/4.9/include -nostdinc -isystem
> /usr/lib/gcc/x86_64-linux-gnu/4.9/include -I./arch/x86/include
> -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated
> -Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi
> -I./include/uapi -Iinclude/generated/uapi -include
> ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
> -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
> -Werror-implicit-function-declaration -Wno-format-security -std=gnu89
> -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1
> -falign-loops=1 -mno-80387 -mno-fp-ret-in-387
> -mpreferred-stack-boundary=3 -mtune=generic -mno-red-zone
> -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args
> -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
> -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1
> -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe
> -Wno-sign-compare -fno-asynchronous-unwind-tables
> -fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
> -Wframe-larger-than=1024 -fno-stack-protector
> -Wno-unused-but-set-variable -fno-omit-frame-pointer
> -fno-optimize-sibling-calls -fno-var-tracking-assignments -pg -mfentry
> -DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign
> -fno-strict-overflow -fconserve-stack -Werror=implicit-int
> -Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO
> -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(timer)"
> -D"KBUILD_MODNAME=KBUILD_STR(timer)" -c -o kernel/time/.tmp_timer.o
> kernel/time/timer.c
>
> I can try to use the make-line of GCC and re-compile.
>

Look here...

+ mycompiler -Wp,-MD,kernel/time/.timer.o.d -nostdinc -isystem
/usr/lib/gcc/x86_64-linux-gnu/4.9/include -nostdinc -isystem
/usr/lib/gcc/x86_64-linux-gnu/4.9/include -I./arch/x86/include
-Iarch/x86/include/generated/uapi -Iarch/x86/include/generated
-Iinclude -I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi
-I./include/uapi -Iinclude/generated/uapi -include
./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -std=gnu89
-mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1
-falign-loops=1 -mno-80387 -mno-fp-ret-in-387
-mpreferred-stack-boundary=3 -mtune=generic -mno-red-zone
-mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args
-DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
-DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1
-DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe
-Wno-sign-compare -fno-asynchronous-unwind-tables
-fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
-Wframe-larger-than=1024 -fno-stack-protector
-Wno-unused-but-set-variable -fno-omit-frame-pointer
-fno-optimize-sibling-calls -fno-var-tracking-assignments -pg -mfentry
-DCC_USING_FENTRY -Wdeclaration-after-statement -Wno-pointer-sign
-fno-strict-overflow -fconserve-stack -Werror=implicit-int
-Werror=strict-prototypes -Werror=date-time -DCC_HAVE_ASM_GOTO
-DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR(timer)
-DKBUILD_MODNAME=KBUILD_STR(timer) -c -o kernel/time/.tmp_timer.o
kernel/time/timer.c
clang-3.8: error: unknown argument: '-mno-80387'
clang-3.8: error: unknown argument: '-mno-fp-ret-in-387'
clang-3.8: error: unknown argument: '-mpreferred-stack-boundary=3'
clang-3.8: error: unknown argument: '-maccumulate-outgoing-args'
clang-3.8: error: unknown argument: '-fno-var-tracking-assignments'
clang-3.8: error: unknown argument: '-mfentry'
clang-3.8: error: unknown argument: '-fconserve-stack'
clang-3.8: warning: optimization flag '-falign-jumps=1' is not supported
clang-3.8: warning: optimization flag '-falign-loops=1' is not supported
clang-3.8: warning: optimization flag
'-fno-delete-null-pointer-checks' is not supported
+ ld -m elf_x86_64 -r kernel/time/.tmp_timer.o -o kernel/time/timer.o
ld: cannot find kernel/time/.tmp_timer.o: No such file or directory

- Sedat -

2016-03-04 16:04:32

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Wed, 2 Mar 2016, Sedat Dilek wrote:

> On 3/1/16, Alan Stern <[email protected]> wrote:
> > On Tue, 1 Mar 2016, Sedat Dilek wrote:
> >
> >> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
> >> wrote:
> >> > On Sat, 3 Oct 2015 12:05:42 +0200
> >> > Sedat Dilek <[email protected]> wrote:
> >> >
> >> >> So, at the beginning... dunno WTF is causing the problems - no
> >> >> workaround for CLANG.
> >> >
> >> > Probably need to compile with gcc and with clang and look at the binary
> >> > differences. Or at least what objdump shows.
> >> >
> >>
> >> [ Hope to address this issue to the correct people - CCed some people
> >> I taped on their nerves ]
> >>
> >> Not sure if I should open a new thread?
> >> Please, some clear statements on this.
> >> Thanks.
> >>
> >> The issue is still visible and alive.

I think it would be worthwhile to doublecheck the time at which
interrupts get disabled. Sedat, please try your plug/unplug the USB
mouse test with the patch below.

Alan Stern



Index: usb-4.4/drivers/hid/usbhid/hid-core.c
===================================================================
--- usb-4.4.orig/drivers/hid/usbhid/hid-core.c
+++ usb-4.4/drivers/hid/usbhid/hid-core.c
@@ -1393,8 +1393,11 @@ static void usbhid_disconnect(struct usb

static void hid_cancel_delayed_stuff(struct usbhid_device *usbhid)
{
+ if (raw_irqs_disabled()) pr_info("usbhid irqs disabled A\n");
del_timer_sync(&usbhid->io_retry);
+ if (raw_irqs_disabled()) pr_info("usbhid irqs disabled B\n");
cancel_work_sync(&usbhid->reset_work);
+ if (raw_irqs_disabled()) pr_info("usbhid irqs disabled C\n");
}

static void hid_cease_io(struct usbhid_device *usbhid)

2016-03-05 22:31:03

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Fri, Mar 4, 2016 at 5:04 PM, Alan Stern <[email protected]> wrote:
> On Wed, 2 Mar 2016, Sedat Dilek wrote:
>
>> On 3/1/16, Alan Stern <[email protected]> wrote:
>> > On Tue, 1 Mar 2016, Sedat Dilek wrote:
>> >
>> >> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
>> >> wrote:
>> >> > On Sat, 3 Oct 2015 12:05:42 +0200
>> >> > Sedat Dilek <[email protected]> wrote:
>> >> >
>> >> >> So, at the beginning... dunno WTF is causing the problems - no
>> >> >> workaround for CLANG.
>> >> >
>> >> > Probably need to compile with gcc and with clang and look at the binary
>> >> > differences. Or at least what objdump shows.
>> >> >
>> >>
>> >> [ Hope to address this issue to the correct people - CCed some people
>> >> I taped on their nerves ]
>> >>
>> >> Not sure if I should open a new thread?
>> >> Please, some clear statements on this.
>> >> Thanks.
>> >>
>> >> The issue is still visible and alive.
>
> I think it would be worthwhile to doublecheck the time at which
> interrupts get disabled. Sedat, please try your plug/unplug the USB
> mouse test with the patch below.
>
> Alan Stern
>
>
>
> Index: usb-4.4/drivers/hid/usbhid/hid-core.c
> ===================================================================
> --- usb-4.4.orig/drivers/hid/usbhid/hid-core.c
> +++ usb-4.4/drivers/hid/usbhid/hid-core.c
> @@ -1393,8 +1393,11 @@ static void usbhid_disconnect(struct usb
>
> static void hid_cancel_delayed_stuff(struct usbhid_device *usbhid)
> {
> + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled A\n");
> del_timer_sync(&usbhid->io_retry);
> + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled B\n");
> cancel_work_sync(&usbhid->reset_work);
> + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled C\n");
> }
>
> static void hid_cease_io(struct usbhid_device *usbhid)
>

With your patch I get the dmesg attached.

- Sedat -


Attachments:
dmesg_4.4.4-3-llvmlinux-amd64_usbmouse-unplugged-and-replugged.txt (65.16 kB)
config-4.4.4-3-llvmlinux-amd64 (126.80 kB)
4.4.4-3-llvmlinux-amd64.patch (30.72 kB)
Download all attachments

2016-03-06 08:52:00

by Sedat Dilek

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Sat, Mar 5, 2016 at 11:30 PM, Sedat Dilek <[email protected]> wrote:
> On Fri, Mar 4, 2016 at 5:04 PM, Alan Stern <[email protected]> wrote:
>> On Wed, 2 Mar 2016, Sedat Dilek wrote:
>>
>>> On 3/1/16, Alan Stern <[email protected]> wrote:
>>> > On Tue, 1 Mar 2016, Sedat Dilek wrote:
>>> >
>>> >> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
>>> >> wrote:
>>> >> > On Sat, 3 Oct 2015 12:05:42 +0200
>>> >> > Sedat Dilek <[email protected]> wrote:
>>> >> >
>>> >> >> So, at the beginning... dunno WTF is causing the problems - no
>>> >> >> workaround for CLANG.
>>> >> >
>>> >> > Probably need to compile with gcc and with clang and look at the binary
>>> >> > differences. Or at least what objdump shows.
>>> >> >
>>> >>
>>> >> [ Hope to address this issue to the correct people - CCed some people
>>> >> I taped on their nerves ]
>>> >>
>>> >> Not sure if I should open a new thread?
>>> >> Please, some clear statements on this.
>>> >> Thanks.
>>> >>
>>> >> The issue is still visible and alive.
>>
>> I think it would be worthwhile to doublecheck the time at which
>> interrupts get disabled. Sedat, please try your plug/unplug the USB
>> mouse test with the patch below.
>>
>> Alan Stern
>>
>>
>>
>> Index: usb-4.4/drivers/hid/usbhid/hid-core.c
>> ===================================================================
>> --- usb-4.4.orig/drivers/hid/usbhid/hid-core.c
>> +++ usb-4.4/drivers/hid/usbhid/hid-core.c
>> @@ -1393,8 +1393,11 @@ static void usbhid_disconnect(struct usb
>>
>> static void hid_cancel_delayed_stuff(struct usbhid_device *usbhid)
>> {
>> + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled A\n");
>> del_timer_sync(&usbhid->io_retry);
>> + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled B\n");
>> cancel_work_sync(&usbhid->reset_work);
>> + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled C\n");
>> }
>>
>> static void hid_cease_io(struct usbhid_device *usbhid)
>>
>
> With your patch I get the dmesg attached.
>

[ CC PaX and David Woodhouse ]

I remembered that CONFIG_FTRACE is not working correct (details see
[1] as explained by PaX).
Thus I disabled it (see 4.4.4-2-llvmlinux-amd64).
That 'mcount' bogus lines are gone.

In a 2nd test-build I turned off CONFIG_PROVE_LOCKING en-plus as it
enables CONFIG_TRACE_IRQFLAGS (4.4.4-4-llvmlinux-amd64).

The call-traces when booting and/or un-/re-plugging my USB-mouse are
still alive in all built clang-compiled Linux-kernels.

dmesg-output for -4 kernel is attached, it includes the usbhid
test-patch from Alan Stern (plus some other CC:stable fixes against
Linux v4.4.4).

Hope this helps.

- Sedat -

[1] https://lists.linuxfoundation.org/pipermail/llvmlinux/2013-July/000429.html
[2] http://cateee.net/lkddb/web-lkddb/FTRACE.html
[3] http://cateee.net/lkddb/web-lkddb/TRACE_IRQFLAGS.html
[4] http://cateee.net/lkddb/web-lkddb/PROVE_LOCKING.html
[5] http://cateee.net/lkddb/web-lkddb/TRACE_IRQFLAGS_SUPPORT.html


Attachments:
objdump-D_vmlinux_del_timer_sync_4.4.4-2-llvmlinux-amd64.txt (3.83 kB)
config-4.4.4-2-llvmlinux-amd64 (125.64 kB)
config-4.4.4-4-llvmlinux-amd64 (125.60 kB)
objdump-D_vmlinux_del_timer_sync_4.4.4-4-llvmlinux-amd64.txt (2.97 kB)
dmesg_4.4.4-4-llvmlinux-amd64_usbmouse-unplugged-and-replugged.txt (73.97 kB)
4.4.4-4-llvmlinux-amd64.patch (31.64 kB)
Download all attachments

2016-03-06 17:23:13

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] usbhid: Fix lockdep unannotated irqs-off warning

On Sat, 5 Mar 2016, Sedat Dilek wrote:

> On Fri, Mar 4, 2016 at 5:04 PM, Alan Stern <[email protected]> wrote:
> > On Wed, 2 Mar 2016, Sedat Dilek wrote:
> >
> >> On 3/1/16, Alan Stern <[email protected]> wrote:
> >> > On Tue, 1 Mar 2016, Sedat Dilek wrote:
> >> >
> >> >> On Tue, Oct 13, 2015 at 2:57 AM, Steven Rostedt <[email protected]>
> >> >> wrote:
> >> >> > On Sat, 3 Oct 2015 12:05:42 +0200
> >> >> > Sedat Dilek <[email protected]> wrote:
> >> >> >
> >> >> >> So, at the beginning... dunno WTF is causing the problems - no
> >> >> >> workaround for CLANG.
> >> >> >
> >> >> > Probably need to compile with gcc and with clang and look at the binary
> >> >> > differences. Or at least what objdump shows.
> >> >> >
> >> >>
> >> >> [ Hope to address this issue to the correct people - CCed some people
> >> >> I taped on their nerves ]
> >> >>
> >> >> Not sure if I should open a new thread?
> >> >> Please, some clear statements on this.
> >> >> Thanks.
> >> >>
> >> >> The issue is still visible and alive.
> >
> > I think it would be worthwhile to doublecheck the time at which
> > interrupts get disabled. Sedat, please try your plug/unplug the USB
> > mouse test with the patch below.
> >
> > Alan Stern
> >
> >
> >
> > Index: usb-4.4/drivers/hid/usbhid/hid-core.c
> > ===================================================================
> > --- usb-4.4.orig/drivers/hid/usbhid/hid-core.c
> > +++ usb-4.4/drivers/hid/usbhid/hid-core.c
> > @@ -1393,8 +1393,11 @@ static void usbhid_disconnect(struct usb
> >
> > static void hid_cancel_delayed_stuff(struct usbhid_device *usbhid)
> > {
> > + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled A\n");
> > del_timer_sync(&usbhid->io_retry);
> > + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled B\n");
> > cancel_work_sync(&usbhid->reset_work);
> > + if (raw_irqs_disabled()) pr_info("usbhid irqs disabled C\n");
> > }
> >
> > static void hid_cease_io(struct usbhid_device *usbhid)
> >
>
> With your patch I get the dmesg attached.

> [ 22.234758] usbhid irqs disabled A
> [ 22.234857] usbhid irqs disabled B
> [ 22.234912] BUG: sleeping function called from invalid context atkernel/workqueue.c:2688

That's a smoking gun. It means everyone has been looking in the wrong
place. Can you provide an objdump listing of usbhid_close()? The
routine starts like this:

void usbhid_close(struct hid_device *hid)
{
struct usbhid_device *usbhid = hid->driver_data;

mutex_lock(&hid_open_mut);

/* protecting hid->open to make sure we don't restart
* data acquistion due to a resumption we no longer
* care about
*/
spin_lock_irq(&usbhid->lock);
if (!--hid->open) {
spin_unlock_irq(&usbhid->lock);
hid_cancel_delayed_stuff(usbhid);

It appears that the spin_unlock_irq() call isn't working.

For extra thoroughness, try putting one of those raw_irqs_disabled()
checks just before and one just after the spin_lock_irq() line above.
Maybe also before the mutex_lock() line.

Alan Stern