2015-04-23 17:07:49

by Josh Boyer

[permalink] [raw]
Subject: Suspicious RCU usage in bridge with Linux v4.0-9362-g1fc149933fd4

Hi All,

We've had a user report the following backtrace from the bridge module
with a recent Linus' tree. Has anything like this been reported yet?
If you have any questions on setup, the user is CC'd.

josh

[ 29.382235] br0: port 1(tap0) entered forwarding state

[ 29.382286] ===============================
[ 29.382315] [ INFO: suspicious RCU usage. ]
[ 29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted
[ 29.382380] -------------------------------
[ 29.382409] net/bridge/br_private.h:626 suspicious
rcu_dereference_check() usage!
[ 29.382455]
other info that might help us debug this:

[ 29.382507]
rcu_scheduler_active = 1, debug_locks = 0
[ 29.382549] 2 locks held by swapper/0/0:
[ 29.382576] #0: (((&p->forward_delay_timer))){+.-...}, at:
[<ffffffff81139f75>] call_timer_fn+0x5/0x4f0
[ 29.382660] #1: (&(&br->lock)->rlock){+.-...}, at:
[<ffffffffa0450dc1>] br_forward_delay_timer_expired+0x31/0x140
[bridge]
[ 29.382754]
stack backtrace:
[ 29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.1.0-0.rc0.git11.1.fc23.x86_64 #1
[ 29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015
[ 29.382882] 0000000000000000 3ebfc20364115825 ffff880666603c48
ffffffff81892d4b
[ 29.382943] 0000000000000000 ffffffff81e124e0 ffff880666603c78
ffffffff8110bcd7
[ 29.383004] ffff8800785c9d00 ffff88065485ac58 ffff880c62002800
ffff880c5fc88ac0
[ 29.383065] Call Trace:
[ 29.383084] <IRQ> [<ffffffff81892d4b>] dump_stack+0x4c/0x65
[ 29.383130] [<ffffffff8110bcd7>] lockdep_rcu_suspicious+0xe7/0x120
[ 29.383178] [<ffffffffa04520f9>] br_fill_ifinfo+0x4a9/0x6a0 [bridge]
[ 29.383225] [<ffffffffa045266b>] br_ifinfo_notify+0x11b/0x4b0 [bridge]
[ 29.383271] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
[ 29.383320] [<ffffffffa0450de8>]
br_forward_delay_timer_expired+0x58/0x140 [bridge]
[ 29.383371] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
[ 29.383416] [<ffffffff8113a033>] call_timer_fn+0xc3/0x4f0
[ 29.383454] [<ffffffff81139f75>] ? call_timer_fn+0x5/0x4f0
[ 29.383493] [<ffffffff8110a90f>] ? lock_release_holdtime.part.29+0xf/0x200
[ 29.383541] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
[ 29.383587] [<ffffffff8113a6a4>] run_timer_softirq+0x244/0x490
[ 29.383629] [<ffffffff810b68cc>] __do_softirq+0xec/0x670
[ 29.383666] [<ffffffff810b70d5>] irq_exit+0x145/0x150
[ 29.383703] [<ffffffff8189f506>] smp_apic_timer_interrupt+0x46/0x60
[ 29.383744] [<ffffffff8189d523>] apic_timer_interrupt+0x73/0x80
[ 29.383782] <EOI> [<ffffffff816f131f>] ? cpuidle_enter_state+0x5f/0x2f0
[ 29.383832] [<ffffffff816f131b>] ? cpuidle_enter_state+0x5b/0x2f0
[ 29.383873] [<ffffffff816f15e7>] cpuidle_enter+0x17/0x20
[ 29.383908] [<ffffffff81103c6f>] cpu_startup_entry+0x36f/0x5f0
[ 29.383949] [<ffffffff81887cbd>] rest_init+0x13d/0x150
[ 29.383986] [<ffffffff821c905b>] start_kernel+0x4d2/0x4f3
[ 29.384023] [<ffffffff821c8120>] ? early_idt_handlers+0x120/0x120
[ 29.384064] [<ffffffff821c8339>] x86_64_start_reservations+0x2a/0x2c
[ 29.384105] [<ffffffff821c8485>] x86_64_start_kernel+0x14a/0x16d
[ 36.100347] IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:12:01:00:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe12:0100
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
[ 40.113744] IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:12:01:00:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe12:0100
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0
[ 44.119727] IN=br0 OUT=
MAC=33:33:00:00:00:02:52:54:00:12:01:00:86:dd
SRC=fe80:0000:0000:0000:5054:00ff:fe12:0100
DST=ff02:0000:0000:0000:0000:0000:0000:0002 LEN=56 TC=0 HOPLIMIT=255
FLOWLBL=0 PROTO=ICMPv6 TYPE=133 CODE=0


2015-04-23 17:35:11

by Sudeep Holla

[permalink] [raw]
Subject: Re: Suspicious RCU usage in bridge with Linux v4.0-9362-g1fc149933fd4

On Thu, Apr 23, 2015 at 6:07 PM, Josh Boyer <[email protected]> wrote:
> Hi All,
>
> We've had a user report the following backtrace from the bridge module
> with a recent Linus' tree. Has anything like this been reported yet?
> If you have any questions on setup, the user is CC'd.
>

I too observed similar backtrace once(not able to reproduce it again)
while I was trying to check inconsistent lock state[1] with lockdep enabled.

Regards,
Sudeep

[1] https://lkml.org/lkml/2015/4/23/329


--->8

===============================
[ INFO: suspicious RCU usage. ]
4.0.0 #269 Not tainted
-------------------------------
include/trace/events/ipi.h:68 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0 #269
Hardware name: ARM-Versatile Express
[<c00151f1>] (unwind_backtrace) from [<c0011971>] (show_stack+0x11/0x14)
[<c0011971>] (show_stack) from [<c05627c7>] (dump_stack+0x73/0x8c)
[<c05627c7>] (dump_stack) from [<c0013e63>] (handle_IPI+0x257/0x410)
[<c0013e63>] (handle_IPI) from [<c000932b>] (gic_handle_irq+0x4f/0x50)
[<c000932b>] (gic_handle_irq) from [<c00121ff>] (__irq_svc+0x3f/0x64)
Exception stack(0xc083bf30 to 0xc083bf78)
bf20: 00000001 00000001 00000000 c0841938
bf40: c083a000 c083d5fc c08ba5c8 c083d598 00000000 00000000 c08b938f c056afc8
bf60: 00000008 c083bf78 c005d157 c000f288 40000133 ffffffff
[<c00121ff>] (__irq_svc) from [<c000f288>] (arch_cpu_idle+0x30/0x34)
[<c000f288>] (arch_cpu_idle) from [<c00555d5>] (cpu_startup_entry+0x339/0x404)
[<c00555d5>] (cpu_startup_entry) from [<c07d1a4f>] (start_kernel+0x32f/0x338)
[<c07d1a4f>] (start_kernel) from [<8000807f>] (0x8000807f)

===============================
[ INFO: suspicious RCU usage. ]
4.0.0 #269 Not tainted
-------------------------------
include/trace/events/ipi.h:84 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.0 #269
Hardware name: ARM-Versatile Express
[<c00151f1>] (unwind_backtrace) from [<c0011971>] (show_stack+0x11/0x14)
[<c0011971>] (show_stack) from [<c05627c7>] (dump_stack+0x73/0x8c)
[<c05627c7>] (dump_stack) from [<c0013f85>] (handle_IPI+0x379/0x410)
[<c0013f85>] (handle_IPI) from [<c000932b>] (gic_handle_irq+0x4f/0x50)
[<c000932b>] (gic_handle_irq) from [<c00121ff>] (__irq_svc+0x3f/0x64)
Exception stack(0xc083bf30 to 0xc083bf78)
bf20: 00000001 00000001 00000000 c0841938
bf40: c083a000 c083d5fc c08ba5c8 c083d598 00000000 00000000 c08b938f c056afc8
bf60: 00000008 c083bf78 c005d157 c000f288 40000133 ffffffff
[<c00121ff>] (__irq_svc) from [<c000f288>] (arch_cpu_idle+0x30/0x34)
[<c000f288>] (arch_cpu_idle) from [<c00555d5>] (cpu_startup_entry+0x339/0x404)
[<c00555d5>] (cpu_startup_entry) from [<c07d1a4f>] (start_kernel+0x32f/0x338)
[<c07d1a4f>] (start_kernel) from [<8000807f>] (0x8000807f)

2015-04-23 22:53:37

by Cong Wang

[permalink] [raw]
Subject: Re: Suspicious RCU usage in bridge with Linux v4.0-9362-g1fc149933fd4

On Thu, Apr 23, 2015 at 10:07 AM, Josh Boyer <[email protected]> wrote:
> [ 29.382235] br0: port 1(tap0) entered forwarding state
>
> [ 29.382286] ===============================
> [ 29.382315] [ INFO: suspicious RCU usage. ]
> [ 29.382344] 4.1.0-0.rc0.git11.1.fc23.x86_64 #1 Not tainted
> [ 29.382380] -------------------------------
> [ 29.382409] net/bridge/br_private.h:626 suspicious
> rcu_dereference_check() usage!
> [ 29.382455]
> other info that might help us debug this:
>
> [ 29.382507]
> rcu_scheduler_active = 1, debug_locks = 0
> [ 29.382549] 2 locks held by swapper/0/0:
> [ 29.382576] #0: (((&p->forward_delay_timer))){+.-...}, at:
> [<ffffffff81139f75>] call_timer_fn+0x5/0x4f0
> [ 29.382660] #1: (&(&br->lock)->rlock){+.-...}, at:
> [<ffffffffa0450dc1>] br_forward_delay_timer_expired+0x31/0x140
> [bridge]
> [ 29.382754]
> stack backtrace:
> [ 29.382787] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 4.1.0-0.rc0.git11.1.fc23.x86_64 #1
> [ 29.382838] Hardware name: LENOVO 422916G/LENOVO, BIOS A1KT53AUS 04/07/2015
> [ 29.382882] 0000000000000000 3ebfc20364115825 ffff880666603c48
> ffffffff81892d4b
> [ 29.382943] 0000000000000000 ffffffff81e124e0 ffff880666603c78
> ffffffff8110bcd7
> [ 29.383004] ffff8800785c9d00 ffff88065485ac58 ffff880c62002800
> ffff880c5fc88ac0
> [ 29.383065] Call Trace:
> [ 29.383084] <IRQ> [<ffffffff81892d4b>] dump_stack+0x4c/0x65
> [ 29.383130] [<ffffffff8110bcd7>] lockdep_rcu_suspicious+0xe7/0x120
> [ 29.383178] [<ffffffffa04520f9>] br_fill_ifinfo+0x4a9/0x6a0 [bridge]
> [ 29.383225] [<ffffffffa045266b>] br_ifinfo_notify+0x11b/0x4b0 [bridge]
> [ 29.383271] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
> [ 29.383320] [<ffffffffa0450de8>]
> br_forward_delay_timer_expired+0x58/0x140 [bridge]
> [ 29.383371] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
> [ 29.383416] [<ffffffff8113a033>] call_timer_fn+0xc3/0x4f0
> [ 29.383454] [<ffffffff81139f75>] ? call_timer_fn+0x5/0x4f0
> [ 29.383493] [<ffffffff8110a90f>] ? lock_release_holdtime.part.29+0xf/0x200
> [ 29.383541] [<ffffffffa0450d90>] ? br_hold_timer_expired+0x70/0x70 [bridge]
> [ 29.383587] [<ffffffff8113a6a4>] run_timer_softirq+0x244/0x490
> [ 29.383629] [<ffffffff810b68cc>] __do_softirq+0xec/0x670
> [ 29.383666] [<ffffffff810b70d5>] irq_exit+0x145/0x150
> [ 29.383703] [<ffffffff8189f506>] smp_apic_timer_interrupt+0x46/0x60
> [ 29.383744] [<ffffffff8189d523>] apic_timer_interrupt+0x73/0x80
> [ 29.383782] <EOI> [<ffffffff816f131f>] ? cpuidle_enter_state+0x5f/0x2f0
> [ 29.383832] [<ffffffff816f131b>] ? cpuidle_enter_state+0x5b/0x2f0
> [ 29.383873] [<ffffffff816f15e7>] cpuidle_enter+0x17/0x20
> [ 29.383908] [<ffffffff81103c6f>] cpu_startup_entry+0x36f/0x5f0
> [ 29.383949] [<ffffffff81887cbd>] rest_init+0x13d/0x150
> [ 29.383986] [<ffffffff821c905b>] start_kernel+0x4d2/0x4f3
> [ 29.384023] [<ffffffff821c8120>] ? early_idt_handlers+0x120/0x120
> [ 29.384064] [<ffffffff821c8339>] x86_64_start_reservations+0x2a/0x2c
> [ 29.384105] [<ffffffff821c8485>] x86_64_start_kernel+0x14a/0x16d

We are dereferencing a RCU pointer with rtnl assert, but in the timer context
we only have br->lock. It looks like we need to take RCU read lock on that path
rather than asserting rtnl lock.

Thanks for the report.