Hi,
We are getting soft lockup OOPs on Cavium CN88XX (A.K.A. ThunderX),
which is an arm64 implementation.
A typical failure shows multiple threads stuck in mutex operations like
this:
.
.
.
[ 68.909873] Task dump for CPU 18:
[ 68.909876] systemd-udevd R running task 0 537 534
0x00000002
[ 68.909877] Call trace:
[ 68.909880] [<fffffe0000088858>] dump_backtrace+0x0/0x17c
[ 68.909883] [<fffffe00000889f8>] show_stack+0x24/0x2c
[ 68.909885] [<fffffe00000c4210>] sched_show_task+0xb0/0x104
[ 68.909888] [<fffffe00000c682c>] dump_cpu_task+0x48/0x54
[ 68.909890] [<fffffe00000ee5e0>] rcu_dump_cpu_stacks+0x9c/0xec
[ 68.909893] [<fffffe00000f2c9c>] rcu_check_callbacks+0x524/0xa18
[ 68.909896] [<fffffe00000f83a0>] update_process_times+0x44/0x74
[ 68.909899] [<fffffe00001078d4>] tick_sched_timer+0x78/0x1ac
[ 68.909901] [<fffffe00000f8b74>] __hrtimer_run_queues+0x148/0x2d4
[ 68.909903] [<fffffe00000f9464>] hrtimer_interrupt+0xb0/0x1f4
[ 68.909906] [<fffffe000056e6e8>] arch_timer_handler_phys+0x3c/0x48
[ 68.909909] [<fffffe00000e7fd4>] handle_percpu_devid_irq+0xb0/0x1b0
[ 68.909912] [<fffffe00000e33c4>] generic_handle_irq+0x34/0x4c
[ 68.909914] [<fffffe00000e3738>] __handle_domain_irq+0x90/0xfc
[ 68.909916] [<fffffe0000081d80>] gic_handle_irq+0x90/0x18c
[ 68.909918] Exception stack(0xfffffe03f14e3920 to 0xfffffe03f14e3a40)
[ 68.909921] 3920: fffffe03fd5c5800 fffffe0000c55800 fffffe03f14e3a80
fffffe00000dabd8
[ 68.909924] 3940: 00000000a0000145 0000000000000015 fffffe03e9602400
fffffe00002fddb0
[ 68.909927] 3960: 0000000000000000 0000000000000000 fffffe03fd5c5810
fffffe03f14e0000
[ 68.909929] 3980: 0000000000000001 ffffffffff000000 fffffe03db307e38
0000000000000000
[ 68.909932] 39a0: 0000000000737973 00000000ffffffff 0000000000000000
000000003b364d50
[ 68.909935] 39c0: 0000000000000018 ffffffffa99641af 0016fd71b6000000
003b9aca00000000
[ 68.909937] 39e0: fffffe00001f1508 000003ff9b9fd028 000003ffed7a0a10
fffffe03fd5c5800
[ 68.909940] 3a00: fffffe0000c55800 fffffe0000cea1c8 fffffe03fd5a5800
fffffe0000ca2eb0
[ 68.909943] 3a20: 0000000000000015 fffffe03e9602400 fffffe0000cea1c8
fffffe0000712000
[ 68.909945] [<fffffe0000084ce8>] el1_irq+0x68/0xd8
[ 68.909948] [<fffffe00000da03c>] mutex_optimistic_spin+0x9c/0x1d0
[ 68.909951] [<fffffe00006fe4b8>] __mutex_lock_slowpath+0x44/0x158
[ 68.909953] [<fffffe00006fe620>] mutex_lock+0x54/0x58
[ 68.909956] [<fffffe0000265efc>] kernfs_iop_permission+0x38/0x70
[ 68.909959] [<fffffe00001fbf50>] __inode_permission+0x88/0xd8
[ 68.909961] [<fffffe00001fbfd0>] inode_permission+0x30/0x6c
[ 68.909964] [<fffffe00001fe26c>] link_path_walk+0x68/0x4d4
[ 68.909966] [<fffffe00001ffa14>] path_openat+0xb4/0x2bc
[ 68.909968] [<fffffe000020123c>] do_filp_open+0x74/0xd0
[ 68.909971] [<fffffe00001f13e4>] do_sys_open+0x14c/0x228
[ 68.909973] [<fffffe00001f1544>] SyS_openat+0x3c/0x48
[ 68.909976] [<fffffe00000851f0>] el0_svc_naked+0x24/0x28
.
.
.
Reverting 81a43adae3b9 (locking/mutex: Use acquire/release semantics)
Makes the problem go away.
At this point it is unknown if this patch is incorrect, or if the
underlying ARM64 atomic_*_{acquire,release} primitives are defective, or
if the problem lies elsewhere.
I am not requesting any specific action with this e-mail, but wanted to
draw attention to the issue. Undoubtedly we will be able to provide
more detailed information about the issue in the coming days.
Thanks,
David Daney
On Thu, Dec 10, 2015 at 11:43:46AM -0800, David Daney wrote:
> We are getting soft lockup OOPs on Cavium CN88XX (A.K.A. ThunderX), which is
> an arm64 implementation.
[...]
> At this point it is unknown if this patch is incorrect, or if the underlying
> ARM64 atomic_*_{acquire,release} primitives are defective, or if the problem
> lies elsewhere.
Are you using the ll/sc or lse versions of the atomics? In the case of
the former, are they inline or out-of-line (this depends on whether or
not you've selected CONFIG_ARM64_LSE_ATOMICS and whether or not you have
toolchain support)?
Will
On Fri, Dec 11, 2015 at 6:18 AM, Davidlohr Bueso wrote:
>
> On Fri, 11 Dec 2015, Will Deacon wrote:
>
>>I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
>>as opposed to "compare and swap". In which case, it does look like
>>there's a bug here because there is nothing to order the initialisation
>>of the node fields with publishing of the node, whether that's
>>indirectly as a result of setting the tail to the current CPU or
>>directly as a result of the WRITE_ONCE.
>
> Sorry I'm late to the party.
>
> Duh yes this is obviously bogus, and worse I recall triggering a similar tail initialization issue in osq_lock on some experimental work on x86, so this is very much a point of failure. Ack.
>
>>
>>Andrew, David: does making that atomic_xchg_acquire and atomic_xchg fix
>>things for you?
Yes that works for me. And yes that looks like the correct fix.
>>
>>I don't fully grok what 81a43adae3b9 has to do with any of this, so
>>maybe there's another bug too.
>
> I think this is mainly because mutex_optimistic_spin is where the stack shows the lockup, which really translates to c55a6ffa62.
Yes as mutex_optimistic_spin calls into osq_lock/osq_unlock. And
81a43adae3b9 changed mutex.c which David thought was where the issue
was located rather than not what mutex_optimistic_spin called.
Thanks,
Andrew Pinski
>
> Thanks,
> Davidlohr